What is Bootstrapping in Bioinformatics

Bootstrapping in Phylogeny

In the captivating world of phylogenetics, where scientists reconstruct the evolutionary relationships between organisms, phylogenetic trees reign supreme. However, these trees represent just one possible interpretation of the data. Bootstrapping emerges as a powerful statistical technique that helps assess the confidence associated with different branches within a phylogenetic tree. It allows researchers to evaluate how robust the tree structure is and identify areas with potentially unreliable branching patterns.

Beyond a Single Tree: Unveiling Uncertainty

Traditional phylogenetic methods, like maximum likelihood or parsimony, often produce a single "best" tree based on the chosen criteria. However, this single tree doesn't tell the whole story:

  • Limited Data: Phylogenetic analyses rely on a finite set of sequences. Unsampled genetic diversity or incomplete fossil records can lead to uncertainty in tree branching patterns.

  • Model Imperfections: The evolutionary models used in phylogenetic reconstruction are simplifications of reality. Factors like varying evolutionary rates or lateral gene transfer can introduce complexities not fully captured by the model.

The Power of Bootstrapping: Resampling with a Twist

Bootstrapping tackles these uncertainties by creating a multitude of pseudo-replicates of the original dataset:

  1. Resampling with Replacement: The core concept lies in resampling the original sequence data with replacement. This means sequences are randomly chosen from the original dataset, and it's possible to pick the same sequence multiple times for a single replicate. This creates a new dataset of the same size as the original but with inherent sampling bias.

  2. Building Replicate Trees: For each replicate dataset generated, a new phylogenetic tree is constructed using the same method used for the original analysis (e.g., maximum likelihood). This process is repeated for a predetermined number of replicates (typically hundreds or even thousands).

  3. Frequency of Branches: By analyzing the resulting collection of replicate trees, researchers can assess the frequency with which specific branches appear. Branches that are consistently present across a high percentage of replicates are considered more reliable, reflecting strong support for those relationships in the tree. Conversely, branches appearing only in a small fraction of replicates suggest weaker support and potentially higher uncertainty.

The Advantages of Bootstrapping: Quantifying Confidence

Bootstrapping offers a valuable tool for phylogenetic analysis by:

  • Estimating Branch Support: Bootstrap percentages (or bootstrap support values) are assigned to each branch in the original tree. These percentages represent the frequency with which that specific branch appeared in the replicate trees. Higher percentages indicate stronger support for the branching pattern.

  • Identifying Areas of Uncertainty: By highlighting branches with low bootstrap support, bootstrapping helps researchers identify areas in the tree where the data might be inconclusive or the model limitations might be impacting the results.

  • Guiding Further Research: Low bootstrap support can guide researchers to prioritize collecting additional data (more sequences) or exploring alternative evolutionary models for a more robust understanding of the relationships.

Considerations and Limitations

While bootstrapping offers a valuable tool, it's essential to consider some limitations:

  • Overestimation of Support: Bootstrap values can sometimes overestimate the true support for a branch, particularly for short sequences or complex evolutionary scenarios.

  • Statistical Dependence on Replicates: The accuracy of bootstrap support values depends on the number of replicates used. More replicates generally provide more reliable estimates.

  • Not a Measure of Absolute Certainty: Even high bootstrap values don't guarantee absolute certainty about a branch. They simply indicate a high degree of confidence based on the chosen model and data.


Bootstrapping serves as a cornerstone technique for assessing the confidence associated with phylogenetic trees. By providing a statistical framework for evaluating branch support, it empowers researchers to move beyond a single "best" tree and delve into the inherent uncertainties within their analyses. As our understanding of evolutionary processes and advancements in computational tools continue to evolve, bootstrapping will likely remain a crucial element in building a more robust and nuanced understanding of the evolutionary web of life.

Previous Post Next Post

Contact Form