Download presentation
Presentation is loading. Please wait.
Published byJanis Barnett Modified over 9 years ago
1
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 6.7-8
2
Have we got the true tree? Several approaches developed to answer this question: Analysis: –In some cases (e.g. UPGMA) the phylogenetic method is simple enough that we can establish mathematically the exact conditions under which it will fail –Parsimony can fail under particular distribution of edge lengths Known phylogenies –Best evidence for success of a tree-building method would be if it could accurately reconstruct a known phylogeny –Typically, only “known” phylogenies exist for crop plants and laboratory animals and even these are often suspect –Growth of bacteriophage T7 in the presence of mutagens allowed comparison of tree building methods
3
Have we got the true tree? Several approaches (continued): Simulation: –Provide software with a tree and “evolve” DNA sequences along branches according to some model –Supply the resulting sequences for a range of tree-building methods and determine which (if any) recover the original tree –An advantage of this approach is that we can explore the effects of a wide range of parameters on the performance of tree reconstruction methods –A disadvantage is that the models used to generate the new sequences may be unrealistic, particularly in biasing the model towards a particular method
4
The “Felsenstein Zone” UPGMAParsimony
5
Congruence Congruence is the agreement between estimates of phylogeny based on different characters: If data sets are independent, the probability of obtaining similar trees is extremely small Conversely, if different data sets give similar trees then this suggests that both reflect the same underlying cause, namely they reflect the same evolutionary history Two ways of using congruence: To validate a method of inference: a method that constantly recovers similar trees from different data sets will be preferred to a method that produces different trees from different data sets To validate a new source of data: does a newly sequenced gene contain phylogenetic information?
6
Sampling error If a data set contains homoplasy then different nucleotide sites support different trees: Which tree(s) a given data set supports depends on which characters have been sampled Estimates of phylogeny based on samples will be accompanied by sample error Effects of sampling error evident by comparing trees for different mitochondrial genes: Since there is no recombination, all mitochondrial genes share the same evolutionary history Several different trees were obtained Sampling of taxa is also important
7
Bootstrapping Bootstrapping is a way of calculating sampling error without taking repeated samples from the population / species under study: Mimics the technique of repeated sampling from the original population by resampling from the original sample Each resampling is a pseudoreplicate Bootstrapping can be applied to phylogenetics by taking several pseudoreplicates: Sampling with replacement gives a new data set based on the original sample: –Some sites represented more than once –Some sites not represented at all Pseudoreplicate can be used to construct a new tree
8
Bootstrapping 1 2 3 4 5 6 7 8 9 Human T C C T T A A A A Chimp T T C T A T A A A Gorilla T T A C A A T A A Orang-utan C C A C A A A T A Gibbon C C A C A A A A T 2 7 7 3 1 7 4 9 6 C A A C T A T A A C A A C T A T A T A T T A T T C A A A A A A C A C A A A A A A C A C T A Original tree Bootstrap tree
9
BootstrappingCG H B O 41/100 B O GHC 28/100 B O CHG 31/100
10
What can go wrong? Sampling error: Almost all phylogenies are based on a sample of some sort Especially true given the vagaries of homoplasy Incorrect model of sequence evolution: All methods make implicit or explicit assumptions about evolutionary process Example is problem of base composition: –An AT rich part of a gene may be more similar to an AT rich part of a different gene purely by chance Tree structure: Evolutionary history is not always simple: –Rapid cladogenesis –Widely differing rates of divergence –Horizontal gene transfer
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.