Download presentation
Presentation is loading. Please wait.
1
. Phylogenetic Trees - Parsimony Tutorial #12 Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in: http://www.cs.technion.ac.il/~moran/lab06.htm - Come to me for more details -
2
. Phylogenetic Reconstruction We’d like to study the evolutionary history of species Distance-based approach: Calculate (ML) pairwise (evolutionary) distances between species Find the edge-weighted tree best describing this metric Major drawback: Lose of information when reducing data to pairwise distances Character-based approach: Consider the character vector of each specie: – morphological characters – bio-molecular characters Optimization criteria: – parsimony – likelihood / posterior-probability
3
3 Parsimony-score: Number of character-changes ( mutations ) along the evolutionary tree (tree containing labels on internal vertices) Example: Most Parsimonious Tree AGA AAA AAG GGA 1 1 02 0 0 1 0 01 0 1 AAA AGA AAA AAG GGA AAA AGA Most parsimonious tree: Tree with minimal parsimony score Score = 4 Score = 3 Minimal Evolution Principle
4
4 We break the problem into two: 1.Small parsimony: Given the topology find the best assignment to internal nodes 2.Large parsimony: Find the topology which gives best score Large parsimony is NP-hard We’ll show solution to small parsimony ( Fitch and Sankoff’s algorithms ) Input to small parsimony: tree with character-state assignments to leaves Example: Small vs. Large Parsimony AardvarkBisonChimpDog Elephant A: CAGGTA B: CAGACA C: CGGGTA D: TGCACT E: TGCGTA
5
5 Fitch’s Algorithm Execute independently for each character: 1.Bottom-up phase: Determine set of possible states for each internal node 2.Top-down phase: Pick states for each internal node AardvarkBisonChimpDog Elephant 1 2 CA G GTA CA G ACA CG G GTA TG C ACT TG C GTA Dynamic Programming framework
6
6 Determine set of possible states for each internal node Initialization: R i = {s i } Do a post-order (from leaves to root) traversal of tree –Determine R i of internal node i with children j, k : Fitch’s Algorithm Bottom-up phase Parsimony-score = # union operations T CT T CTAGT AGT GT T score = 3
7
7 Pick states for each internal node Pick arbitrary state in R root for the root Do pre-order (from root to leaves) traversal of tree –Determine s j of internal node j with parent i : Fitch’s Algorithm Top-down phase T CT T CTAGT AGT GT T Complexity: O(mnk) #characters #taxa/nodes #states score = 3
8
8 Weighted Parsimony Sankoff’s algorithm Each mutation a↔b costs differently - S(a,b). 1.Bottom-up phase: Determine R i (s) – cost of optimal state- assignment for subtree of i, when it is assigned state s. 2.Top-down phase: Pick optimal states for each internal node Fitch’s algorithm as special case: R i – set of states which yield minimal-cost subtree of i Same as algorithm for optimal lifted tree alignment (Tutorial #4)
9
9 Determine R i (s) for each internal node Initialization: Do a post-order (from leaves to root) traversal of tree –Determine R i of internal node i with children j, k : Sankoff’s Algorithm Bottom-up phase CTAGTT Natural generalization For non-binary trees Remember pointers s s’
10
10 Pick states for each internal node Select minimal cost character for root ( s minimizing R root (s) ) Do pre-order (from root to leaves) traversal of tree: - For internal node j, with parent i, select state that produced minimal cost at i (use pointers kept in 1 st stage) Sankoff’s Algorithm Top-down phase CTAGT T Complexity: O(mnk 2 ) #characters #taxa/nodes #states
11
11 Unweighted parsimony: Sankoff ’ s algorithm: R i (s) - cost of optimal subtree of i, when it is assigned state s Fitch ’ s algorithm: Score(i) - cost of optimal state-assignment for subtree of i R i - set of optimal state-assignment for subtree of i We need to show that: 1.Optimal tree assigns node i with state from R i. 2.Fitch’s bottom-up recursive formula for R i. is correct: Fitch’s Algorithm as special case of Sankoff’s algorithm Check for yourselves
12
12 Unweighted parsimony: Score(i) - cost of optimal state-assignment for subtree of i R i - set of optimal state-assignment for subtree of i We need to show that: 1.Optimal tree assigns node i with state from R i. Trivially true for the root Assume ( to the contrary ) that in an optimal assignment, some node – j is assigned s j ∉ R j root i j s j ∉ R j R j (s j ) ≥ Score(j)+1 By switching from s j to some s ∊ R j we do not raise the parsimony-score Why is this not the case for the weighted version? Parsimony-score is integer Fitch’s Algorithm as special case of Sankoff’s algorithm
13
13 Exploring the Space of Trees We saw how to find optimal state-assignment for a given tree topology We need to explore space of topologies Given n sequences there are (2n-3)!! possible rooted trees and (2n-5)!! possible unrooted trees taxa (n) # rooted trees # unrooted trees 331 4 153 5 10515 6 945105 8 135,13510,395 10 34,459,4252,027,025
14
14 Exploring the Space of Trees Possible solutions: 1.Heuristic solutions for “ traveling ” through “ topology-space ” 2.Find (basic) topology using distance-based methods (NJ) Notice another problem: We obtain state-assignments to taxa using multiple alignment We obtain optimal MA using topology of phylogenetic tree (e.g. CLUSTAL ) Solution: Again, use some initial topology (via NJ) A-TA-T GGGGGG G--G-- TTATTA -TA-TA CCCCCC -G--G- C 1,C 2, …, C m
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.