Download presentation
Presentation is loading. Please wait.
1
. Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger
2
2 The input: a species-characters matrix The ouput: a tree with n leaves corresponding to the input species Phylogenetic Reconstruction n species k characters Each character represents some observable trait. Each character takes values from a finite set.
3
3 Parsimony-score: Number of character-changes ( mutations ) along the evolutionary tree (tree containing labels on internal vertices) Example: Most Parsimonious Tree AGA AAA AAG GGA 1 1 02 0 0 1 0 01 0 1 AAA AGA AAA AAG GGA AAA AGA Most parsimonious tree: Tree with minimal parsimony score Score = 4 Score = 3 Minimal Evolution Principle
4
4 We break the problem into two: 1.Small parsimony: Given the topology find the best assignment to internal nodes 2.Large parsimony: Find the topology which gives best score Large parsimony is NP-hard We’ll show solution to small parsimony ( Fitch and Sankoff’s algorithms ) Input to small parsimony: tree with character-state assignments to leaves Example: Small vs. Large Parsimony AardvarkBisonChimpDog Elephant A: CAGGTA B: CAGACA C: CGGGTA D: TGCACT E: TGCGTA
5
5 Fitch’s Algorithm Execute independently for each character: 1.Bottom-up phase: Determine set of possible states for each internal node 2.Top-down phase: Pick states for each internal node AardvarkBisonChimpDog Elephant 1 2 CA G GTA CA G ACA CG G GTA TG C ACT TG C GTA Dynamic Programming framework
6
6 Determine set of possible states for each internal node Initialization: R i = {s i } for all leaves Do a post-order (from leaves to root) traversal of tree –Determine R i of internal node i with children j, k : Fitch’s Algorithm Bottom-up phase Parsimony-score = # union operations T CT T CTAGT AGT GT T score = 3
7
7 Pick states for each internal node Pick arbitrary state in R root for the root Do pre-order (from root to leaves) traversal of tree –Determine s j of internal node j with parent i : Fitch’s Algorithm Top-down phase T CT T CTAGT AGT GT T Complexity: O(mnk) #states #taxa/nodes #characters score = 3
8
8 Weighted Parsimony Sankoff’s algorithm Each mutation a↔b costs differently - S(a,b). 1.Bottom-up phase: Determine R i (s) – cost of optimal state- assignment for subtree of i, when it is assigned state s. 2.Top-down phase: Pick optimal states for each internal node Same as algorithm for optimal lifted tree alignment (Tutorial #4)
9
9 Determine R i (s) for each internal node Initialization: Do a post-order (from leaves to root) traversal of tree –Determine R i of internal node i with children j, k : Sankoff’s Algorithm Bottom-up phase CTAGTT Natural generalization For non-binary trees Remember pointers s s’
10
10 Pick states for each internal node Select minimal cost character for root ( s minimizing R root (s) ) Do pre-order (from root to leaves) traversal of tree: - For internal node j, with parent i, select state that produced minimal cost at i (use pointers kept in 1 st stage) Sankoff’s Algorithm Top-down phase CTAGT T Complexity: O(m 2 nk) #states #taxa/nodes #characters
11
11 Unweighted parsimony: Sankoff ’ s algorithm: R i (s) - cost of optimal subtree of i, when it is assigned state s Fitch ’ s algorithm: Score(i) - cost of optimal state-assignment for subtree of i R i - set of optimal state-assignment for subtree of i We need to show that: 1.There is an optimal tree which assigns node i with state from R i. 2.Fitch’s bottom-up recursive formula for R i. is correct: Fitch’s Algorithm as special case of Sankoff’s algorithm Check for yourselves tricky part
12
12 Unweighted parsimony: Score(i) - cost of optimal state-assignment for subtree of i R i - set of optimal state-assignment for subtree of i We need to show that: 1.There is an optimal tree which assigns node i with state from R i. Trivially true for the root Assume ( to the contrary ) that in an optimal assignment, some node – j is assigned s j ∉ R j root i j s j ∉ R j R j (s j ) ≥ Score(j)+1 By switching from s j to some s ∊ R j we do not raise the parsimony-score Why is this not the case for the weighted version? Parsimony-score is integer Fitch’s Algorithm as special case of Sankoff’s algorithm
13
13 Exploring the Space of Trees We saw how to find optimal state-assignment for a given tree topology We need to explore space of topologies Given n sequences there are (2n-3)!! possible rooted trees and (2n-5)!! possible unrooted trees taxa (n) # rooted trees # unrooted trees 331 4 153 5 10515 6 945105 8 135,13510,395 10 34,459,4252,027,025
14
14 Exploring the Space of Trees Possible solutions: 1.Heuristic solutions for “traveling” through “topology-space” 2.Find (basic) topology using distance-based methods (NJ) Notice another problem: We obtain state-assignments to taxa using multiple alignment We obtain optimal MA using topology of phylogenetic tree (e.g. CLUSTAL ) Solution: Again, use some initial topology (via NJ) A-TA-T GGGGGG G--G-- TTATTA -TA-TA CCCCCC -G--G- C 1,C 2, …, C m
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.