Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.

Similar presentations


Presentation on theme: ". Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger."— Presentation transcript:

1 . Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger

2 2 The input: a species-characters matrix The ouput: a tree with n leaves corresponding to the input species Phylogenetic Reconstruction n species k characters Each character represents some observable trait. Each character takes values from a finite set.

3 3 Parsimony-score: Number of character-changes ( mutations ) along the evolutionary tree (tree containing labels on internal vertices) Example: Most Parsimonious Tree AGA AAA AAG GGA 1 1 02 0 0 1 0 01 0 1 AAA AGA AAA AAG GGA AAA AGA Most parsimonious tree:  Tree with minimal parsimony score Score = 4 Score = 3 Minimal Evolution Principle

4 4 We break the problem into two: 1.Small parsimony: Given the topology find the best assignment to internal nodes 2.Large parsimony: Find the topology which gives best score  Large parsimony is NP-hard  We’ll show solution to small parsimony ( Fitch and Sankoff’s algorithms ) Input to small parsimony: tree with character-state assignments to leaves Example: Small vs. Large Parsimony AardvarkBisonChimpDog Elephant A: CAGGTA B: CAGACA C: CGGGTA D: TGCACT E: TGCGTA

5 5 Fitch’s Algorithm Execute independently for each character: 1.Bottom-up phase: Determine set of possible states for each internal node 2.Top-down phase: Pick states for each internal node AardvarkBisonChimpDog Elephant 1 2 CA G GTA CA G ACA CG G GTA TG C ACT TG C GTA Dynamic Programming framework

6 6 Determine set of possible states for each internal node Initialization: R i = {s i } for all leaves Do a post-order (from leaves to root) traversal of tree –Determine R i of internal node i with children j, k : Fitch’s Algorithm Bottom-up phase Parsimony-score = # union operations T CT T CTAGT AGT GT T score = 3

7 7 Pick states for each internal node Pick arbitrary state in R root for the root Do pre-order (from root to leaves) traversal of tree –Determine s j of internal node j with parent i : Fitch’s Algorithm Top-down phase T CT T CTAGT AGT GT T Complexity: O(mnk) #states #taxa/nodes #characters score = 3

8 8 Weighted Parsimony Sankoff’s algorithm Each mutation a↔b costs differently - S(a,b). 1.Bottom-up phase: Determine R i (s) – cost of optimal state- assignment for subtree of i, when it is assigned state s. 2.Top-down phase: Pick optimal states for each internal node Same as algorithm for optimal lifted tree alignment (Tutorial #4)

9 9 Determine R i (s) for each internal node Initialization: Do a post-order (from leaves to root) traversal of tree –Determine R i of internal node i with children j, k : Sankoff’s Algorithm Bottom-up phase CTAGTT Natural generalization For non-binary trees Remember pointers s  s’

10 10 Pick states for each internal node Select minimal cost character for root ( s minimizing R root (s) ) Do pre-order (from root to leaves) traversal of tree: - For internal node j, with parent i, select state that produced minimal cost at i (use pointers kept in 1 st stage) Sankoff’s Algorithm Top-down phase CTAGT T Complexity: O(m 2 nk) #states #taxa/nodes #characters

11 11 Unweighted parsimony: Sankoff ’ s algorithm: R i (s) - cost of optimal subtree of i, when it is assigned state s Fitch ’ s algorithm: Score(i) - cost of optimal state-assignment for subtree of i R i - set of optimal state-assignment for subtree of i We need to show that: 1.There is an optimal tree which assigns node i with state from R i. 2.Fitch’s bottom-up recursive formula for R i. is correct: Fitch’s Algorithm as special case of Sankoff’s algorithm Check for yourselves tricky part

12 12 Unweighted parsimony: Score(i) - cost of optimal state-assignment for subtree of i R i - set of optimal state-assignment for subtree of i We need to show that: 1.There is an optimal tree which assigns node i with state from R i. Trivially true for the root Assume ( to the contrary ) that in an optimal assignment, some node – j is assigned s j ∉ R j root i j s j ∉ R j  R j (s j ) ≥ Score(j)+1  By switching from s j to some s ∊ R j we do not raise the parsimony-score Why is this not the case for the weighted version? Parsimony-score is integer Fitch’s Algorithm as special case of Sankoff’s algorithm

13 13 Exploring the Space of Trees We saw how to find optimal state-assignment for a given tree topology We need to explore space of topologies Given n sequences there are (2n-3)!! possible rooted trees and (2n-5)!! possible unrooted trees taxa (n) # rooted trees # unrooted trees 331 4 153 5 10515 6 945105 8 135,13510,395 10 34,459,4252,027,025

14 14 Exploring the Space of Trees Possible solutions: 1.Heuristic solutions for “traveling” through “topology-space” 2.Find (basic) topology using distance-based methods (NJ) Notice another problem: We obtain state-assignments to taxa using multiple alignment We obtain optimal MA using topology of phylogenetic tree (e.g. CLUSTAL ) Solution: Again, use some initial topology (via NJ) A-TA-T GGGGGG G--G-- TTATTA -TA-TA CCCCCC -G--G- C 1,C 2, …, C m


Download ppt ". Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger."

Similar presentations


Ads by Google