Phylogenetic Trees - Parsimony Tutorial #12

Phylogenetic Trees - Parsimony Tutorial #12
Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in: - Come to me for more details - .

Phylogenetic Reconstruction
We’d like to study the evolutionary history of species Distance-based approach: Calculate (ML) pairwise (evolutionary) distances between species Find the edge-weighted tree best describing this metric Major drawback: Lose of information when reducing data to pairwise distances Character-based approach: Consider the character vector of each specie: morphological characters bio-molecular characters Optimization criteria: parsimony likelihood / posterior-probability .

Most Parsimonious Tree
Parsimony-score: Number of character-changes (mutations) along the evolutionary tree (tree containing labels on internal vertices) Example: Score = 4 Score = 3 AGA AAA AAG GGA AAA 1 2 1 AGA AAA AAG GGA Most parsimonious tree:  Tree with minimal parsimony score Minimal Evolution Principle

Small vs. Large Parsimony
We break the problem into two: Small parsimony: Given the topology find the best assignment to internal nodes Large parsimony: Find the topology which gives best score Large parsimony is NP-hard We’ll show solution to small parsimony (Fitch and Sankoff’s algorithms) Input to small parsimony: tree with character-state assignments to leaves Example: A: CAGGTA B: CAGACA C: CGGGTA D: TGCACT E: TGCGTA Aardvark Bison Chimp Dog Elephant

Fitch’s Algorithm Execute independently for each character:
Bottom-up phase: Determine set of possible states for each internal node Top-down phase: Pick states for each internal node Dynamic Programming framework 1 2 Aardvark Bison Chimp Dog Elephant CAGGTA CGGGTA TGCGTA CAGACA TGCACT

Fitch’s Algorithm Bottom-up phase
Determine set of possible states for each internal node Initialization: Ri = {si} Do a post-order (from leaves to root) traversal of tree Determine Ri of internal node i with children j, k: T T Parsimony-score = # union operations AGT CT GT score = 3 C T G T A T

Fitch’s Algorithm Top-down phase
Pick states for each internal node Pick arbitrary state in Rroot for the root Do pre-order (from root to leaves) traversal of tree Determine sj of internal node j with parent i: T Complexity: O(mnk) #characters #taxa/nodes #states T AGT CT GT score = 3 C T G T A T

Weighted Parsimony Sankoff’s algorithm
Each mutation a↔b costs differently - S(a,b). Bottom-up phase: Determine Ri(s) – cost of optimal state-assignment for subtree of i, when it is assigned state s. Top-down phase: Pick optimal states for each internal node Fitch’s algorithm as special case: Ri – set of states which yield minimal-cost subtree of i Same as algorithm for optimal lifted tree alignment (Tutorial #4)

Sankoff’s Algorithm Bottom-up phase
Determine Ri(s) for each internal node Initialization: Do a post-order (from leaves to root) traversal of tree Determine Ri of internal node i with children j, k: Natural generalization For non-binary trees Remember pointers ss’ C T G T A T

Sankoff’s Algorithm Top-down phase
Pick states for each internal node Select minimal cost character for root (s minimizing Rroot(s)) Do pre-order (from root to leaves) traversal of tree: - For internal node j, with parent i, select state that produced minimal cost at i (use pointers kept in 1st stage) Complexity: O(mnk2) #characters #taxa/nodes #states C T G T A T

Fitch’s Algorithm as special case of Sankoff’s algorithm
Unweighted parsimony: Sankoff’s algorithm: Ri(s) - cost of optimal subtree of i, when it is assigned state s Fitch’s algorithm: Score(i) - cost of optimal state-assignment for subtree of i Ri set of optimal state-assignment for subtree of i We need to show that: Optimal tree assigns node i with state from Ri. Fitch’s bottom-up recursive formula for Ri. is correct: Check for yourselves

Fitch’s Algorithm as special case of Sankoff’s algorithm
Unweighted parsimony: Score(i) - cost of optimal state-assignment for subtree of i Ri set of optimal state-assignment for subtree of i We need to show that: Optimal tree assigns node i with state from Ri. Trivially true for the root Assume (to the contrary) that in an optimal assignment, some node – j is assigned sj∉Rj root i j Parsimony-score is integer Why is this not the case for the weighted version? sj∉Rj  Rj(sj) ≥ Score(j)+1  By switching from sj to some s∊Rj we do not raise the parsimony-score

Exploring the Space of Trees
We saw how to find optimal state-assignment for a given tree topology We need to explore space of topologies Given n sequences there are (2n-3)!! possible rooted trees and (2n-5)!! possible unrooted trees taxa (n) # rooted trees # unrooted trees , ,395 ,459,425 2,027,025

Exploring the Space of Trees
Possible solutions: Heuristic solutions for “traveling” through “topology-space” Find (basic) topology using distance-based methods (NJ) Notice another problem: We obtain state-assignments to taxa using multiple alignment We obtain optimal MA using topology of phylogenetic tree (e.g. CLUSTAL) Solution: Again, use some initial topology (via NJ) A - T G C C1,C2 , … , Cm

Phylogenetic Trees - Parsimony Tutorial #12

Similar presentations

Presentation on theme: "Phylogenetic Trees - Parsimony Tutorial #12"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Phylogenetic Trees - Parsimony Tutorial #12

Similar presentations

Presentation on theme: "Phylogenetic Trees - Parsimony Tutorial #12"— Presentation transcript:

Similar presentations

About project

Feedback