Phylogenetic Trees - Parsimony Tutorial #12

Slides:



Advertisements
Similar presentations
Parsimony Small Parsimony and Search Algorithms Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Advertisements

Maximum Parsimony Probabilistic Models of Evolutions Distance Based Methods Lecture 12 © Shlomo Moran, Ilan Gronau.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
BALANCED MINIMUM EVOLUTION. DISTANCE BASED PHYLOGENETIC RECONSTRUCTION 1. Compute distance matrix D. 2. Find binary tree using just D. Balanced Minimum.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Phylogenetic Trees Lecture 4
Tree Reconstruction.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
. Phylogenetic Trees - Parsimony Tutorial #12 Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in:
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
Realistic evolutionary models Marjolijn Elsinga & Lars Hemel.
. Phylogenetic Trees Lecture 3 Based on: Durbin et al 7.4; Gusfield 17.
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
Phylogeny Tree Reconstruction
. Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
UNIVERSITY OF SOUTH CAROLINA College of Engineering & Information Technology Bioinformatics Algorithms and Data Structures Chapter : Strings and.
Distance-Based Phylogenetic Reconstruction Tutorial #8 © Ilan Gronau, edited by Itai Sharon.
Perfect Phylogeny MLE for Phylogeny Lecture 14
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
. Multiple Sequence Alignment Tutorial #4 © Ilan Gronau.
Phylogenetic Tree Construction and Related Problems Bioinformatics.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Multiple Sequence Alignment S 1 = AGGTC S 2 = GTTCG S 3 = TGAAC Possible alignment A-TA-T GGGGGG G--G-- TTATTA -TA-TA CCCCCC -G--G- AG-AG- GTTGTT GTGGTG.
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Presented By Dr. Shazzad Hosain Asst. Prof. EECS, NSU
Parsimony and searching tree-space Phylogenetics Workhop, August 2006 Barbara Holland.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Binary Encoding and Gene Rearrangement Analysis Jijun Tang Tianjin University University of South Carolina (803)
1 Building Phylogenetic Trees Yaw-Ling Lin ( 林耀鈴 ) Dept Computer Sci and Info Management Providence University, Taiwan WWW:
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Phylogenetics II.
. Phylogenetic Trees Lecture 11 Sections 6.1, 6.2, in Setubal et. al., 7.1, 7.1 Durbin et. al. © Shlomo Moran, based on Nir Friedman. Danny Geiger, Ilan.
Phylogenetic Tree Reconstruction
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Lecture 6A – Introduction to Trees & Optimality Criteria Branches: n-taxa -> 2n-3 branches 1, 2, 4, 6, & 7 are external (leaves) 3 & 5 are internal branches.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
1 Alignment Matrix vs. Distance Matrix Sequence a gene of length m nucleotides in n species to generate an… n x m alignment matrix n x n distance matrix.
Phylogenetic Trees - Parsimony Tutorial #13
Parsimony and searching tree-space. The basic idea To infer trees we want to find clades (groups) that are supported by synapomorpies (shared derived.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
. Perfect Phylogeny MLE for Phylogeny Lecture 14 Based on: Setubal&Meidanis 6.2, Durbin et. Al. 8.1.
Chapter AGB. Today’s material Maximum Parsimony Fixed tree versions (solvable in polynomial time using dynamic programming) Optimal tree search.
394C: Algorithms for Computational Biology Tandy Warnow Jan 25, 2012.
Phylogenetic basis of systematics
394C, Spring 2012 Jan 23, 2012 Tandy Warnow.
Lecture 6A – Introduction to Trees & Optimality Criteria
Character-Based Phylogeny Reconstruction
Recitation 5 2/4/09 ML in Phylogeny
BNFO 602 Phylogenetics Usman Roshan.
BNFO 602 Phylogenetics – maximum parsimony
CS 581 Tandy Warnow.
CSCI2950-C Lecture 8 Molecular Phylogeny: Parsimony and Likelihood
Multiple Sequence Alignment
BNFO 602 Phylogenetics – maximum likelihood
BNFO 602 Phylogenetics Usman Roshan.
Phylogeny.
September 1, 2009 Tandy Warnow
Lecture 6A – Introduction to Trees & Optimality Criteria
Perfect Phylogeny Tutorial #10
Presentation transcript:

Phylogenetic Trees - Parsimony Tutorial #12 Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in: http://www.cs.technion.ac.il/~moran/lab06.htm - Come to me for more details - .

Phylogenetic Reconstruction We’d like to study the evolutionary history of species Distance-based approach: Calculate (ML) pairwise (evolutionary) distances between species Find the edge-weighted tree best describing this metric Major drawback: Lose of information when reducing data to pairwise distances Character-based approach: Consider the character vector of each specie: morphological characters bio-molecular characters Optimization criteria: parsimony likelihood / posterior-probability .

Most Parsimonious Tree Parsimony-score: Number of character-changes (mutations) along the evolutionary tree (tree containing labels on internal vertices) Example: Score = 4 Score = 3 AGA AAA AAG GGA AAA 1 2 1 AGA AAA AAG GGA Most parsimonious tree:  Tree with minimal parsimony score Minimal Evolution Principle

Small vs. Large Parsimony We break the problem into two: Small parsimony: Given the topology find the best assignment to internal nodes Large parsimony: Find the topology which gives best score Large parsimony is NP-hard We’ll show solution to small parsimony (Fitch and Sankoff’s algorithms) Input to small parsimony: tree with character-state assignments to leaves Example: A: CAGGTA B: CAGACA C: CGGGTA D: TGCACT E: TGCGTA Aardvark Bison Chimp Dog Elephant

Fitch’s Algorithm Execute independently for each character: Bottom-up phase: Determine set of possible states for each internal node Top-down phase: Pick states for each internal node Dynamic Programming framework 1 2 Aardvark Bison Chimp Dog Elephant CAGGTA CGGGTA TGCGTA CAGACA TGCACT

Fitch’s Algorithm Bottom-up phase Determine set of possible states for each internal node Initialization: Ri = {si} Do a post-order (from leaves to root) traversal of tree Determine Ri of internal node i with children j, k: T T Parsimony-score = # union operations AGT CT GT score = 3 C T G T A T

Fitch’s Algorithm Top-down phase Pick states for each internal node Pick arbitrary state in Rroot for the root Do pre-order (from root to leaves) traversal of tree Determine sj of internal node j with parent i: T Complexity: O(mnk) #characters #taxa/nodes #states T AGT CT GT score = 3 C T G T A T

Weighted Parsimony Sankoff’s algorithm Each mutation a↔b costs differently - S(a,b). Bottom-up phase: Determine Ri(s) – cost of optimal state-assignment for subtree of i, when it is assigned state s. Top-down phase: Pick optimal states for each internal node Fitch’s algorithm as special case: Ri – set of states which yield minimal-cost subtree of i Same as algorithm for optimal lifted tree alignment (Tutorial #4)

Sankoff’s Algorithm Bottom-up phase Determine Ri(s) for each internal node Initialization: Do a post-order (from leaves to root) traversal of tree Determine Ri of internal node i with children j, k: Natural generalization For non-binary trees Remember pointers ss’ C T G T A T

Sankoff’s Algorithm Top-down phase Pick states for each internal node Select minimal cost character for root (s minimizing Rroot(s)) Do pre-order (from root to leaves) traversal of tree: - For internal node j, with parent i, select state that produced minimal cost at i (use pointers kept in 1st stage) Complexity: O(mnk2) #characters #taxa/nodes #states C T G T A T

Fitch’s Algorithm as special case of Sankoff’s algorithm Unweighted parsimony: Sankoff’s algorithm: Ri(s) - cost of optimal subtree of i, when it is assigned state s Fitch’s algorithm: Score(i) - cost of optimal state-assignment for subtree of i Ri - set of optimal state-assignment for subtree of i We need to show that: Optimal tree assigns node i with state from Ri. Fitch’s bottom-up recursive formula for Ri. is correct: Check for yourselves

Fitch’s Algorithm as special case of Sankoff’s algorithm Unweighted parsimony: Score(i) - cost of optimal state-assignment for subtree of i Ri - set of optimal state-assignment for subtree of i We need to show that: Optimal tree assigns node i with state from Ri. Trivially true for the root Assume (to the contrary) that in an optimal assignment, some node – j is assigned sj∉Rj root i j Parsimony-score is integer Why is this not the case for the weighted version? sj∉Rj  Rj(sj) ≥ Score(j)+1  By switching from sj to some s∊Rj we do not raise the parsimony-score

Exploring the Space of Trees We saw how to find optimal state-assignment for a given tree topology We need to explore space of topologies Given n sequences there are (2n-3)!! possible rooted trees and (2n-5)!! possible unrooted trees taxa (n) # rooted trees # unrooted trees 3 3 1 4 15 3 5 105 15 6 945 105 8 135,135 10,395 10 34,459,425 2,027,025

Exploring the Space of Trees Possible solutions: Heuristic solutions for “traveling” through “topology-space” Find (basic) topology using distance-based methods (NJ) Notice another problem: We obtain state-assignments to taxa using multiple alignment We obtain optimal MA using topology of phylogenetic tree (e.g. CLUSTAL) Solution: Again, use some initial topology (via NJ) A - T G C C1,C2 , … , Cm