Phylogeny reconstruction BNFO 602 Roshan. Simulation studies.

Slides:



Advertisements
Similar presentations
Parsimony Small Parsimony and Search Algorithms Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Advertisements

A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks Luay Nakhleh Department of Computer Sciences UT Austin.
Challenges in computational phylogenetics Tandy Warnow Radcliffe Institute for Advanced Study University of Texas at Austin.
Branch and Bound Optimization In an exhaustive search, all possible trees in a search space are generated for comparison At each node, if the tree is optimal.
BALANCED MINIMUM EVOLUTION. DISTANCE BASED PHYLOGENETIC RECONSTRUCTION 1. Compute distance matrix D. 2. Find binary tree using just D. Balanced Minimum.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Computer Science and Reconstructing Evolutionary Trees Tandy Warnow Department of Computer Science University of Illinois at Urbana-Champaign.
Tree Reconstruction.
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
CIS786, Lecture 5 Usman Roshan.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
BNFO 602 Phylogenetics Usman Roshan.
. Phylogenetic Trees - Parsimony Tutorial #12 Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in:
BME 130 – Genomes Lecture 26 Molecular phylogenies I.
BNFO 602, Lecture 3 Usman Roshan Some of the slides are based upon material by David Wishart of University.
CIS786, Lecture 3 Usman Roshan.
. Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
BNFO 602 Phylogenetics Usman Roshan. Summary of last time Models of evolution Distance based tree reconstruction –Neighbor joining –UPGMA.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
UNIVERSITY OF SOUTH CAROLINA College of Engineering & Information Technology Bioinformatics Algorithms and Data Structures Chapter : Strings and.
Building Phylogenies Parsimony 2.
Phylogeny Tree Reconstruction
CIS786, Lecture 4 Usman Roshan.
CIS786, Lecture 8 Usman Roshan Some of the slides are based upon material by Dennis Livesay and David.
Lecture 8 – Searching Tree Space. The Search Tree.
Combinatorial and graph-theoretic problems in evolutionary tree reconstruction Tandy Warnow Department of Computer Sciences University of Texas at Austin.
Phylogeny Estimation: Why It Is "Hard", and How to Design Methods with Good Performance Tandy Warnow Department of Computer Sciences University of Texas.
CIPRES: Enabling Tree of Life Projects Tandy Warnow The Program in Evolutionary Dynamics at Harvard University The University of Texas at Austin.
Disk-Covering Methods for phylogeny reconstruction Tandy Warnow The University of Texas at Austin.
Phylogenetic Tree Reconstruction Tandy Warnow The Program in Evolutionary Dynamics at Harvard University The University of Texas at Austin.
Characterizing the Phylogenetic Tree-Search Problem Daniel Money And Simon Whelan ~Anusha Sura.
Maximum Parsimony Input: Set S of n aligned sequences of length k Output: –A phylogenetic tree T leaf-labeled by sequences in S –additional sequences of.
Computer Science Research for The Tree of Life Tandy Warnow Department of Computer Sciences University of Texas at Austin.
Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Evolutionary Trees Usman Roshan Department of Computer Science New Jersey Institute of.
NP-hardness and Phylogeny Reconstruction Tandy Warnow Department of Computer Sciences University of Texas at Austin.
Phylogenetics II.
Benjamin Loyle 2004 Cse 397 Solving Phylogenetic Trees Benjamin Loyle March 16, 2004 Cse 397 : Intro to MBIO.
CS 173, Lecture B August 25, 2015 Professor Tandy Warnow.
394C: Algorithms for Computational Biology Tandy Warnow Sept 9, 2013.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
CIPRES: Enabling Tree of Life Projects Tandy Warnow The University of Texas at Austin.
Introduction to Phylogenetic Estimation Algorithms Tandy Warnow.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Maximum Likelihood Given competing explanations for a particular observation, which explanation should we choose? Maximum likelihood methodologies suggest.
Algorithmic research in phylogeny reconstruction Tandy Warnow The University of Texas at Austin.
Parsimony and searching tree-space. The basic idea To infer trees we want to find clades (groups) that are supported by synapomorpies (shared derived.
Problems with large-scale phylogeny Tandy Warnow, UT-Austin Department of Computer Sciences Center for Computational Biology and Bioinformatics.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Chapter AGB. Today’s material Maximum Parsimony Fixed tree versions (solvable in polynomial time using dynamic programming) Optimal tree search.
394C: Algorithms for Computational Biology Tandy Warnow Jan 25, 2012.
Phylogenetic basis of systematics
Statistical tree estimation
Distance based phylogenetics
Challenges in constructing very large evolutionary trees
Character-Based Phylogeny Reconstruction
Professor Tandy Warnow
BNFO 602 Phylogenetics Usman Roshan.
BNFO 602 Phylogenetics – maximum parsimony
CS 581 Tandy Warnow.
Maximum parsimony with MEGA and PAUP
CS 581 Tandy Warnow.
BNFO 602 Phylogenetics – maximum likelihood
BNFO 602 Phylogenetics Usman Roshan.
Lecture 8 – Searching Tree Space
Lecture 7 – Algorithmic Approaches
Phylogeny.
CS 394C: Computational Biology Algorithms
Algorithms for Inferring the Tree of Life
Tandy Warnow The University of Texas at Austin
Presentation transcript:

Phylogeny reconstruction BNFO 602 Roshan

Simulation studies

Software Random trees: r8s Sequence evolution: seqgen Tree comparison: recidcm3 software

Maximum Parsimony Character based method NP-hard (reduction to the Steiner tree problem) Widely-used in phylogenetics Slower than NJ but more accurate Faster than ML Assumes i.i.d.

Maximum Parsimony Input: Set S of n aligned sequences of length k Output: A phylogenetic tree T –leaf-labeled by sequences in S –additional sequences of length k labeling the internal nodes of T such that is minimized.

Maximum parsimony (example) Input: Four sequences –ACT –ACA –GTT –GTA Question: which of the three trees has the best MP scores?

Maximum Parsimony ACT GTTACA GTA ACA ACT GTA GTT ACT ACA GTT GTA

Maximum Parsimony ACT GTT GTA ACA GTA MP score = 5 ACA ACT GTA GTT ACAACT MP score = 7 ACT ACA GTT GTA ACAGTA MP score = 4 Optimal MP tree

Maximum Parsimony: computational complexity ACT ACA GTT GTA ACAGTA MP score = 4 Finding the optimal MP tree is NP-hard Optimal labeling can be computed in linear time O(nk)

Local search strategies Phylogenetic trees Cost Global optimum Local optimum

Local search for MP Determine a candidate solution s While s is not a local minimum –Find a neighbor s’ of s such that MP(s’)<MP(s) –If found set s=s’ –Else return s and exit Time complexity: unknown---could take forever or end quickly depending on starting tree and local move Need to specify how to construct starting tree and local move

Starting tree for MP Random phylogeny---O(n) time Greedy-MP

Greedy-MP takes O(n^2k^2) time

Local moves for MP: NNI For each edge we get two different topologies Neighborhood size is 2n-6

Local moves for MP: SPR Neighborhood size is quadratic in number of taxa Computing the minimum number of SPR moves between two rooted phylogenies is NP-hard

Local moves for MP: TBR Neighborhood size is cubic in number of taxa Computing the minimum number of TBR moves between two rooted phylogenies is NP-hard

Local optima is a problem

Iterated local search: escape local optima by perturbation Local optimum Local search

Iterated local search: escape local optima by perturbation Local optimum Output of perturbation Perturbation Local search

Iterated local search: escape local optima by perturbation Local optimum Output of perturbation Perturbation Local search