BALANCED MINIMUM EVOLUTION. DISTANCE BASED PHYLOGENETIC RECONSTRUCTION 1. Compute distance matrix D. 2. Find binary tree using just D. Balanced Minimum.

Slides:



Advertisements
Similar presentations
Parsimony Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Advertisements

CS 598AGB What simulations can tell us. Questions that simulations cannot answer Simulations are on finite data. Some questions (e.g., whether a method.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
An Introduction to Phylogenetic Methods
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
Bioinformatics Algorithms and Data Structures
. Phylogenetic Trees - Parsimony Tutorial #12 Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in:
Fast Algorithms for Minimum Evolution Richard Desper, NCBI Olivier Gascuel, LIRMM.
In addition to maximum parsimony (MP) and likelihood methods, pairwise distance methods form the third large group of methods to infer evolutionary trees.
Linear Least Squares and its applications in distance matrix methods Presented by Shai Berkovich June, 2007 Seminar in Phylogeny, CS Based on the.
NJ was originally described as a method for approximating a tree that minimizes the sum of least- squares branch lengths – the minimum – evolution criterion.
. Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
Phylogeny reconstruction BNFO 602 Roshan. Simulation studies.
BNFO 602 Phylogenetics Usman Roshan. Summary of last time Models of evolution Distance based tree reconstruction –Neighbor joining –UPGMA.
Probabilistic methods for phylogenetic trees (Part 2)
Building Phylogenies Parsimony 2.
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic trees Sushmita Roy BMI/CS 576
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Terminology of phylogenetic trees
Christian M Zmasek, PhD 15 June 2010.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
Johns Hopkins University - Fall 2003 Phylogenetics & Computational Genomics Lecture #6 Page 1 Week6: Intro to Phylogenetic Reconstruction.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
O PTIMALITY OF THE N EIGHBOR J OINING A LGORITHM AND F ACES OF THE B ALANCED M INIMUM E VOLUTION P OLYTOPE David Haws Joint work with Ruriko Yoshida and.
Plgw03, 17/12/07 1 On the Hardness of Inferring Phylogenies from Triplet-Dissimilarities Ilan Gronau Shlomo Moran Technion – Israel Institute of Technology.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
Bayes estimators for phylogenetic reconstruction Ruriko Yoshida.
Calculating branch lengths from distances. ABC A B C----- a b c.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
Bayes estimators for phylogenetic reconstruction Ruriko Yoshida.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Fabio Pardi PhD student in Goldman Group European Bioinformatics Institute and University of Cambridge, UK Joint work with: Barbara Holland, Mike Hendy,
Optimality of the Neighbor Joining Algorithm and Faces of the Balanced Minimum Evolution Polytope Ruriko Yoshida.
Statistical stuff: models, methods, and performance issues CS 394C September 16, 2013.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Maximum Likelihood Given competing explanations for a particular observation, which explanation should we choose? Maximum likelihood methodologies suggest.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Phylogenetic Trees - Parsimony Tutorial #13
Construcción de cladogramas y Reconstrucción Filogenética
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Statistical stuff: models, methods, and performance issues CS 394C September 3, 2009.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
394C: Algorithms for Computational Biology Tandy Warnow Jan 25, 2012.
Distance-based phylogeny estimation
Phylogenetic Trees - Parsimony Tutorial #12
Phylogenetic basis of systematics
Distance based phylogenetics
Inferring a phylogeny is an estimation procedure.
Phylogenetic Inference
Clustering methods Tree building methods for distance-based trees
Multiple Alignment and Phylogenetic Trees
Inferring phylogenetic trees: Distance and maximum likelihood methods
BNFO 602 Phylogenetics Usman Roshan.
CS 581 Tandy Warnow.
Gene Tree Estimation Through Affinity Propagation
Lecture 7 – Algorithmic Approaches
Incorporating uncertainty in distance-matrix phylogenetics
Presentation transcript:

BALANCED MINIMUM EVOLUTION

DISTANCE BASED PHYLOGENETIC RECONSTRUCTION 1. Compute distance matrix D. 2. Find binary tree using just D. Balanced Minimum Evolution (BME) is a distance based method to go from a distance matrix to a phylogenetic tree.

MINIMUM EVOLUTION PHYLOGENETIC RECONSTRUCTION Fixed distance matrix. Tree topology being considered. Assign branch lengths using ME. Sum up branch lengths (ex. 36) Goal: Find tree topology T with smallest sum of branch lengths (assigned by ME). That is, find smallest sum of branch lengths for all (2n-5)!! binary tree topologies!

MINIMUM EVOLUTION PHYLOGENETIC RECONSTRUCTION Given the matrix of pairwise evolutionary distances, the ME approach estimates the length of any given tree topology and then selects the tree topology with shortest length. Minimum evolution is conceptually close to character- based parsimony. Complies with Occam’s principle of scientific inference, which essentially maintains that simpler explanations are preferable to more complicated ones and that ad hoc explanations should be avoided. Numerous variants of the ME principle exist, depending on how the branch lengths are estimated and how the tree length is calculated from these branch lengths.

MINIMUM EVOLUTION PHYLOGENETIC RECONSTRUCTION Fixed distance matrix. Tree topology being considered. Assign branch lengths using ME. Sum up branch lengths (ex. 36) How do we assign branch lengths to a tree topology???

LEAST SQUARES ESTIMATE (HOW TO ASSIGN BRANCH LENGTHS TO A TREE TOPOLOGY) Least Squares Observe red data points. Find blue quadratic which minimizes sum of the squared distances from the red points to the blue quadratic. ME analogy for least squares on trees Red dots Estimated distances (D) Blue quadratic Binary tree Residual/Error Sum of branch lengths

MINIMUM EVOLUTION PHYLOGENETIC RECONSTRUCTION Fixed distance matrix. Tree topology being considered. Assign branch lengths using least squares. Sum up branch lengths (ex. 36) Goal: Find tree topology T with smallest sum of branch lengths (assigned by ME). That is, find smallest sum of branch lengths for all (2n-5)!! binary tree topologies!

LEAST SQUARES ASSIGNMENT OF BRANCH LENGTHS If distance estimates are independent with the same variance, use ordinary least squares (OLS). If distance estimates are independent with different variance, use weighted least squares (WLS). (This is BME!) Well known that distance estimates obtained from sequences do not have the same variance, because the largest distances are much more variable than the shortest ones (Fitch and Margoliash, 1967) and are mutually dependent when they share a common history (or path) in the true phylogeny (Nei and Jin, 1989). Thus ordinary least-squares poorly fits the features of evolutionary distance data.

BALANCED MINIMUM EVOLUTION In BME, sibling subtrees have equal weight, as opposed to the standard unweighted OLS, where all taxa have the same weight and thus the weight of a subtree is equal to the number of its taxa. BME is consistent! BME is NP-Hard [W. Day (87)]. BME outperforms Neighbor Joining, BIONJ, WEIGHBOR and FITCH [Desper, Gascuel 2002]. Software (and web version) FastME is a heuristic which finds the BME solution. Uses NNI and SPR moves.

WHY IS IT CALLED “BALANCED”? or is the balanced distance between taxa in A and B in tree T. If B is composed to two subtrees B1 and B2: = distance estimate.

PAUPLIN’S FORMULA (SHORTCUT FOR BME!) D is the distance matrix. T is the tree topology considered. is the sum of branch lengths assigned by BME.

BME VERSION 2.0 (PAUPLIN’S FORMULA) Instead of assigning branch lengths to tree topology T using weighted least squares then summing edge lengths, cut to the chase and use Pauplin’s formula! Given distance matrix D, find binary tree T with the smallest sum of total branch lengths:

EXERCISE Which tree is the BME optimal? Why?

FASTME ON THE WEB Submit distance matrix in Phylip format. Initial tree: OLS_GME, balanced_GME, NJ or BIONJ. Finds optimal tree using moves: OLS_NNI or balanced_NNI. Enter and wait for results! Self-contained executable available.

COMPUTATIONAL EXAMPLE Download sequence at: Calculate distance matrix (use HKY): Compute BME tree:

REFERENCES "Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle.” Desper R., Gascuel O., Journal of Computational Biology (5): "Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting.” Desper R., Gascuel O., Molecular Biology and Evolution (3): "Getting a Tree Fast: Neighbor Joining, FastME, and Distance-Based Methods." Desper R., Gascuel O., Current Protocols in Bioinformatics Edited by John Wiley & Sons "Neighbor-Joining Revealed." Gascuel O., Steel M., Molecular Biology and Evolution (11):