In addition to maximum parsimony (MP) and likelihood methods, pairwise distance methods form the third large group of methods to infer evolutionary trees.

Slides:



Advertisements
Similar presentations
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
An Introduction to Phylogenetic Methods
1 Dan Graur Methods of Tree Reconstruction. 2 3.
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Lecture 13 CS5661 Phylogenetics Motivation Concepts Algorithms.
BALANCED MINIMUM EVOLUTION. DISTANCE BASED PHYLOGENETIC RECONSTRUCTION 1. Compute distance matrix D. 2. Find binary tree using just D. Balanced Minimum.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
IE68 - Biological databases Phylogenetic analysis
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Phylogenetic Reconstruction: Distance Matrix Methods Anders Gorm Pedersen Molecular Evolution Group Center for.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
UPGMA and FM are distance based methods. UPGMA enforces the Molecular Clock Assumption. FM (Fitch-Margoliash) relieves that restriction, but still enforces.
Bioinformatics Algorithms and Data Structures
Distance matrix methods calculate a measure of distance between each pair of species, then find a tree that predicts the observed set of distances.
Fast Algorithms for Minimum Evolution Richard Desper, NCBI Olivier Gascuel, LIRMM.
Distance methods. UPGMA: similar to hierarchical clustering but not additive Neighbor-joining: more sophisticated and additive What is additivity?
The Tree of Life From Ernst Haeckel, 1891.
Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.
CISC667, F05, Lec15, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (II) Distance-based methods.
Protein Sequence Classification Using Neighbor-Joining Method
Lecture 24 Inferring molecular phylogeny Distance methods
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Distance Matrix Methods Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis.
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Terminology of phylogenetic trees
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
1 Building Phylogenetic Trees Yaw-Ling Lin ( 林耀鈴 ) Dept Computer Sci and Info Management Providence University, Taiwan WWW:
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
BINF6201/8201 Molecular phylogenetic methods
Phylogenetics and Coalescence Lab 9 October 24, 2012.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Phylogeny Ch. 7 & 8.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
DATA MINING van data naar informatie Ronald Westra Dep. Mathematics Maastricht University.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
4. Vorlesung WS 2005/06Softwarewerkzeuge der Bioinformatik1 V4 Prediction of Phylogenies based on single genes Material of this lecture taken from - chapter.
Inferring a phylogeny is an estimation procedure.
Clustering methods Tree building methods for distance-based trees
Goals of Phylogenetic Analysis
Phylogenetic Trees.
BNFO 602 Phylogenetics Usman Roshan.
Lecture 7 – Algorithmic Approaches
Phylogeny.
Incorporating uncertainty in distance-matrix phylogenetics
Presentation transcript:

In addition to maximum parsimony (MP) and likelihood methods, pairwise distance methods form the third large group of methods to infer evolutionary trees from sequence data. Evolutionary model: When all the pairwise distances have been computed for a set of sequences, a tree topology can then be inferred by a variety of methods. Chapter 5 Phylogeny inference based on distance methods

The main distance-based tree-building methods are cluster analysis and minimum evolution. Ultrametricity is satisfied d AC ≦ max(d AB, d BC ) unweighted-pair group method with arithmetic means (UPGMA) that are most similar to each other (that is for which the genetic distance is the smallest). When two OTUs are group, they are treated as a new single OTU. From the new group of OTUs, the pair for which the similarity is highest is again identified, and so on, until only two OTUs are left. 5.2

UPGMA are extremely sensitive to unequal rates in different lineages.

Additive distances satisfy the following condition, known as the four-point metric condition (d AB +d CD ) ≦ max (d AC + d BD, d AD +d BC ) Only additive distances can be fitted precisely into an unrooted tree such that the genetic distance between a pair of OTUs equals the sum of the lengths of the branches connecting them, rather than an average, Minimum evolution and neighbor-joining

Minimum evolution (ME) was first described by Kidd and Sgaramella-Zonta (1971); Rzhetsky and Nei(1992) described a method with only a minor difference. In ME, the tree that minimizes the lengths of the tree, which is the sum of the lengths of the branches, is regarded as the best estimate of the phylogeny: n is the number of taxa in the tree and v i is the ith branch (remember that there are 2n-3 branches in an unrooted tree of n taxa).

Distances are rarely, exactly tree metrics, and hence one class of ‘goodness of fit’ methods seeks the metric tree that best accounts for the ‘observed’ distances. The goodness of fit F between observed distance d ij and tree distances p ij for each pair of sequences i and j is given by. In the example just given we were fitting an additive tree with (2n-3) branches to ( ) = n (n-1)/2 pairwise distances. n2n2 Distance methods

Minimum evolution Given an unrooted metric tree for n sequences there are (2n-3) branches, each with length e i. The sum of these branch lengths is the length L of the tree: The minimum evolution tree (ME) is the tree which minimizes L. More commonly, the branch lengths of the minimum evolution tree are estimated using least-squares methods. The branch lengths are estimated in the same way as for goodness of fit measures; however, rather than compare the fit of the observed distances the least squares branch lengths are added together to give the length of the tree.

A drawback of the ME method is that, in principle, all different tree topologies have to be investigated to find the minimum tree. However, this is impossible in practice because of the explosive increase in the number of tree topologies as the number of OTUs increases; an exhaustive search can no longer be applied when more than ten sequences are being used. Page 19

A good heuristic method for estimating the ME tree is the neighbor-joining (NJ) method, developed by Saitou and Nei (1987) and modified by Studier and Keppler (1988). Because NJ is conceptually related to clustering, but without assuming a clock- like behavior, it combines computational speed with uniqueness of results.

However, NJ trees have proven to be the same or similar to the ME tree. Several methods have been proposed to find ME trees, staring from an NJ tree but evaluating alternative topologies close to the NJ tree by conducting local rearrangements.

3 - (21+24) / 3

U 3/2 (21-24) /2 * 3

DC EC

519+17

S DW =d DE /2+(r D -r E )/2(N-2) S EW =d DE -S DW

However, NJ trees have proven to be the same or similar to the ME tree. Several methods have been proposed to find ME trees, staring from an NJ tree but evaluating alternative topologies close to the NJ tree by conducting local rearrangements.

Alternative versions of the NJ algorithm have been proposed, including BIONJ (Gascuel, 1997), weighted neighbor-joining (weighbor), and generalized neighbor-joining. BIONJ and weighbor both consider that long genetic distances present a higher variance than short ones when distances from a newly defined node to all other nodes are estimated.

The weighted neighbor-joining method of Bruno et al. (2000) uses a likelihood-based criterion rather than the ME criterion of Saitou and Nei (1987) to decide which pair of OTUs should be joined. The generalized neighbor-joining method of Pearson et al.(1999) keeps track of multiple, partial, and potentially good solutions during its execution, thus exploring a greater part of the tree space. As a result, the program is able to discover topologically distinct solutions that are close to the ME tree.

Consensus tree

Compare trees derived from different sequences, or from the same sequence using different methods.

Jackknifing