Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.

Slides:



Advertisements
Similar presentations
Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor.
Advertisements

. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Introduction into Phylogenetics Katja Nowick Group Leader “TFome and Transcriptome Evolution” Bioinformatics Group Paul-Flechsig-Institute for Brain Research.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
Phylogenetic reconstruction
IE68 - Biological databases Phylogenetic analysis
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
© Wiley Publishing All Rights Reserved. Phylogeny.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
UPGMA and FM are distance based methods. UPGMA enforces the Molecular Clock Assumption. FM (Fitch-Margoliash) relieves that restriction, but still enforces.
. Maximum Likelihood (ML) Parameter Estimation with applications to inferring phylogenetic trees Comput. Genomics, lecture 7a Presentation partially taken.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
Tree Pattern Matching in Phylogenetic Trees Automatic Search for Orthologs or Paralogs in Homologous Gene Sequence Databases By: Jean-François Dufayard,
Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
Building Phylogenies Parsimony 2.
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Counting evolutionary changes the parsimony method requires an algorithm that counts the number of evolutionary changes in a tree. Fitch W.M Syst.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Parsimony and searching tree-space Phylogenetics Workhop, August 2006 Barbara Holland.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Phylogenetics II.
Phylogenetic Tree Reconstruction
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
Evolutionary tree reconstruction
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Phylogeny Ch. 7 & 8.
1 Alignment Matrix vs. Distance Matrix Sequence a gene of length m nucleotides in n species to generate an… n x m alignment matrix n x n distance matrix.
Phylogenetics.
Phylogenetic Trees - Parsimony Tutorial #13
13. Lecture WS 2004/05Bioinformatics III1 V13 Prediction of Phylogenies based on single genes Material of this lecture taken from - chapter 6, DW Mount.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Parsimony and searching tree-space. The basic idea To infer trees we want to find clades (groups) that are supported by synapomorpies (shared derived.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Phylogeny.
Building Phylogenies Maximum Likelihood. Methods Distance-based Parsimony Maximum likelihood.
Building Phylogenies. Phylogenetic (evolutionary) trees Human Gorilla Chimp Gibbon Orangutan Describe evolutionary relationships between species Cannot.
Maximum Parsimony Phenetic (distance based) methods are fast and often accurate but discard data and are not based on explicit character states at each.
4. Vorlesung WS 2005/06Softwarewerkzeuge der Bioinformatik1 V4 Prediction of Phylogenies based on single genes Material of this lecture taken from - chapter.
Phylogeny and the Tree of Life
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Character-Based Phylogeny Reconstruction
Goals of Phylogenetic Analysis
#31 - Phylogenetics Character-Based Methods
Recitation 5 2/4/09 ML in Phylogeny
CS 581 Tandy Warnow.
CSCI2950-C Lecture 8 Molecular Phylogeny: Parsimony and Likelihood
BNFO 602 Phylogenetics – maximum likelihood
BNFO 602 Phylogenetics Usman Roshan.
Presentation transcript:

Building Phylogenies Parsimony 1

Methods Distance-based Parsimony Maximum likelihood

Note Some of the following figures come from: –[S05] Swofford formatics_spring05 formatics_spring05 –[F05] Felsenstein 1/2005/ 1/2005/

Parsimony methods Goal: Find the tree that allows evolution of the sequences with the fewest changes. This is called a most parsimonious (MP) tree Parsimony is implemented in PAUP* Compatibility methods are closely related to parsimony: –Goal: Find tree that perfectly fits the most characters.

Evolutionary Steps G  A A G G Steps can have weights

Parsimony a0111a0111 ABCDABCD c0011c0011 d0110d0110 e0001e0001 f1000f1000 b0111b0111 ABC D f a, b d c ed Typically, each site is treated separately

Some numbers Number of unrooted trees on n  2 species: U n = (2n  5)(2n  7)(2n  9)... (3)(1), Number of rooted trees on n  3 species: R n = (2n  5) U n

The number of rooted trees [F05]

Small versus Large Parsimony Parsimony score of a tree: The smallest (weighted) number of steps required by the tree (Large) Parsimony: Find the tree with the lowest parsimony score Small Parsimony: Given a tree, find its parsimony score Small parsimony is by far the easier problem. –Used to solve large parsimony

A DNA data set [F05]

An example tree [F05]

Most parsimonious states for site 1

Most parsimonious states for site 2

Most parsimonious states for site 3

Most parsimonious states for sites 4 and 5

Most parsimonious states for site 6

Evolutionary steps on tree Only one choice of reconstruction at each site is shown 9 steps in all

Algorithms for Small Parsimony Fitch’s algorithm: –Based on set operations –Evolutionary steps have same weight Sankoff’s algorithm: –Based on dynamic programming –Allows steps to have different weights Both algorithms compute the minimum (weighted) number of steps a tree requires at a given site.

Fitch’s Algorithm Each node v in tree has a set X(v) If v is a leaf (tip), X(v) is the nucleotide observed at v –if there is ambiguity, X(v) contains all possible nucleotides at v If v is a node with descendants u and w, –Let Y  X(u)  X(w) –If Y  make X(v)  Y, –If Y   make X(v)  X(u)  X(w) and count one step.

Fitch’s Algorithm: Example [F05]

Sankoff’s Algorithm Let c ij be the cost of going from state i to state j. E.g., transitions (A  G or C  T) are more probable than transversions, so give lower weight to transitions Let S v (k) be the smallest (weighted) number of steps needed to evolve the subtree at or above node v, given that node v is in state k.

Sankoff’s Algorithm If v is a leaf (tip) If v is a node with descendants u and w The minimum number of (weighted) steps is

Sankoff’s Algorithm: Example

Sankoff’s Algorithm: Traceback

Searching for an MP tree Exhaustive search (exact) Branch-and-bound search (exact) Heuristic search methods –Stepwise addition –Branch swapping –Star decomposition

Homology, orthology, and paralogy Homology: Similarity attributed to descent from a common ancestor. Orthologous sequences: Homologous sequences in different species that arose from a common ancestral gene during speciation; may or may not be responsible for a similar function. Paralogous sequences: Homologous sequences within a single species that arose by gene duplication.

Orthology and Paralogy