Phylogenetic trees. ChimpHumanGorilla HumanChimpGorilla = ChimpGorillaHuman == GorillaChimp Trees.

Slides:



Advertisements
Similar presentations
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Advertisements

. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Based on lectures by C-B Stewart, and by Tal Pupko Phylogenetic Analysis based on two talks, by Caro-Beth Stewart, Ph.D. Department of Biological Sciences.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Phylogenetic Analysis – Part 2 Spring Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Maximum Parsimony (MP) Algorithm. MP Algorithm  Character-based algorithm – does not use distances, but utilizes the character information in sequences.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
UPGMA and FM are distance based methods. UPGMA enforces the Molecular Clock Assumption. FM (Fitch-Margoliash) relieves that restriction, but still enforces.
Phylogenetic trees as a visualization tools for evolutionary classification.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
1. 2 Rooting the tree and giving length to branches.
TREES. Trees HumanChimpGorilla = ChimpGorillaHuman ChimpHumanGorilla = HumanGorilla = Chimp HumanChimpGorilla ≠ ChimpHuman ≠ GorillaChimp.
Building phylogenetic trees Jurgen Mourik & Richard Vogelaars Utrecht University.
Phylogeny. Reconstructing a phylogeny  The phylogenetic tree (phylogeny) describes the evolutionary relationships between the studied data  The data.
Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.
Phylogenetic reconstruction
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
Introduction to Bioinformatics Molecular Phylogeny Lesson 5.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Counting evolutionary changes the parsimony method requires an algorithm that counts the number of evolutionary changes in a tree. Fitch W.M Syst.
TREES. ChimpHumanGorilla HumanChimpGorilla = ChimpGorillaHuman == GorillaChimp Trees.
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Terminology of phylogenetic trees
Molecular phylogenetics
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
Parsimony and searching tree-space Phylogenetics Workhop, August 2006 Barbara Holland.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
Phylogentic Tree Evolution Evolution of organisms is driven by Diversity  Different individuals carry different variants of.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Introduction to Phylogenetics
MOLECULAR PHYLOGENETICS Four main families of molecular phylogenetic methods :  Parsimony  Distance methods  Maximum likelihood methods  Bayesian methods.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
Rooting Phylogenetic Trees with Non-reversible Substitution Models Von Bing Yap* and Terry Speed § *Statistics and Applied Probability, National University.
Lecture 6A – Introduction to Trees & Optimality Criteria Branches: n-taxa -> 2n-3 branches 1, 2, 4, 6, & 7 are external (leaves) 3 & 5 are internal branches.
Introduction to Phylogenetic trees Colin Dewey BMI/CS 576 Fall 2015.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Phylogenetic Trees - Parsimony Tutorial #13
Selecting Genomes for Reconstruction of Ancestral Genomes Louxin Zhang Department of Mathematics National University of Singapore.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Parsimony and searching tree-space. The basic idea To infer trees we want to find clades (groups) that are supported by synapomorpies (shared derived.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Maximum Parsimony Phenetic (distance based) methods are fast and often accurate but discard data and are not based on explicit character states at each.
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
Phylogenetic basis of systematics
Character-Based Phylogeny Reconstruction
Phylogenetic Trees.
BNFO 602 Phylogenetics Usman Roshan.
CS 581 Tandy Warnow.
Chapter 20 Phylogenetic Trees. Chapter 20 Phylogenetic Trees.
Phylogeny.
Presentation transcript:

Phylogenetic trees

ChimpHumanGorilla HumanChimpGorilla = ChimpGorillaHuman == GorillaChimp Trees

A branch = An edge External node - leaf HumanChimp Chicken Gorilla The root Internal nodes Terminology

HumanChimp Chicken Gorilla INGROUP OUTGROUP Ingroup / Outgroup:

The maximum parsimony principle. (The shortest path) Modified from Inferring Phylogenies (Book), Author: Prof. Joe Felsenstein

Genes: 0 = absence, 1 = presence speciesg1g2g3g4g5g6 s s s s s

s1s4s3 s2 s5 Evaluate this tree…

s1s4s3 s2 s5 1

s1s4s3 s2 s5 01

s1s4s3 s2 s5 110

s1s4s3 s2 s Gene number 1

s1s4s3s2s5 Gene number 1. The most parsimonious ancestral character states

s1s4s3s2s5 Gene number 1, Option number

s1s4s3s2s5 Gene number 1, Option number 2. Minimal number of changes for gene 1 (character 1) =

s1s4s3 s2 s5 00 Gene number 2,

s1s4s3 s2 s5 Gene number 2, Option number

s1s4s3 s2 s5 Gene number 2, Option number

s1s4s3 s2 s Number of changes for gene 2 (character 2) = 2 Gene number 2, Option number 3.

Sum of changes = 9 Genes: 0 = absence, 1 = presence speciesg1g2g3g4g5g6 s s s s s Total number of changes given the tree

Can we do better? Sum of changes = 9

YES WE CAN! Sum of changes = 8 Sum of changes = 9 The MP (most parsimonious) tree:

s1s4s3 s2 s5 The MP (most parsimonious) tree: Sum of changes for this tree topology = 8

Intermediate Summary MP tree = one for which minimal number of changes are needed to explain the data We can now search for the best tree under the MP criterion

Challenges Evaluating big tree “by hand” can be problematic. We want the computer to do it. Going over all the trees? How many trees are there? Can we generalize to nucleotides? To amino acids? Is the parsimony criterion ideal?

MP for nucleotides

Positions: speciesp1p2p3p4p5p6 s1AAGTAA s2CAAAAC s3CAGGAA s4AAATAC s5GCGCCA s1AAGTAA s2CAAAAC s3CAGGAA s4AAATAC s5GCGCCA

s1s4s3 s2 s5 G Position number 1 AACC

s1s4s3 s2 s5 G Position number 1 A A CCA C C C Number of changes for position 1 = 2

GACAGGGA CAAG GCGA GAAA HumanChimp Chicken Gorilla Duck Find the MP score of the tree for these sequences Exercise

How to efficiently compute the MP score of a tree

AG C C A HumanChimp Chicken Gorilla Duck {A,G} {A,C,G} {A,C} Postorder tree scan. In each node, if the intersection between the leaves is empty: we apply a union operator. Otherwise, an intersection. The Fitch algorithm (1971):

AG C C A HumanChimp Chicken Gorilla Duck {A,G} {A,C,G} {A,C} Total number of changes = number of union operators.

Rooting the tree From Wiki commons

Positions: speciesp1p2p3p4p5p6 HumanAAGTAA ChimpAATTAC GorillaACATAA AAAAAAAAA CHGGCHHCG Total number of changes = 0 For all 3 possible tree topologies

Positions: speciesp1p2p3p4p5p6 HumanAAGTAA ChimpAATTAC GorillaACATAA AACCAAAAC CHGGCHHCG Total number of changes = 1 For all 3 possible tree topologies

Positions: speciesp1p2p3p4p5p6 HumanAAGTAA ChimpAATTAC GorillaACATAA TGAATGGTA CHGGCHHCG Total number of changes = 2 For all 3 possible tree topologies

Positions: speciesp1p2p3p4p5p6 HumanAAGTAA ChimpAATTAC GorillaACATAA CHGGCHHCG Total number of changes is always the same for all 3 possible tree topologies

With 4 taxa Orangutan

G OHC H CGO O CHG G HCO H OCG O HGC G COH H OGC O CGH O CGH O HGC O CHG C HGO C OHG C OGH

G OHC H CGO O CGH O CGH C OHG

G OHC H CGO O CGH C CGH C OHG O O G H

The position of the root does not affect the MP score. Conclusion

Chimp Orangutan Gorilla Human C GCA G G G G G G A G After “bending” the trees, the association of changes and branches does not change! Rooting does not change MP score G

Chimp Orangutan Gorilla Human C GCC G G G C C G C G C After “bending” the trees, the association of changes and branches does not change! Rooting does not change MP score

Back to solving the relationships between human, chimp and gorilla… Using an outgroup

No MP with 3 species

Back to solving the relationships between human, chimp and gorilla… Using an outgroup

Human Chimp Chicken Gorilla Human Gorilla Chimp Chicken Human Chicken Chimp Gorilla With 4 taxa, there are 3 difference unrooted trees.

Human Chimp Chicken Gorilla Human Gorilla Chimp Chicken Human Chicken Chimp Gorilla One tree gets a better score (less changes) than the other trees.

Human Chimp Chicken Gorilla We then use an external knowledge, that chicken is the outgroup and get a rooted tree

C X Y H X O CHYO Can you root the unrooted tree to obtain the tree below? Exercise

How many rooted trees result from an unrooted tree with n taxa? Exercise

Assume you have three sequences and the MP score of the unrooted tree is X. You now add another sequence. Can the score of the 4-taxa tree be lower than that of the 3 taxa tree? Exercise