Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C 2 0 1 0 0 Sum of branch lengths = total number of changes.

Slides:



Advertisements
Similar presentations
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Advertisements

Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Phylogenetic Trees Lecture 4
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
UPGMA and FM are distance based methods. UPGMA enforces the Molecular Clock Assumption. FM (Fitch-Margoliash) relieves that restriction, but still enforces.
Phylogenetic trees as a visualization tools for evolutionary classification.
. Maximum Likelihood (ML) Parameter Estimation with applications to inferring phylogenetic trees Comput. Genomics, lecture 7a Presentation partially taken.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
1. 2 Rooting the tree and giving length to branches.
Variants of parsimony Simply counting the number of changes may not be the most desirable way of calculating parsimony.
Maximum Likelihood. Historically the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Its slow uptake by the scientific community.
From population genetics to variation among species: Computing the rate of fixations.
In addition to maximum parsimony (MP) and likelihood methods, pairwise distance methods form the third large group of methods to infer evolutionary trees.
We have shown that: To see what this means in the long run let α=.001 and graph p:
Realistic evolutionary models Marjolijn Elsinga & Lars Hemel.
Maximum Parsimony.
Phylogenetic trees. ChimpHumanGorilla HumanChimpGorilla = ChimpGorillaHuman == GorillaChimp Trees.
. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.
Chapter 2 Opener How do we classify organisms?. Figure 2.1 Tracing the path of evolution to Homo sapiens from the universal ancestor of all life.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Parsimony methods the evolutionary tree to be preferred involves ‘the minimum amount of evolution’ Edwards & Cavalli-Sforza Reconstruct all evolutionary.
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Parsimony and searching tree-space Phylogenetics Workhop, August 2006 Barbara Holland.
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
A brief introduction to phylogenetics
Introduction to Phylogenetics
Inferring phylogenetic trees: Maximum likelihood methods Prof. William Stafford Noble Department of Genome Sciences Department of Computer Science and.
Calculating branch lengths from distances. ABC A B C----- a b c.
Sequence Alignment Csc 487/687 Computing for bioinformatics.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
GENE 3000 Fall 2013 slides wiki. wiki. wiki.
Rooting Phylogenetic Trees with Non-reversible Substitution Models Von Bing Yap* and Terry Speed § *Statistics and Applied Probability, National University.
Lecture 6A – Introduction to Trees & Optimality Criteria Branches: n-taxa -> 2n-3 branches 1, 2, 4, 6, & 7 are external (leaves) 3 & 5 are internal branches.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Phylogeny Ch. 7 & 8.
1 Alignment Matrix vs. Distance Matrix Sequence a gene of length m nucleotides in n species to generate an… n x m alignment matrix n x n distance matrix.
Phylogenetic Trees - Parsimony Tutorial #13
Lecture 15: Reconstruction of Phylogeny Adaptive characters: 1.May indicate derived character (special adaptation) e.g. Raptorial forelegs in mantids 2.May.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Building Phylogenies. Phylogenetic (evolutionary) trees Human Gorilla Chimp Gibbon Orangutan Describe evolutionary relationships between species Cannot.
Maximum Parsimony Phenetic (distance based) methods are fast and often accurate but discard data and are not based on explicit character states at each.
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
Character-Based Phylogeny Reconstruction
Patterns in Evolution I. Phylogenetic
CS 581 Tandy Warnow.
Presentation transcript:

Branch lengths

Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.

C A C A C A C A C C AA C A C A 0.5 0

Genes: 0 = missing, 1 = exist speciesg1g2g3g4g5g6 s s s s s

Ex: Find branch lengths of: s1 s3 s2 s4 s5

Problems with MP MP has many problems. We will go over a small sample of them.

Problems with MP 1. The statistical justification of MP is unclear. Why should a tree with the least number of changes be the most likely one given the data?

Problems with MP 2. Different transitions should have different probabilities (e.g., transitions versus transversions). In MP, this can be accounted for using cost matrices. However, there’s no objective way to assign costs.

Problems with MP 3. Different characters may have different weights (e.g., having versus not having vertebrates should maybe weight more than having or not having nails). In MP, most of the time all characters are assigned equal weights. This can be accounted for in MP by assigning different weights to the different characters. However, there’s no objective way to assign weights to characters.

Problems with MP 4. The chance for a change depends on evolutionary distances. It is more likely for an amino-acid replacement to occur between a cucumber and human, than between a chimp and a human. MP ignores evolutionary distances (branch lengths), i.e., each type of transition is assigned the same cost regardless of the branch in which it is inferred to have occurred.

Problems with MP 5. The MP score does not change if we consider a rooted tree versus an unrooted one. However branch lenghs do change.

s1s4s3 s2 s5 Gene number 2, Option number

s1s4s3 s2 s5 Gene number 2, Option number

s1s4s3 s2 s5 Gene number 2, Option number Number of changes for gene 2 (character 2) = 2

Gene number 2, Branch lengths s1s4s3 s2 s /3 1/3

Gene number 2, The unrooted version s1 s4 s3 s2 s s1 s4 s3 s2 s

Branch lengths are different if one uses a rooted or unrooted tree s1 s4 s3 s2 s

Problems with MP 6. MP ignores the chance of multiple substitutions per position. If we see A in one sequence, and C in another, there’s a chance that in fact the evolution was A->G->C. Similarly, if we have A in two sequences, it may be that the evolution was A->-C>-A. MP ignores these possibilities, which is unrealistic, and as a result, MP also underestimates branch lengths.

A A Introduction to problems with MP MP underestimates branch lengths C A 1.0 MP branch lengths 0 0 A A C A 1.05 A more realistic solution 0.05

Variants of parsimony

The simple parsimony which counts changes is the Wagner parsimony. If different changes have different costs, this is weighted parsimony. Variants of parsimony ACGT A0312 C3021 G1203 T2130

This method assumes that 0 is the ancestral state, and thus, we can only observe 0->1 changes, but never a reversal (1->0). Computation is easy. The father node of 0 is always 0. Total number of changes = number of 0- >1 changes. Example: small deletions in DNA (0 = no deletion). We assume that a deletion cannot revert to the original sequence. Camin-Sokal

This method is directional: the root position influences the score. This parsimony is rarely used today… Camin-Sokal

When 0 can change to 1 and not to 2, etc’... 0  1  2. Or when the states are in a linear continuum, and the distance between states 0.45 and 0.99 is abs( ). For example, this can be used to make phylogeny based on fingers’ length. Algorithm: very similar to Sankoff’s. Ordinal scale

Dollo Parsimony Dollo’s law states that a complex character, once attained, cannot be attained in that form again. In 0/1 terms, if 0 is the ancestral state and 1 the complex state, 1 can evolve from 0 only once, but 1 can revert to state 0. This, like the Camin-Sokal parsimony is a directional method: the position of the root is important. This method was used to infer phylogenies from restriction enzyme sites.

Some additional remarks regarding MP

Monophyletic groups: Human Chimp Chicken Gorilla When an unrooted tree is given, you cannot know which groups are monophyletic. You can only say which are not. For example, Chicken + Rat might be monophyletic if the root was between Chicken + Rat and the rest. In fact, the real root of the tree is between Chicken and the rest, hence Chicken and rat are not monophyletic. But, Human and Gorilla are not monophyletic no matter where is the root… Rat

We have 6 characters. In each species both 0 and 1 are present. The minimum number of changes is 6 (each character must change at least once). The reason we have more than 6 changes is that some characters had arisen more than once. This is called homoplasy. HOMOPLASY