Tree Reconstruction Basic Principles of Phylogenetics Distance Parsimony Compatibility Inconsistency Likelihood.

Slides:



Advertisements
Similar presentations
Phylogenies and the Tree of Life
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
BALANCED MINIMUM EVOLUTION. DISTANCE BASED PHYLOGENETIC RECONSTRUCTION 1. Compute distance matrix D. 2. Find binary tree using just D. Balanced Minimum.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Molecular Evolution and Phylogenetic Tree Reconstruction
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
. Maximum Likelihood (ML) Parameter Estimation with applications to inferring phylogenetic trees Comput. Genomics, lecture 7a Presentation partially taken.
Phylogeny Tree Reconstruction
Maximum Likelihood. Historically the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Its slow uptake by the scientific community.
Building phylogenetic trees Jurgen Mourik & Richard Vogelaars Utrecht University.
BME 130 – Genomes Lecture 26 Molecular phylogenies I.
Realistic evolutionary models Marjolijn Elsinga & Lars Hemel.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Lecture 24 Inferring molecular phylogeny Distance methods
Molecular Evolution: Plan for week Monday 3.11: Basics of Molecular Evolution Lecture 1: Molecular Basis and Models I (JH) Computer :
Phylogeny Tree Reconstruction
Probabilistic methods for phylogenetic trees (Part 2)
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Terminology of phylogenetic trees
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
PHYLOGENETIC TREES Dwyane George February 24,
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Calculating branch lengths from distances. ABC A B C----- a b c.
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
Evolutionary tree reconstruction
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Mareike Fischer How many characters are needed to reconstruct the true tree? Mareike Fischer and Mike Steel Future Directions in Phylogenetic Methods and.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Statistical stuff: models, methods, and performance issues CS 394C September 3, 2009.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Building Phylogenies. Phylogenetic (evolutionary) trees Human Gorilla Chimp Gibbon Orangutan Describe evolutionary relationships between species Cannot.
Recombination and Pedigrees Genealogies and Recombination: The ARG Recombination Parsimony The ARG and Data Pedigrees: Models and Data Pedigrees & ARGs.
Chapter AGB. Today’s material Maximum Parsimony Fixed tree versions (solvable in polynomial time using dynamic programming) Optimal tree search.
Phylogenetic basis of systematics
Distance based phylogenetics
Inferring a phylogeny is an estimation procedure.
Multiple Alignment and Phylogenetic Trees
Goals of Phylogenetic Analysis
Patterns in Evolution I. Phylogenetic
Inferring phylogenetic trees: Distance and maximum likelihood methods
CS 581 Tandy Warnow.
Lecture 36 Section 12.2 Mon, Apr 23, 2007
Phylogeny.
But what if there is a large amount of homoplasy in the data?
Presentation transcript:

Tree Reconstruction Basic Principles of Phylogenetics Distance Parsimony Compatibility Inconsistency Likelihood

Central Principles of Phylogeny Reconstruction TTCAGT TCCAGT GCCAAT Parsimony s2 s1 s4 s Total Weight: 3 s2 s1 s4 s Distance s2 s1 s4 s3 L=3.1*10 -7 Parameter estimates Likelihood

From Distance to Phylogenies What is the relationship of a, b, c, d & e? a c b d e a c b a c bde a b c d e a b c d e Molecular clock No Molecular clock b e 14

UGPMA Unweighted Group Pairs Method using Arithmetic Averages From Molecular Systematics p486 A B C D E A B C D 4289 E AB C D E AB C D 4289 E ABE C D ABE C 2795 D ABE CD ABE 3310 CD A B 857 A B E 1096 A B 857 E 1096 D C 1347 A B 857 E 1096 D C UGPMA can fail: A and B are siblings, but A and C are closest Siblings will have [d(A,?)+d(B,?)-d(A,B)]/2 maximal. A B C ?

Assignment to internal nodes: The simple way. C A C C A C T G ? ? ? ? ? ? What is the cheapest assignment of nucleotides to internal nodes, given some (symmetric) distance function d(N 1,N 2 )?? If there are k leaves, there are k-2 internal nodes and 4 k-2 possible assignments of nucleotides. For k=22, this is more than

5S RNA Alignment & Phylogeny Hein, tatt-ctggtgtcccaggcgtagaggaaccacaccgatccatctcgaacttggtggtgaaactctgccgcggt--aaccaatact-cg-gg-gggggccct-gcggaaaaatagctcgatgccagga--ta 17 t--t-ctggtgtcccaggcgtagaggaaccacaccaatccatcccgaacttggtggtgaaactctgctgcggt--ga-cgatact-tg-gg-gggagcccg-atggaaaaatagctcgatgccagga--t- 9 t--t-ctggtgtctcaggcgtggaggaaccacaccaatccatcccgaacttggtggtgaaactctattgcggt--ga-cgatactgta-gg-ggaagcccg-atggaaaaatagctcgacgccagga--t- 14 t----ctggtggccatggcgtagaggaaacaccccatcccataccgaactcggcagttaagctctgctgcgcc--ga-tggtact-tg-gg-gggagcccg-ctgggaaaataggacgctgccag-a--t- 3 t----ctggtgatgatggcggaggggacacacccgttcccataccgaacacggccgttaagccctccagcgcc--aa-tggtact-tgctc-cgcagggag-ccgggagagtaggacgtcgccag-g--c- 11 t----ctggtggcgatggcgaagaggacacacccgttcccataccgaacacggcagttaagctctccagcgcc--ga-tggtact-tg-gg-ggcagtccg-ctgggagagtaggacgctgccag-g--c- 4 t----ctggtggcgatagcgagaaggtcacacccgttcccataccgaacacggaagttaagcttctcagcgcc--ga-tggtagt-ta-gg-ggctgtccc-ctgtgagagtaggacgctgccag-g--c- 15 g----cctgcggccatagcaccgtgaaagcaccccatcccat-ccgaactcggcagttaagcacggttgcgcccaga-tagtact-tg-ggtgggagaccgcctgggaaacctggatgctgcaag-c--t- 8 g----cctacggccatcccaccctggtaacgcccgatctcgt-ctgatctcggaagctaagcagggtcgggcctggt-tagtact-tg-gatgggagacctcctgggaataccgggtgctgtagg-ct-t- 12 g----cctacggccataccaccctgaaagcaccccatcccgt-ccgatctgggaagttaagcagggttgagcccagt-tagtact-tg-gatgggagaccgcctgggaatcctgggtgctgtagg-c--t- 7 g----cttacgaccatatcacgttgaatgcacgccatcccgt-ccgatctggcaagttaagcaacgttgagtccagt-tagtact-tg-gatcggagacggcctgggaatcctggatgttgtaag-c--t- 16 g----cctacggccatagcaccctgaaagcaccccatcccgt-ccgatctgggaagttaagcagggttgcgcccagt-tagtact-tg-ggtgggagaccgcctgggaatcctgggtgctgtagg-c--t- 1 a----tccacggccataggactctgaaagcactgcatcccgt-ccgatctgcaaagttaaccagagtaccgcccagt-tagtacc-ac-ggtgggggaccacgcgggaatcctgggtgctgt-gg-t--t- 18 a----tccacggccataggactctgaaagcaccgcatcccgt-ccgatctgcgaagttaaacagagtaccgcccagt-tagtacc-ac-ggtgggggaccacatgggaatcctgggtgctgt-gg-t--t- 2 a----tccacggccataggactgtgaaagcaccgcatcccgt-ctgatctgcgcagttaaacacagtgccgcctagt-tagtacc-at-ggtgggggaccacatgggaatcctgggtgctgt-gg-t--t- 5 g---tggtgcggtcataccagcgctaatgcaccggatcccat-cagaactccgcagttaagcgcgcttgggccagaa-cagtact-gg-gatgggtgacctcccgggaagtcctggtgccgcacc-c--c- 13 g----ggtgcggtcataccagcgttaatgcaccggatcccat-cagaactccgcagttaagcgcgcttgggccagcc-tagtact-ag-gatgggtgacctcctgggaagtcctgatgctgcacc-c--t- 6 g----ggtgcgatcataccagcgttaatgcaccggatcccat-cagaactccgcagttaagcgcgcttgggttggag-tagtact-ag-gatgggtgacctcctgggaagtcctaatattgcacc-c-tt Transitions 2, transversions 5 Total weight 843.

Cost of a history - minimizing over internal states A C G T d(C,G) +w C (left subtree)

Cost of a history – leaves (initialisation). A C G T G A Empty Cost 0 Empty Cost 0 Initialisation: leaves Cost(N)= 0 if N is at leaf, otherwise infinity

Compatibility and Branch Popping A GCACGTGCAGTTAGGA B GCACGTGCAGTTAGGA C TCTCGTGCAGTTAGGA D TCTCATGCAATTAGGA E TCTCATGCAATTATGA F TCTCATGCAATTATGA EFG ABC A GCACGTGCAGTTAGGA B GCACGTGCAGTTAGGA C TCTCGTGCAGTTAGGA D TCTCATGCAATTAGGA E TCTCATGCAATTATGA F TCTCATGCAATTATGA E ABC FG A GCACGTGCAGTTAGGA B GCACGTGCAGTTAGGA C TCTCGTGCAGTTAGGA D TCTCATGCAATTAGGA E TCTCATGCAATTATGA F TCTCATGCAATTATGA E C FG AB Definition: Two columns can be placed on the same tree – each explained by 1 mutation. This is equivalent to: In the two columns only 3 or the 4 possible character pairs are observed Multistate Definition: The number of mutations needed to explain a pair of columns is the sum of the mutations needed to explain the individual columns ? ? ? ? ? 2 + ? ? ? ? 3 + ? ? ? 4 + ? ? 5 + ? 6 + For imperfect data: Find the maximal compatible set of characters and then branch- pop

The Felsenstein Zone Felsenstein-Cavendar (1979) Patterns:(16 only 8 shown) s4 s3 s2 s1 True Tree s3 s1 s2 s4 Reconstructed Tree

Hadamard Conjugation & binary characters on a tree Closely related to inclusion-exclusion principle and Sieve Methods H1=H1= Hk=Hk= H k-1 H k-1 -H k-1 From branch lengths to bipartitions q=Hs From bipartition to lengths s=H -1 q Branch lengths – s, Bipartition lengths - q A B C D E True Tree with Clock A B C D E More Likely Tree Inconsistency in presence of a Clock: Felsenstein (2004) Inferring Phylogenies p 118

Bootstrapping Felsenstein (1985) ATCTGTAGTC T ATCTGTAGTCT 1 2 ?????????? ??????????