Phylogenetic basis of systematics

Slides:



Advertisements
Similar presentations
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Advertisements

. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
An Introduction to Phylogenetic Methods
Phylogenetic Analysis
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
© Wiley Publishing All Rights Reserved. Phylogeny.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Review of cladistic technique Shared derived (apomorphic) traits are useful in understanding evolutionary relationships Shared primitive (plesiomorphic)
Bioinformatics and Phylogenetic Analysis
Lecture 16: Wrap-Up COMP 538 Introduction of Bayesian networks.
BME 130 – Genomes Lecture 26 Molecular phylogenies I.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Phylogenetic trees Sushmita Roy BMI/CS 576
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
COMPUTATIONAL MODELS FOR PHYLOGENETIC ANALYSIS K. R. PARDASANI DEPTT OF APPLIED MATHEMATICS MAULANA AZAD NATIONAL INSTITUTE OF TECHNOLOGY (MANIT) BHOPAL.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
BINF6201/8201 Molecular phylogenetic methods
Phylogenetics and Coalescence Lab 9 October 24, 2012.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Phylogenetic Tree Reconstruction
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
Announcements Urban Forestry project starts this week. Go through protocol. We'll be sending you off on your own. Please act responsibly. Peer review of.
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny Ch. 7 & 8.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Phylogenetics.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Bioinformatics Lecture 3 Molecular Phylogenetic By: Dr. Mehdi Mansouri Mehr 1395.
CS 466 and BIOE 498: Introduction to Bioinformatics
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
New Approaches for Inferring the Tree of Life
394C, Spring 2012 Jan 23, 2012 Tandy Warnow.
Statistical tree estimation
Challenges in constructing very large evolutionary trees
Multiple Alignment and Phylogenetic Trees
Goals of Phylogenetic Analysis
Methods of molecular phylogeny
Molecular Evolution.
CS 581 Tandy Warnow.
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Tandy Warnow Department of Computer Sciences
New methods for simultaneous estimation of trees and alignments
Texas, Nebraska, Georgia, Kansas
Chapter 19 Molecular Phylogenetics
CS 394C: Computational Biology Algorithms
September 1, 2009 Tandy Warnow
Molecular data assisted morphological analyses
Algorithms for Inferring the Tree of Life
Sequence alignment CS 394C Tandy Warnow Feb 15, 2012.
Tandy Warnow The University of Texas at Austin
Presentation transcript:

Phylogenetic basis of systematics Linnaeus: Ordering principle is God. Darwin: Ordering principle is shared descent from common ancestors. Today, systematics is explicitly based on phylogeny.

Goals of Phylogenetic Analysis Given a multiple sequence alignment, determine the ancestral relationships among the species. We assume that residues in a column are homologous, and that all columns have the same history. Time Hu Ch Go Gi

Types of Phylogenic Trees: 1. Cladogram: show the relationships between different organisms branch lengths are arbitary 2. Phylogram: branches that represent evolutionary time and amount of change.

Data Biomolecular sequences: DNA, RNA, amino acid, in a multiple alignment Molecular markers (e.g., SNPs, etc.) Morphology Gene order and content These are “character data”: each character is a function mapping the set of taxa to distinct states (equivalence classes), with evolution modelled as a process that changes the state of a character

DNA Sequence Evolution -3 mil yrs -2 mil yrs -1 mil yrs today AAGACTT TGGACTT AAGGCCT AGGGCAT TAGCCCT AGCACTT AGCGCTT AGCACAA TAGACTT TAGCCCA AAGACTT TGGACTT AAGGCCT AGGGCAT TAGCCCT AGCACTT AAGGCCT TGGACTT TAGCCCA TAGACTT AGCGCTT AGCACAA AGGGCAT TAGCCCT AGCACTT

Phylogenetic Analyses Step 1: Gather sequence data, and estimate the multiple alignment of the sequences. Step 2: Reconstruct trees on the data. (This can result in many trees.) Step 3: Apply consensus methods to the set of trees to figure out what is reliable.

Phylogeny Problem AGGGCAT TAGCCCA TAGACTT TGCACAA TGCGCTT U V W X Y X

Types of Phylogenetic Methods Character-based Parsimony Likelihood Distance-based Neighbor joining (NJ) UPGMA Involve optimizing a criterion based on fit of the residues to the tree. Involve optimizing a criterion based on fit of a matrix of pairwise distances to the tree

Select the tree that best recreates the observed pairwise distances. Parsimony Select the tree that explains the data with the fewest number of substitutions. Likelihood Select the tree that has the highest probability of producing the observed data Distance Select the tree that best recreates the observed pairwise distances. http://study.com/academy/lesson/maximum-parsimony-likelihood-methods-in-phylogeny.html https://www.youtube.com/watch?v=NRRErwFsIcw

Phylogenetic Tree Building Two basic types: Gene/protein tree: represents evolutionary history of genes/proteins Species tree: represents the evolutionary history of species based on characters (like protein sequences) Rooted, binary tree Unrooted, binary tree

Phylogenetic Tree Building Two basic types: Gene/protein tree: represents evolutionary history of genes/proteins Species tree: represents the evolutionary history of species based on characters (like protein sequences) Rooted, binary tree Unrooted, binary tree * Can root a tree using an outgoup: known distant relative

(modern observations) Branch lengths (“distance”) ~ time Root (ancestral species) Edges Nodes (common ancestor) Leaves (modern observations)

(modern observations) Branch lengths (“distance”) ~ time Root (ancestral species) Why is the structure of the tree important? Edges Nodes (common ancestor) Leaves (modern observations)

(modern observations) Branch lengths (“distance”) ~ time Root (ancestral species) Why is the structure of the tree important? Branching represents speciation into two new species Edges Nodes (common ancestor) Leaves (modern observations)

Branch lengths (“distance”) ~ time 8 7 Root (ancestral species) 6 5 4 3 2 1 This tree can also be denoted in text format

Branch lengths (“distance”) ~ time 8 7 Root (ancestral species) 6 5 4 3 2 1 This tree can also be denoted in text format ( ( ( (3,4) , (5,6) ), 7 ), (1,2) ), 8

Building phylogenetic trees Distance based methods a. Calculate evolutionary distances between sequences b. Build a tree based on those distances Maximum Parsimony (character based method) a. Find the simplest tree that explains the data with the fewest # of substitutions Maximum Likelihood (probabilistic method based on explicit model) a. Find the tree that is most likely, given an evolutionary model

Building phylogenetic trees Distance based methods Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions A A G a G G A b A A A c A G A d

Building phylogenetic trees Distance based methods Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions A A G a G G A b A A A c A G A d

Building phylogenetic trees Distance based methods Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions A A G a A A A c G G A b A G A d What are the ancestral sequences at each node? How many base changes are required for this tree?

Building phylogenetic trees Distance based methods Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions A A A A A G a A A A c G G A b A G A d A A A or A G A A G A What are the ancestral sequences at each node? How many base changes are required for this tree? 3 changes are required.

Building phylogenetic trees Distance based methods Maximum Parsimony (character based method) Search all possible trees and find the one requiring the fewest substitutions A A A A A G a A A A c G G A b A G A d A A A or A G A A G A The score of the tree is the number of character changes. MP aims to minimize the score of tree.

How can you tell if your tree is significant? Bootstrapping: how dependent is the tree on the dataset 1. Randomly choose n objects from your dataset of n, with replacement 2. Rebuild the tree based on the subset of the data 3. Repeat 1,000 – 10,000 times 4. How often are the same children joined? If a given node is represented in <x trials, collapse the node for a ‘consensus’ tree Jackknifing: how dependent is the tree on the dataset 1. Randomly choose k objects from your dataset of n, without replacement 2. Rebuild the tree based on the subset of the data 3. Repeat 1,000 – 10,000 times 4. How often are the same children joined?

How can you tell if your tree is significant? 70 100 80 95 100

Maximum Likelihood tree showing Bayesian Inference/Maximum Parsimony/Maximum Likelihood support value at each node