Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.

Slides:



Advertisements
Similar presentations
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Advertisements

. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
An Introduction to Phylogenetic Methods
Introduction to Phylogenies
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
The Evolutionary Basis of Bioinformatics: An Introduction to Phylogenetics > Sequence 1 GAGGTAGTAATTAGATCCGAAA… > Sequence.
Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Plant Molecular Systematics (Phylogenetics). Systematics classifies species based on similarity of traits and possible mechanisms of evolution, a change.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Molecular Evolution Revised 29/12/06
Distance methods. UPGMA: similar to hierarchical clustering but not additive Neighbor-joining: more sophisticated and additive What is additivity?
The Tree of Life From Ernst Haeckel, 1891.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Phylogenetic Trees Tutorial 6. Measuring distance Bottom-up algorithm (Neighbor Joining) –Distance based algorithm –Relative distance based Phylogenetic.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Sequence Alignments Revisited
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic trees Sushmita Roy BMI/CS 576
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Phylogenetic Analysis
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Molecular Phylogeny and Evolution.
Terminology of phylogenetic trees
Multiple Sequence Alignment May 12, 2009 Announcements Quiz #2 return (average 30) Hand in homework #7 Learning objectives-Understand ClustalW Homework#8-Due.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
COMPUTATIONAL MODELS FOR PHYLOGENETIC ANALYSIS K. R. PARDASANI DEPTT OF APPLIED MATHEMATICS MAULANA AZAD NATIONAL INSTITUTE OF TECHNOLOGY (MANIT) BHOPAL.
PHYLOGENETIC TREES Dwyane George February 24,
1 Summary on similarity search or Why do we care about far homologies ? A protein from a new pathogenic bacteria. We have no idea what it does A protein.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Chapter 8 Molecular Phylogenetics: Measuring Evolution.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Evolutionary Biology Concepts Molecular Evolution Phylogenetic Inference BIO520 BioinformaticsJim Lund Reading: Ch7.
Introduction to Phylogenetics
Construction of Substitution Matrices
Calculating branch lengths from distances. ABC A B C----- a b c.
Sequence Alignment Csc 487/687 Computing for bioinformatics.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny Ch. 7 & 8.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Construction of Substitution matrices
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
Introduction to Bioinformatics Resources for DNA Barcoding
Phylogenetic basis of systematics
Inferring a phylogeny is an estimation procedure.
Clustering methods Tree building methods for distance-based trees
Multiple Alignment and Phylogenetic Trees
The Tree of Life From Ernst Haeckel, 1891.
Phylogenetic Trees.
Phylogeny.
Phylogenetic Trees Jasmin sutkovic.
Presentation transcript:

Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood

The goals of phylogenetics To understand the evolutionary relationships among species, e.g. - the order in which they diverged - the time since divergence

The assumptions in phylogenetics 1.Any group of organisms are related to each other by descent from a common ancestor 2.The relationships between organisms are described by a bifurcating tree 3.Change in characteristics between organisms occurs over time

Phylogenetic “objects” taxon clade node branch Phylogenetic tree

Constructing an evolutionary tree Step 2. Construction of multiple sequence alignment Step 1. Selection of appropriate sequences Step 3. Calculation of pair-wise evolutionary distances Step 4. Tree construction Step 5. Tree evaluation

1. Sequence selection find sequences with an appropriate amount of divergence: there can be too little or too much divergence (e.g. genes identical across taxa, or non- conserved genomic sequence) try to select orthologous sequences to make sure that the genes used for tree construction are likely to have preserved functions

2. Multiple alignment (mitochondrial small subunit RNA gene) informative sites alignment editing mechanics of multiple alignment construction covered in earlier classes in the course

3. Pair-wise distance measures how diverged two sequences are: ACGCGTTATTACAGTTGACT ACACGTTATGACAGTTGACT 2 differences in 20bp  D = 2/20 = 0.1 (10% divergence) Jukes-Cantor (JC) d = -3/4 ln(1-D*4/3) = (evolutionary distance) how evolutionarily distant two sequences are:

Pair-wise distances Pair-wise JC distance matrix

More complex substitution models substitutions between less similar residues indicate more divergence than between more similar residues (hydrophobic vs. hydrophilic) ACGTA-212C2-21G12-2T212-ACGTA-212C2-21G12-2T212- ACGCGTTATTACAGTTGACT ACACGTTATGACAGTTGACT A/G (1) + T/G (2)  diff = 3 amino acid substitution matrices (e.g. PAM, BLOSUM)

4. Tree construction goal is to group (cluster) sequences in a hierarchical fashion each step creates a “node” that represents the common ancestor of all the species/sequences within the group CA of group containing (A,B) CA of group containing (A,B,C,D) CA of group containing (A,B)

UPGMA method for phylogeny construction UPGMA (unweighted pair-group method with arithmetic mean) is conceptually very simple Step #1. Cluster two nodes with the shortest distance: e.g. if d(C,D) is lower than d(A,B), d(A,C), etc. then group C and D together. CD is now a new “node” Step #2. re-calculate distance between new node CD and all other current node, e.g.: d(CD, A) = ½ * (d(C,A) + d(D,A)) Go to Step #1. until every node is clustered into a single group CD

Example UPGMA phylogeny from a given distance matrix First cluster: Chimp + Pygmy chimp

Example (cont’d) After performing the complete clustering with UPGMA, we get the following rooted tree: There are many other tree-building methods (see Higgs & Attwood)

Branch lengths ultra-metricity additivity

Rooted vs. un-rooted trees Tree rooted with an outgroup (rodents)

5. Tree evaluation Goal: to evaluate the strength of the phylogenetic signal in the data and the robustness of the tree Bootstrapping: re-sample the original columns of the alignment with replacement, and produce a random, artificial alignment

Bootstrap support Report: for each node, the %-age of times resampled alignments produced the same tree topology (from that node down to the leaves) strong bootstrap support weak bootstrap support