© Wiley Publishing. 2007. All Rights Reserved. Phylogeny.

Slides:



Advertisements
Similar presentations
Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor.
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Lecture 3 Molecular Evolution and Phylogeny. Facts on the molecular basis of life Every life forms is genome based Genomes evolves There are large numbers.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
Finding Orthologous Groups René van der Heijden. What is this lecture about? What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)?
Bioinformatics and Phylogenetic Analysis
BME 130 – Genomes Lecture 26 Molecular phylogenies I.
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Bioinformatics tools for phylogeny and visualization
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic Analysis
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Alexis Dereeper Homology analysis and molecular phylogeny CIBA courses – Brasil 2011.
Christian M Zmasek, PhD 15 June 2010.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Phylogenetic Analysis Dayong Guo. Introduction Phylogenetics is the study of evolutionary relatedness among various species, populations, or among a set.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
COMPUTATIONAL MODELS FOR PHYLOGENETIC ANALYSIS K. R. PARDASANI DEPTT OF APPLIED MATHEMATICS MAULANA AZAD NATIONAL INSTITUTE OF TECHNOLOGY (MANIT) BHOPAL.
Sequence Alignment and Phylogenetic Analysis. Evolution.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
BINF6201/8201 Molecular phylogenetic methods
© Wiley Publishing All Rights Reserved. Building Multiple- Sequence Alignments.
3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Building and visualizing phylogeny Henrik Lantz Dept. of Medical Biochemistry and Microbiology, BMC, Uppsala University.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Introduction to Phylogenetic Trees
Phylogenetic Trees  Importance of phylogenetic trees  What is the phylogenetic analysis  Example of cladistics  Assumptions in cladistics  Frequently.
Introduction to Phylogenetics
Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
Phylogenetics.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
Phylogenetic trees. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Phylogeny and the Tree of Life
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Phylogeny - based on whole genome data
Phylogenetic Inference
Methods of molecular phylogeny
Phylogenetic Trees.
Chapter 19 Molecular Phylogenetics
Phylogenetics Chapter 26.
Phylogeny and the Tree of Life
Presentation transcript:

© Wiley Publishing All Rights Reserved. Phylogeny

Learning Objectives Understand the most basic concepts of phylogeny Be able to compute simple phylogenetic trees Understand the difference between orthology and paralogy Understand what bootstrapping means in phylogeny

Outline Finding out what to do with phylogeny Gathering sequences to make a tree Preparing your multiple-sequence alignment Computing a tree Bootstrapping your tree to check its reliability Displaying your tree

Why Build a Phylogenetic Tree ? Phylogenetic trees reconstruct the evolutionary history of your sequences They tell you who is closer to whom in the big tree of life Phylogenetic trees are based on sequence similarity rather than morphologic characters

3 Ways to Use Your Tree Finding the closest relative of your organism Usually done with a tree based on the ribosomal RNA Discovering the function of a gene Finding the orthologues of your gene Finding the origin of your gene Finding whether your gene comes from another species

Orthology and Paralogy Orthologous genes Separated by speciation Often have the same function Paralogous genes Separated by duplications Can have different functions In the graph: A is paralogous with B A1 is orthologous with A2

Working on the Right Data Garbage in  garbage out The quality of your tree depends on the quality of the data Your first task is to assemble a very accurate MSA

DNA or Proteins Most phylogenetic methods work on Proteins and DNA sequences If possible, always compute a multiple-sequence alignment on the protein sequences Translate the sequences if the DNA is coding Align the sequences Thread the DNA sequences back onto the protein MSA with coot.embl.de/pal2nal If your DNA sequences are coding and have more than 70% identity... Compute the tree on the DNA multiple-sequence alignment If your DNA sequences are coding and have less than 70% identity... Compute the tree on the protein multiple-sequence alignment

Which Sequences ? Orthologous sequences Produce a species tree Show how the considered species have diverged Paralogous sequences Produce a gene tree Show the evolution of a protein family

Establishing Orthology Establishing orthology is very complicated It is common practice to establish orthology using the best reciprocal BLAST A is a gene of Genome X B is a gene of Genome Y BLAST (Gene A against Genome X) = B BLAST (Gene B against Genome Y) = A A is B’s best friend and B is A’s best friend… Phylogeny purists dislike this method

Creating the Perfect Dataset

Building the Right MSA Your MSA should have as few gaps as possible. Some variability but not too much! Some conservation but not too much!

Building the Right Tree There are two types of tree-reconstruction methods Distance-based methods Statistical methods Statistical methods are the most accurate Maximum likelihood of success Parsimony Statistical methods take more time Limited to small datasets

Distance-based Methods for Tree Reconstruction Distance-based methods are the most popular Neighbor Joining (NJ) UPGMA Distance-based methods involve 2 steps: Measure the distances between pairs of sequences in the MSA Transform the distance matrix into a tree The two most popular packages for making trees are Clustalw : very simple, not very sophisticated Phylip : very powerful, less convivial

Computing Your Tree The simplest way (offers little control) Cut and paste an MSA into the ClustalW server Available at The best way (offers full control) Use phylip online server on the Pasteur Web server Available at bioweb.pasteur.fr/intro-uk.html

Which Format ? Trees are displayed in graphic formats Always keep a version of your tree in newick format Also called new-hampshire, or nh Note the parentheses in this format Display your nh tree Use iubio.bio.indiana.edu/treeapp

Reading Your Tree There’s a lot of vocabulary in a tree Nodes correspond to common ancestors The root is the oldest ancestor Often artificial Only meaningful with a good outgroup Trees can be un-rooted Branch lengths are only meaningful when the tree is scaled Cladograms are often scaled Phenograms are usualy unscaled

Bootstrapping Use bootstrapping to verify the solidity of each node ClustalW and Phylip do bootstrap operations automatically Bootstrapping involves these steps: Select a subset of your MSA Redo the tree Repeat this operation N times (100 or 1000 times if you can) Compute a consensus tree of the N trees Measure how many of the N trees agree with the consensus tree on each node Each node gets a bootstrap figure between 0 and N High bootstrap  good node

A Bootstrapped Tree This tree was produced with 2 bootstrap cycles It shows some nodes as more robust than others In practice, always use more than 100 cycles

Going Farther A major improvement: PhyML a recently developed method that can do maximum likelihood on more than 20 sequences atgc.lirmm.fr/phyml/ A legendary resource for phylogeny evolution.genetics.washington.edu/phylip/software.html