Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Slides:



Advertisements
Similar presentations
Linnaeus developed the scientific naming system still used today.
Advertisements

1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
ATPase dataset -> nj in figtree. ATPase dataset -> muscle -> phyml (with ASRV)– re-rooted.
No similarity vs no homology If two (complex) sequences show significant similarity in their primary sequence, they have shared ancestry, and probably.
Tree of Life Chapter 26.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Basics of Comparative Genomics Dr G. P. S. Raghava.
Phylogenetic reconstruction
Trees and Sequence Space J. Peter Gogarten University of Connecticut Dept. of Molecular and Cell Biology Sculpture at Royal Botanical Gardens, Kew.
Adaptive evolution of bacterial metabolic networks by horizontal gene transfer Chao Wang Dec 14, 2005.
Branches, splits, bipartitions In a rooted tree: clades (for urooted trees sometimes the term clann is used) Mono-, Para-, polyphyletic groups, cladists.
Example of bipartition analysis for five genomes of photosynthetic bacteria (188 gene families) total 10 bipartitions R: Rhodobacter capsulatus, H: Heliobacillus.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Gene transfer Organismal tree: species B species A species C species D Gene Transfer seq. from B seq. from A seq. from C seq. from D molecular tree: speciation.
Trees as a Tool to Visualize Evolutionary History
Cenancestor (aka LUCA or MRCA) can be placed using the echo remaining from the early expansion of the genetic code. reflects only a single cellular component.
Branches, splits, bipartitions In a rooted tree: clades Mono-, Para-, polyphyletic groups, cladists and a natural taxonomy Terminology The term cladogram.
Trees? J. Peter Gogarten University of Connecticut Dept. of Molecular and Cell Biology Sculpture at Royal Botanical Gardens, Kew.
MCB 372 #14: Student Presentations, Discussion, Clustering Genes Based on Phylogenetic Information J. Peter Gogarten University of Connecticut Dept. of.
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogeny and the Tree of Life
The diversity of genomes and the tree of life
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
T-COFFEE Multiple Alignments of Orthologous Sequences Horizontal Gene Transfer (Phylogenetic Trees) WebLogo.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
ATPase dataset -> nj in figtree. ATPase dataset -> muscle -> phyml (with ASRV)– re-rooted.
Phylogenetic Trees: Common Ancestry and Divergence 1B1: Organisms share many conserved core processes and features that evolved and are widely distributed.
Introduction to Phylogenetics
Phylogenetic analyses of cyanobacterial genomes: Quantification of horizontal gene transfer events Olga Zhaxybayeva, J. Peter Gogarten, Robert L. Charlebois,
Using blast to study gene evolution – an example.
Significance Tests for Max-Gap Gene Clusters Rose Hoberman joint work with Dannie Durand and David Sankoff.
Introduction to Phylogenetic trees Colin Dewey BMI/CS 576 Fall 2015.
ATPase dataset from last Friday Alignment clustal vs muscle Conserved part are aligned reproducibly.
Phylogeny & the Tree of Life
Classification. Cell Types Cells come in all types of shapes and sizes. Cell Membrane – cells are surrounded by a thin flexible layer Also known as a.
Classification.
Phylogeny & Systematics
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Bootstrap ? See herehere. Maximum Likelihood and Model Choice The maximum Likelihood Ratio Test (LRT) allows to compare two nested models given a dataset.Likelihood.
Evolutionary change involves genetic change   Phenotype   Genotype Study of evolution of macromolecules - nature of changes (in DNA, protein) & their.
{ Early Earth and the Origin of Life Chapter 15.  The Earth formed 4.6 billion years ago  Earliest evidence for life on Earth  Comes from 3.5 billion-year-old.
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Section 26.5: Horizontal Gene Transfer By Monica Macaro.
Phylogeny.
PHYOGENY & THE Tree of life Represent traits that are either derived or lost due to evolution.
The gradualist point of view Evolution occurs within populations where the fittest organisms have a selective advantage. Over time the advantages genes.
Universal Tree of Life  Universal tree ids the roadmap of life. It depicts the evolutionary history of the cells of all organism and the criteria reveals.
Chapter 26 Phylogeny and the Tree of Life
Lesson Overview Lesson Overview Modern Evolutionary Classification 18.2.
Taxonomy & Phylogeny. B-5.6 Summarize ways that scientists use data from a variety of sources to investigate and critically analyze aspects of evolutionary.
Phylogenetic reconstruction - How Distance analyses calculate pairwise distances (different distance measures, correction for multiple hits, correction.
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Phylogeny & the Tree of Life
The Coral of Life (Darwin)
Pipelines for Computational Analysis (Bioinformatics)
The Ribosomal “Tree of Life”
Phylogeny & Systematics
Hierarchical Classification vs. Systematics
Chapter 26 Phylogeny and the Tree of Life
Chapter 26.5: Horizontal Gene Transfer
Archaea and the Origin(s) of DNA Replication Proteins
The Ribosomal “Tree of Life”
Phylogenetics Chapter 26.
Chapter 25 Essential Questions
Chapter 26 Phylogeny and the Tree of Life
Chapter 20 Phylogeny and the Tree of Life
Fig. 2. —Phylogenetic relationships and motif compositions of some representative MORC genes in plants and animals. ... Fig. 2. —Phylogenetic relationships.
Evolution Biology Mrs. Johnson.
Phylogeny and the Tree of Life
Presentation transcript:

Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology

Acknowledgements NSF Microbial Genetics NASA Exobiology Program Re. Coalescence: Andrew Martin (U of C Boulder) Joe Felsenstein (U of Wash) Hyman Hartman (MIT) Yuri Wolf (NCBI) Gogarten Lab: Olga Zhaxybayeva

Cenancestor – (from the Greek “kainos” meaning recent and “koinos” meaning common) – the most recent common ancestor of all the organisms that are alive today. The term was proposed by Fitch in All life forms on Earth share common ancestry Where is the root of the tree of life? On the branch leading to Bacteria (Gogarten, J.P. et al.,1989; Iwabe, N. et al., 1989) On the branch leading to Archaea (Woese, C.R.,1987) On the branch leading to Eukaryotes (Forterre, P. and Philippe, H., 1999; Lopez, P. et al., 1999) Under aboriginal trifurcation (Woese, C.R., 1978) Inconclusive results (Caetano-Anolles, G., 2002)

Detection of Ancient Gene Duplications using Whole Genome Data SELECTION OF GENOMES 12 representative completed genomes SELECTION OF GENOMES 12 representative completed genomes RANKING OF PUTATIVE ANCIENT PARALOGOUS PAIRS Elimination of duplicates; ranking by number of occurrences RANKING OF PUTATIVE ANCIENT PARALOGOUS PAIRS Elimination of duplicates; ranking by number of occurrences ANALYSIS OF PUTATIVE ANCIENT PARALOGOUS PAIRS Add homologous sequences, perform alignment and reconstruct phylogenetic trees ANALYSIS OF PUTATIVE ANCIENT PARALOGOUS PAIRS Add homologous sequences, perform alignment and reconstruct phylogenetic trees DETECTION OF PUTATIVE ANCIENT PARALOGOUS PAIRS Encoded proteins, Genome A Encoded proteins, Genome A Encoded proteins, Genome B Encoded proteins, Genome B BLAST, E-value cutoff (A1,B1) and (A2,B2) are putative anciently duplicated genes (A1,B1) and (A2,B2) are putative anciently duplicated genes Gene A1 Gene B1 Gene A2 Gene B2 Reciprocal top scoring BLAST hit

Ancient Gene Duplications and Root of Tree of Life Root Location Total number of datasets ReliableAmbiguous or Unresolved Between Domains32725 Within Archaea321 Within Bacteria909 Polyphyletic, only Archaeal sequences group with Archaeal ortholog of the A- B pair Polyphyletic, only Bacterial sequences group with Bacterial ortholog of the A- B pair 1459 Polyphyletic on both sides of root 110

SSU-rRNA Tree of Life

Another example of topology with long empty branches connecting three domains of life – proton pumping ATPases Olendzenski et al, 2000

Phylogenetic tree of phenylalanyl- tRNA synthetase beta-subunits. J.R. Brown, Syst. Biol. 50(4):497–512, 2001

H. Philippe, P. Forterre, J Mol Evol (1999) 49:509–523 Phylogenetic tree of Ile- and Val- tRNA synthetases

Hypotheses to explain the long empty branches connecting the three domains of life Early evolution prior to the split of the three domains was following different mechanisms. Wolfram Zillig, 1992 Otto Kandler, 1994 Carl Woese, 1998 Arthur Koch, 1994

Hypotheses to explain the long empty branches connecting the three domains of life (cont.) There were major catastrophic events in the past that led to a bottleneck with only few survivors For example: Tail of early heavy bombardment Rise of oxygen

Hypotheses to explain the long empty branches connecting the three domains of life (cont.) There were major catastrophic events in the past that led to a bottleneck with only few survivors

Hypotheses to explain the long empty branches connecting the three domains of life (cont.)

Coalescence – the process of tracing lineages backwards in time to their common ancestors. Every two extant lineages coalesce to their most recent common ancestor. Eventually, all lineages coalesce to the cenancestor. t/2 (Kingman, 1982) Illustration is from J. Felsenstein, “Inferring Phylogenies”, Sinauer, 2003

Clade – (from the Greek “klados” meaning branch or twig) – a group of organisms that includes all of the descendants of an ancestral taxon. In a rooted phylogeny every node defines a clade as the lineages originating from this node, including those that arise in successive furcations. Coalescence as an Approach to Study Cladogenesis Cladogenesis – the process of clade formation clade

Simulations of cladogenesis by coalescence One extinction and one speciation event per generation Horizontal transfer event once in 10 generations RED: organismal lineages (no HGT) BLUE: molecular lineages (with HGT) GRAY: extinct lineages RESULTS: Most recent common ancestors are different for organismal and molecular phylogenies Different coalescence times Long coalescence of the last two lineages

EXTANT LINEAGES FOR THE SIMULATIONS OF 50 LINEAGES

green: organismal lineages ; red: molecular lineages (with gene transfer) Number of extant lineages over time

Adam and Eve never met  Albrecht Durer, The Fall of Man, 1504 Mitochondrial Eve Y chromosome Adam Lived approximately 50,000 years ago Lived 166, ,000 years ago Thomson, R. et al. (2000) Proc Natl Acad Sci U S A 97, Underhill, P.A. et al. (2000) Nat Genet 26, Cann, R.L. et al. (1987) Nature 325, 31-6 Vigilant, L. et al. (1991) Science 253,

CONCLUSIONS Coalescence leads to long branches at the root. Using single genes as phylogenetic markers makes it difficult to trace organismal phylogeny in the presence of horizontal gene transfer. Each contemporary molecule has its own history and traces back to an individual molecular cenancestor, but these molecular ancestors were likely to be present in different organisms at different times. The model alone already explains some features of the observed topology of the tree of life. Therefore, it does not appear warranted to invoke more complex hypotheses involving bottlenecks and extinction events to explain these features of the tree of life.

OUTLOOK Incorporate non-random Horizontal Gene Transfer into the model. Consider the model as a null hypothesis and look for deviations from it. According to the model the bottom branches are expected to be long for every individual clade. This does not seem to be the case for bacterial domain. This deviation could be due to sudden radiation. Explore if the model can be used to infer parameters that describe the early evolution of life. E.g., number of species.