Download presentation
Presentation is loading. Please wait.
1
The Tree of Life (TOL) in the age of Genomics or a journey through the Phylogenetic Forest Eugene Koonin, NCBI / NLM / NIH RECOM BE, San Diego, May 23, 2010
2
Thinking of the history of life in terms of phylogenetic trees is as old as scientific biology (if not older) Charles Darwin (1859) Origin of Species [one and only illustration]: "descent with modification" Ernst Haeckel (1879) The Evolution of Man A brief history of TOL
3
Advent of molecular phylogenetics – trees can be made from alignments of homologous genes - expectations of objectively reconstructed complete Tree of Life Woese et al. (1990) Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. PNAS 87, 4576-4579 [Figure 1, modified] A brief history of TOL ribosomal RNA phylogeny
4
Enter genomics…and things get complicated Fleischmann et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995 Jul 28;269(5223):496-512.
5
Massive horizontal gene transfer (HGT) from archaea to a hyperthermophilic bacterium
6
Significant HGT from eukaryotes to a (facultative) symbiotic bacterium
7
Phylogenetic trees for aaRSs – 20 key enzymes of protein biosynthesis diversity of HGT signals Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press
8
Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals
9
Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press
10
Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals
11
Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals
12
Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals
13
Doolittle WF. (2000) Uprooting the tree of life. Sci. Am. 282, 90-95 [modified] History of TOL BacteriaArchaea Eukaryotes BacteriaArchaea Eukaryotes Troubled times – "uprooting" of TOL for prokaryotes. horizontal gene transfer is rampant; no gene is exempt histories of individual genes are non-coherent with each other tree-like signal is completely lost (or never existed at all) there are no species (or other taxa) in prokaryotes the consistent tree-likesignal we observe is created by biases in HGT "Standard Model""Net of Life"
14
So – in the light of HGT - is there a Tree of Life? Or was the history of life more like a net? To address this question, explore the Forest of Life – trees for ALL conserved genes
15
ANALYSIS OF THE FOL RECONSTRUCTION OF THE FOL 1. SELECTION OF ORTHOLOGOUS GENES 2. MULTIPLE ALIGNMENT 3. CONSTRUCTION OF ML PHYLOGENETIC TREES Tree11 Tree20.4921 Tree30.5910.4871 ………… Tree N 0.325 0.4850.1121 Tree comparison networks cluster analysis 4. TREE COMPARISON METHODS Matrix of distances between trees FOL NUTs > 90% species Exploration of the Forest of Life (FOL): bioinformatic pipeline
16
FOL and NUTs Forest of Life (FOL) and Nearly Universal Trees (NUTs) most of the trees contain relatively few species 102 "nearly universal" trees (90+ species): NUTs 6901 trees (FOL)
17
Archaea and Bacteria in NUTs How well archaea and bacteria are separated? 56% of NUTs show perfect separation between archaea and bacteria; 92% of NUTs show partial, non-random separation archaea and bacteria separated archaea and bacteria mixed archaea invade bacterial subtree bacteria invade archaeal subtree
18
NUTs vs Random Trees Inconsistency Score (IS) compares a tree to a set of trees. The score is based on the frequency of bipartitions derived from the given tree among the whole set of trees. Range ~[0..1]. NUTs are much more consistent other than random trees random trees NUTs
19
NUTs Pattern of Similarity An edge connects two vertices (trees) if the distance between them is below the threshold A single connected cluster appears and gradually grows to encompass all NUTs as the threshold is lowered.
20
Are NUTs Clustered? The 102x102 matrix of distances between NUTs is projected into a lower- dimensional space using multidimensional scaling (CMDS). Analysis using gap function approach (Tibshirani et al. 2001) shows lack of distinct separate clusters in the tree topology space
21
NUTs vs FOL Similarity between NUTs and the rest of the FOL. The NUTs are connected to 2505 trees (36%) from the FOL at a 0.8 similarity cut-off. The mean similarity between the NUTs and the FOL is ~0.5 (only ~0.1 for random trees).
22
NUTs vs FOL The matrix of distances between all the trees in the FOL is projected into a lower-dimensional space using CMDS. FOL/COG trees form 7 distinct clusters in tree topology space. Clusters differ largely by phyletic patterns. NUTs form a tight group within one of the clusters and are approximately equidistant to all clusters. NUTs NUTs vs FOL clusters
23
Interim conclusions Analysis of the full Forest of Life in comparison to NUTs shows that: a considerable fraction of FOL trees are very similar to NUTs: average FOL-NUTs similarity is dramatically above the random level unlike NUTs, topologies of the FOL trees show distinct clustering largely determined by the phyletic patterns (i.e. set of species present) in the tree topology space NUTs form a comparatively tight centrally located group compared to NUTs, FOL trees show much wider diversity of their topologies; however, the "central" tree-like signal still exists for a large part of the FOL a "consensus" tree make sense at least as a crude representation of the common trend in the FOL (especially so for the NUTs)
24
Distinguishing Tree and Net signals in the FOL using Quartets of species The FOL shows a great diversity of phyletic and phylogenetic patterns. We employ the quartet analysis to measure the vertical and horizontal evolutionary signal in different areas of the Forest. there are C 4 100 4x10 6 species quartets from the set of 100 species each quartet of species can resolve into 3 different tree topologies (~12x10 6 combinations total) any tree containing all 4 species from a given quartet resolves them into one of these 3 topologies for any quartet one can compute the support for all 3 topologies within a set of trees (i.e. relative frequencies of the topologies); ~8x10 10 comparisons for the whole FOL (or any subset of trees)) 1 2 3 3 f 1 + f 2 + f 3 = 1
25
Looking for tree-like and net-like evolution in the FOL Species Matrix – NUTs – mostly tree-like signal NUTs ArchaeaBacteria Crenarchaeota Euryarchaeota Cyanobacteria Proteobacteria often rarely together
26
Species Matrix – Rest of the FOL – much stronger net-like (horizontal) signal FOL-NUTs ArchaeaBacteria Crenarchaeota Euryarchaeota Cyanobacteria Proteobacteria often rarely together
27
Species Matrix vs Function J U K L F H I N O S D MECG RQ P T V
28
Conclusions – 3 Quartet-based analysis of species position in the Forest of Life shows that: relationships between species in NUTs roughly follow the conventional microbial taxonomy – probably, a consequence of significant contribution of tree-like evolution the “TOL” signal declines with decreasing number of species in the tree different functional classes of genes show substantially different balance between tree-like and net-like modes of gene transfer and possibly in preferred routes of HGT
29
The Take-Home Message There is no single "Tree of Life" describing the evolutionary history of all or even the majority of the prokaryote genomes Yet, there is a central tree-like trend of evolution compatible with a common history of descent of prokaryotic groups This trend is more evident at shallow phylogenetic depths, in more ubiquitous genes and among some functional categories of genes (eg, translation) Observations are compatible with the ancient divergence between the bacterial and archaeal clades followed by rapid diversification of major phyla followed by HGT that distorted but did not destroy the tree-like signal Altogether, HGT might dominate evolution but the tree-like signal is stronger than the signal from any particular route of HGT Puigbo P, Wolf YI, Koonin EV. (2009) Search for a 'Tree of Life' in the thicket of the phylogenetic forest. J. Biol. 8, 59 Koonin EV, Wolf YI, Puigbo P. (2009) The Phylogenetic Forest and the Quest for the Elusive Tree of Life. Cold Spring Harb Symp Quant Biol. [advanced epub] Koonin EV, Wolf YI. (2009) The fundamental units, processes and patterns of evolution, and the tree of life conundrum. Biol Direct. 4, 33 Puigbo P, Wolf YI, Koonin EV. The tree and net signals in the evolution of prokaryotes, In preparation
30
The Take-Home Message To see the forest for the trees… …but also to see trees for the forest.
31
A quick redux on bioinformatics bioinformatics is a toolbox for computational analysis of biological problems the toolbox is already quite large and versatile, so in many cases, combination of available tools into pipelines allows researchers to address new questions of course, new tools often have to be developed and added to the toolbox biologists must understand the capacities but also the limitations of the tools Can bioinformatics – using the vast amounts of data produced by genomics and systems biology – help transform biology “from stamp collection to physics”?
32
Acknowledgments Pere Puigbo Fellow explorers of the Forest of Life Yuri Wolf
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.