Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.

Slides:



Advertisements
Similar presentations
Microbial Evolution Ecology and Evolution are inextricably connected.
Advertisements

1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Cell Structure and Evolutionary History Structure, p. 22.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Basics of Comparative Genomics Dr G. P. S. Raghava.
Classification of Living Things. 2 Taxonomy: Distinguishing Species Distinguishing species on the basis of structure can be difficult  Members of the.
Phylogenetic reconstruction
Types of homology BLAST
Phylogeny and Systematics By: Ashley Yamachika. Biologists use systematics They use systematics as an analytical approach to understanding the diversity.
Molecular Evolution Revised 29/12/06
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan.
Genetica per Scienze Naturali a.a prof S. Presciuttini Human and chimpanzee genomes The human and chimpanzee genomes—with their 5-million-year history.
Finding Orthologous Groups René van der Heijden. What is this lecture about? What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)?
Adaptive evolution of bacterial metabolic networks by horizontal gene transfer Chao Wang Dec 14, 2005.
Bioinformatics and Phylogenetic Analysis
General Microbiology (Micr300) Lecture 10 Microbial Genetics (Text Chapter: ; )
FOG: High-Resolution Fungal Orthologous Groups René van der Heijden Project 5.10: Comparative genomics for the prediction of protein function and pathways.
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
Brock Biology of Microorganisms
The Tree of Life (TOL) in the age of Genomics or a journey through the Phylogenetic Forest Eugene Koonin, NCBI / NLM / NIH RECOM BE, San Diego, May 23,
Example of bipartition analysis for five genomes of photosynthetic bacteria (188 gene families) total 10 bipartitions R: Rhodobacter capsulatus, H: Heliobacillus.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Finding Orthologous Groups René van der Heijden. What is this lecture about? What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)?
Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction.
Phylogeny and the Tree of Life
The diversity of genomes and the tree of life
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Microbial taxonomy and phylogeny
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
The Evolutionary History of Biodiversity
Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Genomics Lecture 8 By Ms. Shumaila Azam. 2 Genome Evolution “Genomes are more than instruction books for building and maintaining an organism; they also.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Phylogenetic Trees: Common Ancestry and Divergence 1B1: Organisms share many conserved core processes and features that evolved and are widely distributed.
Sequence-based Similarity Module (BLAST & CDD only ) & Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only)
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Genome Analysis II Comparative Genomics Jiangbo Miao Apr. 25, 2002 CISC889-02S: Bioinformatics.
Introduction to History of Life. Biological evolution consists of change in the hereditary characteristics of groups of organisms over the course of generations.
Functional and Evolutionary Attributes through Analysis of Metabolism Sophia Tsoka European Bioinformatics Institute Cambridge UK.
Significance Tests for Max-Gap Gene Clusters Rose Hoberman joint work with Dannie Durand and David Sankoff.
In brief Vertical vs. Horizontal Homologous vs. Unequal
Phylogeny & Systematics
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Evolutionary change involves genetic change   Phenotype   Genotype Study of evolution of macromolecules - nature of changes (in DNA, protein) & their.
PHYOGENY & THE Tree of life Represent traits that are either derived or lost due to evolution.
Finding Motifs Vasileios Hatzivassiloglou University of Texas at Dallas.
Chapter 26: Phylogeny and the Tree of Life
Classification Biology I. Lesson Objectives Compare Aristotle’s and Linnaeus’s methods of classifying organisms. Explain how to write a scientific name.
Molecular Clocks and Continued Research
Taxonomy & Phylogeny. B-5.6 Summarize ways that scientists use data from a variety of sources to investigate and critically analyze aspects of evolutionary.
General Microbiology (Micr300)
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Announcements Seminar today after class! Seminar Wednesday!
Announcements.
Evolution of eukaryotic genomes
The Original Question:
BLAST program selection guide
Basics of Comparative Genomics
Pipelines for Computational Analysis (Bioinformatics)
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Chapter 26 Phylogeny and the Tree of Life
Gautam Dey, Tobias Meyer  Cell Systems 
Unit Genomic sequencing
Basics of Comparative Genomics
Presentation transcript:

Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014

A couple of key facts Prokaryotes have been around a long time ( GYA). Bacteria and Archaea diverged a very long time ago and are not more closely related to each other than to eukaryotes Prokaryotes exhibit tremendous diversity of habitats, lifestyles, and metabolic strategies

Important applications of microbial evolution

Critical Topics Already Introduced Genomic revolution and genome evolution – Core vs. Variable fractions of genomes – Pan-genome – Genome size and organization Horizontal (Lateral) Gene Transfer (HGT) – There is no “tree of life” – How frequent is HGT? Bacterial species - is there such a thing? What do we mean?

Assigned reading

Microbial genome sequence availability is exponentially increasing

NCBI Genome Project List As of 4/19/2012 (2013): 2029 complete bacterial genomes (2510) 134 complete archaea (262) 3313 draft bacteria (>10K) 44 draft archaea 4600 bacteria – no data yet 49 archaea – no data yet

How well sampled is prokaryotic diversity by current genome sequences? Koonin and Wolf 2008 perspective: Uncultivated organisms remain problematic Only 10% of the genes in major metagenomic samplings have no detectable homologs “The possibility, certainly, remains that major new and, perhaps, unusual groups of archaea and bacteria dwell in complex and unusual habitats. Nevertheless, it appears likely that the current collections of archaeal and bacterial genomes provide a reasonable approximation of the diversity of prokaryotic life forms on earth.”

Genomic Encyclopedia of Bacteria and Archaea (GEBA) Project Objective – sequence genomes selected solely for their phylogenetic novelty (plus in depth sampling of a single phylum) …based on 16S rDNA tree Wu et al. Nature Dec 24; 462(7276):

DY Wu et al. Nature 462, (2009) doi: /nature08656 Maximum-likelihood phylogenetic tree of the bacterial domain based on a concatenated alignment of 31 broadly conserved protein-coding genes 16. Phyla are distinguished by colour of the branch and GEBA genomes are indicated in red in the outer circle of species names. 53 GEBA bacteria accounted for 2.8– 4.4 times more phylogenetic diversity than randomly sampled subsets of 53 non- GEBA bacterial genomes

DY Wu et al. Nature 462, (2009) doi: /nature08656 Rate of discovery of protein families as a function of phylogenetic breadth of genomes. Even discovered a bacterial homolog of eukaryotic cytoskeleton protein, Actin

Evolution-oriented reasons to target genomes for sequencing Maximize sampling of diversity Understand structure of particular populations and/or species Make targeted comparisons to understand the genetic basis of phenotypic differences

Size and organization of microbial genomes (Koonin and Wolf 2008) Size Range = 180 Kbp – 13 Mbp

Structure of a prokaryotic genome One circular chromosome is typical. Some have other replicons, such as linear or circular plasmids. Some have more than one chromosome, generally distinguished from a plasmid by the presence of at least one “essential” gene. Some have linear chromosomes.

Fitch WM. Trends Genet May;16(5):

Analogy vs Homology Analogy The relationship of any two characters that have descended convergently from unrelated ancestors. Homology The relationship of any two characters that have descended, usually with divergence, from a common ancestral character.

Orthology The relationship of any two homologous characters whose common ancestor lies in the cenancestor of the taxa from which the two sequences were obtained. Paralogy The relationship of any two homologous characters arising from a duplication of the gene for that character. Xenology The relationship of any two homologous characters whose history, since their common ancestor, involves an interspecies (horizontal) transfer of the genetic material for at least one of those characters.

Test Yourself A1 – B1 A1 – B2 A1 – C3 B1 – C2 C2 – C3 B2 – C3 C3 – AB1

Homology on a Genome-Scale How many and which genes are common to two or more organisms? Which genes differentiate one organism from another? How is homology related to function?

A phylogenetic perspective Orthologs are the set of genes/proteins with gene trees identical to the species tree. We can understand other types of homology relationships by comparison to the species tree. But often we don’t know the species tree, and phylogenetic methods are complex

Consider two genomes Use BLASTP to compare one set of proteins (proteome) to the other Which set will you use as the query and which as the database? What criteria will you use to define “a match”? GenomeA – gene 1 GenomeA – gene 2 GenomeA – gene 3 GenomeB– gene 1 GenomeB – gene 2 GenomeB – gene 3 A1, A3, B2 and B3 are homologs (assuming the aligned regions overlap)

Reciprocal Best Hits Use BLASTP to compare sets of proteins (proteome) to each other – First using GenomeA to query against GenomeB – Then using GenomeB to query against GenomeA – Save only one best match for each query – Save only the reciprocal best matches as “orthologs” GenomeA – gene 1 GenomeA – gene 2 GenomeA – gene 3 GenomeB– gene 1 GenomeB – gene 2 GenomeB – gene 3 GenomeA – gene 1 GenomeA – gene 2 GenomeA – gene 3 GenomeB– gene 1 GenomeB – gene 2 GenomeB – gene 3 GenomeA – gene 1 GenomeA – gene 2 GenomeA – gene 3 GenomeB– gene 1 GenomeB – gene 2 GenomeB – gene 3 Lose A3-B2 and A1-B3 homology

Software/Methods for Predicting Orthologs from Genome Sequences RBH RSD (Reciprocal Shortest Distance) INPARANOID RIO Orthostrapper Ortholuge TribeMCL OrthoMCL

Method Comparison Chen F, Mackey AJ, Vermunt JK, Roos DS. PLoS ONE Apr 18;2(4):e383.

Core and variable genes- single genome perspective A small number of genes have orthologs in all microbial genomes (core) More genes have orthologs in many genomes, but not all (shell) Some genes are rare and have orthologs in only a few genomes (cloud) Some are unique to one genome (ORFans)

Core and variable genes – species perspective (pan-genome) For some species as a whole, The number of core (plus shell) genes can be much smaller than the variable fraction (cloud plus ORFans) And the pan-genome can be very large Touchon et al. PLoS Genetics. 2009

Different types of pan-genomes Figure 3. Power law regression for species with open and closed pan-genomes. Tettelin et al. Curr Opin Microbiology 2008:11(5).

Open vs Closed Pan-genomes Open – Number of new genes discovered continues to grow as additional genomes of the species are sequenced – Organisms live in diverse environments and are genetically amenable to horizontal gene transfer Closed – Number of new genes discovered is very small as additional genomes of the species are sequenced – Organisms have little exposure to other organisms and/or are refractory to horizontal gene transfer

Horizontal Gene Transfer Mechanisms include conjugation, transduction and transformation Can introduce entirely new genes and gene clusters into genomes (grow the pan-genome) Can replace existing genes with functionally equivalent (?) xenologs (scramble phylogenetic history)

Horizontal Gene Transfer How prevalent is it? – We don’t know. Debates continue largely based on the challenges of separating the error associated with phylogenetic reconstruction from true differences in phylogenetic signal Who is doing it? – We don’t know. Same problem as above. – Good evidence that it is much more frequent within (some) species than between – Some evidence for relationship with evolutionary distance and/or commonality of enviroment

SSU rDNA perspective

EVOLUTION: Genome Data Shake Tree of Life E Pennisi - Science, sciencemag.org The ring of life provides evidence for a genome fusion origin of eukaryotes MC Rivera, JA Lake - Nature, 2004 The net of life: reconstructing the microbial phylogenetic network V Kunin, L Goldovsky, N Darzentas, CA … - Genome Research 2005 The tree of one percent T Dagan, W Martin - Genome biology, 2006 Uprooting the tree of life WF Doolittle - Evolution: a Scientific American reader, 2006

Comparison of phylogenies for nearly universally conserved genes 102 ML trees for 100 taxa Objective – compare topological distance between trees New metric called IS (inconsistency score) = fraction of splits two trees have in common The network of similarities among the nearly universal trees (NUTs). (a) Each node (green dot) denotes a NUT, and nodes are connected by edges if the similarity between the respective edges exceeds the indicated threshold. (b) The connectivity of 102 NUTs and the 14 1:1 NUTs depending on the topological similarity threshold.

Real trees are more similar to each other than randomly simulated trees Although no single tree appears to represent the evolutionary history of these organisms, there is distinctly preserved phylogenetic signal across the dataset as a whole

The big divide? Look for evidence of HGT between bacteria and archaea 56% of NUTs separated the groups perfectly 44% show at least one HGT – 13% from archaea to bacteria – 23% from bacteria to archaea – 8% both directions The supernetwork of the NUTs. Puigbò et al. Journal of Biology :59 doi: /jbiol159

Expanding to ~6800 other predicted ortholog clusters Network connectivity is greatly reduced Different functional categories of genes show different levels of connectedness Network representation of the 6,901 trees of the forest of life. The 102 NUTs are shown as red circles in the middle. The NUTs are connected to trees with similar topologies: trees with at least 50% of similarity with at least one NUT (P-value < 0.05) are shown as purple circles and connected to the NUTs. The rest of the trees are shown as green circles. Puigbò et al. Journal of Biology :59 doi: /jbiol159

Proc Natl Acad Sci U S A Oct 4;102(40):

Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis of the 22,348 protein trees for which a minimal edit path could be resolved Beiko R G et al. PNAS 2005;102: ©2005 by National Academy of Sciences

Horizontal Transfer within species Estimate that a given basepair is 100 times more likely to have undergone a recombination event than a point mutation within the species E. coli, so how can we justify representing the relationship between strains with a tree like structure? Modeling and simulation support inference of a tree summarizing dominant signal AS LONG AS patterns of recombination are more or less random between lineages Touchon et al. PLoS Genetics. 2009

Major processes affecting prokaryotic genome evolution (Koonin and Wolf, 2008) (1) Genome streamlining under strong selection. (2) Neutral gene loss and genome degradation under weak selection (or neutral). (3) Innovation and complexification via gene duplication. (4) Innovation via operon shuffling. (5) Innovation and complexification via HGT, in particular, of partially selfish operons, a process that often leads to nonorthologous gene displacement. (6) Replicon fusion, propagation of mobile elements and other interactions between the relatively stable chromosomes and the mobilome.

Test Yourself A1 – B1 A1 – B2 A1 – C3 B1 – C2 C2 – C3 B2 – C3 C3 – AB1

Test Yourself A1 – B1 = Ortho A1 – B2 = Ortho A1 – C3 = Ortho B1 – C2 = Para (out) C2 – C3 = Para (in) B2 – C3 = Ortho C3 – AB1= Xeno