Charting the function of microbes and microbial communities Curtis Huttenhower 11-17-11 Harvard School of Public Health Department of Biostatistics.

Slides:



Advertisements
Similar presentations
Clostridium difficile Colitis or Dysbiosis. Symbiostasis/Dysbiosis.
Advertisements

The Human Microbiome in Health and Disease Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
Genetic Analysis in Human Disease
Use of the genomic data o Reconstruction of metabolic properties o Nature’s Microbiome o NGS in Population Genetics.
Meta’omic functional profiling with HUMAnN Curtis Huttenhower Harvard School of Public Health Department of Biostatistics U. Oregon.
Metabarcoding 16S RNA targeted sequencing
Computational Analysis of the Taxanomical Classification of Short 16S rRNA Sequences Christel Chehoud Mentor: Brian Haas.
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Amplicon functional profiling with PICRUSt
Scalable metabolic reconstruction for metagenomic data and the human microbiome Sahar Abubucker, Nicola Segata, Johannes Goll, Alyxandria Schubert, Beltran.
Computational Methodology for Microbial and Metagenomic Characterization using Large Scale Functional Genomic Data Integration Curtis Huttenhower
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Scalable data mining for functional genomics and metagenomics Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
Sahar Abubucker, Nicola Segata,
The NIH Human Microbiome Project
Computational metagenomics and the human microbiome Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
The Microbiome and Metagenomics
Metagenomic Analysis Using MEGAN4
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
Species  OTUs  OPUs  Species  OTUs  OPUs. Rosselló-Mora & Amann 2001, FEMS Rev. 25:39-67 Taxa circumscription depends on the observable characters.
Accurate estimation of microbial communities using 16S tags Julien Tremblay, PhD
Human Microbiome Conference
713 Lecture 15 Host metagenomics. Progression of techniques Culture based –Use phenotypes and genotypes to ID Non-culture based, focused on 16S rDNA –Clone.
Large scale genomic data integration for functional genomics and metagenomics Curtis Huttenhower Harvard School of Public Health Department of.
Meta’omic functional profiling with ShortBRED Galeb Abu-Ali Curtis Huttenhower Harvard T.H. Chan School of Public Health Department of Biostatistics.
Meta’omic Analysis with MetaPhlAn, HUMAnN, and LEfSe Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
10 Billion Piece Jigsaw Puzzles John Cleary Real Time Genomics.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Compositionality and Sparseness in 16S rRNA data Anthony Fodor Associate Professor Bioinformatics and Genomics UNC Charlotte.
Meta’omic functional profiling with ShortBRED Curtis Huttenhower Harvard School of Public Health Department of Biostatistics U. Oregon.
Accurate estimation of microbial communities using 16S tags
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
1 Modelling and Simulation EMBL – Beyond Molecular Biology Physics Computational Biology Chemistry Medicine.
Shotgun sequencing reveals transkingdom alterations in immunodeficiency associated enteropathy Xiaoxi Dong (Oregon State University), Jialu Hu (Oregon.
tracking microbes at the strain level
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Robert Edgar Independent scientist
An Introduction to Meta’omic Analyses Curtis Huttenhower Galeb Abu-Ali Eric Franzosa Harvard T.H. Chan School of Public Health Department of Biostatistics.
Functional profiling with HUMAnN2
TIPP: Taxonomic Identification And Phylogenetic Profiling
Meta’omic functional profiling with ShortBRED
CuratedMetagenomicData: curated taxonomic and functional profiles for thousands of human-associated microbiomes Microbiome working group seminar Dec 1,
Rob Edwards San Diego State University
Canadian Bioinformatics Workshops
Metagenomic Species Diversity.
Discussion of Food, Sex, travel paper
Strain profiling with StrainPhlAn and PanPhlAn
An Introduction to Meta’omic Analyses
Omolola C. Betiku1,2. , Carl J. Yeoman2, T. Gibson Gaylord1, Suzanne L
Functional profiling with HUMAnN2
Taxonomic profiling with MetaPhlAn2
Identifying personal microbiomes using metagenomic codes
Systematic Characterization and Analysis of the Taxonomic Drivers of Functional Shifts in the Human Microbiome  Ohad Manor, Elhanan Borenstein  Cell Host.
Taxonomic profiling with MetaPhlAn2
Genomic Data Manipulation
Strain profiling with StrainPhlAn
Volume 10, Issue 3, Pages (September 2011)
Human Gut Microbiome: Function Matters
H = -Σpi log2 pi.
Volume 17, Issue 2, Pages (February 2015)
Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon 
Volume 20, Issue 5, Pages (November 2014)
The Human Microbiome Project in 2011 and Beyond
Volume 22, Issue 3, Pages e4 (September 2017)
Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon 
Volume 20, Issue 5, Pages (November 2014)
A typical current computational meta'omic pipeline to analyze and contrast microbial communities. A typical current computational meta'omic pipeline to.
Gut Microbiome Studies
Toward Accurate and Quantitative Comparative Metagenomics
Presentation transcript:

Charting the function of microbes and microbial communities Curtis Huttenhower Harvard School of Public Health Department of Biostatistics

Valm et al, PNAS 2011

What to do with your metagenome? 3 Diagnostic or prognostic biomarker for host disease Public health tool monitoring population health and interactions Comprehensive snapshot of microbial ecology and evolution Reservoir of gene and protein functional information Who’s there? What are they doing? Who’s there varies: your microbiota is plastic and personalized. This personalization is true at the level of phyla, genera, species, strains, and sequence variants. What they’re doing is adapting to their environment: you, your body, and your environment.

The NIH Human Microbiome Project (HMP): A comprehensive microbial survey What is a “normal” human microbiome? 300 healthy human subjects Multiple body sites 15 male, 18 female Multiple visits Clinical metadata Slides by Dirk Gevers

A three-tier study design… 16S 16S WGSWGS ref ref

…for mining metagenomic data contigs pathways ~100M reads per sample Assembly Annotation Map on ~50% ~90M proteins 16S 16S WGSWGS Filtering/ trimming Chimera removal >3k reads per sample BLAST against functional DBs Organismal census at different taxonomic levels ref ref Taxonomic classification (RDP) Clustering into OTUs census... ~36% ~57% genes

“Pathogen” carriage varies a lot 7 Gardnerella Alistipes Capnocytophaga Actinomyces Gemella 22 ***uniquely identifiable*** nonzero abundance “pathogens” from NIAID’s list of 135 +Propionibacterium >0.66

Phenotypes that explain variation (or not) can be surprising 8 Normalized relative abundance

Phenotypes that explain variation (or not) can be surprising 9 Normalized relative abundance

Phenotypes that explain variation (or not) can be surprising 10 Normalized relative abundance

Gene expression SNP genotypes A functional perspective on the human microbiome 11 Healthy/IBD BMI Diet Taxon abundances Enzyme family abundances Pathway abundances Functional seq. KEGG + MetaCYC CAZy, TCDB, VFDB, MEROPS… 100 subjects 1-3 visits/subject ~7 body sites/visit M reads/sample 100bp reads Metagenomic reads Enzymes and pathways ? HUMAnN HMP Unified Metabolic Analysis Network BLAST

HUMAnN: Metabolic reconstruction 12 Pathway coveragePathway abundance ← Samples → ← Pathways→ VaginalSkinNaresGutOral (SupP)Oral (BM)Oral (TD) ← Pathways→ ← Samples → VaginalSkinNaresGutOral (SupP)Oral (BM)Oral (TD)

← Subjects → ← Pathway abundance → ← Phylotype abundance → A portrait of the healthy human microbiome: Who’s there vs. what they’re doing 13 VaginalSkinNaresGut Oral (SupP)Oral (BM)Oral (TD) ← Phylotype abundance → ← Subjects → ← Pathway abundance →

← ~700 HMP communities→ Niche specialization in human microbiome function 14 Metabolic modules in the KEGG functional catalog enriched at one or more body habitats 16 (of 251) modules strongly “core” at 90%+ coverage in 90%+ individuals at 7 body sites 24 modules at 33%+ coverage 71 modules (28%) weakly “core” at 33%+ coverage in 66%+ individuals at 6+ body sites Contrast zero phylotypes or OTUs meeting this threshold! Only 24 modules (<10%) differentially covered by body site Compare with 168 modules (>66%) differentially abundant by body site 16 (of 251) modules strongly “core” at 90%+ coverage in 90%+ individuals at 7 body sites 24 modules at 33%+ coverage 71 modules (28%) weakly “core” at 33%+ coverage in 66%+ individuals at 6+ body sites Contrast zero phylotypes or OTUs meeting this threshold! Only 24 modules (<10%) differentially covered by body site Compare with 168 modules (>66%) differentially abundant by body site

Proteoglycan degradation by the gut microbiota 15 AA core Glycosaminoglycans (Polysaccharide chains)

Proteoglycan degradation: From pathways to enzymes Enzyme relative abundance Heparan sulfate degradation missing due to the absence of heparanase, a eukaryotic enzyme Other pathways not bottlenecked by individual genes HUMAnN links microbiome-wide pathway reconstructions → site-specific pathways → individual gene families

Patterns of variation in human microbiome function by niche 17

Patterns of variation in human microbiome function by niche 18 Three main axes of variation Eukaryotic exterior Low-diversity vaginal Gut metabolism Oral vs. tooth hard surface Only broad patterns: every human-associated habitat is functionally distinct!

Normal varies a lot at the genus level (16S) 200 subjects Bacteroides Alistipes Faecalibacterium Parabacteroides 343 genera Relative frequency Relative frequency of genera within Stool Dirk Gevers

Bacteroides vulgatus Bacteroides sp. Bacteroides uniformis Bacteroides sp. Bacteroides stercoris Bacteroides caccae Relative frequency of Bacteroides species within Stool 123 samples Relative frequency Normal varies a lot at the species level (WGS) Dirk Gevers

What’s wrong with this picture? 21 Lactobacillus crispatus MV-1A-US Lactobacillus crispatus JV-V01 Lactobacillus crispatus CHN Lactobacillus crispatus Lactobacillus crispatus MV-3A-US Lactobacillus crispatus ST1 Lactobacillus gasseri JV-V03 Lactobacillus gasseri Lactobacillus gasseri Lactobacillus gasseri MV-22 Bifidobacterium breve DSM Bifidobacterium dentium ATCC Mycoplasma hominis Clostridiales genomosp BVAB3 str UPII9-5 Clostridiales genomosp BVAB3 UPII9-5 Gardnerella vaginalis AMD Prevotella timonensis CRIS 5C-B1 Megasphaera genomosp type 1 str 28L Porphyromonas uenonis 60-3 Gardnerella vaginalis Gardnerella vaginalis 5-1 Atopobium vaginae DSM Gardnerella vaginalis ATCC Lactobacillus jensenii 1153 Lactobacillus jensenii Lactobacillus jensenii SJ-7A-US Lactobacillus jensenii Lactobacillus jensenii JV-V16 Lactobacillus jensenii 27-2-CHN Lactobacillus jensenii CHN Lactobacillus iners AB-1 Lactobacillus iners DSM posterior fornix microbiomes → Species and strains matter – but so does your method for identifying them in a community!

Core gene families 22 Gene X is a core gene for Clade Y All subclades of Clade Y must have Gene X as core gene (strict definition) Gene X may be a core gene of several (unrelated) clades We have to relax the definition for taking into account: Low-level gene losses Sequencing errors Gene calls errors Gene X A core gene is a gene strongly conserved within a clade

Examples of core genes 23

Clade-specific marker genes 24 Gene X Gene X is a marker gene (for Clade Y) if X is a core gene for Y and X never appears outside Clade Y

Examples of marker genes 25

The BactoChip: high-throughput microbial species identification 26 With Olivier Jousson, Annalisa Ballarini

BactoChip: detecting single species 27 With Olivier Jousson, Annalisa Ballarini

MetaPhlAn: inferring microbial abundances from metagenomic data using marker genes 28 Map metagenomic reads to marker genes to infer microbial abundances –Normalizing for copy number, gene length, etc. Much faster than existing approaches as the marker gene database is ~50 times smaller than the whole microbial sequence DB  Few hours instead of weeks for Illumina samples with 100Gb of sequence data MetaPhlAn: Metagenomic Phylogenetic Analysis

MetaPhlAn: synthetic validation on log- normal abundances 29 Summary of 8 synthetic communities composed by 2M reads coming from 200 organisms with log-normal distributed abundances concentrations Species-levelClass-level Species levelClass level

Matching 16S and more 30

The human microbiome at species-level resolution 31

Whence enterotypes? 32 Genera Species

Microbial community function and structure in the human microbiome: the story so far? Who’s there varies even in health –What they’re doing doesn’t (as much) –Both correlate with niche –By the way: both change during disease and treatment There are patterns in this variation –Function correlates with membership and phenotype –“Pathogenicity” correlates with lower prevalence –Membership means species, strains, or variants –Patterns aren’t always as simple as enterotypes ~1/3 to 2/3 of human metagenome characterized –Job security! 33

Ask both what you can do for your microbiome and what your microbiome can do for you

Wendy Garrett Michelle Rooks Ramnik Xavier Harry Sokol Thanks! 35 Nicola SegataLevi Waldron Fah Sathira Human Microbiome Project HMP Metabolic Reconstruction Owen White George Weinstock Karen Nelson Joe Petrosino Mihai Pop Pat Schloss Makedonka Mitreva Erica Sodergren Vivien Bonazzi Jane Peterson Lita Proctor Sahar Abubucker Yuzhen Ye Beltran Rodriguez-Mueller Jeremy Zucker Qiandong Zeng Mathangi Thiagarajan Brandi Cantarel Maria Rivera Barbara Methe Bill Klimke Daniel Haft Dirk Gevers Bruce BirrenMark Daly Doyle WardEric Alm Ashlee EarlLisa Cosimi Joseph Moon Vagheesh Narasimhan Tim Tickle Xochi Morgan Josh Reyes Jeroen Raes Karoline Faust Jacques Izard Olivier Jousson Annalisa Ballarini

Linking function to community composition 37 ← Taxa and correlated metabolic pathways → ← 52 posterior fornix microbiomes → F-type ATPase, THF Sugar transport Phosphate and peptide transport AA and small molecule biosynthesis Embden-Meyerhof glycolysis, phosphotransferases Eukaryotic pathways Plus ubiquitous pathways: transcription, translation, cell wall, portions of central carbon metabolism… Lactobacillus crispatus Lactobacillus jensenii Lactobacillus gasseri Lactobacillus iners Gardnerella/Atopobium Candida/Bifidobacterium

Linking communities to host phenotype 38 Normalized relative abundance Vaginal pH (posterior fornix) Body Mass Index Top correlates with BMI in stool Vaginal pH, community metabolism, and community composition represent a strong, direct link between phenotype and function in these data. Vaginal pH (posterior fornix)