Computational metagenomics and the human microbiome Curtis Huttenhower 01-21-11 Harvard School of Public Health Department of Biostatistics.

Slides:



Advertisements
Similar presentations
“Tracking Immune Biomarkers and the Human Gut Microbiome: Inflammation, Crohn's Disease, and Colon Cancer” USC Monthly Seminar Series Physical Sciences.
Advertisements

Network integration and function prediction: Putting it all together Slides courtesy of Curtis Huttenhower Harvard School of Public Health Department.
The Human Microbiome in Health and Disease Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
The Microbiome: What’s the immune system got to do with it?
Use of the genomic data o Reconstruction of metabolic properties o Nature’s Microbiome o NGS in Population Genetics.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Amplicon functional profiling with PICRUSt
Scalable metabolic reconstruction for metagenomic data and the human microbiome Sahar Abubucker, Nicola Segata, Johannes Goll, Alyxandria Schubert, Beltran.
Computational Methodology for Microbial and Metagenomic Characterization using Large Scale Functional Genomic Data Integration Curtis Huttenhower
Scalable data mining for functional genomics and metagenomics
Scalable data mining for functional genomics and metagenomics Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
Large scale functional data mining: What can we find in the data we have? Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
Sahar Abubucker, Nicola Segata,
The NIH Human Microbiome Project
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
The Microbiome and Metagenomics
A Multivariate Biomarker for Parkinson’s Disease M. Coakley, G. Crocetti, P. Dressner, W. Kellum, T. Lamin The Michael L. Gargano 12 th Annual Research.
Answering biological questions using large genomic data collections Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
“Mapping the Human Gut Microbiome in Health and Disease Using Sequencing, Supercomputing, and Data Analysis” Invited Talk Delivered by Mehrdad Yazdani,
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Human Microbiome Conference
Charting the function of microbes and microbial communities Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
The NIH Roadmap and the Human Microbiome Project Francis S. Collins, M.D., Ph.D. National Human Genome Research Institute April 22, 2007.
The Human Microbiome: PSC, IBD, and the Gut-Liver Axis
713 Lecture 15 Host metagenomics. Progression of techniques Culture based –Use phenotypes and genotypes to ID Non-culture based, focused on 16S rDNA –Clone.
Microbial diversity and virulence probing of five different body sites Anu Rebbapragada, Pub. Health Ontario Central Lab. Canada Wei-Jen Lin, Cal State.
Large scale genomic data integration for functional genomics and metagenomics Curtis Huttenhower Harvard School of Public Health Department of.
Meta’omic functional profiling with ShortBRED Galeb Abu-Ali Curtis Huttenhower Harvard T.H. Chan School of Public Health Department of Biostatistics.
Meta’omic Analysis with MetaPhlAn, HUMAnN, and LEfSe Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
“Observing the Dynamics of the Human Immune System Coupled to the Microbiome in Health and Disease” CASIS Workshop on Biomedical Research Aboard the ISS.
2009 IADR, MIAMI, FL, USA Hands-on Experience for using the Human Oral Microbiome Database (HOMD) 2009 IADR Workshop, Miami, FL, USA Tsute (George) Chen.
The Microbiome and Metagenomics
Compositionality and Sparseness in 16S rRNA data Anthony Fodor Associate Professor Bioinformatics and Genomics UNC Charlotte.
Meta’omic functional profiling with ShortBRED Curtis Huttenhower Harvard School of Public Health Department of Biostatistics U. Oregon.
Inflammatory Bowel Diseases November 19, 2007 NCDD Meeting Chair: Daniel K. Podolsky, MD Vice Chair: Eugene B. Chang, MD.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
1 Modelling and Simulation EMBL – Beyond Molecular Biology Physics Computational Biology Chemistry Medicine.
Shotgun sequencing reveals transkingdom alterations in immunodeficiency associated enteropathy Xiaoxi Dong (Oregon State University), Jialu Hu (Oregon.
tracking microbes at the strain level
An Introduction to Meta’omic Analyses Curtis Huttenhower Galeb Abu-Ali Eric Franzosa Harvard T.H. Chan School of Public Health Department of Biostatistics.
Functional profiling with HUMAnN2
Meta’omic functional profiling with ShortBRED
Metagenomic Species Diversity.
The Human Microbiome Project
Strain profiling with StrainPhlAn and PanPhlAn
Genomic Data Integration
Functional profiling with HUMAnN2
The African Soil Microbiology project
Identifying personal microbiomes using metagenomic codes
Systematic Characterization and Analysis of the Taxonomic Drivers of Functional Shifts in the Human Microbiome  Ohad Manor, Elhanan Borenstein  Cell Host.
Taxonomic profiling with MetaPhlAn2
Genomic Data Manipulation
Strain profiling with StrainPhlAn
Human Gut Microbiome: Function Matters
H = -Σpi log2 pi.
Volume 16, Issue 3, Pages (September 2014)
Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon 
Inflammatory Bowel Disease as a Model for Translating the Microbiome
Dysbiosis of gut microbiota patterns in Chinese patients with IBD
Volume 20, Issue 5, Pages (November 2014)
Microbiome studies for microbial disease pathogenesis research
The Human Microbiome Project in 2011 and Beyond
Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon 
Volume 20, Issue 5, Pages (November 2014)
A typical current computational meta'omic pipeline to analyze and contrast microbial communities. A typical current computational meta'omic pipeline to.
Toward Accurate and Quantitative Comparative Metagenomics
Metagenomics of the Human Microbiome
Presentation transcript:

Computational metagenomics and the human microbiome Curtis Huttenhower Harvard School of Public Health Department of Biostatistics

What to do with your metagenome? 2 (x10 10 ) Diagnostic or prognostic biomarker for host disease Public health tool monitoring population health and interactions Comprehensive snapshot of microbial ecology and evolution Reservoir of gene and protein functional information Who’s there? What are they doing? What do functional genomic data tell us about microbiomes? What can our microbiomes tell us about us? * * Using terabases of sequence and thousands of experimental results

The Human Microbiome Project ongoing 300 “normal” adults, S rDNA + WGS 5 sites/18 samples + blood Oral cavity: saliva, tongue, palate, buccal mucosa, gingiva, tonsils, throat, teeth Skin: ears, inner elbows Nasal cavity Gut: stool Vagina: introitus, mid, fornix Reference genomes (~ ) All healthy subjects; followup projects in psoriasis, Crohn’s, colitis, obesity, acne, cancer, antibiotic resistant infection… Hamady, 2009 Kolenbrander, 2010

HMP Organisms: Everyone and everywhere is different 4 ← Body sites + individuals → ← Organisms (taxa) → ear gutnosemouthvaginaarm mucosapalategingivatonsilssalivasub. plaq.sup. plaq.throattongue Every microbiome is surprisingly different Most organisms are rare in most places Even common organisms vary tremendously in abundance among individuals Aerobicity, interaction with the immune system, and extracellular medium appear to be major determinants There are few organismal biotypes in health

HUMAnN: Community metabolic and functional reconstruction 5 WGS reads Pathways/ modules Genes (KOs) Pathways (KEGGs) Functional seq. KEGG + MetaCYC CAZy, TCDB, VFDB, MEROPS… BLAST → Genes Genes → Pathways MinPath (Ye 2009) Smoothing Witten-Bell Gap filling c(g) = max( c(g), median ) 300 subjects 1-3 visits/subject ~6 body sites/visit M reads/sample 100bp reads BLAST ? Taxonomic limitation Rem. paths in taxa < ave. Xipe Distinguish zero/low (Rodriguez-Mueller in review) HMP Unified Metabolic Analysis Network

HUMAnN: Community metabolic and functional reconstruction 6 Pathway coveragePathway abundance

HUMAnN: Validating gene and pathway abundances on synthetic data 7 Validated on individual genes, module coverage + abundance False negatives: short genes (<100bp), taxonomically rare pathways False positives: large and multicopy (not many in bacteria)

HUMAnN: The steps that didn’t make the cut 8 Abundance Coverage

Functional modules in 741 HMP samples 9 Coverage Abundance ANO(BM)PFO(SP)SRCRCO(TD) ← Samples → ← Pathways→ Zero microbes (of ~1,000) are core among body sites Zero microbes are core among individuals 19 (of ~220) pathways are present in every sample 53 pathways are present in 90%+ samples Only 31 (of 1,110) pathways are present/absent from exactly one body site 263 pathways are differentially abundant in exactly one body site

Microbial environment trumps host environment (in health) 10 HMP stool, colored by BMIMetaHIT stool, colored by IBD ← Microbes→ ← Pathways→ Aerobic body sites Gastrointestinal body sites Pathways in all body sites (“core”) Human microbiome structure dictated primarily by microbial niche, not host (in health) Huge variation in who’s there; small variation in what they’re doing Note: definitely variation in how these functions are implemented Does not yet speak to environment (diet!), genetics, or disease

Gene expression SNP genotypes Metagenomic biomarker discovery 11 Healthy/IBD BMI Diet Taxa & pathways Batch effects? Population structure? Niches & Phylogeny Test for correlates Multiple hypothesis correction Feature selection p >> n Confounds/ stratification/ environment Cross- validate Biological story? Independent sample Intervention/ perturbation

LEfSe: Metagenomic class comparison and explanation 12 LEfSe Nicola Segata LDA + Effect Size

LEfSe: Evaluation on synthetic data 13

Microbes characteristic of the oral and gut microbiota 14

Aerobic, microaerobic and anaerobic communities High oxygen:skin, nasal Mid oxygen:vaginal, oral Low oxygen:gut

LEfSe: The TRUC murine colitis microbiota 16 With Wendy Garrett

MetaHIT: The gut microbiome and IBD 17 WGS reads Pathways/ modules 124 subjects:99 healthy 21 UC + 4 CD ReBLASTed against KEGG since published data obfuscates read counts Taxa Phymm Brady 2009 Genes (KOs) Pathways (KEGGs) Qin 2010 With Ramnik Xavier, Joshua Korzenik

MetaHIT: Taxonomic CD biomarkers 18 Firmicutes Enterobacteriaceae Up in CD Down in CD UC

MetaHIT: Functional CD biomarkers 19 Motility Transporters Sugar metabolism Down in CD Up in CD Subset of enriched modules in CD patientsSubset of enriched pathways in CD patients Growth/replication

Sleipnir C++ library for computational functional genomics Data types for biological entities Microarray data, interaction data, genes and gene sets, functional catalogs, etc. etc. Network communication, parallelization Efficient machine learning algorithms Generative (Bayesian) and discriminative (SVM) And it’s fully documented! Sleipnir: Software for scalable functional genomics Massive datasets require efficient algorithms and implementations. 20 It’s also speedy: microbial data integration computation takes <3hrs

Thanks! 21 Jacques Izard Wendy Garrett Pinaki SarderNicola Segata Levi WaldronLarisa Miropolsky Interested? We’re recruiting students and postdocs! Human Microbiome Project HMP Metabolic Reconstruction George Weinstock Jennifer Wortman Owen White Makedonka Mitreva Erica Sodergren Vivien Bonazzi Jane Peterson Lita Proctor Sahar Abubucker Yuzhen Ye Beltran Rodriguez-Mueller Jeremy Zucker Qiandong Zeng Mathangi Thiagarajan Brandi Cantarel Maria Rivera Barbara Methe Bill Klimke Daniel Haft Ramnik XavierDirk Gevers Bruce BirrenMark Daly Doyle WardEric Alm Ashlee EarlLisa Cosimi Sarah Fortune

The LEfSe algorithm 23 Statistical consistency Biological consistency Overall effect size

HMP: Metabolism, host-microbiome interactions, and microbial taxa 24 >3200 gene families differential in the mucosa >1500 upregulated outside the mucosa and not in any Actinobacterial genome 16S WGS