Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Meta’omic Analyses

Similar presentations


Presentation on theme: "An Introduction to Meta’omic Analyses"— Presentation transcript:

1 An Introduction to Meta’omic Analyses
Curtis Huttenhower Galeb Abu-Ali Eric Franzosa Harvard T.H. Chan School of Public Health Department of Biostatistics

2 Sequencing as a tool for microbial community analysis
Who’s there? (Taxonomic profiling) What are they doing? (Functional profiling) What does it all mean? (Statistical analysis) Lyse cells Extract DNA (and/or RNA) Meta’omic Amplicons George Rice, Montana State University PCR to amplify a single marker gene, e.g. 16S rRNA Classify sequence  microbe Samples Microbes Taxon counts / %s Taxon counts, Gene catalog / counts, Genomes, Pathway reconstructions, Genetic variants...

3 What to do with your metagenome?
Reservoir of gene and protein functional information Comprehensive snapshot of microbial ecology and evolution Public health tool monitoring population health and epidemiology Diagnostic or prognostic biomarker for host disease

4 A foundational metagenomic study: Global Ocean Sampling
2003/ ongoing

5 What is a “normal” human microbiome? 300 healthy human subjects
Slide by Dirk Gevers The NIH Human Microbiome Project (HMP): A comprehensive microbial survey What is a “normal” human microbiome? 300 healthy human subjects Multiple body sites 15 male, 18 female Multiple visits Clinical metadata 5,200 16S samples Spanning 300 subjects, 18 sites 700 shotgun samples Subset of 100 subjects, six sites 2,500 we have collected microbial samples consistently across multiple bodysites: 15 in man, and 18 in womanfrom 300 different subjectsand for a number of these subjects we included sampled from up to 3 different time pointsin all, we will be able to define the microbes present in samples

6 How to find biology in your meta’ome
Looking for ecology? Diversity metrics, k-mer analysis, curve fitting Looking for specific bugs? Assembly: +novelty, -difficulty Mapping: +speed/ease, -novelty Looking for specific processes? Intrinsic annotation: +novelty, -difficulty Extrinsic annotation: +sensitivity, -novelty Looking for variants? Clustering: +specificity, -difficulty Mapping: +sensitivity, -novelty What else?

7 Typical shotgun metagenome and metatranscriptome analyses
Taxonomic Profiling Assembly Functional Profiling Samples Microbes Relative abundances Samples Genes or Pathways Relative abundances

8 From Bugs to Drugs in Inflammatory Bowel Disease
The gut microbiota varies in IBD Diversity is reduced, specific clades enriched/depleted, and consistent functional dysbioses are induced Differential within CD and UC, and heterogeneous within these diseases To be actionable, requires... New onset patients to stratify disease subtypes and response to treatment Longitudinal data to predict onset and resolution of flares Microbial molecular data for new potential bioactives Host molecular data to identify targetable pathways

9 Taxonomic and functional dysbioses in IBD
Dirk Gevers Xochitl Morgan Gevers CHM 2014

10 The “HMP2” IBD Multi’omics Data resource

11 Preliminary IBD microbiome multi’omics
Alexandra Sirota-Madi Preliminary IBD microbiome multi’omics HMP2 Cross-Sectional Stool Taxonomic Profiling CD - 69 157 UC- 55 Shotgun Metagenomic Sequencing Non-IBD Controls - 33 Functional Profiling HMP2 Longitudinal Time1 Time10 CD– 5 UC– 2 IC - 1 80 Metabolomic Profiling The experimental design, included shotgun metagenomics sequencing of both PRISM cross-sectional and longitudinal samples, from which we can get taxonomic profiling and functional profiling. The new and I can say exciting addition to the functional profiling component is that now we can get not only the functional profiling of the community but also the contribution of each species to each function, and I will touch on that later on. And finally, all three cohorts were run through the metabolomics profiling pipeline, using 4 methods as Clary mentioned before. The main goals are: To develop tools and methods for integration of multi-omic data, in the context of IBD Specifically, we would like know if there is a unique metabolic signature for IBD patients? And can we use those as potential targets for small molecule screening? Can these metabolites be validated with an independent cohort? Can we connect metabolic profiles to taxonomic profiles and to a specific function? NLIBD - Cross-Sectional Non-targeted LC-MS Metabolomics CD - 20 1. Lipids -Positive ion mode 2. HiliC - Polar metabolites in positive ion mode 3. FFA - Free fatty acid and bile aids - negative ion mode 4. CMH - Polar metabolites, negative ion mode 65 UC- 23 Non-IBD Controls - 22

12 MetaPhlAn2: metagenomic taxonomic profiling
Nicola Segata X is a unique marker gene for clade Y Gene X Nicola’s taken advantage of this catalog for several computational methods, but the one I’d like to talk about today relies on identifying high-quality taxonomically unique marker sequences guaranteed to arise from exactly one microbial clade. By organizing the gene catalog of IMG into groups of gene families – not orthologous families, but highly nucleotide-similar sequences – we can identify gene families that are core to one or more clades. This means that the gene’s conserved throughout the clade, although it may appear elsewhere due to conservation or horizontal gene transfer. Core genes are thus a superset of unique marker genes, which are both core to a clade and unique there – they never appear elsewhere, even by horizontal transfer. Nicola’s developed a system called ChocoPhlAn, which I’m pretty sure is an acronym for something, that identifies all genes core or unique for any clade within IMG. This results in a high-quality set of about two million unique markers, with uniqueness verified by whole-genome BLAST against the entire database. About 400 thousand of these proved sufficient to uniquely identify all 1200 species in the database, plus several hundred higher-level clades, with several hundred markers for most organisms.

13 Representative Differentially Abundant Microbes and Metabolites
This is a zoom into top ten DA metabolites, which includes, taurine, chenodeoxicholic acid (part of the bile acid biosynthesis pathway), and arachidonic acid, which are up in IBD, other metabolites such Urobilin and cholesterol which are down in IBD. On the bottom panel, we can see distribution of DA species such as Ruminaccus gnavus and Blautia produca that are up in IBD and the rest of the species that are present in Controls and are depleted in IBD such F. pau, Eubacterium rectale and Coproccocus comes.

14 Co-variation between Gut Microbes and Metabolites
Spearman of residuals after regressing disease, medication, and age Up in IBD Down in IBD Up in IBD Bacteria FDR <0.1 Correlation matrix between residuals of differentially abundant bugs and differentially abundant metabolites after correcting for disease, age and medication, with warmer colors indicating a higher spearman correlation coefficient. Between-subject Spearman correlations (FDR < 0.1) was calculated, and taxa and metabolites with at least one significant correlation were hierarchically clustered. Down in IBD

15 HUMAnN2: Organism-specific functional profiling of metagenomes and metatranscriptomes
Gene Family RPKs UniRef50_A6L0N6 67 UniRef50_A6L0N6|s__Bacteroides_fragilis 8 UniRef50_A6L0N6|s__Bacteroides_finegoldii 5 UniRef50_A6L0N6|s__Bacteroides_stercoris 4 UniRef50_A6L0N6|unclassified 1 UniRef50_G9S1V7 60 UniRef50_G9S1V7|s__Bacteroides_vulgatus 31 UniRef50_G9S1V7|s__Bacteroides_thetaiotaomicron 22 UniRef50_G9S1V7|s__Bacteroides_stercoris 7 Pathway RPKs indole-3-acetate activation 57 indole-3-acetate activation|unclassified 32.3 indole-3-acetate activation|s__Bacteroides_ovatus 4.5 indole-3-acetate activation|s__Alistipes_putredinis 3 indole-3-acetate activation|s__Bacteroides_caccae 2.25 melibiose degradation 55 melibiose degradation|unclassified 17 melibiose degradation|s__Parabacteroides_merdae 8 melibiose degradation|s__Bacteroides_caccae 6 The relative abundance of gene i in a metagenome is the number of reads j that map to a gene sequence in the family, weighted by the inverse p-value of each mapping and normalized by the average length of all gene sequences in the orthologous family. Eric Franzosa Lauren McIver

16 Microbial Contributions to Bile Acid Dismetabolism
Log2 Up in IBD Next, we wanted to see if we can connect between bile acid metabolites, to a function and identify which of the bacteria have the potential de-conjugate and what is the contribution of each species. Here you can see the same function “Conjugated bile acid hydrolase”, and the contribution of each bacteria to the total functional potential. In the middle, you can see high signal from healthy abundant bacteria when the de-conjugation occurs normally, compared to bacteria which are up in IBD, which also contains the enzymatic capability, but probably not to the same levels as in healthy individuals. Most Gram-positive bacteria inhabiting the gastrointestinal tract are capable of hydrolysing bile salts. Bile salts play an important role in lipid digestion in mammals. In the liver, bile salts are synthesized from cholesterol, and conjugated with the amino acids glycine or taurine. Following excretion into the intestinal lumen, the amino acid part of the bile salts can be hydrolysed (i.e. deconjugated) by bile salt hydrolases (Bsh; EC ), also designated choloylglycine hydrolase or conjugated bile acid hydrolase (CBAH), produced by the intestinal microbiota. Down in IBD Samples Conjugated bile acid hydrolases produced by the intestinal microbiota

17 Gene-based fingerprints capture strain variation
in individuals’ most abundant (stable) bugs 17

18 PanPhlAn: the approach
Nicola Segata mapping Read Metagenomic sample Gene coverage Microbial pangenomes Cluster to Gene families Pan-gene family coverage Abundance-sorted pan-gene families Coverage Multi-copy genes Plateau of genes from one metagenome’s strain Absent genes

19 Gene-family distribution curves
Select samples with “step” distribution (colored curves) E. coli strain is present Base coverage Reject non-step (gray) curves E. coli gene-families

20 PanPhlAn for “meta-epidemiology”
Metagenomes from [Loman et al., 2013]

21 StrainPhlAn: metagenomic strain identification and tracking

22 A tool for strain level population genomics
China Denmark Estonia Finland Peru’ Hungary Italy Norway France Spain Sweden USA Germany P. copri as an example species Alignment length: 66k nt Median SNPs: 830 [3.6%] # pos. samples: 123

23 Most bugs are dominated by one stable strain
Intra HMP (520 comparisons) Intra MetaHIT (330 comp.) Intra house (59 comp.) Inter HMP (22k comp.) Inter MetaHIT (180k comp.) Inter everyone (1.5M comp.)

24 Thanks! http://huttenhower.sph.harvard.edu Human Microbiome Project 2
Lita Procter Jon Braun Dermot McGovern Subra Kugathasan Ted Denson Janet Jansson Bruce Birren Chad Nusbaum Clary Clish Joe Petrosino Thad Stappenbeck Alex Kostic Ayshwarya Subramanian Xochitl Morgan Casey DuLong Daniela Boernigen Lauren McIver Ramnik Xavier Human Microbiome Project Jane Peterson Sarah Highlander Barbara Methe Karen Nelson George Weinstock Owen White George Weingart Emma Schwager Eric Franzosa Boyu Ren Tiffany Hsu Ali Rahnavard Hera Vlamakis Finally, thanks again to the ISMB organizers and ISCB for putting together a fantastic conference as always, to our funders, the rest of my lab back in Boston, our collaborators, and everyone in the audience today – many thanks again. Levi Waldron Joseph Moon Jim Kaminski Tommi Vatanen Koji Yasuda Siyuan Ma Galeb Abu-Ali Dirk Gevers Nicola Segata Clary Clish Justin Scott Wendy Garrett Bahar Sayoldin Randall Schwager Melanie Schirmer Himel Mallick Moran Yassour Alexandra Sirota-Madi

25


Download ppt "An Introduction to Meta’omic Analyses"

Similar presentations


Ads by Google