Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functional profiling with HUMAnN2

Similar presentations


Presentation on theme: "Functional profiling with HUMAnN2"— Presentation transcript:

1 Functional profiling with HUMAnN2
Curtis Huttenhower Galeb Abu-Ali Eric Franzosa Harvard T.H. Chan School of Public Health Department of Biostatistics

2 The two big questions of microbial community analysis...
What are they doing? Who is there?

3 Metagenomic analyses: molecular functions and biological roles
Orthology: Grouping genes by conserved sequence features COG, KO, FIGfam… Structure: Grouping genes by similar protein domains Pfam, TIGRfam, SMART, EC… Biological roles: Grouping genes by pathway and process involvement GO, KEGG, MetaCyc, SEED… Warnecke, 2007 Turnbaugh, 2009 DeLong, 2006

4 “Who’s there,” versus, “What they’re doing,” in the healthy human microbiome
← Subjects → ← Phylum abundance → ← Phylum abundance → Vaginal Skin Nares Oral (SupP) Oral (BM) Oral (TD) Gut ← Subjects → ← Pathway abundance → Instead, consider this view that Dirk showed earlier of the taxa in a collection of the HMP samples, specifically the gut communities. Each color here represents a taxon, and as he described, there’s huge variation among individuals, with Bacteroides ranging from tremendously dominant to a minority organism. We can use this same view for the metabolic reconstruction data, where now each color represents a specific metabolic pathway. There’s much less variation in community function among individuals than there is in community composition – roughly the same pathways are present at the same abundances in each individual’s gut, regardless of variation in membership. This is true not only in the gut, but in all of the HMP’s body sites. The body sites themselves differ somewhat in metabolism, as I’ll discuss later, but again not nearly as dramatically as they differ in composition. This is now an overview of most of the HMP’s 16S data and all 700 of our metabolic reconstructions, and it captures the important functional patterns in the human microbiome perhaps better than the previous slide. From this high-level perspective, though, I’d like to zoom in to a specific pathway or set of enzymes, though, and talk about… ← Pathway abundance →

5 HUMAnN2: Organism-specific functional profiling of metagenomes and metatranscriptomes
The relative abundance of gene i in a metagenome is the number of reads j that map to a gene sequence in the family, weighted by the inverse p-value of each mapping and normalized by the average length of all gene sequences in the orthologous family. Eric Franzosa Lauren McIver

6 HUMAnN2: stratified output
UniRef gene cluster Gene name Total gene abundance (RPK) UniRef90_R6K3Z5: IMP dehydrogenase 600.95 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_caccae 234.76 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_dorei 107.38 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_ovatus 92.18 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_stercoris 83.95 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_vulgatus 57.27 UniRef90_R6K3Z5: IMP dehydrogenase|unclassified 25.41 Σ Per-species & unclassified stratifications MetaCyc pathway Pathway abundance & coverage PWY-7221: GTP biosynthesis 200.35 1 PWY-7221: GTP biosynthesis|Bacteroides_caccae 120.23 PWY-7221: GTP biosynthesis|Bacteroides_dorei 11.12

7 HUMAnN2 synthetic evaluation (genes)
…and is >7x faster ~2.3 hours ~0.3 hours (10M reads, 8 cores) HUMAnN2 tiered search is more accurate… Comprehensive search suffers from spurious hits ...and provides accurate per-species quantification! Compare exp. vs. obs. gene abundance 1x Staggered abundance ~0.1x to 100x coverage Synthetic human gut metagenome (top 20 species)

8 The other HMP2: HMP1-II Shotgun metagenomes 18 body sites in 5 areas
152 49 159 548 Gut 2012 2016 Ns Nt 403 44 153 1248 Oral 68 17 54 183 Urogenital 118 18 64 306 Skin 133 41 116 413 Tongue Dorsum 121 35 114 378 Supragingival Plaque 117 30 105 370 Buccal Mucosa Shotgun metagenomes 18 body sites in 5 areas 6 intensively sampled sites 2,285 samples from 264 people Up to 3 time points per (person, site) New analyses: HUMAnN2: Enzyme and pathway abundances, stratified by species MetaPhlAn2: Eukaryotes and viruses StrainPhlAn: Strain profiling Temporal dynamics Fully updated IDBA-UD assemblies + gene catalog / annotations

9 HUMAnN2 real-world performance
~60% of reads align before translated search ~15% more reads align during translated search (total~75%) Applied HUMAnN2’s tiered search to profile 100s of human metagenomes (HMP, six major body sites) Pangenome search tier 1-2 orders of magnitude faster than comprehensive translated search DIAMOND w/ comprehensive protein db bowtie2 w/ sample-specific pangenome db

10 abundance (other areas square-root scaled)
HUMAnN2 identifies body site-specific “signature pathways” in the human microbiome Zoom in… Max area ≈2% relative abundance (other areas square-root scaled) “signature for area i” → Q1( area i ) > Q3( area j ) for all j ≠ i; very stringent! ≈50 total signature pathways across 4 major body areas (20 shown) Values plotted = median (Q2) abundance for samples from that area

11 HUMAnN2 identifies body site-specific “signature pathways” in the human microbiome
Zoom in… Unclassified abundance In the minority L-rhamnose degradation (RHAMCAT-PWY) is a signature of the human gut microbiome

12 HUMAnN2 reveals three distinct mechanisms of cross-environment functional conservation
Mechanism 1: Complex Multiple contributing species per individual Ex: L-rhamnose degradation in gut microbiome

13 Mechanism 2: Per-person-dominant
HUMAnN2 reveals three distinct mechanisms of cross-environment functional conservation Mechanism 2: Per-person-dominant One dominant contributing species per individual Ex: peptidoglycan biosynthesis in vaginal microbiome

14 Mechanism 3: Universal-dominant
HUMAnN2 reveals three distinct mechanisms of cross-environment functional conservation Mechanism 3: Universal-dominant Same, dominant contributing species in all individuals Ex: trehalose degradation in skin microbiome

15 HUMAnN2 reveals unusual “relative expression” in paired metatranscriptomes & metagenomes
Sucrose degradation follows a complex attribution pattern across ~200 human gut metagenomes… In collaboration with the STARR Consortium & HPFS cohort …but its expression can be dominated by a single species in paired gut metatranscriptomes!

16 Conclusions HUMAnN2 implements a tiered approach to faciliate
meta’omic functional profiling This approach is more accurate & much faster than traditional comprehensive meta’omic search Results stratify by species for free: Answering “who’s there?” and “What are they doing?” in tandem

17 Further improving short reads and regions of local homology among proteins
Protein of interest Belongs to a family Local homology to unrelated families Short reads from unrelated families may map to protein of interest (spurious hits)

18 Find unique markers for interesting prots
ShortBRED Identify Find unique markers for interesting prots Jim Kaminski Prots of interest Reference database Cluster into families Identify short, common regions True Marker Junction Marker Quasi Marker

19 Use markers for highly specific profiling
ShortBRED Quantify Use markers for highly specific profiling Jim Kaminski Metagenome reads Translated search for high ID hits Normalize relative abundances ShortBRED markers

20 ShortBRED: ABR in human gut metagenomes

21 ShortBRED: Functional profiling of microbial genomes

22 Thanks! http://huttenhower.sph.harvard.edu Human Microbiome Project 2
Lita Procter Jon Braun Dermot McGovern Subra Kugathasan Ted Denson Janet Jansson Bruce Birren Chad Nusbaum Clary Clish Joe Petrosino Thad Stappenbeck Alex Kostic Ayshwarya Subramanian Xochitl Morgan Casey DuLong Daniela Boernigen Lauren McIver Ramnik Xavier Human Microbiome Project Jane Peterson Sarah Highlander Barbara Methe Karen Nelson George Weinstock Owen White George Weingart Emma Schwager Eric Franzosa Boyu Ren Tiffany Hsu Ali Rahnavard Hera Vlamakis Finally, thanks again to the ISMB organizers and ISCB for putting together a fantastic conference as always, to our funders, the rest of my lab back in Boston, our collaborators, and everyone in the audience today – many thanks again. Levi Waldron Joseph Moon Jim Kaminski Tommi Vatanen Koji Yasuda Siyuan Ma Galeb Abu-Ali Dirk Gevers Nicola Segata Clary Clish Justin Scott Wendy Garrett Bahar Sayoldin Randall Schwager Melanie Schirmer Himel Mallick Moran Yassour Alexandra Sirota-Madi

23


Download ppt "Functional profiling with HUMAnN2"

Similar presentations


Ads by Google