Download presentation
1
Functional profiling with HUMAnN2
Eric Franzosa Jason Lloyd-Price Functional profiling with HUMAnN2 Curtis Huttenhower Galeb Abu-Ali Ali Rahnavard STAMPS 2017 Harvard T.H. Chan School of Public Health Department of Biostatistics
2
The two big questions of microbial community profiling:
What are they doing? Who is there? (functional profiling) (taxonomic profiling) Like many great bioinformatics problems, answering these questions begins with sequence search!
3
HUMAnN2 for taxon-specific metagenome and metatranscriptome functional profiling
The relative abundance of gene i in a metagenome is the number of reads j that map to a gene sequence in the family, weighted by the inverse p-value of each mapping and normalized by the average length of all gene sequences in the orthologous family. Eric Franzosa Lauren McIver
4
HUMAnN2: stratified output
UniRef gene cluster Gene name Total gene abundance (RPK) UniRef90_R6K3Z5: IMP dehydrogenase 600.95 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_caccae 234.76 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_dorei 107.38 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_ovatus 92.18 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_stercoris 83.95 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_vulgatus 57.27 UniRef90_R6K3Z5: IMP dehydrogenase|unclassified 25.41 Σ Per-species & unclassified stratifications ~HUMAnN1 MetaCyc pathway Pathway abundance & coverage PWY-7221: GTP biosynthesis 200.35 1 PWY-7221: GTP biosynthesis|Bacteroides_caccae 120.23 PWY-7221: GTP biosynthesis|Bacteroides_dorei 11.12
5
HUMAnN2 real-world performance
~60% of reads align before translated search ~15% more reads align during translated search (total ~75%) Applied HUMAnN2’s tiered search to profile >2K human metagenomes (HMP1-II, six major body sites) Pangenome search tier 1-2 orders of magnitude faster than comprehensive translated search DIAMOND w/ comprehensive protein db bowtie2 w/ sample-specific pangenome db
6
And it works on non-human meta’omes, too
Luke Thompson
7
Quantifying the diversity of species contributing a function within and across subjects
low between-subject diversity high low simple, consistent simple, variable within-subject diversity A pathway’s contributional alpha-diversity is calculated from the distribution of taxa providing it (DNA or RNA) within a community; contributional beta-diversity is the corresponding comparison between communities. complex, consistent complex, variable high
8
HUMAnN2 reveals unusual “relative expression” in paired metatranscriptomes & metagenomes
Sucrose degradation follows a complex attribution pattern across ~200 human gut metagenomes… In collaboration with the STARR Consortium & HPFS cohort …but its expression can be dominated by a single species in paired gut metatranscriptomes!
9
The “HMP2” IBD Multi’omics Data resource
With Ramnik Xavier
10
The IBD Multi’omics DataBase
Cesar Arze
11
The IBD metatranscriptome in the HMP2 IBDMDB
117 Subjects: 59 Crohn’s Disease 34 Ulcerative Colitis 24 non-IBD Controls Gender: 57 Female 59 Male 1 unknown Cohorts: 32 MGH adult new onset 30 Cedars-Sinai adult establ. 31 Cincinnati peds new onset 11 Emory peds new onset 13 MGH peds new onset Melanie Schirmer
12
Different microbes can transcribe shared pathways
HISDEG-PWY: L-histidine degradation I Histidine is an α-amino acid that is used in the biosynthesis of proteins A. putredinis has been implicated in IBD Major contributor to transcription in subsets of IBD patients
13
PWY-7094: fatty acid salvage
Pathways can be contributed by different microbes over time PWY-7094: fatty acid salvage Faecalibacterium prausnitzii Time-courses for individual patients: CD Patient 1 CD Patient 2
14
https://bitbucket.org/biobakery/biobakery/wiki/humann2
HUMAnN2 tutorial
16
HUMAnN2 synthetic evaluation (genes)
…and is ~3x faster ~2.1 hours ~0.7 hours (10M reads, 8 cores) HUMAnN2 tiered search is more accurate… Comprehensive search suffers from spurious hits ...and provides accurate per-species quantification! Compare exp. vs. obs. gene abundance 1x Staggered abundance ~0.1x to 100x coverage Synthetic human gut metagenome (top 20 species)
17
HUMAnN2 real-world performance
18
Considerations for paired metatranscriptomes & metagenomes
$ humann2_rna_dna_norm --input_dna <DNA genefamilies file> input_rna <RNA genefamilies file> --output_basename <basename of the 3 output files> Calculates RNA/DNA abundance ratios Smooths the RNA and DNA abundances prior to taking the ratio Also outputs smoothed RNA and DNA files UniRef90_R6K3Z5: IMP dehydrogenase 2.02 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_caccae 5.96 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_dorei 3.82 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_ovatus 1.80 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_stercoris 0.87 UniRef90_R6K3Z5: IMP dehydrogenase|Bacteroides_vulgatus 0.34 UniRef90_R6K3Z5: IMP dehydrogenase|unclassified 1.96
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.