Amplicon functional profiling with PICRUSt

Slides:



Advertisements
Similar presentations
Lab 1: Using data output from Qiime, transformations, quality control
Advertisements

Meta’omic functional profiling with HUMAnN Curtis Huttenhower Harvard School of Public Health Department of Biostatistics U. Oregon META Center.
CSS Cascading Style Sheets. Objectives Using Inline Styles Working with Selectors Using Embedded Styles Using an External Style Sheet Applying a Style.
XSL eXtensible Stylesheet Language. What is XSL? XSL is a language that allows one to describe a browser how to process an XML file. XSL can convert an.
Meta’omic functional profiling with HUMAnN Curtis Huttenhower Harvard School of Public Health Department of Biostatistics U. Oregon.
The Earth Microbiome Project: Planetary Scale Systems Ecology Sean M. Gibbons, Antonio González, Emmanuel Prestat, Jai Rideout, J. Gregory Caporaso, Rob.
Computational Analysis of the Taxanomical Classification of Short 16S rRNA Sequences Christel Chehoud Mentor: Brian Haas.
Practical Bioinformatics Community structure measures for meta-genomics István Albert Bioinformatics Consulting Center Penn State.
Structure and formatting HTML pages Helen Treharne Department of Computing.
Sahar Abubucker, Nicola Segata,
Computational metagenomics and the human microbiome Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
The following slides in this presentation are templates for poster sized 42 inches by 56 inches. You can page through the templates and choose one you.
Discovery of new biomarkers as indicators of watershed health and water quality Anamaria Crisan & Mike Peabody.
This Must Be the Place: The Abundance and Distribution of Microbes using Maximum Entropy Will Shoemaker.
Microsoft Excel 2007 © Wiley Publishing All Rights Reserved. The L Line The Express Line to Learning L Line.
G053 Lecture 12 Introduction To HTML Mr C Johnston ICT Teacher
Human Microbiome Conference
Analysis of microbial communities with QIIME Justin Kuczynski February 2012.
Microbial diversity and virulence probing of five different body sites Anu Rebbapragada, Pub. Health Ontario Central Lab. Canada Wei-Jen Lin, Cal State.
Assignment feedback Everyone is doing very well!
Meta’omic functional profiling with ShortBRED Galeb Abu-Ali Curtis Huttenhower Harvard T.H. Chan School of Public Health Department of Biostatistics.
Meta’omic Analysis with MetaPhlAn, HUMAnN, and LEfSe Curtis Huttenhower Harvard School of Public Health Department of Biostatistics.
Microsoft® Excel Key and format dates and times. 1 Use Date & Time functions. 2 Use date and time arithmetic. 3 Use the IF function. 4 Create.
Metagenomics at Second Genome
Download all complete prokaryotic genomes from the NCBI RefSeq database Extract 16S rRNA sequences from each genome. Use UCLUST algorithm to cluster 16S.
RNASTAR, Greengenes, and QIIMEdb Rob Knight HHMI/CU Boulder.
Compositionality and Sparseness in 16S rRNA data Anthony Fodor Associate Professor Bioinformatics and Genomics UNC Charlotte.
1 Title Line on a Divider Slide Format >Level one bullet text for a divider slide.
Meta’omic functional profiling with ShortBRED Curtis Huttenhower Harvard School of Public Health Department of Biostatistics U. Oregon.
The Treatment-Naive Microbiome in New-Onset Crohn’s Disease Dirk Gevers, Subra Kugathasan, Lee A. Denson, Yoshiki Vázquez-Baeza, Will Van Treuren, Boyu.
Assignment 3 Your going to modify your last Best Movies page and Recipe page Your going to modify your last Best Movies page and Recipe page You will need.
Wikis in Education: Part III Wiki Basics University School of Milwaukee.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
An Introduction to Meta’omic Analyses Curtis Huttenhower Galeb Abu-Ali Eric Franzosa Harvard T.H. Chan School of Public Health Department of Biostatistics.
Functional profiling with HUMAnN2
Using the bioBakery Curtis Huttenhower
Meta’omic functional profiling with ShortBRED
Strain profiling with StrainPhlAn and PanPhlAn
Bare bones notes.
An Introduction to Meta’omic Analyses
Automating reproducible analyses with AnADAMA2 and bioBakery Workflows
Functional profiling with HUMAnN2
Microsoft PowerPoint 2003 Illustrated Introductory
Microsoft Office Illustrated Introductory, Premium Edition
Taxonomic profiling with MetaPhlAn2
Identifying personal microbiomes using metagenomic codes
Systematic Characterization and Analysis of the Taxonomic Drivers of Functional Shifts in the Human Microbiome  Ohad Manor, Elhanan Borenstein  Cell Host.
Taxonomic profiling with MetaPhlAn2
Strain profiling with StrainPhlAn
Microsoft Office 2003 Illustrated Introductory
Hui Wang OARDC How to use SSRHunter Hui Wang OARDC
Curtis Huttenhower Galeb Abu-Ali Eric Franzosa
Taxonomic composition of subway microbial communities.
Volume 17, Issue 2, Pages (February 2015)
One-Line Title. One-Line Title Sample Title, Too Long For One Line.
Volume 20, Issue 5, Pages (November 2014)
A Microbiome Foundation for the Study of Crohn’s Disease
One-Line Title. One-Line Title Sample Title, Too Long For One Line.
One-Line Title. One-Line Title Sample Title, Too Long For One Line.
(a) PCoA of the abundance of unique OTUs per sample from the 16S marker gene sequencing data from the AGP data repository (small spheres) and the San Diego.
Volume 10, Issue 4, Pages (October 2011)
ENTER YOUR PRESENTATION TITLE
Example usage of mockrobiota MC resource for marker gene and metagenome sequencing pipelines. Example usage of mockrobiota MC resource for marker gene.
Volume 20, Issue 5, Pages (November 2014)
Volume 17, Issue 3, Pages (March 2015)
A Microbiome Foundation for the Study of Crohn’s Disease
A typical current computational meta'omic pipeline to analyze and contrast microbial communities. A typical current computational meta'omic pipeline to.
A Presentation by Regina Strelecki
Presentation transcript:

Amplicon functional profiling with PICRUSt Curtis Huttenhower 08-15-14 Harvard School of Public Health Department of Biostatistics U. Oregon META Center

Who is there? What are they doing? The two big questions… Who is there? (taxonomic profiling) What are they doing? (functional profiling) In marker gene data

Gene families in one HMP hard palate sample PICRUSt: Inferring community metagenomic potential from marker gene sequencing With Rob Knight, Rob Beiko Seq. genomes One can recover general community function with reasonable accuracy from 16S profiles. http://picrust.github.com Gene families in one HMP hard palate sample HMP stool sample If function is so important, what about the thousands of 16S-based microbial community taxonomic profiles? Metagenomic abundance 16S predicted abundance Pathways and modules Orthologous gene families Taxon abundances PI-CRUST has been an excellent collaboration among three groups, ours, Rob Knight’s, and Rob Beiko’s, with the goal being to relate 16S-based taxonomic profiles with metabolic or functional profiles as might be recovered from shotgun metagenomic data. If you consider a 16S profile to identify the relative abundances of organisms from without a phylogenetic tree, often an exact reference genome is available that can quite specifically identify the functional potential of an organism. Sometimes a nearby genome is available instead, allowing an educated guess, and sometimes a more complex ancestral state reconstruction is needed to determine the most likely functional profile of an organism. Given such ancestral state reconstructions with appropriate confidence intervals, however, each reconstruction’s gene catalog can then be “multiplied” by the corresponding organism’s 16S abundance to infer community gene abundances. I honestly never expected this to work well, but when comparing inferred and real gene abundance profiles in communities from the HMP where both are available, the reconstruction actually performs remarkably well, even when I don’t cherry pick just the best few examples to show you. Morgan Langille from Rob Beiko’s group has an excellent poster detailing this process, and what it means in our context of IBD is that we can reasonably accurately go from 16S organismal abundances to gene family abundances, which can then be reconstructed into pathways using the same HUMAnN tool mentioned earlier for metagenomes. HUMAnN Reconstructed “genomes” Relative abundance

Setup notes reminder Slides with green titles or text include instructions not needed today, but useful for your own analyses Keep an eye out for red warnings of particular importance Command lines and program/file names appear in a monospaced font. Commands you should specifically copy/paste are in monospaced bold blue.

Installing PICRUSt http://picrust.github.io/picrust/install.html Requires Python (>=2.7, easy) PyCogent (http://pycogent.org) BIOM v1 (http://biom-format.org) And PICRUSt!

Installing PICRUSt http://picrust.github.io/picrust/install.html

Installing PICRUSt data http://picrust.github.io/picrust/picrust_precalculated_files.html Need to download the precomputed data used by PICRUSt separately It’s big! Saves you the trouble in the software itself

Picking PICRUSt-compatible OTUs http://picrust.github.io/picrust/tutorials/otu_picking.html PICRUSt uses precomputed ancestral state reconstructions OTUs in your data must match those used during precalculation This means Greengenes, either 18may2012 or 13.5

Picking PICRUSt-compatible OTUs This means that you must use completely closed-reference OTU picking in QIIME Produces an OTU table in which all features are Greengenes IDs:

Picking PICRUSt-compatible OTUs

Picking PICRUSt-compatible OTUs

Munging BIOMs Great format, hard to read! Easy to convert to/from TSV convert_biom.py -i hmp_otu_subset.tsv -o hmp_otu_subset.biom --biom_table_type="otu table"

Everything I ever needed to know about PICRUSt I learned from this web site http://picrust.github.io/picrust/tutorials/metagenome_prediction.html

Step 1: Normalize OTUs by 16S copy number We can get better predictions by dividing “raw” OTU counts by their expected 16S copy number export PYTHONPATH=`pwd`/picrust-1.0.0 ./picrust-1.0.0/scripts/normalize_by_copy_number.py -i hmp_otu_subset.biom -o hmp_otu_subset_normalized.biom -g 18may2012

Step 1: Normalize OTUs by 16S copy number convert_biom.py -i hmp_otu_subset_normalized.biom -o hmp_otu_subset_normalized.tsv -b

Step 2: Predict metagenome contents Given a normalized OTU table, PICRUSt uses gene copy numbers associated with each Greengenes tree tip to multiply and infer community gene copy numbers ./picrust-1.0.0/scripts/predict_metagenomes.py -i hmp_otu_subset_normalized.biom -o hmp_ko_subset.tsv -f -g 18may2012

Step 2: Predict metagenome contents

PICRUSt optional behavior You can associate predicted gene abundances with the organism that contains them metagenome_contributions.py You can show the confidence intervals of all predictions --with_confidence argument to predict_metagenomes.py You can summarize per-gene predictions to per-pathway predictions categorize_by_function.py You can input a PICRUSt output BIOM file into HUMAnN to reconstruct pathways

PICRUSt in Galaxy http://huttenhower.sph.harvard.edu/picrust First get data! scp from /class/stamps-shared/biobakery/data/hmp_otu_subset.tsv

PICRUSt in Galaxy Next normalize by copy number

PICRUSt in Galaxy Then predict your metagenome

You should change the extension to .tsv after downloading PICRUSt in Galaxy You can download the resulting gene table… You should change the extension to .tsv after downloading

PICRUSt in Galaxy

PICRUSt in Galaxy …or you can summarize genes to pathways Note that this only works on BIOM files

PICRUSt in Galaxy …continuing where we left off…

You should change the extension to .tsv after downloading PICRUSt in Galaxy …and then download them. You should change the extension to .tsv after downloading

PICRUSt in Galaxy

Interested? We’re recruiting postdoctoral fellows! Thanks! http://huttenhower.sph.harvard.edu Human Microbiome Project 2 Ramnik Xavier Dirk Gevers Lita Procter Jon Braun Dermot McGovern Subra Kugathasan Ted Denson Janet Jansson Bruce Birren Chad Nusbaum Clary Clish Joe Petrosino Thad Stappenbeck Alex Kostic Levi Waldron Xochitl Morgan Tim Tickle Daniela Boernigen Soumya Banerjee Human Microbiome Project Jane Peterson Sarah Highlander Barbara Methe Karen Nelson George Weinstock Owen White George Weingart Emma Schwager Eric Franzosa Boyu Ren Tiffany Hsu Ali Rahnavard Joseph Moon Jim Kaminski Regina Joice Koji Yasuda Kevin Oh Galeb Abu-Ali Morgan Langille Rob Beiko Rob Knight Greg Caporaso Interested? We’re recruiting postdoctoral fellows! Afrah Shafquat Randall Schwager Chengwei Luo Keith Bayer Moran Yassour Alexandra Sirota Jesse Zaneveld