Genboree Microbiome Toolset Kevin Riehle 4/3/12 – NIH Cloud Workshop Boulder, Colorado
Kevin Riehle Collaborators Aleks Milosavljevic Cristi Coarfa Andrew Jackson Arpit Tandon Sameer Paithankar Sriram Raghuraman Aagaard Lab Kjersti Aagaard Jun Ma Versalovic Lab James Versalovic Emily B. Hollister Delphine Saulnier Toni-Ann Mistretta Sabeen Raza James Veralovic Sabeen Raza Toni-Ann Mistretta Cristi Coarfa Kjersti Aagaard Aleksandar Milosavljevic Jun Ma Andrew Jackson Sameer Paithankar Emily B. Hollister Delphine Saulnier Sriram Raghuraman Arpit Tandon Matt Roth
Overview Genboree Introduction – Manuscripts – Overview Data + Tools – Lean vs. obese twins study – Grid viewer + 16S Samples Data + Mashups – Grid viewer + WGS Genes / Pathways + KEGG Virtual Integration – Multiple databases existing within multiple servers in different physical locations Conclusions
Genboree Microbiome Toolset Riehle K, Coarfa C, Jackson A, Ma J, Tandon A, Paithankar S, Raghuraman S, Mistretta TA, Saulnier D, Raza S, Diaz MA, Shulman R, Aagaard K, Versalovic J, Milosavljevic A. The genboree microbiome toolset and the analysis of 16S rRNA microbial sequences. BMC Bioinformatics 2012; In Press.
Large Scale Applications Metagenomic-Based Approach to a Comprehensive Characterization of the Vaginal Microbiome Signature in Pregnancy – Kjersti Aagaard, in review
Genboree Introduction Groups Permissions Databases Projects Browser Workbench
Genboree Introduction Genboree.org – Everyone should have received an regarding their Genboree account Genboree.org/microbiome – Tutorial – FAQ – This PointPoint – Etc. Questions: – Ask later, interrupt now, etc.
16S rRNA SFF / SRASample Meta Data Quality Filtered Sequences Multi-step OTU Picking Remove Chimeras Taxonomic Classification Representative Sequences OTU TablePhylogenetic Tree Beta Diversity Alpha Diversity Classification Feature Selection Taxonomic Abundance
Data Tree Selector Genboree Workbench Various Data Types Item Details Data Type Filter Input Data Output Targets
Activated Tool Non-Activated Tool Genboree Workbench
Workbench Flow TransferAssociateInitializeAnalyze SRR, SFF Sequences Subject Meta Data Sample Record Quality Filtered Sample Sequences Quality Filtered Sample Sequences Samples Sample Set α β
Samples Import Samples – If sample does not exist, create – If sample exists Add metadata if metadata does not exist Update metadata if metadata exists and differs Sample – File Linker Add Sample Set Delete Sample Set(s) Add Samples to Sample Set Remove Samples from Sample Set(s)
Data + Tools Lean vs obese twin study
Lean vs. Obese Twins Study
94 samples – 49 Lean – 45 Obese V6 primer region 454 – 16S rRNA
Lean vs. Obese Twins Study
Genboree Project Integration bin/project.jsp?projectName=Turnbaugh_lean _obese_twins_project bin/project.jsp?projectName=Turnbaugh_lean _obese_twins_project
Phylogenetic Visualizations - iTOL
Lean vs. Obese Twins Study
Vol 457|22 January 2009| doi: /nature07540
Lean vs. Obese Twins Study Vol 457|22 January 2009| doi: /nature07540
Grid Viewer Provides an interactive view of Samples from 1 to many databases Databases may exist in different physical locations (virtual integration) (will discuss more later) Users can save Sample Sets in which to analyze Users can select Samples in which to explore Genes and Pathways (WGS only)
HMP Data Metrics Phase I and Phase II – RP RP – RP RP – > 13,000 samples
Grid Viewer
16S rRNA Sample Grid Viewer
Then show how we can use these sample sets for analysis on the GMT
16S rRNA Sample Grid Viewer bin/sampleGridViewer.jsp?dbList=http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2 Fgrp%2FHMP-16S-rRNA-phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F&gbGridXAttr=primer_region&gbGridYAttr=body_site&xlabel=primer_region &ylabel=body_site&gridTitle=Samples%20from%20HMP-16S-I- II&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II bin/sampleGridViewer.jsp?dbList=http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2 Fgrp%2FHMP-16S-rRNA-phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F&gbGridXAttr=seq_center&gbGridYAttr=primer_region_PLUS_body_site&xla bel=seq_center&ylabel=primer_region_PLUS_body_site&gridTitle=Samples%20fro m%20HMP-16S-I- II&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II bin/sampleGridViewer.jsp?dbList=http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2 Fgrp%2FHMP-16S-rRNA-phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F&gbGridXAttr=seq_center&gbGridYAttr=primer_region_PLUS_body_site&xla bel=seq_center&ylabel=primer_region_PLUS_body_site&gridTitle=Samples%20fro m%20HMP-16S-I- II&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II
Data + Mashup Genes and Pathways – View samples + tracks within Grid Viewer – View output within Gene Browser and Pathway Browser – View Pathways within KEGG
Data + Mashup
Virtual Integration Accessing data that exists within different physical servers
Virtual Integration
phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=TY PE&gbGridYAttr=DNA_extraction_site_PLUS_seq_center_PLUS_body_site_PLUS_primer_region&xlabel=TYPE&ylabel=DNA_extraction_ site_PLUS_seq_center_PLUS_body_site_PLUS_primer_region&gridTitle=Samples%20from%20HMP-16S-I- II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=TY PE&gbGridYAttr=DNA_extraction_site_PLUS_seq_center_PLUS_body_site_PLUS_primer_region&xlabel=TYPE&ylabel=DNA_extraction_ site_PLUS_seq_center_PLUS_body_site_PLUS_primer_region&gridTitle=Samples%20from%20HMP-16S-I- II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=pri mer_region&gbGridYAttr=body_site&xlabel=primer_region&ylabel=body_site&gridTitle=Samples%20from%20HMP-16S-I- II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=pri mer_region&gbGridYAttr=body_site&xlabel=primer_region&ylabel=body_site&gridTitle=Samples%20from%20HMP-16S-I- II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=ba rcode&gbGridYAttr=primer_region_PLUS_body_site&xlabel=barcode&ylabel=primer_region_PLUS_body_site&gridTitle=Samples%20fr om%20HMP-16S-I-II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=ba rcode&gbGridYAttr=primer_region_PLUS_body_site&xlabel=barcode&ylabel=primer_region_PLUS_body_site&gridTitle=Samples%20fr om%20HMP-16S-I-II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=TY PE&gbGridYAttr=primer_region_PLUS_body_site&xlabel=TYPE&ylabel=primer_region_PLUS_body_site&gridTitle=Samples%20from%20 HMP-16S-I-II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X phaseI-phaseII%2Fdb%2FHMP-16S-I- II%3F,http%3A%2F%2Fgenboree.org%2FREST%2Fv1%2Fgrp%2FPublic_16S_experiment_data%2Fdb%2FDisease_X%3F&gbGridXAttr=TY PE&gbGridYAttr=primer_region_PLUS_body_site&xlabel=TYPE&ylabel=primer_region_PLUS_body_site&gridTitle=Samples%20from%20 HMP-16S-I-II,Disease_X&pageTitle=Sample%20Grid%20Viewer:%20Samples%20from%20HMP-16S-I-II,Disease_X