From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/2013 1 Anna Shcherbina Bioinformatics Challenge Day 02/02/2013 From Metagenomic Sample to.

Slides:



Advertisements
Similar presentations
CSU IDRC Next Generation Sequencing Core Genomic Sequencing Services.
Advertisements

Metabarcoding 16S RNA targeted sequencing
Practical Bioinformatics Community structure measures for meta-genomics István Albert Bioinformatics Consulting Center Penn State.
Bioinformatics and Phylogenetic Analysis
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Midterm project Course: Statistics in Bioinformatics Date: 指導教授 : 陳光琦 學生 : 吳昱賢.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Metagenomics Binning and Machine Learning
Databases and tools to study the genomes of hundreds of pathogens, plants, and mammals Richard H. Scheuermann, Ph.D. Director of Informatics J. Craig Venter.
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
Metagenomic Analysis Using MEGAN4
Development of Bioinformatics and its application on Biotechnology
Discussion on Metagenomic Data for ANGUS Course Adina Howe.
Gene Expression Omnibus (GEO)
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
Accurate estimation of microbial communities using 16S tags Julien Tremblay, PhD
Metagenomic Analysis Using MEGAN?
Supporting High- Performance Data Processing on Flat-Files Xuan Zhang Gagan Agrawal Ohio State University.
Identify gene markers for different taxonomic groups in Archaea and Bacteria Genomes Dongying Wu 1,2, Jonathan A. Eisen 1,2 1. DOE Joint Genome Institute,
Conclusions and Future Work (301) Kamal Kumar, Valmik Desai, Li Cheng, Maxim Khitrov, Deepak Grover, Ravi Vijaya Satya,
DAN LAWSON BRC 2011 – ANNUAL MEETING UT SOUTHWESTERN MEDICAL CENTER DALLAS, TX SEPTEMBER 2011 Challenges and opportunities of new sequencing technologies.
Microbial diversity and virulence probing of five different body sites Anu Rebbapragada, Pub. Health Ontario Central Lab. Canada Wei-Jen Lin, Cal State.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Advancing Science with DNA Sequence Metagenome definitions: a refresher course Natalia Ivanova MGM Workshop September 12, 2012.
NCBI resources II: web-based tools and ftp resources Yanbin Yin Fall 2014 Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1.
RNA surveillance and degradation: the Yin Yang of RNA RNA Pol II AAAAAAAAAAA AAA production destruction RNA Ribosome.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Metagenomic Analysis Using MEGAN4 Peter R. Hoyt Director, OSU Bioinformatics Graduate Certificate Program Matthew Vaughn iPlant, University of Texas Super.
Tsute (George) Chen Bioinformatics Core Department of Microbiology The Forsyth Institute March 24 th, 2015 HOMD A Tour to the Data and Tools.
Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. [many slides borrowed from various sources]
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Advancing Science with DNA Sequence Natalia Ivanova MGM Workshop September 29, 2011 Metagenome analysis: use case.
Analyzing Time Course Data: How can we pick the disappearing needle across multiple haystacks? IEEE-HPEC Bioinformatics Challenge Day Dr. C. Nicole Rosenzweig.
Metagenomics at Second Genome
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
Metagenome analysis Natalia Ivanova MGM Workshop February 2, 2012.
Accurate estimation of microbial communities using 16S tags
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Sequence Search Abhishek Niroula Department of Experimental Medical Science Lund University
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Metagenomic dataset preprocessing – data reduction
CCLE Cancer Cell Line Encyclopedia Alexey Erohskin.
Bioinformatics Shared Resource Introduction to Gene Expression Omnibus (GEO) bsrweb.sanfordburnham.org
Canadian Bioinformatics Workshops
MEGAN analysis of metagenomic data Daniel H. Huson, Alexander F. Auch, Ji Qi, et al. Genome Res
Convenience Sample of 4 Adults and 6 Infants. Adults 4 visits over 2 weeks; infants 2 visits over 2 weeks Adult specimens: 1) plaque (by method, teeth,
Date of download: 6/23/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A)
Metagenomics The study of metagenomes, genetic material recovered directly from environmental samples. Term: Coined in 1998 to refer to the idea that a.
Discussion on Genomic/Metagenomic Data for ANGUS Course Adina Howe.
Date of download: 7/7/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A) DNA.
Canadian Bioinformatics Workshops
Metagenomic Species Diversity.
MGmapper A tool to map MetaGenomics data
The Original Question:
PNAS 2012 Alpha diversity: how many species are in each sample?
Unraveling the microbial profile of the rhizosphere of SDS-suppressive soils in Soybean fields Ali Y. Srour1, Jason Bond1, Leonor Leandro2, Dean Malvick3.
Workshop on the analysis of microbial sequence data using ARB
Metagenomics Image: Iverson et al. 2012, Science.
Taxonomic profiling with MetaPhlAn2
H = -Σpi log2 pi.
Analysis of the factors affecting the formation of the microbiome associated with chronic osteomyelitis of the jaw  A. Goda, F. Maruyama, Y. Michi, I.
Basic Local Alignment Search Tool
Explore Evolution: Instrument for Analysis
Volume 21, Issue 8, Pages (August 2014)
Dissemination of the mcr-1 colistin resistance gene
Multiple sequence alignment & Phylogenetics Analysis
Example usage of mockrobiota MC resource for marker gene and metagenome sequencing pipelines. Example usage of mockrobiota MC resource for marker gene.
Supporting High-Performance Data Processing on Flat-Files
Cancer Cell Line Encyclopedia
Presentation transcript:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Anna Shcherbina Bioinformatics Challenge Day 02/02/2013 From Metagenomic Sample to Useful Visual This work is sponsored by the Defense Threat Reduction Agency under Air Force Contract #FA C Opinions, interpretations, recommendations and conclusions are those of the authors and are not necessarily endorsed by the United States Government. Distribution Statement A: Approved for public release; distribution is unlimited.

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ The Opportunity NGS instruments have recently given us the ability to characterize the microbiomes that we live in and that live in us. We can get a step closer to this goal by creating a visualization program that facilitates manual data curation by a human.

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Your Mission Invent novel visualization approaches to represent metagenomic data. Subgoals: Pick out anomalies within a given dataset. Generate time series representation of multiple datasets. Compress data efficiently to allow visualization of huge datasets.

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Metagenomic datasets (FASTQ format) from clinical and environmental samples. Metagenome of the human oral cavity under healthy and diseased conditions, with a focus on supragingival dental plaque and cavities. –“oral_healthy” and “oral_diseased” datasets –Roche 454 Nose/throat swab from Nicaraguan child with acute respiratory illness –“nicaragua” dataset –Illumina The Data (I)

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Skin surface from the palm of a human hand –“palm” dataset –Roche 454 Human abscess sample of unknown etiology –“abscess” dataset –Illumina Cultivated corn soil metagenome –“soil” dataset –Illumina The Data (II)

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Our Processing Pipeline Raw FASTA reads BLAST against virus, bacteria, and archaea databases (from GenBank) Data Processing Parsed CSV summary of BLAST hits BLAST hits sorted by species, FASTA format Other BLAST parsers Data is available from each stage of the processing pipeline

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Parsed BLAST File Example for a Single Hit S _ Query Name + Query Strand 1 Query Start 232 Query End Neisseria meningitidis Query Organism Bacteria; Proteobacteria; Betaproteobacteria; Query Taxonomy 232 Identities 100 Percent 0 Number Gaps 0 Number Characters GU Target Name - Target Strand 47 Target Start 278Target End Neisseria subflava Target Organism Bacteria;Proteobacteria;Betaproteobacteria;Neisseriales;Neisseriaceae;Neisseria.Target Taxonomy CTGGGCCGTGTCTCAGTCCCAGTGTGGC Query Sequence CTGGGCCGTGTCTCAGTCCCAGTGTGGC Target Sequence BLASTN Analysis Program bacteria.gdna Database

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Your Open-Source Toolkit MEGAN4 IMG/IM KRONA (included with PhymmBl) MG-RAST METAREP Mothur Feel free to use any additional tools you think are useful.

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ MEGAN4-MEtaGenomoe ANalyzer A simple lowest common ancestor algorithm assigns reads to taxa. Taxonomic level reflects the degree of conservation of a sequence. Dissects large datasets without assembly or the targeting of specific phylogenetic markers. Graphical and statistical output for comparing different datasets.

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ MEGAN4-MEtaGenomoe ANalyzer Oral Diseased Bacteria Oral Healthy Bacteria Oral Diseased Virus Oral Healthy Virus

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ MEGAN4-MEtaGenomoe ANalyzer Oral healthy Vs. Oral diseased Bacteria Oral healthy Vs. Oral diseased Virus

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Web interface: IMG/IM – Integrated Microbial Genomes with Microbial Samples source:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ IMG/IM Phylogenetic Distribution of Genes Based on Distribution of BLAST Hits source:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ IMG/M Abundance Profile Overview source:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ KRONA allows hierarchal data to be explored with zoomable pie-charts. –Excel template or KRONA tools. –Support for several bioinformatics tools and raw data formats. KRONA source:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ MG-RAST Oral Diseased source:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ MG-RAST Oral Healthy source:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ MG-RAST Oral Diseased Oral Healthy source:

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ A Web 2.0 application to analyze and compare annotated metagenomic datasets. Compare absolute and relative counts of multiple datasets at various functional and taxonomic levels. Statistical tests, multidimensional scaling, heatmap and hierarchal clustering plots. JCVI Metagenomics Reports (METAREP) source: Heatmap Plot Hierarchical Clustering Plot METASTAT Results

From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ A single platform for sequence alignment, pairwise distance calculation, distance matrix analysis. Venn diagrams, community trees, heat maps, sample-based rarefaction curves. Mothur: 16S rRNA Sequence Analysis