16S RNA sequencing analysis

Slides:



Advertisements
Similar presentations
Clostridium difficile Colitis or Dysbiosis. Symbiostasis/Dysbiosis.
Advertisements

Metabarcoding 16S RNA targeted sequencing
Practical Bioinformatics Community structure measures for meta-genomics István Albert Bioinformatics Consulting Center Penn State.
The Microbiome and Metagenomics
DNA Fingerprinting of Bacterial Communities. Overview Targets gene for ribosomal RNA (16S rDNA) Make many DNA copies of the gene for the entire community.
From Metagenomic Sample to Useful Visual Anna Shcherbina 01/10/ Anna Shcherbina Bioinformatics Challenge Day 02/02/2013 From Metagenomic Sample to.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
Accurate estimation of microbial communities using 16S tags Julien Tremblay, PhD
Prokaryote Taxonomy & Diversity
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Elucidating factors behind pair wise distances discrepancies between short and near full-length sequences. We hypothesized that since the 16S rRNA molecule.
Accurate estimation of microbial communities using 16S tags
Diversity of Soil Microbes. Approaches for Assessing Diversity Microbial community Organism isolation Culture Nucleic acid extraction Molecular characterization.
Canadian Bioinformatics Workshops
Date of download: 6/23/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A)
Date of download: 7/7/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A) DNA.
Soil Microbiome of Native and Invasive Marsh Grasses in Blackbird Creek, Delaware Lathadevi K.Chintapenta 1#, Gulnihal Ozbay 1#, Venu Kalavacharla 1* Figure.
16S rRNA Experimental Design
Quantitative Phylogenetic Assessment of Microbial Communities in Diverse Environments Xinjun Zhang.
Canadian Bioinformatics Workshops
Presented By: Emily Lamoureux
Metagenomic Species Diversity.
Metagenomics: From Bench to Data Analysis 19-23rd September S rRNA-based surveys for Community Analysis: How Quantitative are they? Dr.
Micelle PCR reduces artifact formation in 16S microbiota profiling
Metagenomics Rob Edwards.
A non-endoscopic device to sample the oesophageal microbiota: a case-control study  Daffolyn R Fels Elliott, MD, Alan W Walker, PhD, Maria O'Donovan, MD,
Research in Computational Molecular Biology , Vol (2008)
Workshop on the analysis of microbial sequence data using ARB
Design and Analysis of Single-Cell Sequencing Experiments
Microbiome: 16S rRNA Sequencing
H = -Σpi log2 pi.
Microbial community dissimilarity.
Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon 
Volume 137, Issue 2, Pages (August 2009)
Volume 17, Issue 3, Pages (March 2015)
Volume 20, Issue 5, Pages (November 2014)
Volume 21, Issue 8, Pages (August 2014)
Inference of Environmental Factor-Microbe and Microbe-Microbe Associations from Metagenomic Data Using a Hierarchical Bayesian Statistical Model  Yuqing.
Genetic Determinants of the Gut Microbiome in UK Twins
Intervention effects on taxonomic and functional pathway diversity of the intestinal microbiome. Intervention effects on taxonomic and functional pathway.
Conducting a Microbiome Study
Fractions of 16S rRNA genes from bacteria (top panel) and archaea (bottom panel) in public databases from primer-amplified metagenomes (with and without.
(a) PCoA of the abundance of unique OTUs per sample from the 16S marker gene sequencing data from the AGP data repository (small spheres) and the San Diego.
RRNA Modifications in an Intersubunit Bridge of the Ribosome Strongly Affect Both Ribosome Biogenesis and Activity  Xue-hai Liang, Qing Liu, Maurille.
Volume 10, Issue 11, Pages (March 2015)
Skin Microbiome Surveys Are Strongly Influenced by Experimental Design
Taxonomic composition of the baboon and human gut microbiota.
Structure of benthic microbial communities of residential and industrial land use types before and after two rain events in urban waterways are shown.
Ruth E. Ley, Daniel A. Peterson, Jeffrey I. Gordon  Cell 
Example of amplicon performance in our presented workflow.
Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon 
Volume 20, Issue 5, Pages (November 2014)
Community diversity and metagenome depth interact to influence assembly quality. Community diversity and metagenome depth interact to influence assembly.
Volume 20, Issue 4, Pages (October 2016)
Aboveground and belowground samples showed differences in their bacterial community structures and compositions, while bulk soil and root communities differed.
Matthew A. Campbell, Piotr Łukasik, Chris Simon, John P. McCutcheon 
Benchmarks of OTU picking tools on artificial communities.
Research Techniques Made Simple: Profiling the Skin Microbiota
Fig. 1 A phylogenetically cohesive core rumen microbiome was found across farms with highly conserved hierarchical structure and tight association to overall.
Bacterial composition of olive fermentations is affected by microbial inoculation. Bacterial composition of olive fermentations is affected by microbial.
by Peter J. Turnbaugh, Vanessa K. Ridaura, Jeremiah J
Dynamic microbiome evolution in social bees
Relative proportions of taxa and UPGMA hierarchical clustering of the mock communities. Relative proportions of taxa and UPGMA hierarchical clustering.
Toward Accurate and Quantitative Comparative Metagenomics
Relative abundance of taxa in the 16S rRNA PCR amplicon and gDNA mock communities. Relative abundance of taxa in the 16S rRNA PCR amplicon and gDNA mock.
Host-Associated Quantitative Abundance Profiling Reveals the Microbial Load Variation of Root Microbiome  Xiaoxuan Guo, Xiaoning Zhang, Yuan Qin, Yong-Xin.
Variations in beta and alpha diversity of gut microbiome bacterial communities in relation to presence of Blastocystis. Variations in beta and alpha diversity.
Variations in beta and alpha diversity of gut microbiome eukaryotic communities explained by presence of Blastocystis. Variations in beta and alpha diversity.
General overview of the bioinformatic pipelines for the 16S rRNA gene microbial profiling and shotgun metagenomics. General overview of the bioinformatic.
Fig. 3 Postnatal assembly of the humanized gut microbiota.
Presentation transcript:

16S RNA sequencing analysis Bioinformatics and 16S RNA sequencing analysis http://okinbretest.ouhsc.edu/ http://okinbretest.ouhsc.edu/Bioinformatics.aspx

Three 16S rRNA Sequencing Questions to Answer Why would we want to do it? How is it performed? What is 16S rRNA sequencing?

What is bioinformatics? Terms to Define Biology Computer Science Math/Statistics What is bioinformatics?

What is Amplicon Sequencing? When a particular gene or gene fragment is amplified and the sequence determined to achieve insight when studying microbiomes (all microorganisms in a particular environment). http://users.ugent.be/~avierstr/principles/pcr.html

Metagenome-the genomes of the total microbiota in a community. Metagenomics allows us to extract DNA sequences from a microbial community in nature bypassing the need for cultures.

What is the 16S gene and why us it? 16S – named because rate at which it sediments in ultracentrifuge (Svedberg) Protein found in small subunit of ribosome (Present in all species) Ubiquitous - Highly conserved because mutation would probably be deleterious Extreme sequence conservation useful for primers Variable regions for organism identification Well annotated reference databases http://www.alimetrics.net/en/index.php/dna-sequence-analysis http://www.nature.com/nrmicro/journal/v12/n9/fig_tab/nrmicro3330_F1.html

Variable and conserved regions within the 16S rRNA gene Adapted from: Kevin E. Ashelford et al. Appl. Environ. Microbiol. 2005;71:7724-7736 http://www.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/16s/16s-metagenomic-library-prep-guide-15044223-b.pdf Illustrating variable regions within the 16S rRNA gene and location of chimeric breakpoints. (A) The frequency of occurrence of the most common nucleotide residue at each base position within the 16S rRNA gene, as determined from RDP-listed 4,383 type strains, with E. coli U00096 as a reference. These frequencies are measures of variability within the gene. (B) Smoothing the data, by taking the mean frequency within a window of 50 bases, moving one base at a time along the gene, creates the plot shown in panel B. The locations of the hypervariable regions are labeled, with gray bars on the x axis defining these regions as V1 to V9 (the Comparative RNA Web Site [http://www.rna.icmb.utexas.edu/]). (C) Histogram of all chimera breakpoints identified in this study and that of Hugenholtz and Huber (8). Fry and Ellis, Patent Publication number WO2014138119 A2 Example of primers

Variation in rRNA gene copy number Valdivia-Anistro, J.A., et al., Front. Microbiol. 05 January 2016| http://dx.doi.org/10.3389/fmicb.2015.01486 Most bacteria contain more than one rRNA operon, and copy number varies Can affect abundance estimates And some bacteria have high levels of sequence divergence in the rRNA operons This can inflate diversity estimates Can attempt to correct this (PICRUSt, CopyRighter)

16S pipeline Prepping Sample Extracting DNA Sampling Bioinformatics http://newsexaminer.net/opinion/were-treating-soil-like-dirt-its-a-fatal-mistake-because-all-human-life-depends-on-it/ http://mrdnalab.com/dna-sequencing/illumina-miseq.html http://www.copybook.com/pharmaceutical/companies/anachem/articles/the-new-kapa-express-extract-from-anachem-ltd https://www.neb.com/protocols/2015/01/23/setting-up-the-pcr-reaction-e7600 http://www.clipartbest.com/cartoon-computer-pictures Sampling Extracting DNA Prepping Sample Bioinformatics Sequencing

Which sequencing chemistry to use? Kuczynski et al., Nat. Rev. Genet. 13: 4-58 (2012)

Sequencing by Synthesis and Basecalling http://openwetware.org/wiki/BioMicroCenter:Sequencing

Illumina Inc. Video Illustrating Prepping and Sequencing by Synthesis Illumina Sequencing Technology Intro to Sequencing by Synthesis: Industry-leading Data Quality https://www.youtube.com/watch?v=womKfikWlxM https://www.youtube.com/watch?v=HMyCqWhwB8E

Bioinformatics

Workflow for deep sequencing of 16S rRNA gene amplicons Schematic overview of the workflow involved to analyze the microbiota composition by deep sequencing of 16S rRNA gene amplicons. Ines Yang et al. FEMS Microbiol Rev 2013;37:736-761

Operational Taxonomic Units (OTUs) and Annotation - MiSeq Reporter Operational taxonomic unit-extant taxon Cluster of similar amplicon sequences 97% identity commonly used Pre-clustering to collapse all identical sequences into one category speeds analysis OTU determination De novo methods cluster by similarity with no reference to outside sequences Taxonomy-based methods cluster based on similarity to known sequences-uses ClassifyReads, a proprietary algorithm Combined taxonomy + de novo methods Known sequence databases Ribosomal Database Project (https://rdp.cme.msu.edu/) GreenGenes (http://greengenes.lbl.gov/cgi-bin/nph-index.cgi)

Statistically Speaking and Interpreting the Data

Species Diversity Results Species richness Number of species present in a sample Determined by the number of OTUs present Influenced by sequencing depth Species evenness How close in numbers each species is in a sample Species diversity Composite of species richness and species evenness

α-diversity and Shannon’s diversity index Diversity within a sample Measures the amount of information needed to describe every member of the community Shannon’s diversity index in an information statistic index If pi is the proportion of individuals of species i, then the diversity (H’) is: From this, one can calculate evenness, which is the ratio of the actual H’ to the maximum value (and so ranges from 0 to 1) Adapted from: http://ww2.tnstate.edu/ganter/B412%20L16%20Communities.html

Species Diversity Results Species richness Number of species present in a sample Determined by the number of OTUs present Influenced by sequencing depth Species evenness How close in numbers each species is in a sample Species diversity Composite of species richness and species evenness

Flower species Field 1 Field 2 Daisy 300 20 Dandelion 335 49 https://www.greenthumb.co.uk/help-and-advice/lawn-problems/weeds Flower species Field 1 Field 2 Daisy 300 20 Dandelion 335 49 Buttercup 365 931 Total 1000 1000 Which field is more diverse? Field 1

Hierarchical Clustering Dendrogram Dendrogram: a tree diagram Topology- how closely sample related The two samples (clusters) most similar are clustered together forming a new cluster. At each step, the next closest sample is clustered with the new cluster.

What is a Principle Coordinate Analysis (PCoA) Graphically represents similarities or dissimilarities of data. Begins with a distance matrix ends with computation of Eigen values and vectors. Objects ordinated closer to one another are more similar than those ordinated further away.

Why do 16S rRNA sequencing? http://commonfund.nih.gov/hmp/overview

Scheme for studying the “normal” human microbiome. Bacterial distribution by body site. This figure shows the distribution by body site of bacteria that have been sequenced under the HMP or are in the sequencing pipelines. The NIH HMP Working Group, Peterson J, Garges S, et al. The NIH Human Microbiome Project. Genome Research. 2009;19(12):2317-2323. doi:10.1101/gr.096651.109.

α Diversity rarefaction curves of cutaneous microbiota in psoriasis (lesion), unaffected and control specimens. (A) Taxonomical richness trends towards decreasing α diversity in unaffected and lesion specimens relative to control, with no statistically significant differences between skin types. (B) Shannon index is significantly different (decreases from control to unaffected to lesion) among skin types at all taxonomic levels (P <0.05), except at the operational taxonomical unit (OTU) level. (C) Analysis of taxa sharing. Taxa present in <3 samples excluded from the analysis. Taxa that are only observed in one clinical skin type are denoted as ‘unique’. Taxa that are present in two types of skin are denoted as ‘shared’. The data show that nearly all taxa are represented in all three types of skin. The shading represents the relative distribution (heatmap) for each column number (green = low, yellow = intermediate, red = high). Alekseyenko AV, Perez-Perez GI, De Souza A, et al. Community differentiation of the cutaneous microbiota in psoriasis. Microbiome. 2013;1:31. doi:10.1186/2049-2618-1-31.

Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences Morgan G I Langille, et al. Nature Biotechnology 31, 814–821 (2013) doi:10.1038/nbt.2676 (2013).

Shannon Species Diversity Number of Species Identified Site 1 Site 2 Site 4 Site 3 Site 5 Site 6 Sample Shannon Species Diversity Number of Species Identified St. James Place 2.625 653 53rd Street 2.553 1490 44th Street West 2.571 1478 44th Street East 2.659 1178 38th Street 2.727 1545 Coombs Road 2.826 462

Summary What? How? Why? Bioinformatics Amplicon sequencing and metagenomics 16S gene How? Pipeline and Different Sequencing Platforms Bioinformatic’s Step Operational Taxonomic Units Statistical Terms Alpha Diversity and Shannon Index Species Diversity, Richness and Evenness Hierarchical Clustering Dendrogram and PCoA Why? Human Microbiome Project – Microbiota in Psoriasis Demonstration Project Cameron University’s Wolf Creek Microbial Diversity Project