Microbiome: 16S rRNA Sequencing

Slides:

Advertisements

Similar presentations

16S sequencing for microbiome studies Nicola Segata and Nick Loman

Advertisements

Use of the genomic data o Reconstruction of metabolic properties o Nature’s Microbiome o NGS in Population Genetics.

Metabarcoding 16S RNA targeted sequencing

MICROBIAL TAXONOMY Phenotypic Analysis Genotypic Analysis.

Practical Bioinformatics Community structure measures for meta-genomics István Albert Bioinformatics Consulting Center Penn State.

The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.

The Microbiome and Metagenomics

Zachary Bendiks. Jonathan Eisen  UC Davis Genome Center  Lab focus: “Our work focuses on genomic basis for the origin of novelty in microorganisms (how.

Microbial taxonomy and phylogeny

H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.

Accurate estimation of microbial communities using 16S tags Julien Tremblay, PhD

Prokaryote Taxonomy & Diversity

Taxonomy of Cellular Life Taxonomy: classification (hierarchical grouping based on characteristics); nomenclature (naming); identification (define characteristics.

Identification and Classification of Prokaryotes

Abstract Our current understanding of the taxonomic and phylogenetic diversity of cellular organisms, especially the bacteria and archaea, is mostly based.

Analyzing Time Course Data: How can we pick the disappearing needle across multiple haystacks? IEEE-HPEC Bioinformatics Challenge Day Dr. C. Nicole Rosenzweig.

The Microbiome and Metagenomics

Accurate estimation of microbial communities using 16S tags

Major characteristics used in taxonomy

Higher Human Biology Unit 1 Human Cells KEY AREA 5: Human Genomics.

The use of 16S rRNA gene sequences to study phylogeny and taxonomy.

Canadian Bioinformatics Workshops

Date of download: 6/23/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A)

Date of download: 7/7/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A) DNA.

Tools for microbial community analysis. What I am not going to talk  Culture dependent analysis  Isolate all possible colonies  Infer community  Test.

16S rRNA Experimental Design

Quantitative Phylogenetic Assessment of Microbial Communities in Diverse Environments Xinjun Zhang.

16S RNA sequencing analysis

Rob Edwards San Diego State University

Canadian Bioinformatics Workshops

Metagenomic Species Diversity.

Classification BIO – Explain the historical development and changing nature of classification systems. BIO – Analyze the classification of.

How to Use This Presentation

Metagenomics: From Bench to Data Analysis 19-23rd September S rRNA-based surveys for Community Analysis: How Quantitative are they? Dr.

Peter Sterk EBI Metagenomics Course 2014

Classification BIO – Explain the historical development and changing nature of classification systems. BIO – Analyze the classification of.

Chapter 17 Table of Contents Section 1 Biodiversity

PNAS 2012 Alpha diversity: how many species are in each sample?

BIOLOGY: Characteristics of Living Things

Copyright Pearson Prentice Hall

Introductory Microbiology Dr. Hala Al Daghistani

Microbial Taxonomy and the Evolution of Diversity

Microbial Taxonomy Classification Systems Levels of Classification

Research in Computational Molecular Biology , Vol (2008)

Everyone is a Biologist: Studier of Life!

Unit 1 – Science Inquiry Biology.

Workshop on the analysis of microbial sequence data using ARB

Classification of Organisms

Human Cells Human genomics

DNA-based technology New and old technologies that are utilized in biotechnology DNA cloning DNA libraries Polymerase chain reaction (PCR) Genome sequencing.

Copyright Pearson Prentice Hall

Genomes and Their Evolution

Chapter 17 Table of Contents Section 1 Biodiversity

H = -Σpi log2 pi.

Introductory Microbiology Dr. Hala Al Daghistani

Microbiome: Metagenomics

Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon

Copyright Pearson Prentice Hall

Classifying Organisms

Chapter 17 Table of Contents Section 1 Biodiversity

Classification Chapter 18.

Systematics Systematics is the science of categorizing organisms into like groups and establishing their relationship relative to each other. Eight major.

Reading Phylogenetic Trees

Microbiome studies for microbial disease pathogenesis research

Copyright Pearson Prentice Hall

Volume 10, Issue 4, Pages (October 2011)

Unit Genomic sequencing

Daniel A. Peterson, Daniel N. Frank, Norman R. Pace, Jeffrey I. Gordon

Copyright Pearson Prentice Hall

Toward Accurate and Quantitative Comparative Metagenomics

Presentation transcript:

Microbiome: 16S rRNA Sequencing 3/30/2018

Skills from Previous Lectures

Central Dogma of Biology Lecture 3: Genetics and Genomics Lecture 4: Microarrays Lecture 12: ChIP-Seq

Phylogenetics Lecture 13: Phylogenetics

(Linnaean) Taxonomy Domain Kingdom Phylum Class Order Family Genus Species Eukaryota Animalia Chordata Mammalia Primates Hominidae Homo Homo sapiens Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae Escherichia Escherichia coli Lecture 13: Phylogenetics

Illumina Sequencing Read Length Single- vs. Paired-End Reads Number of Reads Coverage/Depth Lecture 5: 2nd (Next) Generation Sequencing Lecture 10: RNA-Seq

Quality Assessment & Trimming Lecture 6: Sequence Analysis 1 – WGS/WES Lecture 10: RNA-Seq

Sequence Analysis Assembly: Putting short sequences together to reconstruct a longer, source sequence Mapping: Locating where one short sequence is found in a longer sequence Pattern Recognition: Looking for specific patterns within sequences that have special meaning Lecture 3: Genetics and Genomics Lecture 5: 2nd (Next) Generation Sequencing

Sequence Alignment Lecture 3: Genetics and Genomics Lecture 13: Phylogenetics

Microbiome

Microbiome Bacteria + Archaea Fungi Viruses

Microbiome Not all microbes are bad Still trying to understand microbe-microbe and microbe-host interactions Traditional approaches rely on isolating microbes We are unable to culture many microbial species → Need to directly sequence

Microbiome Want to understand the function and structure of microbial communities Function = genes + metabolic pathways Structure = species richness and distribution

Human Microbiome Project Launched 2008, Finished (phase 1) 2012 Second phase is ongoing Championed culture-independent methods 16S rRNA Sequencing Whole Genome Sequencing Created reference genomes! Not the first study to survey microbiomes Just the first to survey the human microbiome large-scale https://hmpdacc.org/ https://commonfund.nih.gov/hmp/

Human Microbiome Composition Different microbes colonize different body areas Lots of variation between individuals The Human Microbiome Project Consortium. (2012). Structure, function and diversity of the healthy human microbiome. Nature

16S rRNA Sequencing Biology

Metagenomics Genomics: Single organism Metagenomics: Group of (micro)-organisms

Why rRNA? Relatively short (~1.5 kb) Highly conserved Short = cheap Highly conserved Forms a ribosome, which is highly translated, so most mutations will impact fitness Generally different between species Also enough information in smaller (variable) regions == cheap Is found in all organisms Component of self-replicating systems Is readily isolated Is highly conserved Probably because it is critical to cell function Has a low rate of horizontal gene transfer and recombination Has sufficient genetic information to differentiate closely-related organisms it is relatively small (~1.5 kb), it has a high enough level of sequence conservation between microbial species to permit reliable alignments, and it possesses sufficient variation to infer evolutionary relationships. Got “hot” in the 1990s “Comparative analysis of molecular sequences has become a powerful approach to determining evolutionary relationships.” “To determine relationships covering the entire spectrum of extant living systems, one optimally needs a molecule of appropriately broad distribution…ribosomal RNA [fits this requirement].” “It is a component of all self-replicating systems; it is readily isolated; and its sequence changes but slowly with time – permitting the detection of relatedness among very distant species.” Woese & Fox (1977). Phylogenetic structure of the prokaryotic domain: the primary kingdoms. PNAS Lane, Pace, Olsen, Stahl, Sogin, & Pace (1985). Rapid determination of 16S ribosomal RNA sequence for phylogenetic analyses. PNAS

Central Dogma Translation requires ribosomes… …ribosomal RNA (rRNA) encode ribosomes

16S rRNA Gene Ribosomal RNA gene 9 variable regions Encodes the 30S small subunit 9 variable regions Used as signatures to determine phylogeny Choose which region(s) to sequence Several universal regions Used to create primers S = “Svedberg” Svedberg = unit of molecular size V5&V6 V4 V7&V8 V3 V1&V2 V9

rRNA Genes as a Marker 16S rRNA 18S rRNA Woese used rRNA to classify life into three domains Woese, Kandler, & Wheelis (1990). Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. PNAS

Microbiome Bacteria + Archaea Fungi Viruses Metagenomic/ Whole Genome Shotgun Sequencing Bacteria + Archaea Fungi Viruses 16S rRNA Sequencing 18S rRNA Sequencing

Taxonomic Resolution Domain Kingdom Phylum Class Order Family Genus Species 16S/18S rRNA does not yield species-level information → genus-level, at best, but usually higher Closely-related species have a high sequence similarity across the 16S gene Typically don’t sequence the whole gene, just 1(+) variable regions

Population Analysis Alpha Diversity: Within a sample Beta Diversity: Between samples Evenness: Distribution of taxa Richness: Number of taxa Rarefaction Curves: Alpha diversity vs number of observations Used to test whether an environment has been sufficiently sequenced to observe all taxa PCA/PCoA: Principal Component/Coordinate Analysis NMDS: Non-Metric Multidimenstional Scaling Dimensionality reduction techniques for visualization

PCR PCR: Polymerase Chain Reaction “Molecular photocopying” Technique to amplify/copy small segments of DNA How do you know you discovered something important? Win Nobel Prize Have people stop citing your work

16S rRNA Sequencing Survey: Taxonomic Classification Sample 16S rRNA Who’s there? Taxonomic Classification Population Analysis Alpha diversity Beta diversity Over/under-representation 16S rRNA Sequencing Sample

16S rRNA Sequencing Bioinformatics

Basic Workflow Sample microbiome Harvest gDNA Amplify 16S rRNA gene (or region) Sequence Taxonomic Classification Population analysis Alpha diversity Beta diversity Over/under representation

Things to Keep in Mind Sequencing platform Error rate, biases, read length, noise Choice of variable region(s) Amplification process Error rate, biases, choice of primer, DNA template concentration, PCR cycle number, introduction of chimeras Coverage/Depth

Preprocessing Quality Assessment & Trimming Remove adapters, PCR primers, and low-quality bases Demultiplex using barcodes, discard reads without a barcode

Binning OTU: Operational Taxonomy Unit (also “phylotype”) A group of sequences clustered together based purely on similarity and an arbitrary threshold May or may not be equivalent to taxonomical entities (species, genera, etc.) Can cluster based on similarity to a reference database or de novo (compared to each other) Can also cluster de novo and then assign taxonomy

Binning Cutoffs: What percent sequence identify should you use? → Will depend on the error rate, etc.

Software Packages mothur (your project) QIIME (“chime”) bioBakery* (PICRUSt) CloVR* *Not currently on the SCC, except for specific projects

Taxonomy Databases Greengenes MG-RAST NCBI RDP SILVA *A different database may give you different taxonomic results

Notes 16S rRNA Sequencing analysis is qualitative Surveying who is there PCR depends on the a priori knowledge assumption of universal primers May yield an altered/incomplete estimation of diversity Also, uneven primer annealing, uneven amplification, etc… Are converting either to binary data (presence/absence) or normalizing (relative abundance) Make sure you are using the appropriate statistical/analysis tools for binary and normalized data!

Notes Taxonomic classification will depend on The resolution which variable region of the 16S rRNA gene is used The primers used for PCR Which database is used Which software package is used What cutoff is used Sequence coverage/depth Sequencing platform The species composition in the community being analyzed When you perform the analysis …

Considerations Define the question as precisely as possible. What controls do you need? What sequencing platform will you use? Illumina is the typical platform (right now) What region of the 16S rRNA gene will you amplify? V4 usually yields genus-level How many reads do you need per sample? Coverage/Depth What are hidden technical issues? Example: Chimeras What analysis tool will you use? How will you display your data? How will you compare your results with other published studies?