Next generation sequencing

Slides:



Advertisements
Similar presentations
RNA-seq library prep introduction
Advertisements

The Past, Present, and Future of DNA Sequencing
The Good, Bad, and Ugly of Next-Gen Sequencing
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Next–generation DNA sequencing technologies – theory & practice
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
Canadian Bioinformatics Workshops
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
What Is Genomics? Genomics is the study of how the entire genome of a species functions as a unit and evolves over time. It is the study of life’s blueprint,
Greg Phillips Veterinary Microbiology
The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.
1 Library Screening, Characterization, and Amplification Screening of libraries Amplification of DNA (PCR) Analysis of DNA (Sequencing) Chemical Synthesis.
DNA Sequencing and Gene Analysis
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Polymerase chain reaction: Starting with VERY SMALL AMOUNTS OF DNA (sometimes a few molecules), one can amplify the DNA enough to detect it by electrophoresis.
A Contract Research and Services Organization. Ideas to Life! A Contract Research and Services Organization  Xcelris is a Specialty Contract Research.
High Throughput Sequencing
Department of Bioinformatics and Computational Biology
CS 6293 Advanced Topics: Current Bioinformatics
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
with an emphasis on DNA microarrays
Update on Next-Generation Sequencing
The impact of next-generation sequencing technology of genetics Elaine R. Mardis – 11 February Washington School of Medicine, Genome Sequencing Center.
MCB 7200: Molecular Biology
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
High Throughput Sequencing Methods and Concepts
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
Bioinformatics and Sequencing Relevant to SolCAP
High Throughput Sequencing Methods and Concepts Cedric Notredame adapted from S.M Brown.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Molecular Biology Dr. Chaim Wachtel May 28, 2015.
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX
Proteome and Gene Expression Analysis Chapter 15 & 16.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Biotechnology and Genetic Engineering PBIO 450/550 Characterization of DNA clones including: Restriction Enzyme (RE) mapping Subcloning Southerns Northerns*
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Microarray: An Introduction
Library QA & QC Day 1, Video 3
DNA Sequencing First generation techniques
Next-generation sequencing technology
DNA Sequencing Second generation techniques
The Transcriptional Landscape of the Mammalian Genome
Microbial Genomes and techniques for studying them.
Sequencing technologies
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017
Next-generation sequencing technology
Very important to know the difference between the trees!
Functional Genomics in Evolutionary Research
Microarray Technology and Applications
Introduction to NGS.
Sequencing Technologies
SOLEXA aka: Sequencing by Synthesis
Polymerase Chain Reaction (PCR)
DNA Technology.
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
CHAPTER 12 DNA Technology and the Human Genome
ULTRASEQUENCING. Next Generation Sequencing: methods and applications.
Sequencing techniques
DNA and the Genome Key Area 8a Genomic Sequencing.
Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine
High-Throughput Sequencing Technologies
High-Throughput Sequencing Technologies
Next-generation DNA sequencing
Microarray Data Analysis
Standard (Sanger) sequencing
Data Type 1: Microarrays
Presentation transcript:

Next generation sequencing Platforms, chemistries, and applications

Outline Sanger sequencing “Next generation sequencing” (NGS) Chain termination with modified dNTPs “Next generation sequencing” (NGS) “Sequencing by synthesis” systems Pyrosequencing refers to Roche GS FLX (formerly “454”) 3rd generation sequencing (discussed by Kristen) e.g., Nanostring

Sanger sequencing Method of choice for years Based on chain-terminating nucleotides Automated by Applied Biosystems using fluorescently-labeled chain terminators Capillary

Method Extract DNA Shear/digest and clone PCR amplify (cloning optional) Sequencing reaction primer DNA polymerase regular dNTPs fluorescently-tagged, chain- terminating dNTPs Imaging CCD reads fluorescence as fragments pass through capillary

Sanger sequencing: pros & cons Long read lengths: up to ~700 bp Most flexible in throughput: from 1 to 1,000s of samples Convenient: found in many facilities Cons Expensive: ~$3/sequence Requires PCR or bacterial-mediated pre-amplification Cannot quantify genome copies or transcripts from DNA/cDNA libraries* *Unless doing SAGE

Next generation sequencing Definition: massively parallel, cloning-free sequencing (by synthesis) Roche GS FLX (pyrosequencing) Illumina (Solexa sequencing) Applied Biosystems (SOLiD)

Roche GS FLX (“454”) The original “pyrosequencer” Pyrosequencing is not new (Nyren 1996) Was converted into high-throughput system in 2005 (Margulis et al. Nature)

GS FLX library preparation Shear DNA/cDNA and ligate to adaptors Amount of shearing is dependent on desired read length New reagents “claim” reads up to 500 bp How much variation does this lead to?

Bind to beads & PCR amplify in emulsion (ePCR)

Spot beads onto picotitre plate (flow cell)

GS FLX sequencing chemistry

Output Creates an image for every read ~13 Mbp/hr, ~400-500 bp/read Best instrument for de novo work

GS FLX pros & cons vs. Sanger Cloning-free Generates Mbp of DNA sequence Massively parallel: all sequencing done simultaneously Quantitative: # reads => # molecules in sample Cheaper than Sanger at $/bp Cons Shorter read lengths: 200-400 bp Low biological replication (n = 8 for $10k run) Low flexibility in throughput: must do high throughput

Illumina (formerly Solexa) Polymerase-based sequencing by synthesis

Protocol Shear DNA/cDNA and link to adaptors Adaptors bind to probes on flow cell Adaptor “lawn” (similar to a probe array)

Clonal amplification of individual molecules

Sequencing chemistry Fluorescently labeled bases Initially blocked to prevent polymerization Laser reads fluorescence Unblocked so that next base can be added

Output Superimposed image of 4 colors RNA-seq application (Kristen)

Illumina : pros & cons vs. Sanger Cloning-free Generates Gbp of DNA sequence Massively parallel: all sequencing done simultaneously Quantitative: # reads => # molecules in sample Cheaper at $/bp Cons Short read lengths: 20-100 bp Low biological replication (n = 8 for $10k run) Low flexibility in throughput: must do high throughput Run lasts from 1-3 days

Applied Biosystems SOLiD Supported oligonucleotide ligation and detection system Similar to FLX but uses DNA ligase ePCR beads coated onto slide

SOLiD chemistry

Coverage: 20X

SOLiD : pros & cons vs. Sanger Cloning-free Generates Gbp of DNA sequence Massively parallel: all sequencing done simultaneously Quantitative: # reads => # molecules in sample Cheaper at $/bp Cons Short read lengths: 25-50 bp Low biological replication (n = 8 for $12k run) Low flexibility in throughput: must do high throughput Run lasts from 3-6 days

Platform comparison

Applications Genome sequencing Resequencing Transcriptome characterization Comparative transcriptomics miRNA profiling Epigenetics CHiP sequencing

Hypothetical experimental

Hypothetical experiment Sequence cDNA libraries from each bucket and/or treatment Count reads for each transcript Compare transcript abundances between treatments BLAST against reference genome

NGS vs. microarray With microarray: must have sequences in hand to design probes. With NGS: there is no such bias. Sequence everything. # of reads is proportional to # of transcripts. Also no bias to particular gene region. ? ?

Fu et al. 2008

Microarrays: a dying technology? Must generate sequences first Difficulty in interpreting data Probe hybridization issues Can only resolve large differences NGS shows higher correlation w/ protein But NGS is a bioinformatics nightmare!!

The beginning of the end of the microarray? Knowledge of sequences on array Cross-hyb problematic if seq are similar Difficult to detect low abundant species Reproducibility b/w labs and platforms

RNA-Seq: a new tool for transcriptomics - “shotgun transcriptomic sequencing/short read” - more precise method of measuring expression Illumina, Applied Biosystems SOLiD, 454 Life Sciences Transcriptomics on non-model organisms Reveal SNPs Reveal connectivity b/w exons (long or paired reads) High accuracy, on par with qPCR Quantitation Spike-in RNA standards No upper limit, 5 orders of magnitude No extensive normalization required across treatments

Wang et al. 2009, Nature Genetics Total RNA or polyA(+) RNA cDNA production Adaptor ligation (one or both ends) Pair-end or single-end reads Reads 30-400bp Wang et al. 2009, Nature Genetics

Illumina sequencing ~35bp, single end reads, ~ 15 M reads Nagalakshmi et al. 2008, Science

RNA-Seq pitfalls Difficulty with the following: Mapping short reads to the genome Appropriate assign. of ‘multi-mapping’ reads Identification of new splice junctions Sample comparison to ID diff. exp. genes Reads mapping outside annotated boundaries Genomic DNA contamination Pre-spliced heterogeneous nuclear RNA Bioinformatic challenge Shendure 2008, Nature Methods

Marioni et al. 2008

Marioni et al. 2008

Marioni et al. 2008

NanoString Technology Minimal background signal No amplification (induce bias) Less sample needed Improved detection of low exp. RNAs single copy per cell Fortina and Surrey 2008, Nature Biotechnoloy

Probe Design 2 ssDNA probes/ mRNA (35-50 bp oligo) Overnight hybridization to mRNA (solution-based) Slide adhesion via biotin labeled capture probe Reporter probe, 4 spectrally distinct dyes, 7 spaces ‘Barcode’, 47 or 16,384 barcodes