Next Generation Sequencing

Slides:



Advertisements
Similar presentations
Next-Generation Sequencing: Methodology and Application
Advertisements

RNA-seq library prep introduction
Schulich School of Medicine & Dentistry The University of Western Ontario London Regional Genomics Centre Next Generation Sequencing Meeting April 1, 2010.
The Past, Present, and Future of DNA Sequencing
The Good, Bad, and Ugly of Next-Gen Sequencing
 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species.
Next–generation DNA sequencing technologies – theory & practice
High-Throughput Sequencing Technologies
Current Sequencing Technologies and Data Generation
SOLiD Sequencing & Data
Next-generation sequencing
Next Generation Sequencing, Assembly, and Alignment Methods
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Canadian Bioinformatics Workshops
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
Greg Phillips Veterinary Microbiology
The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.
High Throughput Sequencing
CS 6293 Advanced Topics: Current Bioinformatics
Next Generation DNA Sequencing Platforms: Evolving Tools for
Next Generation Sequencing
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
GENOME SEQUENCING. I. Genome sequencing The Sanger Method (1977) Denaturation +priming Polymerization.
NGS Data Generation Dr Laura Emery. Overview The NGS data explosion Sequencing technologies An example of a sequencing workflow Bioinformatics challenges.
Update on Next-Generation Sequencing
Next generation sequencing platforms Applications
The impact of next-generation sequencing technology of genetics Elaine R. Mardis – 11 February Washington School of Medicine, Genome Sequencing Center.
Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health.
High-Throughput Sequencing Technologies
Next generation sequencing Xusheng Wang 4/29/2010.
Sequencing Technologies and Applications at JGI
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
Introduction to next generation sequencing Rolf Sommer Kaas.
Genomics – Next-Gen sequencing and Microarrays
Bioinformatics and Sequencing Relevant to SolCAP
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
The Changing Face of Sequencing
Towards your own genome. Designing your Sequencing Run Sequencing strategy Genome size and genome.
De Novo Genome Assembly - Introduction Henrik Lantz - BILS/SciLife/Uppsala University.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.
Next-Generation Sequencing of Microbial Genomes and Metagenomes
Molecular Biology Dr. Chaim Wachtel May 28, 2015.
Jan Pačes Institute of Molecular Genetics AS CR
SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX
De Novo Genome Assembly - Introduction
Next Generation Sequencing Lenka Veselovská Laboratory of Developmental Biology and Genomics.
When the next-generation sequencing becomes the now- generation Lisa Zhang November 6th, 2012.
Library QA & QC Day 1, Video 3
Introduction to Illumina Sequencing
Next-generation sequencing technology
DNA Sequencing Second generation techniques
Short Read Sequencing Analysis Workshop
Next generation sequencing
Cancer Genomics Core Lab
Sequencing technologies
Next-generation sequencing technology
Teagasc/APC Sequencing Facility
Small RNA Sample Preparation
2nd (Next) Generation Sequencing
ULTRASEQUENCING. Next Generation Sequencing: methods and applications.
Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine
Next-generation DNA sequencing
BF nd (Next) Generation Sequencing
Global Next Generation Sequencing (NGS) Market (By Products - Consumables, Platforms, Services, Sequencing Services, Bioinformatics, Technology, Applications, End Users, Regions), Key Company Profiles - Forecast to 2025
Presentation transcript:

Next Generation Sequencing Miluše Hroudová Laboratory of Genomics and Bioinformatics Institute of Molecular Genetics of the ASCR, v.v.i. The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Outline Introduction to Next Generation Sequencing (NGS) Material - DNA / RNA (types, characteristics, applications) - genomics x transcriptomics Technologies - Principles - Workflow - Parametres Data analysis (basic pipeline) Project example (IMG) Technology progression The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Basic Terms Base-pair - basic building block of double-stranded DNA, unit of DNA segment length (bp) Read - continuous sequence produced by sequencer Coverage - the number of short reads that overlap each other within a specific genomic region (how many times the particular base or region is read) Consensus sequence - idealized sequence in which each position represents the base most often found when many sequences are compared Contig - set of overlapping segments (reads) of DNA sequences forming continuous consensus sequence Assembly - aligning and merging fragments of DNA sequence (reads, contigs) in order to reconstruct the original sequence Scaffold - set of linked non-contiguous series of genomic sequences, consisting of contigs separated by gaps of known length The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Next Generation Sequencing Introduction Modern high-throughput DNA sequencing technologies Massive, parallel, rapid ... Decreasing price, time, workflow complexity, error rate Increasing data quantity and quality, read lenght (data storage capacity), repertoire of bioinformatics tools Wide range of applications Third Generation Sequencing (single molecule, real time, in situ ...) The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Input Material, Target Sequence DNA De novo genome seq Resequencing (ChIP-Seq) Amplicon seq (16S) Sequence capture Base modification detection Genomic variations eukaryotic viral prokaryotic => Genomics chromosome l The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Genomics Area of genetics that concerns the sequencing and analysis of an organism’s genetic information DNA sequencing + bioinformatics => sequence, assemble and analyze the function and structure of genomes (the complete set of DNA within a single cell of an organism) Bacterial genome Human genome

Input Material, Target Sequence RNA RNA Seq (Whole Transcriptome Shotgun Seq – WTSS, normalized) SNPs detection RNA species other than mRNA Quantitative seq (without normalization) Total RNA Coding RNA 4 % of total Functional RNA 96 % of total Pre-mRNA (hnRNA) mRNA Pre-rRNA Pre-tRNA snRNA snoRNA miRNA siRNA All organisms Eukaryotes only rRNA tRNA => Transcriptomics The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

DNA sequencing procedure Transcriptomics Study of the transcriptome - the complete set of RNA transcripts produced from the genome, under specific circumstances at particular place and time Methods: RT PCR, Microarrays, mRNA seq mRNA sequencing procedure Total RNA mRNA Fragmented mRNA cDNA library cDNA Raw data (reads) polyA mRNA selection rRNA depletion Temperature based fragmentation Reverse transcription Library preparation Adapter ligation Size selection Sequencing run Normalized cDNA Normalization Optional DNA sequencing procedure The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

RNA Quality quality of the starting total RNA - RNA integrity number (RIN) RIN<7 => unequal read distribution along 5’ and 3’ ends => bad sequencing results Number of reads 454 reads distribution RIN < 7 RIN > 9 Agilent Bioanalyzer traces

cDNA synthesis Total RNA (ug) mRNA with polyA 3’end SMARTer II A Oligo: 5’-AAGCAGTGGTATCAACGCAGAGTACGCGGG-3’ Modified CDS Primer 5’-AAGCAGTGGTATCAACGCAGAGTTTTTGTTTTTTTCTTTTTTTTTTVN-3’ The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

cDNA normalization TRIMMER cDNA normalization kit (Evrogen) abundant transcripts rare transcripts TRIMMER cDNA normalization kit (Evrogen) DSN = duplex-specific nuclease The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Sequencing Principles Sequencing by Synthesis Sanger/Dideoxy chain termination (Life Technologies, Applied Biosystems) Pyrosequencing (Roche/454) Reversible terminator (Illumina ) Ion proton semiconductor (Life Technologies) Zero Mode Waveguide (Pacific Biosciences) Sequencing by Oligo Ligation Detection SOLiD (Applied Biosystems) Other Asynchronous virtual terminator chemistry - HeliScope (Helios) The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Actual Sequencing Platforms Roche/454 (GS FLX+/GS Junior) Illumina Genome Analyzer (HiSeq/MiSeq/NextSeq) Life Technologies (3500 Genetic Analyzer, Ion Torrent Proton/PGM) Pacific Biosciences (PACBIO RSII) Applied Biosystems (SOLiD, 3730xl DNA Analyzer ) The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Sanger (3500 GA, 3730xl DNA Analyzer) Sequencing by synthesis The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Oligo Ligation Detection (SOLiD) Sequencing by ligation The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Reversible Terminator (HiSeq, MiSeq, NextSeq) Cluster generation on a flow-cell surface The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Reversible Terminator (HiSeq, MiSeq, NextSeq) Sequencing by synthesis The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Pyrosequencing (GS FLX, GS Junior) Sequencing by synthesis The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Pyrosequencing (GS FLX, GS Junior) Sequencing by synthesis The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Sequencing Matrices Sanger, 96-well, 8 capillaries 96 x 600 bp / 24 h 1400 € Pyrosequencing, 2 regions 1,000,000 x 600 bp / 20 h 5500 € Revers. terminator, MiSeq 10,000,000 x 250 bp / 40 h 1150 € The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

General Workflow Nucleic acid isolation/purification RNA – selection of particular RNA species, cDNA synthesis DNA – fragmentation, size selection (shotgun x paired end) Seq library preparation (platform specific adaptors ligation, indexes) Amplification of seq library (DNA-binding beads and other carriers) Sequencing run set up Image processing (images => sequence + quality information) Data analysis (assembly, mapping, annotation ...) Special tricks for amplicons, SeqCap, ChIP-Seq, small RNAs ... user service service user The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Pyrosequencing workflow Library preparation: Adaptor ligation Fragmentation Emulsion PCR amplification: Bead deposition onto PicoTiter Plate (PTP): The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Paired-end x Mate-pair Paired-end – sequencing from both fragment ends (< 1 kb) Mate-pair – longer (3-20 kb) molecules circularized via internal adapter x

Mate-pair types Mate-pair – longer (3-20 kb) molecules circularized via internal adapter The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Parametres Comparison PacBio RSII Sequencing by synthesis > 4000 bp 99,999% 30 Min – 3 Hours 1.6 GB Read length, fast, no amplification, real time record 0.06 M Low throughput, low accuracy Liu et al. 2012. Comparison of Next-Generation Sequencing Systems. Journal of Biomedicine and Biotechnology. 251364. The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Parametres Comparison Liu et al. 2012. Comparison of Next-Generation Sequencing Systems. Journal of Biomedicine and Biotechnology. 251364. The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Parametres Comparison of Benchtop Variants Junior 700 bp 70 Mb 18 hours 2 days Pyrosequencing Minimize hand on time, increase emPCR reproducibility On/Off instrument µg Liu et al. 2012. Comparison of Next-Generation Sequencing Systems. Journal of Biomedicine and Biotechnology. 251364. The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Applications and Suitable Seq Type de novo DNA/RNA seq – Illumina, Roche/454 (PE), PacBio Resequencing – SOLiD, Illumina SNPs detection – Roche/454, PacBio (x InDels variation – Illumina, SOLiD) Sequence capture - Illumina Sanger - low-coverage sequencing of individual positions and regions (e.g., diagnostic genotyping) or the sequencing of virus- and phage-sized genomes Ion Torrent – short amplicons SOLiD - quantitative applications, small RNAs, epigenomics HeliScope – quantitative applications Combination of methods The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Data Analysis, Assembly, Annotation The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Data Analysis, Assembly, Annotation technology compatible software (user friendly, inefective) general, free access software (search for optimal tool) user developed (lack of qualified bioinformaticians) combination of different platforms data x problems with assemblers platform specific errors, incompatible software parametres multiple data filtering procedures The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Machine/Service Availability IMG – Roche/454 GS FLX+ (full run including library prep 5500 €/0,7GB) - Illumina NextSeq (next year? ) Illumina MiSeq – IEM AS CR, GeneCore EMBL (1150 €/ 10 GB) Illumina – GeneCore EMBL (HiSeq lane 100 bp PE 2500 €/200 GB) Ion Torrent - GeneCore EMBL, TU Liberec PacBio –Netherlands (Macrogen), Germany, Switzerland Beijing Genomics Institute (BGI, China) – Illumina HiSeq 2000 - Roche GS FLX+ - SOLiD 4 - Ion Torrent - Sanger 3730xl DNA Analyzer The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Our Sequencing Projects GS FLX+, Roche 454 HiSeq2000/MiSeq, Illumina Amplicon seq (environmental samples, 16S rDNA genes) De novo genome sequencing (bacteria, protozoa, platyhelminthes, plants ...) Sequence capture (human cancer research, animal population genetics ...) Metagenomics (simple bacterial consortia x complex environmental samples) Transcriptomics (protozoa, cnidarians, insects, human cancer research ...) Beckman CEQ 2000XL - minor sequencing analyses The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Transcriptomics (Evo-Devo Studies) Craspedacusta sowerbyi Six and Pou genes early evolution Hroudova et al. 2012. PLoS ONE, 7(4): e36420

De Novo Genome Seq Achromobacter xylosoxidans isolated from biphenyl contaminated soil 2-chlorobenzoate and 2,5-dichlorobenzoate degrader Strnad et al. 2011. J Bacteriol 193: 791-792

Metagenomics ecosystem total DNA DNA fragments sequencing analysis F. myxofacies At. ferroxidans others ecosystem total DNA DNA fragments sequencing analysis

Metagenomic Research Examples Cow rumen and biotechnology: Fishing out genes for cellulose biodegradation Lean vs. obese phenotype microbiome transplantation Functional profiling and comparison of nine biomes

Amplicon Sequencing 16S rDNA genes bacterial consortia actively degrading biphenyl, benzoate, and naphthalene in a long-term contaminated soil Uhlik et al. 2012. PLoS ONE, 7(7): e40653

Sequencing Hot Today and Near Future Single-Molecule Real-Time seq – SMRT Pac Bio (without amplification necessary for signal detection) The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Sequencing Hot Today and Near Future Single cell DNA/RNA seq based on micro/nanofluidics technology (without WGA based on MDA - Φ29 DNA polymerase) Nanopores – Oxford Nanopores Technologies (reduced enzymatic steps, electric current based detection) Silicon based nanopores - IBM Human genome (30x) under 1000 $ already announced by Illumina (HiSeq X Ten) The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Before You Start Planning Seq Experiment sufficient sample source targeted application/platform computational capacity (storage, back up, operations) bioinformatics support The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Take-away message NGS - high-throughput, massive, parallel, rapid DNA sequencing Third generation – single molecule, real time, reduced chemistry Basic NGS principles – synthesis, ligation Basic workflow sample - fragmentation - library prep - seq run - data analysis Applications – de novo seq, reseq, amplicons, SeqCap, RNA seq (quantitative expression analysis x normalized cDNA seq) Choose the right one application and prepare sample appropriately Basic data analysis pipeline image acquisition, quality metrics - filtering - contig building - annotation The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

of Genomics and Bioinformatics Acknowledgement Laboratory of Genomics and Bioinformatics IMG AS CR, Prague Čestmír Vlček Václav Pačes Jan Pačes Hynek Strnad Michal Kolář Jakub Rídl Šárka Pinkasová Laboratory of Transcriptional Regulation, IMG (Dr. Zbyněk Kozmik) Core facility of Genomics and Bioinformatics, IMG (Mgr. Šárka Kocourková, Mgr. Marcela Vedralová) GeneCore, EMBL, Heidelberg (Dr. Vladimír Beneš) Roche CR (Diagnostic Division), Genetica CR (Illumina Division) The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”

Thank you for your attention! Miluše Hroudová Institute of Molecular Genetics of the ASCR, v.v.i. hroudova@img.cas.cz The presentation is supported from the project OP EC CZ.1.07/2.3.00/30.0027 “Founding the Centre of Transgenic Technologies”