Next–generation DNA sequencing technologies – theory & practice

Slides:



Advertisements
Similar presentations
Next-Generation Sequencing: Methodology and Application
Advertisements

Schulich School of Medicine & Dentistry The University of Western Ontario London Regional Genomics Centre Next Generation Sequencing Meeting April 1, 2010.
The Past, Present, and Future of DNA Sequencing
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
High-Throughput Sequencing Technologies
Current Sequencing Technologies and Data Generation
Genome-wide Association Study Focus on association between SNPs and traits Tendency – Larger and larger sample size – Use of more narrowly defined phenotypes(blood.
Next-generation sequencing
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Canadian Bioinformatics Workshops
Greg Phillips Veterinary Microbiology
The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.
1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.
High Throughput Sequencing
CS 6293 Advanced Topics: Current Bioinformatics
Next Generation DNA Sequencing Platforms: Evolving Tools for
11 © 2009 PerkinElmer © 2010 PerkinElmer November 20, 2012 DNA Services Overview.
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
GENOME SEQUENCING. I. Genome sequencing The Sanger Method (1977) Denaturation +priming Polymerization.
NGS Data Generation Dr Laura Emery. Overview The NGS data explosion Sequencing technologies An example of a sequencing workflow Bioinformatics challenges.
Update on Next-Generation Sequencing
DRAW+SneakPeek: Analysis Workflow and Quality Metric Management for DNA-Seq Experiments O. Valladares 1,2, C.-F. Lin 1,2, D. M. Childress 1,2, E. Klevak.
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
Sequencing Technologies
High Throughput Sequencing Methods and Concepts
Introduction to next generation sequencing Rolf Sommer Kaas.
Next-Generation Sequencing: Methodology and Application
Bioinformatics and Sequencing Relevant to SolCAP
High Throughput Sequencing Methods and Concepts Cedric Notredame adapted from S.M Brown.
HLA Analysis and Next Generation Sequencing Henry Erlich, Ph.D. Cherie Holcomb, Ph.D. Roche Molecular Systems picture placeholder NGS and EFI, May 14,
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Next-Generation Sequencing of Microbial Genomes and Metagenomes
HaloPlexHS Get to Know Your DNA. Every Single Fragment.
Molecular Biology Dr. Chaim Wachtel May 28, 2015.
SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.
Sequencing DNA 1. Maxam & Gilbert's method (chemical cleavage) 2. Fred Sanger's method (dideoxy method) 3. AUTOMATED sequencing (dideoxy, using fluorescent.
Biochemistry 412 Overview of Genomics & Proteomics 18 January 2005.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX
Computational methods for genomics-guided immunotherapy Sahar Al Seesi Computer Science & Engineering Department, UCONN Immunology Department, UCONN Health.
Lecture-3 EXOME SEQUENCING Huseyin Tombuloglu, Phd GBE423 Genomics & Proteomics.
Third Generation Sequencing. Today Illumina – Solexa sequencing technology 454 Life sciences – 454 sequencer Applied Biosystem – SOLiD system Tomorrow.
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Introduction to Next Generation Sequencing. Strategies For Interrogating the Transcriptome Known genes Predicted genes Surrogate strategy Exon verification.
Introduction to Illumina Sequencing
Next-generation sequencing technology
Interpreting exomes and genomes: a beginner’s guide
Research Techniques Made Simple: Next-Generation Sequencing:
DNA Sequencing Second generation techniques
Next generation sequencing
Cancer Genomics Core Lab
Next Generation Sequencing
Introduction to next generation sequencing
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Next-generation sequencing technology
NGS technologies.
Sequencing technology and assembly
California Department of Justice – Jan Bashinski DNA Laboratory
2nd (Next) Generation Sequencing
High-throughput sequencing techniques
ULTRASEQUENCING. Next Generation Sequencing: methods and applications.
Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine
High-Throughput Sequencing Technologies
High-Throughput Sequencing Technologies
Next-generation DNA sequencing
BF nd (Next) Generation Sequencing
Standard (Sanger) sequencing
Presentation transcript:

Next–generation DNA sequencing technologies – theory & practice

Outline Next-Generation sequencing (NGS) technologies – overview NGS targeted re-sequencing – fishing out the regions of interest NGS workflow: data collection and processing – the exome sequencing pipeline

PART I: NGS technologies Next-Generation sequencing (NGS) technologies – overview

DNA Sequencing – the next generation The automated Sanger method is considered as a ‘first- generation’ technology, and newer methods are referred to as next- generation sequencing (NGS).

Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977 A Maxam and W Gilbert "DNA seq by chemical degradation" F Sanger"DNA sequencing with chain-terminating inhibitors" 1984 DNA sequence of the Epstein-Barr virus, 170 kb 1987 Applied Biosystems - first automated sequencer 1991 Sequencing of human genome in Venter's lab 1996 P. Nyrén and M Ronaghi - pyrosequencing 2001 A draft sequence of the human genome 2003 human genome completed 2004 454 Life Sciences markets first NGS machine

2005

DNA Sequencing – the next generation Random genome sequencing 25 Mb 300k reads 110bp Sanger sequencing Targeted 700-1000 bp

DNA Sequencing – the next generation The newer technologies constitute various strategies that rely on a combination of Library/template preparation Sequencing and imaging

DNA Sequencing – the next generation Commercially available technologies Roche – 454 GSFLX titanium Junior Illumina HiSeq2000 MySeq Life – SOLiD 5500xl Ion torrent Helicos BioSciences – HeliScope Pacific Biosciences – PacBio RS

DNA Sequencing – the next generation

Template preparation: STEP1 Produce a non-biased source of nucleic acid material from the genome

Template preparation: STEP1 Produce a non-biased source of nucleic acid material from the genome

Template preparation Produce a non-biased source of nucleic acid material from the genome Current methods: randomly breaking genomic DNA into smaller sizes Ligate adaptors attach or immobilize the template to a solid surface or support the spatially separated template sites allows thousands to billions of sequencing reactions to be performed simultaneously

Template preparation Clonal amplification Single molecule sequencing Roche – 454 Illumina – HiSeq Life – SOLiD Single molecule sequencing Helicos BioSciences – HeliScope Pacific Biosciences – PacBio RS

Template preparation: Clonal amplification In solution – emulsion PCR (emPCR) Roche – 454 Life – SOLiD Solid phase – Bridge PCR Illumina – HiSeq

Template preparation: Clonal amplification - emPCR

Sequencing SOLiD 454

Pyrosequencing Picotitre plate Pyrosequencing

Pyrosequencing

Sequencing by ligation

Sequencing by ligation

Sequencing by ligation

Template preparation: Clonal amplification – Bridge PCR

Template preparation: Single molecule templates Heliscope BioPac

HiSeq Heliscope

DNA Sequencing – the next generation The major advance offered by NGS is the ability to cheaply produce an enormous volume of data The arrival of NGS technologies in the marketplace has changed the way we think about scientific approaches in basic, applied and clinical research

PART II: NGS targeted resequencing fishing out the regions of interest

Random genome sequencing The beginning Random genome sequencing ??? Sanger sequencing Targeted 700-1000 bp

DNA Sequencing – the next generation Library/template preparation Library enrichment for target Sequencing and imaging

Target enrichment strategies Random genome sequencing Hybrid Capture PCR based Sanger sequencing

Target enrichment strategies

Target enrichment strategies

Target enrichment strategies

Target enrichment strategies: MIP

Hybrid Capture In solution Agilent Nimblegen ... Solid phase Febit

Hybrid Capture In solution Relatively cheap High throughput is possible Small amounts of DNA sufficient Solid phase Straightforward method Flexible Higher amounts of DNA

Target enrichment strategies

PCR based approaches Uniplex Multiplex Fluidigm Raindance Multiplicon Longrange PCR products

PCR based approaches: Raindance

PCR based approaches: Fluidigm 48.48 Access Array

PCR based approaches: Fluidigm 48.48 Access Array

PCR based approaches: Fluidigm 48.48 Access Array

Target enrichment strategies

PART III: NGS workflow data collection and processing – the exome sequencing pipeline

Whole Exome Sequencing The human genome Genome = 3Gb Exome = 30Mb 180 000 exons Protein coding genes constitute only approximately 1% of the human genome It is estimated that 85% of the mutations with large effects on disease-related traits can be found in exons or splice sites

Exome sequencing gDNA 3 Gb Exome 38Mb NGS

The past, present & future

Exome sequencing capacity HiSeq specifications: 2 flow cells 16 lanes (8 per flow cell) 200-300 Gbases per flow cell 10 days for a single run Exome throughput 96 @ 60x coverage per run 3000 @ 60x coverage per year

Data processing workflow Data formatting & QC Mapping & QC Variant calling Variant annotation Variant filtering/comparison

Data processing

DATA GENERATION DATA PROCESSING DATA STORAGE INTERPRETATION RESULTS REPORTING & VALIDATION

Prepare sample library DATA GENERATION Prepare sample library Perfom exome capture Perform sequencing

Prepare sample library DATA GENERATION Prepare sample library Perfom exome capture Perform sequencing

Prepare sample library DATA GENERATION Prepare sample library Perfom exome capture Perform sequencing

DATA GENERATION DATA PROCESSING DATA STORAGE Image processing Base calling Sequence Data 10-15 Gb / exome

NGS data processing: overview 1 Mapping 2 Duplicate marking 3 Local realignment 4 Base quality recalibration 5 Analysis-ready mapped reads

DATA GENERATION DATA PROCESSING DATA STORAGE Image processing Base calling Sequence Data 10-15 Gb / exome QC sequencing Mapping sequences QC capture exp

DATA PROCESSING QC NGS Mapping QC HC

DATA PROCESSING QC NGS Mapping QC HC

DATA GENERATION DATA PROCESSING DATA STORAGE Image processing Base calling Sequence Data 10-15 Gb / exome QC sequencing Mapping sequences QC capture exp Mapping results 5 Gb / exome Variant Calling Variant Annotation

DATA GENERATION DATA PROCESSING DATA STORAGE Image processing Base calling Sequence Data 10-15 Gb / exome QC sequencing Mapping sequences QC capture exp Mapping results 5 Gb / exome Variant Calling Variant Annotation Variant Calls 100Mb / exome

SNPs vs Indels

exonic vs non-exonic

Exonic

Exonic

Variants Public & Private DATA GENERATION DATA PROCESSING DATA STORAGE Image processing Base calling Sequence Data 10-15 Gb / exome QC sequencing Mapping sequences QC capture exp Mapping results 5 Gb / exome Variant Calling Variant Annotation Variant Calls 100Mb / exome Variant Filtering Database known Variants Public & Private

Validated variants in candidate genes DATA GENERATION DATA PROCESSING DATA STORAGE Image processing Base calling Sequence Data 10-15 Gb / exome QC sequencing Mapping sequences QC capture exp Mapping results 5 Gb / exome INTERPRETATION RESULTS Variant Calling Variant Annotation Variant Calls 100Mb / exome Validated variants in candidate genes Variant Filtering Database known Variants Public & Private REPORTING & VALIDATION