Next-generation sequencing: from basics to future diagnostics PART I: NGS technologies and standard workflow Sangwoo Kim, Ph.D. Assistant Professor, Severance.

Slides:



Advertisements
Similar presentations
Next-Generation Sequencing: Methodology and Application
Advertisements

Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Next–generation DNA sequencing technologies – theory & practice
High-Throughput Sequencing Technologies
Nuts and Bolts of Clinical Genomic Sequencing Thomas Stricker MD PhD Vanderbilt University.
Next-generation sequencing
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Greg Phillips Veterinary Microbiology
High Throughput Sequencing
CS 6293 Advanced Topics: Current Bioinformatics
GENOME SEQUENCING. I. Genome sequencing The Sanger Method (1977) Denaturation +priming Polymerization.
High-Throughput Sequencing Technologies
Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital
Whole Exome Sequencing for Variant Discovery and Prioritisation
DNA Technology and Genomics
Chapter 20~ DNA Technology & Genomics
Chapter 20~DNA Technology & Genomics. Who am I? Recombinant DNA n Def: DNA in which genes from 2 different sources are linked n Genetic engineering:
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
Introduction to next generation sequencing Rolf Sommer Kaas.
MES Genome Informatics I - Lecture IV. NGS basics Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University.
MES Genome Informatics I - Lecture VIII. Interpreting variants Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute,
Chapter 19 - Chromatin DNA PackagingDNA Packaging histone proteinhistone protein NucleosomeNucleosome ”beads on a string” basic unit of DNA packing”beads.
Technological Solutions. In 1977 Sanger et al. were able to work out the complete nucleotide sequence in a virus – (Phage 0X174) This breakthrough allowed.
Next-Generation Sequencing: Methodology and Application
Section 4 Lesson 1– The Human Genome Project. Applications of DNA Technology Advances in gene manipulation have made many things possible. This section.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
HaloPlexHS Get to Know Your DNA. Every Single Fragment.
Sangwoo Kim, Ph.D. Assistant Professor,
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Computational methods for genomics-guided immunotherapy Sahar Al Seesi Computer Science & Engineering Department, UCONN Immunology Department, UCONN Health.
Genetic Engineering/ Recombinant DNA Technology
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
Higher Human Biology Unit 1 Human Cells KEY AREA 5: Human Genomics.
When the next-generation sequencing becomes the now- generation Lisa Zhang November 6th, 2012.
Canadian Bioinformatics Workshops
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Introduction to Illumina Sequencing
From Reads to Results Exome-seq analysis at CCBR
DNA Sequencing First generation techniques
Next-generation sequencing technology
Virginia Commonwealth University
Interpreting exomes and genomes: a beginner’s guide
Research Techniques Made Simple: Next-Generation Sequencing:
DNA Sequencing Second generation techniques
The NGS Era is Now Eric T. Weimer, PhD, D(ABMLI)
Cancer Genomics Core Lab
Sequencing technologies
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Next-generation sequencing technology
Sequencing Technologies
DNA Technology Now it gets real…..
DNA Technology & Genomics
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
Sequencing Data Analysis
DNA-based technology New and old technologies that are utilized in biotechnology DNA cloning DNA libraries Polymerase chain reaction (PCR) Genome sequencing.
Recombinant DNA Technology
2nd (Next) Generation Sequencing
Figure 1 The genomic nephrology workflow: genetic diagnosis and clinical application Figure 1 |The genomic nephrology workflow: genetic diagnosis and clinical.
ULTRASEQUENCING. Next Generation Sequencing: methods and applications.
High-Throughput Sequencing Technologies
High-Throughput Sequencing Technologies
BF nd (Next) Generation Sequencing
3.5 – Genetic Modification & Biotechnology
Canadian Bioinformatics Workshops
Sequencing Data Analysis
Personalized genomic analyses for cancer mutation discovery and interpretation by Siân Jones, Valsamo Anagnostou, Karli Lytle, Sonya Parpart-Li, Monica.
Development of a Novel Next-Generation Sequencing Assay for Carrier Screening in Old Order Amish and Mennonite Populations of Pennsylvania  Erin L. Crowgey,
Fig. 1. Schematic description of whole-exome or targeted next-generation sequencing analyses. Schematic description of whole-exome or targeted next-generation.
Global Next Generation Sequencing (NGS) Market (By Products - Consumables, Platforms, Services, Sequencing Services, Bioinformatics, Technology, Applications, End Users, Regions), Key Company Profiles - Forecast to 2025
Presentation transcript:

Next-generation sequencing: from basics to future diagnostics PART I: NGS technologies and standard workflow Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University College of Medicine

Overview PART I: NGS technologies and standard workflow Next generation sequencing History and technology Data and its meaning; process workflow Discussion PART II: NGS Analysis to find variants NGS analysis to find variants Single nucleotide variants (SNVs) Copy number variations (CNVs) Structural variations (SVs) PART III: NGS application to diagnostics NGS in genomic medicine Potential application to forensic science

background Conventional variant calling Variant calling in minor subgroups background

6.7% of Japanese patients with NSCLC harbor a fusion of EML4 with the intracellular kinase domain of ALK

PF-2341066 (Crizotinib) | cMet/ALK inhibitor

57% response rate, 27% stable disease

“The FDA approved the Pfizer drug in 2011 based on 250 patients, four years after the ALK-mutation link was discovered. That is lightning speed in an industry accustomed to spending a decade with thousands of test subjects to get drug approval.”

Genomic medicine is a reality McCarthy et al, 2013 Sci Transl Med.

The first breakthrough The Human Genome Project (1990~2003) Began in 1990. Consortium comprised in U.S, U.K, France, Australia, Japan etc. “Rough draft” in 2000 “Complete genome” published in 2003 13 years, $3 billion dollars.

The second breakthrough Massively Parallel Sequencing (a.k.a. Next-generation sequencing) via spatially separated, clonally amplified DNA templates or single DNA molecules Metzker et al, Nat Rev Genet, 2010 Illumina HiSeq2500 5500 SOLiD system Ion Torrent PGM

Launched in 2008 Sequencing of 1092 individual genomes was announced in 2012 Great repository for population genomics

Inaugural publication in 2009 Aims to assemble a genomic zoo (10,000 vertebrate species)

Project announced in 2013, aiming to accomplish in 5 years. To identify cancer genes (regarding heterogeneity) and genetics of rare diseases

Overwhelmed by data “The challenges turns from data generation into data analysis!” Alex Sanchez, Introduction to NGS data analysis, 2012

Overwhelmed by data Alex Sanchez, Introduction to NGS data analysis, 2012 Elizabeth Pennsini , Science 2011

Overwhelmed by data …“Within a few years, Ponting predicts, analysis, not sequencing, will be the main expense hurdle to many genome projects. And that’s assuming there’s someone who can do it; bioinformaticists are in short supply everywhere.”... Alex Sanchez, Introduction to NGS data analysis, 2012 Elizabeth Pennsini , Science 2011

From data-poor to data rich “과거의 ‘classical’ bioinformatics는 서열 상동성분석, 정렬, 재구성등에 대한 알고리즘이 주를 이루었습니다. 하지만 고도로 병렬화된 대용량 생명정보는 단순 분석을 넘어서는 통합과 해석을 요구하기 시작했습니다.” “오늘날 데이터는 도처에서 생성됩니다. 이제 데이터는 ‘그저 생성되기 마련’인 시대입니다.” Prof. Ju Han Kim, SNU Conference on Biomedical Informatics

From data-poor to data rich env. “과거의 ‘classical’ bioinformatics는 서열 상동성분석, 정렬, 재구성등에 대한 알고리즘이 주를 이루었습니다. 하지만 고도로 병렬화된 대용량 생명정보는 단순 분석을 넘어서는 통합과 해석을 요구하기 시작했습니다.” “오늘날 데이터는 도처에서 생성됩니다. 이제 데이터는 ‘그저 생성되기 마련’인 시대입니다.” Prof. Ju Han Kim, SNU Conference on Biomedical Informatics Prof. Atul Butte, Stanford Univ. Hypothesis driven data → Data driven hypothesis

next generation sequencing Conventional variant calling Variant calling in minor subgroups next generation sequencing

Traditional Sequencing Genomic DNA is fragmented, then cloned to a plasmid vector and used to transform E. coli For each sequencing reaction, a single bacterial colony is picked and plasmid DNA isolated Each cycle sequencing reaction takes place within a microliter-scale volume

Sanger Sequencing

Next Generation Sequencing No cloning DNA to be sequenced is used to construct a library of fragments that have synthetic DNAs (adapters) added covalently to each fragment end by use of DNA ligase Amplification can be done in parallel Library fragments are amplified in situ on a solid surface Sequencing can be done in parallel (in 3 iterative steps) a nucleotide addition step a detection step a wash step

Illumina Sequencing

Illumina Sequencing

Illumina Sequencing

Illumina Sequencing https://www.youtube.com/watch?v=HMyCqWhwB8E

Ion Torrent Sequencing DNA capture on beads Single bead in a well Attach one nucleotide (A/T/G/C) at one time Detect pH change Measure the level of change for homopolymer detection

Ion Torrent Sequencing

Ion Torrent Sequencing

Ion Torrent Sequencing

Pacbio SMRT sequencing zero-mode waveguide (ZMW) http://www.pacificbiosciences.com/products/smrt-technology/

Nanopore sequencing https://www.youtube.com/watch?v=3UHw22hBpAk

Comparison

NGS data and processing overview Conventional variant calling Variant calling in minor subgroups NGS data and processing overview

FASTA format A format for DNA (or protein) sequence

FASTQ format (NGS raw data) sequence quality one read A format for NGS read (FASTQ + quality)

Mapping back to genome Where is this sequence in human genome? TAACACCTGGGAAATTCATCACAAAAAGATCTTAGCCTAGGCACATTGTCATTAGGTTATCCAAAGTTAAGACAAAGGAAAGAATCTTAAGAGCTGTGAGA

Quality Each basecall (a call for nucleotide – ‘A’,’T’,’C’,’G’) has its own quality quality is a confidence of the machine Genome Informatics I (2015 Spring)

Phred scale quality Q = -10log10(e) Quality score @SRR1798798.1 D4LHBFN1:204:D1B2UACXX:6:1101:1156:1996 length=101 NCTCTCACCGAGCTCCACGAACGATAAGGGAATCAGTCTTAAAAGAGCCGCGAGTTACAGGCACACCTGAGAGAAAGAGATGTTTGTATTCACCTTAGAAC +SRR1798798.1 D4LHBFN1:204:D1B2UACXX:6:1101:1156:1996 length=101 #1:BDDDDF?FF@B>:ACFIBCGB3BF@C<?F9?DFBFCFEBFEFIFEIFFFDC>@ABBBB?BBBBBBBB?@:?AA@B@?(:4:>?<AB@:B@@B>>ABBB Q = -10log10(e) Quality score Probability of the base call being wrong 10, 20, 30, 40… 10%, 1%, 0.1%, 0.01%... +33 +,5,?,I… ASCII code table

D. Validation and functional assessment control sequencing quality control short read alignment (BAM files) raw reads (FASTQ files) germ-line mutation somatic mutation copy number variation (CNV) structural variation (SV) A. Data Generation B. Variant Finding C. Variant Analysis xenogeneic sequence 43% 0% 31% recurrence analysis GKRRAGGGKRRAV*G variant impact prediction mutation filtration/selection tumor heterogeneity inference disease Box 1. Sequencing types and platforms. Depending on the sequencing purpose, various platforms can be considered for optimization. Whole genome sequencing (WGS) allows an inspection of all genomic areas and is applicable for CNV and SV analysis. Whole exome sequencing (WES) only interrogates coding regions (1~2% of the genome) with a less cost and throughput. WGS and WES are frequently used for novel causative variant discovery and control sample sequencing is generally mandatory. When a limited regions are to be tested (as in a diagnosis kit), a set of targeted genes are amplified and fed for sequencing (targeted/ panel sequencing). For this case, control is usually omitted when the target sites (hotspots) are clear. D. Validation and functional assessment variant confirmation pathway analysis functional study Kim S and Paik S, in preparation

discussion Conventional variant calling Variant calling in minor subgroups discussion