Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health.

Slides:



Advertisements
Similar presentations
Next-Generation Sequencing: Methodology and Application
Advertisements

Schulich School of Medicine & Dentistry The University of Western Ontario London Regional Genomics Centre Next Generation Sequencing Meeting April 1, 2010.
The Past, Present, and Future of DNA Sequencing
The Good, Bad, and Ugly of Next-Gen Sequencing
High-Throughput Sequencing Technologies
Bioinformatics Lectures at Rice
Next-generation sequencing
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Canadian Bioinformatics Workshops
Next-generation sequencing – the informatics angle Gabor T. Marth Boston College Biology Department AGBT 2008 Marco Island, FL. February
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
Greg Phillips Veterinary Microbiology
Bioinformatics Methods and Computer Programs for Next-Generation Sequencing Data Analysis Gabor Marth Boston College Biology Next Generation Sequencing.
Transcriptomics Jim Noonan GENE 760.
Affymetrix Microarray and Illumina/ Solexa NextGen Sequencing Yuannan Xia, Ph.D Genomics Core Research Facility
NHGRI/NCBI Short-Read Archive: Data Retrieval Gabor T. Marth Boston College Biology Department NCBI/NHGRI Short-Read.
The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.
CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.
1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
$399 Personal Genome Service $2,500 Health Compass service $985 deCODEme (November 2007) (April 2008) $350,000 Whole-genome sequencing (November 2007)
Informatics for next-generation sequence analysis – SNP calling Gabor T. Marth Boston College Biology Department PSB 2008 January
Next generation sequencing Why? What? How? Marcel Dinger Developmental Biology Divisional Seminar 7 October 2010.
Department of Bioinformatics and Computational Biology
1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html
CS 6293 Advanced Topics: Current Bioinformatics
Next Generation DNA Sequencing Platforms: Evolving Tools for
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
NGS Data Generation Dr Laura Emery. Overview The NGS data explosion Sequencing technologies An example of a sequencing workflow Bioinformatics challenges.
with an emphasis on DNA microarrays
Update on Next-Generation Sequencing
Next generation sequencing platforms Applications
The impact of next-generation sequencing technology of genetics Elaine R. Mardis – 11 February Washington School of Medicine, Genome Sequencing Center.
High-Throughput Sequencing Technologies
Molecular Biology Dr. Chaim Wachtel April 4, 2013.
Next generation sequencing Xusheng Wang 4/29/2010.
Sequencing Technologies and Applications at JGI
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
High Throughput Sequencing Methods and Concepts
BUDDING TECHNOLOGIES AND BUDDING YEAST 2012 HHMI Summer Workshop for High School Science Teachers.
The virochip (UCSF) is a spotted microarray. Hybridization of a clinical RNA (cDNA) sample can identify specific viral expression.
Bioinformatics and Sequencing Relevant to SolCAP
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
High throughput sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department BI543 Fall 2013 January 29, 2013.
High Throughput Sequencing Methods and Concepts Cedric Notredame adapted from S.M Brown.
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
I519 Introduction to Bioinformatics, Fall, 2012
De Novo Genome Assembly - Introduction Henrik Lantz - BILS/SciLife/Uppsala University.
How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.
Sequencing and Assembly GEN875, Genomics and Proteomics, Fall 2010.
Molecular Biology Dr. Chaim Wachtel May 28, 2015.
SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Lecture-5 ChIP-chip and ChIP-seq
No reference available
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Introduction to next-gen sequencing bioinformatics.ca Canadian Bioinformatics Workshops
Introduction to Next Generation Sequencing. Strategies For Interrogating the Transcriptome Known genes Predicted genes Surrogate strategy Exon verification.
Introduction to Illumina Sequencing
Next-generation sequencing technology
DNA Sequencing Second generation techniques
Short Read Sequencing Analysis Workshop
Next generation sequencing
Sequencing technologies
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Next-generation sequencing technology
Next-generation DNA sequencing
Tools for Molecular Biology
Presentation transcript:

Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health Genomics Wednesday October 7 th, 2009 BIMS 853 Special Topics in Cardiovascular Research

source: Francis Ouellette, OICR “omic” Disease Research

source: Francis Ouellette, OICR

Basics of the “old” technology Clone the DNA. Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide. Separate mixture on some matrix. Detect fluorochrome by laser. Interpret peaks as string of DNA. Strings are 500 to 1,000 letters long 1 machine generates 57,000 nucleotides/run Assemble all strings into a genome. source: Francis Ouellette, OICR

Basics of the “new” technology Get DNA. Attach it to something. Extend and amplify signal with some color scheme. Detect fluorochrome by microscopy. Interpret series of spots as short strings of DNA. Strings are letters long Multiple images are interpreted as 0.4 to 1.2 GB/run (1,200,000,000 letters/day). Map or align strings to one or many genome. source: Francis Ouellette, OICR

Differences between platforms: Nanotechnology used. Resolution of the image analysis. Chemistry and enzymology. Signal to noise detection in the software Software/images/file size/pipeline Cost $$$ source: Francis Ouellette, OICR

Adapted from Richard Wilson, School of Medicine, Washington University, “Sequencing the Cancer Genome” 3 Gb == source: Francis Ouellette, OICR

NGS technologies Roche/454 Life Sciences Illumina (Solexa) ABI SOLiD Helicos Complete Genomics Pacific Biosciences Polonator

Roche/454 pyrosequencing

454 flowgram 454 has difficulty quantizing luminescence of long homopolymers; problem gets worse with homopolymer length

Roche/454 first commercially available NGS platform long reads (most bp; soon 1000bp) paired-end module available relatively expensive runs homopolymer error rate is high common uses: metagenomics, bacterial genome (re)sequencing James Watson’s genome done entirely on 454 UVA Biology Dept. has one (Martin Wu)

Illumina (Solexa) 75 bp reads, PE bp fragments 8 lanes per flowcell ~3 Gbp per lane < 5% error rate available at UVA BRF DNA Core

ABI SOLiD

SOLiD “color space”

ABI SOLiD short reads (~35 bp) cheapest cost/base high fidelity reads (easy to detect errors) Common uses: SNP discovery 1000 genome project with PET libraries, all applications within reach …

Comparing Sequencers Roche (454)IlluminaSOLiD ChemistryPyrosequencingPolymerase-basedLigation-based AmplificationEmulsion PCRBridge AmpEmulsion PCR Paired ends/sepYes/3kbYes/200 bpYes/3 kb Mb/run100 Mb1300 Mb3000 Mb Time/run7 h4 days5 days Read length250 bp32-40 bp35 bp Cost per run (total)$8439$8950$17447 Cost per Mb$84.39$5.97$5.81 source: Stefan Bekiranov, UVA

Other NGS platforms Helicos (Stephen Quake, Stanford) – single molecules on slide – like Illumina, but no PCR, greater density Complete Genomics – sequencing factory – 10K human genomes/year, $10K each Pacific Biosciences – SMRT – DNA polymerase bound to laser/camera hookup – records a movie of DNA replication with fluoroscent dNTPs as single strand moves through nanopore Polonator (Shendure and Church) – homebrew, $200K flowcell+laser machine – allows custom chemistry protocols

NGS applications genome (re)sequencing – de novo genomes: 454 in Bact, small Euks – SNP discovery and genotyping (barcoded pools) – targeted, “deep” gene resequencing – metagenomics structural/copy-number variation – Tumor genome SV/CNV: Illumina/PET epigenomics – last week’s seminar RNA-seq: now-generation transcriptomics ChIP-seq: now-generation DNA-binding

RNA-seq: RNA abundance

RNA-seq: alternative splicing

RNA-seq “unbiased” digital measure of abundance – residual PCR artifacts? Helicos says “yes” larger dynamic range than microarray – depends on sequencing depth  cost ability to see alt./edited transcripts – multiple AS sites confounded; 454? Total RNA vs. cDNA – 3’ end bias of cDNA – non-polyA transcripts in total RNA

ChIP-seq: protein-DNA binding

PET: Paired End Tag libraries

PET applications

some things I didn’t get to talk about much: personal genome sequencing/medicine microbial metagenomics ENCODE/modENCODE projects HapMap project human 1000 Genome Project (1KGP) targeted- and/or deep-resequencing microRNAs, piRNAs, ncRNAs, … SVs and CNVs (cancer) read alignment issues (“mapability”)

Questions?