Next generation sequencing Why? What? How? Marcel Dinger Developmental Biology Divisional Seminar 7 October 2010.

Slides:



Advertisements
Similar presentations
Mo17 shotgun project Goal: sequence Mo17 gene space with inexpensive new technologies Datasets in progress: Four-phases of 454-FLX sequencing to max of.
Advertisements

RNA-seq library prep introduction
Capturing the chicken transcriptome with PacBio long read RNA-seq data OR Chicken in awesome sauce: a recipe for new transcript identification Gladstone.
The Past, Present, and Future of DNA Sequencing
An Introduction to Studying Expression Data Through RNA-seq
The Good, Bad, and Ugly of Next-Gen Sequencing
Celera Assembler Arthur L. Delcher Senior Research Scientist CBCB University of Maryland.
Peter Tsai Bioinformatics Institute, University of Auckland
RNA-seq: the future of transcriptomics ……. ?
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
World’s Leading Provider of Turn-key Compute Solutions for NGS / Bioinformatics.
Greg Phillips Veterinary Microbiology
WV-INBRE West Virginia IDeA Network of Biomedical Research Excellence Managing the NextGen data pipeline Jim Denvir, Ph.D.
Bioinformatics for high-throughput DNA sequencing Gabor Marth Boston College Biology New grad student orientation Boston College September 8, 2009.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Transcriptomics Jim Noonan GENE 760.
The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
RNA-seq Analysis in Galaxy
High Throughput Sequencing
mRNA-Seq: methods and applications
11 © 2009 PerkinElmer © 2010 PerkinElmer November 20, 2012 DNA Services Overview.
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
NGS Data Generation Dr Laura Emery. Overview The NGS data explosion Sequencing technologies An example of a sequencing workflow Bioinformatics challenges.
Update on Next-Generation Sequencing
Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health.
Molecular Biology Dr. Chaim Wachtel April 4, 2013.
BACKGROUND Have a gene involved in neurological disease, its function unclear Knockout is lethal, so… Designed a conditional knockout (cKO) mouse where.
‘Omics’ - Analysis of high dimensional Data
Bioinformatics Core Facility Ernesto Lowy February 2012.
Expression Analysis of RNA-seq Data
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
Ji-hye Choi August Introduction (2006) ABRF-NGS (the Association fo Biomolecular Resource Facilities next-generation sequencing study)
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Bioinformatics Institute work with ASAS Genomics Centre By Dan Jones.
Genomics – Next-Gen sequencing and Microarrays
Data Type 1: Microarrays
PERFORMANCE COMPARISON OF NEXT GENERATION SEQUENCING PLATFORMS Bekir Erguner 1,3, Duran Üstek 2, Mahmut Ş. Sağıroğlu 1 1Advanced Genomics and Bioinformatics.
Bioinformatics and Sequencing Relevant to SolCAP
Next Generation DNA Sequencing
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
The iPlant Collaborative
De Novo Genome Assembly - Introduction Henrik Lantz - BILS/SciLife/Uppsala University.
How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.
Next Generation Sequencing pipeline: a joint LONI – BIRN [UCLA – UCI] collaborative project F. Macciardi – March 16, 2011.
Tag profiling is dead... October 2009 Claudia Voelckel Patrick Biggs...long live mRNA-Seq!
Genomics.
Transcriptomics Sequencing. over view The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non coding RNA produced.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
Introduction to RNAseq
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Proteome and Gene Expression Analysis Chapter 15 & 16.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
No reference available
Lecture 12 RNA – seq analysis.
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introduction to Next Generation Sequencing. Strategies For Interrogating the Transcriptome Known genes Predicted genes Surrogate strategy Exon verification.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introduction to Illumina Sequencing
Short Read Sequencing Analysis Workshop
Cancer Genomics Core Lab
Gene expression from RNA-Seq
Next Generation Sequencing
2nd (Next) Generation Sequencing
Gene Expression Analysis
BF nd (Next) Generation Sequencing
Presentation transcript:

Next generation sequencing Why? What? How? Marcel Dinger Developmental Biology Divisional Seminar 7 October 2010

Applications of Next-Generation Sequencing

Next-Generation Sequencing Workflow Illumina, Roche 454 or ABI SOLiD? Illumina, Roche 454 or ABI SOLiD?

Sample generation and cluster generation 200,000 clusters per tile 62.5 million reads per lane 100 bp reads -> 12.5 Gb per lane 200,000 clusters per tile 62.5 million reads per lane 100 bp reads -> 12.5 Gb per lane

Cluster generation and preparation for sequencing

Sequencing by Synthesis (SBS)

Base Calling

Terminology Single end - sequence of the first nt of DNA fragments Paired end - sequence ~75 nt from each end of the fragment. Fragment length can be adjusted from nt. Mate pair - sequence of the first nt of DNA fragments

cDNA normalization One disadvantage of RNA-seq is the “diminishing returns” with increased sequencing depth i.e. majority of reads represent common RNAs. To detect rare transcripts, very deep sequencing is necessary. One disadvantage of RNA-seq is the “diminishing returns” with increased sequencing depth i.e. majority of reads represent common RNAs. To detect rare transcripts, very deep sequencing is necessary. Ribo-minus, capture arrays and cDNA normalization reduce this problem. cDNA normalization can now be achieved simply with a duplex-specific nuclease based approach. Normalized RNA can no longer be used for quantitative expression studies, but is essential for rare transcript discovery and characterization

Target enrichment (sequence capture)

Where do I start? Commercial Commercial Advantages Quality guarantee Access to cutting-edge Advantages Quality guarantee Access to cutting-edge Disadvantages Expensive Disadvantages Expensive Collaborate Collaborate Advantages Cheaper Free support and expertise Advantages Cheaper Free support and expertise Disadvantages No guarantees Slower Share authorship Disadvantages No guarantees Slower Share authorship

Commercial providers Contact: Mark Crowe Platform: Illumina GAIIx, Roche 454 and 2 HiSeq 2000 (coming soon) Contact: Rob King Platform: Illumina GAIIx Contact: Karolina Janitz Platform: Illumina GAIIx and Roche 454

Collaborative options qbi Contact: Vikki Marshall Platform: Illumina HiSeq 2000 (from December) Contact: Evgeny Glazov Platform: Illumina GAIIx, 2-3 HiSeq 2000 (coming soon) Contact: Peter Wilson / Sean Grimmond Platform: 11 ABI SOLiD (committed to ICGC till at least 3Q 2011), other technology, e.g. IonTorrent forthcoming? Platform: 11 ABI SOLiD (committed to ICGC till at least 3Q 2011), other technology, e.g. IonTorrent forthcoming?

What will it cost? At least two stages in next-gen sequencing: library preparation and sequencing Extra costs for capture arrays, normalization etc At least two stages in next-gen sequencing: library preparation and sequencing Extra costs for capture arrays, normalization etc

What will it cost? Costs can be dramatically lower if library preparation is down in-house and if working with a collaborator such as Diamantina or QBI (<$2,500 per lane) Total RNA required depends on protocol - count on at least 100 ng (but as low as 1 ng is possible!). Small RNA needs to be considered independently of long RNA.

Data analysis Next generation sequencing data is really BIG. Genomes typically need to be covered 30-fold to get good assembly and be able to detect SNPs For RNA-sequencing million tags are necessary for each time point for differential expression studies and for coverage of rare transcripts and isoforms As well as a server with a lot of memory and processors, terabytes of space are required for analysis and organization of next-gen sequencing data Cost of data analysis will be much greater than the cost of the sequencing (consider expertise in the lab, collaborate with an informatics group or engage with commercial service, such as QFAB) Software for next-gen sequence analysis is improving, but still in its infancy. Considerable computional expertise is necessary to get the most out of the data.

Summary Next generation sequencing (NGS) is transforming molecular biology NGS can intersect and contribute to (and even revolutionize!) practically any research program NGS is not prohibitively expensive... but does require some bioinformatics expertise to get the most from the data (remember to factor this into your grant applications!) Experimental planning is critical: before embarking on an NGS experiment, be sure to talk with the people that are going to be analyzing the data.

Acknowledgments Mark Crowe (AGRF) Evgeny Glazov (DI) Karolina Janitz (Ramaciotti) Rob King (GeneWorks) Arjuna Kumarasuriyar (Illumina) Vikki Marshall (QBI) Peter Wilson (QCMG) Mark Crowe (AGRF) Evgeny Glazov (DI) Karolina Janitz (Ramaciotti) Rob King (GeneWorks) Arjuna Kumarasuriyar (Illumina) Vikki Marshall (QBI) Peter Wilson (QCMG) Questions?? Questions??