Introduction to next-gen sequencing bioinformatics.ca Canadian Bioinformatics Workshops www.bioinformatics.ca.

Slides:



Advertisements
Similar presentations
High throughput sequencing Barbera van Schaik
Advertisements

Schulich School of Medicine & Dentistry The University of Western Ontario London Regional Genomics Centre Next Generation Sequencing Meeting April 1, 2010.
The Past, Present, and Future of DNA Sequencing
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Next–generation DNA sequencing technologies – theory & practice
Next-generation sequencing
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Canadian Bioinformatics Workshops
What Is Genomics? Genomics is the study of how the entire genome of a species functions as a unit and evolves over time. It is the study of life’s blueprint,
Dawei Lin, Ph.D. Director, Bioinformatics Core UC Davis Genome Center July 20, 2008, SLIMS (Solexa sequencing.
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
Greg Phillips Veterinary Microbiology
Bioinformatics for high-throughput DNA sequencing Gabor Marth Boston College Biology New grad student orientation Boston College September 8, 2009.
A Lot More Advanced Biotechnology Tools DNA Sequencing.
Affymetrix Microarray and Illumina/ Solexa NextGen Sequencing Yuannan Xia, Ph.D Genomics Core Research Facility
The SOLiD System: Next-Generation Sequencing Overview of the SOLiD System –  Scalable  Accurate Ultra High Throughput  Flexible  Mate Pairs.
CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.
Bioinformatics for next-generation DNA sequencing Gabor T. Marth Boston College Biology Department BC Biology new graduate student orientation September.
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
CS273a Lecture 1, Autumn 10, Batzoglou DNA Sequencing.
Lecture 1 Introduction to high throughput sequencing
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.
Informatics for next-generation sequence analysis – SNP calling Gabor T. Marth Boston College Biology Department PSB 2008 January
Informatics challenges and computer tools for sequencing 1000s of human genomes Gabor T. Marth Boston College Biology Department Cold Spring Harbor Laboratory.
High Throughput Sequencing
Next generation sequencing Why? What? How? Marcel Dinger Developmental Biology Divisional Seminar 7 October 2010.
Department of Bioinformatics and Computational Biology
DNA Sequencing LECTURE 6: Biotechnology; 3 Credit hours Atta-ur-Rahman School of Applied Biosciences (ASAB) National University of Sciences and Technology.
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
Update on Next-Generation Sequencing
The impact of next-generation sequencing technology of genetics Elaine R. Mardis – 11 February Washington School of Medicine, Genome Sequencing Center.
Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health.
Next generation sequencing Xusheng Wang 4/29/2010.
High Throughput Sequencing Methods and Concepts
Introduction to next generation sequencing Rolf Sommer Kaas.
AP Biology A Lot More Advanced Biotechnology Tools Sequencing.
Collecting and Storing Sequences In the laboratory Heather Helm UPR Sequencing Facilities Manager.
Bioinformatics and Sequencing Relevant to SolCAP
High Throughput Sequencing Methods and Concepts Cedric Notredame adapted from S.M Brown.
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
The iPlant Collaborative
CO341: Introduction to Bioinformatics Prof. Yi-Ke Guo
Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006.
How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.
Serghei Mangul Department of Computer Science Georgia State University Joint work with Irina Astrovskaya, Marius Nicolae, Bassam Tork, Ion Mandoiu and.
Sequencing and Assembly GEN875, Genomics and Proteomics, Fall 2010.
Molecular Biology Dr. Chaim Wachtel May 28, 2015.
Genomics.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
Bioinformatics & Biotechnology Lecture 1 Sequencing BLAST PCR Gel Electrophoresis.
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introduction to Illumina Sequencing
Canadian Bioinformatics Workshops
DNA Sequencing First generation techniques
Short Read Sequencing Analysis Workshop
Next generation sequencing
Cancer Genomics Core Lab
Sequencing technologies
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Canadian Bioinformatics Workshops
Sequencing technology and assembly
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
2nd (Next) Generation Sequencing
ULTRASEQUENCING. Next Generation Sequencing: methods and applications.
Next-generation DNA sequencing
A Lot More Advanced Biotechnology Tools
Presentation transcript:

Introduction to next-gen sequencing bioinformatics.ca Canadian Bioinformatics Workshops

Introduction to next-gen sequencing bioinformatics.ca

Module 1 Introduction to next-gen sequencing

Introduction to next-gen sequencing bioinformatics.ca Overview “next-gen” or “next-next-gen”: why are we here? What kinds of sequencing are we doing? How does DNA sequencing works? Trying to stay away from vender-specific challenges, but can we really? Where next?

History of DNA Sequencing Avery: Proposes DNA as ‘Genetic Material’ Watson & Crick: Double Helix Structure of DNA Holley: Sequences Yeast tRNA Ala Miescher: Discovers DNA Wu: Sequences Cohesive End DNA Sanger: Dideoxy Chain Termination Gilbert: Chemical Degradation Messing: M13 Cloning Hood et al.: Partial Automation Cycle Sequencing Improved Sequencing Enzymes Improved Fluorescent Detection Schemes 1986 Next Generation Sequencing Improved enzymes and chemistry New image processing Adapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998) , ,00 0 1, , ,000,000 Efficiency (bp/person/year) 15, ,000,000,

Introduction to next-gen sequencing bioinformatics.ca Why are we sequencing? Before Next-generation: – Reductionist perspective on life – DNA, RNA, (proteins), (populations), sampling, averages, consensus Problems: sampling, averages, consensus. After Next-generation: – We are still reductionist, but better – Genome sequence and structure – Less cloning/PCR – Single molecules (for some)

Introduction to next-gen sequencing bioinformatics.ca Basics of the “old” technology Clone the DNA. Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide. Separate mixture on some matrix. Detect fluorochrome by laser. Interpret peaks as string of DNA. Strings are 500 to 1,000 letters long 1 machine generates 57,000 nucleotides/run Assemble all strings into a “whole”.

Introduction to next-gen sequencing bioinformatics.ca Sanger (old-gen) Sequencing Now-Gen Sequencing Whole GenomeHuman (early drafts), model organisms, bacteria, viruses and mitochondria (chloroplast), low coverage New human (!), individual genome, 1,000 normal, 25,000 cancer matched control pairs, rare-samples RNAcDNA clones, ESTs, Full Length Insert cDNAs, other RNAs RNA-Seq: Digitization of transcriptome, alternative splicing events, miRNA CommunitiesEnvironmental sampling, 16S RNA populations, ocean sampling, Human microbiome, deep environmental sequencing, Bar-Seq OtherEpigenome, rearrangements, ChIP-Seq

Introduction to next-gen sequencing bioinformatics.ca Differences between the various platforms: Nanotechnology used. Resolution of the image analysis. Chemistry and enzymology. Signal to noise detection in the software Software/images/file size/pipeline Cost $$$

Next Generation DNA Sequencing Technologies Adapted from Richard Wilson, School of Medicine, Washington University, “Sequencing the Cancer Genome” Human Genome6GB == 6000 MB Req’d Coverage Illumina bp/read X75 reads/run96500,000100, bp/run57, GB15 GB # runs req’d625, runs/day210.1 Machine days/human genome 312,500 (856 years) Cost/run$48$6,800$9,300 Total cost$15,000,000$979,200$111,600

Next-gen sequencers read length bases per machine run 10 bp1,000 bp100 bp 1 Gb 100 Mb 10 Mb 10 Gb AB/SOLiDv3, Illumina/GAII short-read sequencers ABI capillary sequencer 454 GS FLX pyrosequencer ( Mb in bp reads, 0.5-1M reads, 5-10 hours) (10+Gb in bp reads, >100M reads, 4-8 days) 1 Mb ( Mb in bp reads, 96 reads, 1-3 hours) 100 Gb From John McPherson, OICR

2009/10 Promises? read length bases per machine run 10 bp1,000 bp100 bp 1 Gb 100 Mb 10 Mb 10 Gb ABI capillary sequencer 454 GS FLX Titanium Gb, bp reads Illumina GAII 90Gb, 175bp reads 1 Mb ( Mb, bp reads 100 Gb AB SOLiDv3 120Gb, 100 bp reads From John McPherson, OICR

Introduction to next-gen sequencing bioinformatics.ca

Solexa-based Whole Genome Sequencing Adapted from Richard Wilson, School of Medicine, Washington University, “Sequencing the Cancer Genome”

Introduction to next-gen sequencing bioinformatics.ca Illumina (Solexa)

Introduction to next-gen sequencing bioinformatics.ca Illumina (Solexa)

Introduction to next-gen sequencing bioinformatics.ca Illumina (Solexa)

From Debbie Nickerson, Department of Genome Sciences, University of Washington,

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca AB SOLiD: file management

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca AB SOLiD

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca SOLiD color space

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca Sample AB data Lab >443_1087_001_F3 T >443_1087_002_F3 T >443_1087_003_F3 T >443_1087_004_F3 T >443_1088_005_F3 T >443_1088_006_F3 T >443_1088_007_F3 T >443_1088_008_F3 T >443_1088_009_F3 T >443_1088_010_F3 T Get sequence assignment from instructor Work with people at your table. Use info from lecture notes (Panel E) BLAST sequence at NCBI What is it?

Introduction to next-gen sequencing bioinformatics.ca Module 1 lab

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca Also known as “pyrosequencing” million bp/run 10 hr run bp/read & > 1 M reads Roche / 454 : GS FLX

Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX Made for de novo sequencing. Too expensive for resequencing. For example, this platform will be used a lot by laboratories doing new bacterial genomes. Baylor Genome Center involved in Sea Urchin, Bee, Platypus genomes: They have a number of 454.

Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX

Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX

Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX

Introduction to next-gen sequencing bioinformatics.ca It’s more complicated! Get files with quality scores Get files with miss-matches Need to align them to a reference genome Multiple tools do this today … and there will be more later. What do you do? Do it all!

Introduction to next-gen sequencing bioinformatics.ca Pacific Biosystems (PacBio) July 2008

Introduction to next-gen sequencing bioinformatics.ca Pacific Biosystems (PacBio)

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca

Introduction to next-gen sequencing bioinformatics.ca Things to keep in mind All people are learning, if you don’t know, ask, and they probably won’t know either, and you can figure it out together! The technology is changing – This workshop next year will be totally different! We can only do so much in two days – you will need to find things, find people who can help you, and you will need to teach your friends!

Introduction to next-gen sequencing bioinformatics.ca Other factors Changing technology –New and disappearing companies? Changing price structure –Cost of machine –Cost of operation (reagents/people) –Service from the company –1 machine vs (2 or 3 machines) vs 40 machines. Changing software and processing

Introduction to next-gen sequencing bioinformatics.ca OICR Informatics: servers, CPU, Storage, and Backups 14 Sequencers cluster 8 core 16 GB RAM 8 core 96 or 256 GB RAM 200 X 5 X Web Dev SVN 125 X MS- Windows 12 X 50 X local (150 GB) 10 X seq (9 TB) FC (25 TB) N-series SATA (25 TB) BlueArc SATA (1PB) SAS (40 TB) Storage Robot 800 GB/tape 12 Drives > 300 tape library Back Up 1640 cores 1259 TB

Introduction to next-gen sequencing bioinformatics.ca What have we learned? Sequencing technologies are changing fast Allowing new biology to be performed, new questions to be asked Understand the difference between some of the technologies You can work in “color space”.

Introduction to next-gen sequencing bioinformatics.ca What next?

Introduction to next-gen sequencing bioinformatics.ca Day 1

Introduction to next-gen sequencing bioinformatics.ca URLs