Download presentation
Presentation is loading. Please wait.
Published byJanel Dickerson Modified over 8 years ago
1
Introduction to next-gen sequencing bioinformatics.ca Canadian Bioinformatics Workshops www.bioinformatics.ca
2
Introduction to next-gen sequencing bioinformatics.ca
3
Module 1 Introduction to next-gen sequencing
4
Introduction to next-gen sequencing bioinformatics.ca Overview “next-gen” or “next-next-gen”: why are we here? What kinds of sequencing are we doing? How does DNA sequencing works? Trying to stay away from vender-specific challenges, but can we really? Where next?
5
History of DNA Sequencing Avery: Proposes DNA as ‘Genetic Material’ Watson & Crick: Double Helix Structure of DNA Holley: Sequences Yeast tRNA Ala 1870 1953 1940 1965 1970 1977 1980 1990 2002 Miescher: Discovers DNA Wu: Sequences Cohesive End DNA Sanger: Dideoxy Chain Termination Gilbert: Chemical Degradation Messing: M13 Cloning Hood et al.: Partial Automation Cycle Sequencing Improved Sequencing Enzymes Improved Fluorescent Detection Schemes 1986 Next Generation Sequencing Improved enzymes and chemistry New image processing Adapted from Eric Green, NIH; Adapted from Messing & Llaca, PNAS (1998) 1 1 15 15 0 50,00 0 25,00 0 1,500 200,00 0 50,000,000 Efficiency (bp/person/year) 15,000 100,000,000,000 2009
6
Introduction to next-gen sequencing bioinformatics.ca Why are we sequencing? Before Next-generation: – Reductionist perspective on life – DNA, RNA, (proteins), (populations), sampling, averages, consensus Problems: sampling, averages, consensus. After Next-generation: – We are still reductionist, but better – Genome sequence and structure – Less cloning/PCR – Single molecules (for some)
7
Introduction to next-gen sequencing bioinformatics.ca Basics of the “old” technology Clone the DNA. Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide. Separate mixture on some matrix. Detect fluorochrome by laser. Interpret peaks as string of DNA. Strings are 500 to 1,000 letters long 1 machine generates 57,000 nucleotides/run Assemble all strings into a “whole”.
8
Introduction to next-gen sequencing bioinformatics.ca Sanger (old-gen) Sequencing Now-Gen Sequencing Whole GenomeHuman (early drafts), model organisms, bacteria, viruses and mitochondria (chloroplast), low coverage New human (!), individual genome, 1,000 normal, 25,000 cancer matched control pairs, rare-samples RNAcDNA clones, ESTs, Full Length Insert cDNAs, other RNAs RNA-Seq: Digitization of transcriptome, alternative splicing events, miRNA CommunitiesEnvironmental sampling, 16S RNA populations, ocean sampling, Human microbiome, deep environmental sequencing, Bar-Seq OtherEpigenome, rearrangements, ChIP-Seq
9
Introduction to next-gen sequencing bioinformatics.ca Differences between the various platforms: Nanotechnology used. Resolution of the image analysis. Chemistry and enzymology. Signal to noise detection in the software Software/images/file size/pipeline Cost $$$
10
Next Generation DNA Sequencing Technologies Adapted from Richard Wilson, School of Medicine, Washington University, “Sequencing the Cancer Genome” http://tinyurl.com/5f3alk Human Genome6GB == 6000 MB Req’d Coverage61230 3730454Illumina bp/read6004002X75 reads/run96500,000100,000.000 bp/run57,6000.5 GB15 GB # runs req’d625,00014412 runs/day210.1 Machine days/human genome 312,500 (856 years) 144120 Cost/run$48$6,800$9,300 Total cost$15,000,000$979,200$111,600
11
Next-gen sequencers read length bases per machine run 10 bp1,000 bp100 bp 1 Gb 100 Mb 10 Mb 10 Gb AB/SOLiDv3, Illumina/GAII short-read sequencers ABI capillary sequencer 454 GS FLX pyrosequencer (100-500 Mb in 100-400 bp reads, 0.5-1M reads, 5-10 hours) (10+Gb in 50-100 bp reads, >100M reads, 4-8 days) 1 Mb (0.04-0.08 Mb in 450-800 bp reads, 96 reads, 1-3 hours) 100 Gb From John McPherson, OICR
12
2009/10 Promises? read length bases per machine run 10 bp1,000 bp100 bp 1 Gb 100 Mb 10 Mb 10 Gb ABI capillary sequencer 454 GS FLX Titanium 0.4-0.6 Gb, 100-400 bp reads Illumina GAII 90Gb, 175bp reads 1 Mb (0.04-0.08 Mb, 450-800 bp reads 100 Gb AB SOLiDv3 120Gb, 100 bp reads From John McPherson, OICR
13
Introduction to next-gen sequencing bioinformatics.ca http://tinyurl.com/nk9rkm
14
Solexa-based Whole Genome Sequencing Adapted from Richard Wilson, School of Medicine, Washington University, “Sequencing the Cancer Genome” http://tinyurl.com/5f3alk
15
Introduction to next-gen sequencing bioinformatics.ca Illumina (Solexa)
16
Introduction to next-gen sequencing bioinformatics.ca Illumina (Solexa)
17
Introduction to next-gen sequencing bioinformatics.ca Illumina (Solexa)
18
From Debbie Nickerson, Department of Genome Sciences, University of Washington, http://tinyurl.com/6zbzh4
19
Introduction to next-gen sequencing bioinformatics.ca
20
Introduction to next-gen sequencing bioinformatics.ca AB SOLiD: file management
21
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
22
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
23
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
24
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
25
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
26
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
27
Introduction to next-gen sequencing bioinformatics.ca AB SOLiD
28
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
29
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
30
Introduction to next-gen sequencing bioinformatics.ca
31
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
32
Introduction to next-gen sequencing bioinformatics.ca
33
Introduction to next-gen sequencing bioinformatics.ca SOLiD color space
34
Introduction to next-gen sequencing bioinformatics.ca
35
Introduction to next-gen sequencing bioinformatics.ca http://solidsoftwaretools.com/gf/project/dh10bfrag/
36
Introduction to next-gen sequencing bioinformatics.ca Sample AB data Lab >443_1087_001_F3 T12111121313231331100020021211112211 >443_1087_002_F3 T01121100201303232033213132212320123 >443_1087_003_F3 T21333200110101330330011101121132111 >443_1087_004_F3 T21322103331203331001002121021323111 >443_1088_005_F3 T32311301011311231133321301012223110 >443_1088_006_F3 T13211113031122103020002220012122101 >443_1088_007_F3 T21112301301221022023212000311310313 >443_1088_008_F3 T12133033210200001231010301011012031 >443_1088_009_F3 T23330012121212103111123012012320300 >443_1088_010_F3 T10213330331021322130123311011312110 Get sequence assignment from instructor Work with people at your table. Use info from lecture notes (Panel E) BLAST sequence at NCBI What is it?
37
Introduction to next-gen sequencing bioinformatics.ca Module 1 lab
38
Introduction to next-gen sequencing bioinformatics.ca
39
Introduction to next-gen sequencing bioinformatics.ca Also known as “pyrosequencing” http://www.454.com/products-solutions/system-features.asp 500 million bp/run 10 hr run 400-500 bp/read & > 1 M reads Roche / 454 : GS FLX
40
Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX Made for de novo sequencing. Too expensive for resequencing. For example, this platform will be used a lot by laboratories doing new bacterial genomes. Baylor Genome Center involved in Sea Urchin, Bee, Platypus genomes: They have a number of 454.
41
Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX
42
Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX
43
Introduction to next-gen sequencing bioinformatics.ca Roche / 454 : GS FLX
44
Introduction to next-gen sequencing bioinformatics.ca It’s more complicated! Get files with quality scores Get files with miss-matches Need to align them to a reference genome Multiple tools do this today … and there will be more later. What do you do? Do it all!
45
Introduction to next-gen sequencing bioinformatics.ca Pacific Biosystems (PacBio) July 2008
46
Introduction to next-gen sequencing bioinformatics.ca Pacific Biosystems (PacBio)
47
Introduction to next-gen sequencing bioinformatics.ca
48
Introduction to next-gen sequencing bioinformatics.ca
49
Introduction to next-gen sequencing bioinformatics.ca Things to keep in mind All people are learning, if you don’t know, ask, and they probably won’t know either, and you can figure it out together! The technology is changing – This workshop next year will be totally different! We can only do so much in two days – you will need to find things, find people who can help you, and you will need to teach your friends!
50
Introduction to next-gen sequencing bioinformatics.ca Other factors Changing technology –New and disappearing companies? Changing price structure –Cost of machine –Cost of operation (reagents/people) –Service from the company –1 machine vs (2 or 3 machines) vs 40 machines. Changing software and processing
51
Introduction to next-gen sequencing bioinformatics.ca OICR Informatics: servers, CPU, Storage, and Backups 14 Sequencers cluster 8 core 16 GB RAM 8 core 96 or 256 GB RAM 200 X 5 X Web Dev SVN 125 X MS- Windows 12 X 50 X local (150 GB) 10 X seq (9 TB) FC (25 TB) N-series SATA (25 TB) BlueArc SATA (1PB) SAS (40 TB) Storage Robot 800 GB/tape 12 Drives > 300 tape library Back Up 1640 cores 1259 TB
52
Introduction to next-gen sequencing bioinformatics.ca What have we learned? Sequencing technologies are changing fast Allowing new biology to be performed, new questions to be asked Understand the difference between some of the technologies You can work in “color space”.
53
Introduction to next-gen sequencing bioinformatics.ca What next?
54
Introduction to next-gen sequencing bioinformatics.ca Day 1
55
Introduction to next-gen sequencing bioinformatics.ca URLs http://454.com/ http://illumina.com/ http://appliedbiosystems.com/ http://pacificbiosciences.com/ http://helicosbio.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.