Download presentation
Presentation is loading. Please wait.
Published byJustina Sutton Modified over 8 years ago
1
cse587A/Bio 5747: L2 1/19/06 1 DNA sequencing: Basic idea Background: test tube DNA synthesis DNA polymerase (a natural enzyme) extends 2-stranded DNA over a 1-stranded template primer extension polymerase 5’ TTACAGGTCCATACTA 3’ AATGTCCAGGTATGATACATAGG 5’ Template Can buy DNA polymerase and do this in a tube. Quicktime animation
2
cse587A/Bio 5747: L2 1/19/06 2 DNA sequencing, cont
3
cse587A/Bio 5747: L2 1/19/06 3 DNA sequencing, cont
4
cse587A/Bio 5747: L2 1/19/06 4 DNA sequencing, cont
5
cse587A/Bio 5747: L2 1/19/06 5 Quicktime animation
6
cse587A/Bio 5747: L2 1/19/06 6 Modern Sanger sequencing Dye terminator sequencing Flourescent label on terminator, not primer Different colors for ddA, ddC, ddG, ddT Run all 4 reactions in a single lane Image under 4 colors of laser Capillary electrophoresis Each sequence is sized thru a separate, thin tube (capillary) Avoids lane tracking errors Automated readout -- Phred
7
cse587A/Bio 5747: L2 1/19/06 7 Limitations of technology Error prone, especially at beginning & end –But Phred estimates error probability Not useful beyond 500-800 bp
8
cse587A/Bio 5747: L2 1/19/06 8 Whole chromatogram (trace)
9
cse587A/Bio 5747: L2 1/19/06 9 Start of trace
10
cse587A/Bio 5747: L2 1/19/06 10 End of trace
11
cse587A/Bio 5747: L2 1/19/06 11 Base calling, assembly, editing Software tools PHRED calls bases from traces. Reads. –Estimates error probability for each base (quality values) PHRAP assembles reads a longer sequence –Uses quality values –Not intended for whole-genome assembly Research on assembly algorithms is ongoing
12
cse587A/Bio 5747: L2 1/19/06 12 Michael Brent Dept. of Computer Science Washington University Sequencing Genomes
13
cse587A/Bio 5747: L2 1/19/06 13 Why sequence a genome? Cool technology Infrastructure for molecular science E.g. Cloning & studying a gene of interest “Parts list for the human body” Genome science Evolution and dynamics of genomes Medicine Genomic causes of disease and health
14
cse587A/Bio 5747: L2 1/19/06 14 Which genomes?
15
cse587A/Bio 5747: L2 1/19/06 15 How can I sequence a genome? Shotgun sequencing: simple version 1.Cut your DNA at random locations 2.Get ~700-800 bp of sequence from the end of each fragment: AAGTCGTGGG…. 3.Use overlapping sequences to reassemble
16
cse587A/Bio 5747: L2 1/19/06 16 Step 1: cutting & cloning A.Cut/break the DNA Physical shear – put it in a blender, or Restriction digest B. Separate fragments by size & select
17
cse587A/Bio 5747: L2 1/19/06 17 1C. Clone select fragments Quicktime animation
18
cse587A/Bio 5747: L2 1/19/06 18 2. Sequence random clones Pick a clone containing copies of 1 insert from the plate Separate the plasmids from the cells Sequence the inserts using primers complementary to the vector
19
cse587A/Bio 5747: L2 1/19/06 19 3. Assemble fragments Idea Common end sequences may indicate overlap in original sequence overlapping shotgun sequences …CTGACTAAGTCAUGTTACAG TTACAGCAGGTATGATA… assembled sequence …CTGACTAAGTCAUGTTACAGCAGGTATGATA…
20
cse587A/Bio 5747: L2 1/19/06 20 3. Assemble fragments Problems Sequencing error may obscure true overlap Common end sequences can occur by chance Repeats: DNA of higher eukaryotes contains many copies of nearly identical sequences –This means overlaps are often from different copies of the same repeat element –Repeats are the major issue in sequencing Polymorphism
21
cse587A/Bio 5747: L2 1/19/06 21 Genome assembly Challenge Can’t assemble sequencing reads based on overlapping ends in long repeats …CTGACTAAGTCAUGTTACAG TTACAGCAGGTATGATA Overlaps may be from different repeat copies Leading to large-scale misassembly Polymorphic mismatches may prevent good joins
22
cse587A/Bio 5747: L2 1/19/06 22 Single-molecule sequencing Since ~2007, we can sequence individual molecules without cloning 1.Many molecules are attached to a surface and copied, forming a cluster of identical templates 2.Reversible dye terminators are incor- porated according to templates (1 bp) 3.Slide is imaged sequentially under 4 color lasers, showing which dye was incorporated at each cluster
23
cse587A/Bio 5747: L2 1/19/06 23 Single-molecule sequencing 4.Terminator is cleaved off and 2 nd -strand synthesis continues for next cycle Each cycle is one position in the sequence 10 8 50 nt reads / 2-day run (Solexa) 10 6 400 nt reads / 5-day run (454) For Sanger, ~10 3 700 nt reads / day Read-length vs. throughput tradeoff
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.