Download presentation
Presentation is loading. Please wait.
Published byJean Little Modified over 9 years ago
1
Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006 review paper Assigned listening: Ecic Lander genomics lecture
2
DNA Sequence Project Size/Type 500 bases 2500 bases 10 kbp 150 kbp 3 Mbp –simple –repeats 3 Gbp 31 Gbp 1 EST,STS whole cDNA/EST Gene, virus BAC, big virus Bacterial genome, YAC-size Human, mouse Salamander
3
Metazoan genome sizes Nematode (Caenorhabditis elegans): 100 Mb Thale cress (Arabidopsis thaliana): 160 Mb Fruit fly (Drosophila melanogaster): 180 Mb Puffer fish (Takifugu rubripes): 400 Mb Rice (Oryza sativa): 490 Mb Human (Homo sapiens): 3.5 Gb Leopard frog (Rana pipiens): 6.5 Gb Onion (Allium cepa):16.4 Gb Mountain grasshopper(Podisma pedestris):16.5 Gb Tiger salamander (Ambystoma tigrinum):31 Gb Easter lily (Lilium longiflorum): 34 Gb Marbled lungfish (Protopterus aethiopicus):130 Gb
4
DNA Sequencing Methods Chain termination/Dideoxy/Sanger ABI –Fluorescence paradigm, ABI –Main method Next generation sequencing –Polymerase addition sequencing –454 Sequencing, Illumina Affymetrix –Chips: Affymetrix
5
Dideoxy / Chain Terminator / Sanger Template Primer Extension Chemistry –polymerase –termination –labeling Separation Detection
6
Chain Terminator Basics Target Template-Primer Extend ddA ddG ddC ddT Labeled Terminators ddA AddC ACddG ACGddT TGCA dN : ddN 100 : 1
7
Electrophoresis Sequencing Reaction products Polyacrylamide Gel Electrophoresis (PAGE)
8
DNA sequencing trace file
9
Separation Gel Electrophoresis Capillary Electrophoresis –suited to automation rapid (2 hrs vs 12 hrs) re-usable simple temperature control 96 well format
10
Paradigm Instrument Applied Biosystems http://www.appliedbiosystems.com/ –ABI3730XL (2002, 96 samples, 1000 base reads, ~$350,000, higher sensitivity, lower reagent cost, ~$1/reaction) –700 Kbp / 24 hours. 384 capillary sequencers –5700 sequences / 24 hr day –2.8 Mbp / 24 hours.
11
384-well capillary sequencing Results are shown as an electropherogram showing a peak for each base. From the peak heights and widths, a Phred score is assigned to each individual base. A high Phred score indicates a high certainty as to the identity of that particular base.
12
Sample Output 1 lane
13
1 trace=1000 bases or less –ABI: 1000 bp reads –Illumina: 50-100 bp reads –454 Sequencing: 300-400 bp reads How do we cover a genome? –DIVIDE AND CONQUER: assemble these short sequence fragments.
14
Assembly/Trace Editing Consed –UNIX EBI’s Phusion EditView (ABI PRISM) –Mac Chromas (free/pay versions) –Windows
15
Sequencing Strategies Ordered –Divide and Conquer Random Sequence –Brute Force The random approach now predominates for big projects
16
Random Method (details for Sanger seq) Shear DNA (nebulize) –finish ends, ligate into vector Produce template Sequence to 8X – 10X coverage –Sequence both ends of templates. –Read length (1,000bp typical) –Accuracy (99% good)
17
Assembly Problem CONTIG
18
Contigs, Islands contigs Island
19
Assembling random sequences No coverage Only 1 strand DISAGREEMENT T T C
20
Assembly programs Celera Assembler (Eugene Myers et al.) Arachne (Serafim Batzoglou et al.) PCAP (Xiaoqiu Huang, Iowa State University) Phusion (EBI)
21
Continuing rapid improvement in sequencing technology
22
1990’s: Human genome 3Gbps, $300 million (just sequencing) Current: Mammalian genome (3 Gbps): $1 million Goal: $100,000 genome, 10X cheaper (and faster) likely 2012! New goal! $1,000 genome. UK’s sequencing center has one: http://www.uky.edu/Centers/AGTC/
23
454 Sequencing’s Genome Sequencer FLX Pyrosequencing (sequencing by detection of nucleotides added during DNA synthesis. 350-400 million bases per run (10 hrs.). 400 bp sequence reads. 1,000,000 reads per run. $6,600 per run, 60kb/$1, or $0.00165/bp.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.