Download presentation
Presentation is loading. Please wait.
1
Genome Assembly Bonnie Hurwitz Graduate student TMPL
2
Genome assembly
4
…ACGGCTGCGTTACATCGATCAT ACATCGATCATTTACGATACCATTG… sheared clone library (insert sizes of 1-2, 3- 4, 30-40, 100kb) end sequence clones (f / r) assemble reads by alignment identity genomic DNA Shotgun sequencing (WGS)
5
1 2 3 4 5 6 7 8 break A B C D E F G H ABCDFG H E’E’’ mate pair linkage contig “composite” genome scaffold Genome scaffolding
6
0.57 ¢ 0.19 ¢ 0.35 ¢ Sequence production (Billions of bases/month) 0.50 0.80 1.00 0.30 0.40 0.20 0.10 0 0 0.70 0.60 0.90 Cost: Cents per base.6 1.0.8.4.2 0 0 1.6 1.4 1.2 2.0 1.8 3.1 0.46 ¢ 1989 1991 1993 1995 1997 1999 2003 2005 2001 0.10 ¢ Sanger sequencing costs 2008 ~ $1/read
7
454 Pyrosequencing - the generations Stats/ runGS20FLXTitanium Total sequence (Mb)401001,000 Read length (bp)100>200>400 # reads400,000 1M Paired Ends?NOY, 50% 0.03 ¢ 0.01 ¢ 0.003 ¢ (Sanger is currently 0.1 ¢ ) Cost / bp -->
8
When is a genome “finished”? (by Poisson Calculations) Fold coverage Percent of genome sequenced 0.25 x22% 0.50 x39% 0.75 x53% 1 x63% 2 x88% 3 x95% 4 x98% 5 x99.4% 6 x99.75% 7 x99.91% 8 x99.97% 9 x99.99% 10 x99.995% Coverage: Coverage is the average number of reads representing a given nucleotidenucleotide in the reconstructed sequence. It can be calculated from the length of the original genome (G), the number of reads (N), and the average read length(L) as NL / G
9
Tablet: Assembly Viewer Sequence Overlap Consensus Sequence reads Contig info Current location
10
Our goal today Assemble a phage genome – Assemble a phage genome with different levels of coverage – Compute basic statistics on each genome assembly – View the assemblies – Compare the best assembly to the finished genome
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.