Sequencing technology and assembly 1
Sanger sequencing Sanger sequencing with radioactivity High throughput Sanger sequencing with fluorescence 2
Roche/454 sequencing Yield: 500,000,000 bp Cost: $5,000 Time: ~1 min per bp Read length: 450 bp - > 1kb
Pyrosequencing
Illumina sequencing Yield: 8,000,000,000 – 80,000,000,000 bp Time: ~1 hour per bp Read length: ~150 bp Cost: Sample Extraction, $14.00/sample Automated Sample Library, $90.00/sample MiSeq (2x250), 1 lane 8-10Gb/lane, $1,700.00/sample MiSeq (2x300), 1 lane, 10-12Gb/lane, $2,100.00/sample HiSeq2500 (2x150), 1 lane, ~40Gb/lane, $2,500.00/lane HiSeq2500 (2x250), 1 lane, ~65Gb/lane, $3,500.00/lane 5
Illumina sequencing
Ion Torrent Yield: 50,000,000 bp Time: 2 hours Read length: 500bp <1 min per bp Cost: $500 7
Ion Torrent
PacBio Long reads (5-10kb) High error, but read 150x coverage Library prep: $600 Sequencing: $300
PacBio
Minion Quick sample prep Long reads (~50kb) High error $150 per run
Minion
Errors Different technologies have different error rates:
Base calling Need to be sure which base you have identified Depends on the technology Each machine includes software Phred is an historical package developed by at U. Washington Phred scores are probability that the base is correct 14
Quality values Phred 10: 1 x 101 chance that the base is wrong Phred 99: the base is correct! Fastq scores are the score + 33 then converted to ascii text 15
Homopolymeric errors Homopolymeric runs: Signal is not linear Not clear if 5 or 6 bases
Errors Different technologies have different error rates: Pyrosequencing/Ion Torrent – homopolymeric tracts Illumina – substitution errors PacBio – Machines can not keep up with biology Minion – noise coming through the membrane