Download presentation
Presentation is loading. Please wait.
Published byLorraine Spencer Modified over 8 years ago
1
What should a bioinformatician know about DNA sequencing, and why?
2
Update this table: remove SOLiD, add Life Technologies Ion Proton (PGM), Illumina MiSeq Update all with latest info on read length
3
What are the error types and rates of the different platforms?
4
Quality scores Phred www.phrap.com/phred/www.phrap.com/phred/ Q = -10 log 10 (e) Quality scoreProb wrong base callAccuracy of base call 101/1090% 201/10099% 301/100099.9% 401/10,00099.99% 501/100,00099.999%
5
Wikipedia.org
6
FASTQ format 4 lines, sequence + quality scores @SEQ_ID (+optional description) GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + optional repeat of line 1, often left as just the + character to save space !''*((((***+))%%++)(%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 But beware! At least 3 different FASTQ file standards, indistinguishable in format, but incompatible with each other Wikipedia.org
7
FASTQ variants NameASCII range, offsetQ score typeQ score range Sanger standard; fastq-sanger 33-126, 33PHRED0 to 93 (raw 0-40) Solexa/Illumina <1.3 fastq-solexa 59-126, 64Solexa-5 to 62 (raw -5-40) Illumina 1.3+ fastq-illumina 64-126, 64PHRED0 to 62 (raw 0-40) Illumina 1.5+64-126, 64PHRED3 to 62 (raw 3-40) Illumina 1.8+33-126, 33PHRED0 to 93 (raw 0-41)
8
What use is the quality score?
9
What factors should be considered in the choice of a DNA sequencing platform?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.