Presentation is loading. Please wait.

Presentation is loading. Please wait.

What should a bioinformatician know about DNA sequencing, and why?

Similar presentations


Presentation on theme: "What should a bioinformatician know about DNA sequencing, and why?"— Presentation transcript:

1 What should a bioinformatician know about DNA sequencing, and why?

2 Update this table: remove SOLiD, add Life Technologies Ion Proton (PGM), Illumina MiSeq Update all with latest info on read length

3 What are the error types and rates of the different platforms?

4 Quality scores Phred www.phrap.com/phred/www.phrap.com/phred/ Q = -10 log 10 (e) Quality scoreProb wrong base callAccuracy of base call 101/1090% 201/10099% 301/100099.9% 401/10,00099.99% 501/100,00099.999%

5 Wikipedia.org

6 FASTQ format 4 lines, sequence + quality scores @SEQ_ID (+optional description) GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + optional repeat of line 1, often left as just the + character to save space !''*((((***+))%%++)(%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 But beware! At least 3 different FASTQ file standards, indistinguishable in format, but incompatible with each other Wikipedia.org

7 FASTQ variants NameASCII range, offsetQ score typeQ score range Sanger standard; fastq-sanger 33-126, 33PHRED0 to 93 (raw 0-40) Solexa/Illumina <1.3 fastq-solexa 59-126, 64Solexa-5 to 62 (raw -5-40) Illumina 1.3+ fastq-illumina 64-126, 64PHRED0 to 62 (raw 0-40) Illumina 1.5+64-126, 64PHRED3 to 62 (raw 3-40) Illumina 1.8+33-126, 33PHRED0 to 93 (raw 0-41)

8 What use is the quality score?

9 What factors should be considered in the choice of a DNA sequencing platform?


Download ppt "What should a bioinformatician know about DNA sequencing, and why?"

Similar presentations


Ads by Google