EGASP 2005 Evaluation Protocol Paul Flicek EBI
Basics The evaluations are probably wrong GTF is not standard There are hidden assumptions Filters, overlaps, clusters Terminology varies Genes, exons, etc. EGASP 2005 Evaluations
Evaluation Measures Exons and introns Transcript Gene Sensitivity (Sn) Specificity (Sp) Exon length Exons per transcript Transcript Sn / Sp Overlap Gene EGASP 2005 Evaluations
Definitions EGASP 2005 Evaluations
Definitions Positive Transcript Positive Gene Correct translation start Correct translation stop Every splice site correct Positive Gene At least one positive transcript EGASP 2005 Evaluations
Examples Annotation Trans Sn = 0.5 Trans Sp = 1.0 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP 2005 Evaluations
Examples Annotation Trans Sn = 0.5 Trans Sp = 1.0 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP 2005 Evaluations
Examples Annotation Trans Sn = 0.0 Trans Sp = 0.0 Gene Sn = 0.0 Gene Sp = 0.0 Prediction EGASP 2005 Evaluations
Examples Annotation Trans Sn = 1.0 Trans Sp = 1.0 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP 2005 Evaluations
Examples Annotation Trans Sn = 0.5 Trans Sp = 0.5 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP 2005 Evaluations
Examples Annotation Trans Sn = 1.0 Trans Sp = 0.67 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP 2005 Evaluations
The winners are… (there are clear trends) The most successful programs use expressed sequences Programs using evolutionary conservation are more successful than those that do not Exon and nucleotide measures are similar We are improving EGASP 2005 Evaluations
Spear Catching Time EGASP 2005 Evaluations
EGASP 2005 Evaluations Block 1 Paul Flicek EBI Expressed Sequence Methods
Nucleotide EGASP 2005 Evaluations
Exon EGASP 2005 Evaluations
Intron EGASP 2005 Evaluations
Transcript EGASP 2005 Evaluations
Gene EGASP 2005 Evaluations
Transcript Overlap EGASP 2005 Evaluations
Average Exon Length EGASP 2005 Evaluations
Exons per transcript EGASP 2005 Evaluations
Number of Genes 1027 1389 EGASP 2005 Evaluations
Unique Exons EGASP 2005 Evaluations
Summary EGASP 2005 Evaluations
EGASP 2005 Evaluations Block 2 Paul Flicek EBI Evolutionary Conservation (Dual/Multiple Genome) Methods
Nucleotide EGASP 2005 Evaluations
Exon EGASP 2005 Evaluations
Intron EGASP 2005 Evaluations
Transcript EGASP 2005 Evaluations
Gene EGASP 2005 Evaluations
Transcript Overlap EGASP 2005 Evaluations
Average Exon Length EGASP 2005 Evaluations
Exons per transcript EGASP 2005 Evaluations
Number of Genes 1027 1389 EGASP 2005 Evaluations
Unique Exons EGASP 2005 Evaluations
Summary EGASP 2005 Evaluations
EGASP 2005 Evaluations Block 3a Paul Flicek EBI Ab initio (single genome) and Exon only Methods
Nucleotide EGASP 2005 Evaluations
Exon EGASP 2005 Evaluations
Intron EGASP 2005 Evaluations
Transcript EGASP 2005 Evaluations
Gene EGASP 2005 Evaluations
Transcript Overlap EGASP 2005 Evaluations
Average Exon Length EGASP 2005 Evaluations
Exons per transcript EGASP 2005 Evaluations
Number of Genes 1027 1389 EGASP 2005 Evaluations
Unique Exons EGASP 2005 Evaluations
Summary EGASP 2005 Evaluations
EGASP 2005 Evaluations Block 3b Paul Flicek EBI Open (Any) Methods
Nucleotide EGASP 2005 Evaluations
Exon EGASP 2005 Evaluations
Intron EGASP 2005 Evaluations
Transcript EGASP 2005 Evaluations
Gene EGASP 2005 Evaluations
Transcript Overlap EGASP 2005 Evaluations
Average Exon Length EGASP 2005 Evaluations
Exons per transcript EGASP 2005 Evaluations
Number of Genes 1027 1389 EGASP 2005 Evaluations
Unique Exons EGASP 2005 Evaluations
Summary EGASP 2005 Evaluations