Introduction to Bioinformatics II Lecture 12 Expressed Sequence Tags By Ms. Shumaila Azam
Expressed Sequence Tags (ESTs) Unedited, short, single pass sequences generated from 5' or 3' end of randomly selected cDNA libraries in desired cells/tissues/organ. Length: 200-700 bp (average 360 bp) Can be quickly generated at low cost (“poor-man’s genome”) EST annotations have very little biological information
EST… An expressed sequence tag or EST is a short sub-sequence of a cDNA sequence. They may be used to identify gene transcripts, and are instrumental in gene discovery and gene sequence determination. 74.2 million ESTs now available in public databases
the ESTs represent portions of expressed genes. cDNA/mRNA sequence or as the reverse complement of the mRNA, the template strand.
EST data repositories dbEST release 061507 (June, 2007) www.ncbi.nlm.nih.gov/dbEST/ 43,396,096 ESTs from 659 different organisms Homo sapiens (human) 8,119,106 Mus musculus (mouse) 4,850,243 Danio rerio (zebrafish) 1,350,105 Bos taurus (cattle) 1,318,208 Arabidopsis thaliana (thale cress) 1,276,692 Xenopus tropicalis 1,271,375 Oryza sativa (rice) 1,211,418 Zea mays (maize) 1,161,241 Triticum aestivum (wheat) 1,050,267
EST Applications Gene Discovery Gene Structure Prediction Expression Maps Alternative Splicing Identification and characterization of SNPs Gene expression studies tissue or disease specific developmental stage Proteomics (for example peptide mass fingerprinting) Identification of drug and vaccine candidates