1 Computational Approaches(1/7) Computational methods can be divided into four categories: prediction methods based on (i) The overall protein amino acid composition (ii) Known targeting sequences (iii) Sequence homology and/or motifs (iv) A combination of several sources of information (hybrid methods)
2 Computational Approaches(2/7) (i) The overall protein amino acid composition(1/2) Nakashima and Nishikawa Method for discriminating between intracellular and extracellular proteins Using the distance between the overall amino acid composition vectors Cedano et al ProtLock for predicting five classes of subcellular localizations Extracellular, intracellular, integral membrane, anchored membrane, and nuclear Reinhardt and Hubbard NNPSL, an approach using artificial neural networks (ANNs) Predicting four eukaryotic and three prokaryotic subcellular localizations Chou et al SVM-based method for predicting twelve different subcellular localization taking sequence order effects into account
3 Computational Approaches(3/7) (i) The overall protein amino acid composition(2/2) Huang et al Using Fuzzy k-NNs algorithm Describe the dipeptide composition of the whole protein sequence for eleven different localizations Yu, C.S. (CELLO method) Prediction of five subcellular localizations in Gram-negative bacteria Based on the composition of peptides of varying lengths Andrade et al First to incorporate structural information into the amino acid composition vectors Composition of eukaryotic proteins with known structure was used The rationale behind this approach The interiors of proteins have stayed fairly constant during evolution
4 Computational Approaches(4/7) (ii) Known targeting sequences Gunnar von Heijne (TargetP) The most comprehensive method based on N-terminal targeting sequences Prediction of chloroplast, mitochondrial, secretory pathway, and other proteins Claros M.G. (MitoProt and Predotar) Specifically discriminate chloroplast from mitochondrial proteins Bannai H. et al (iPsort) Using knowledge-based rules for prediction based on protein sequence features
5 Computational Approaches(5/7) (iii) Sequence homology and/or motifs Marcotte et al Assigns the subcellular localization by constructing phylogenetic profiles of the proteins Cokol M et al(PredictNLS) Specialized on recognizing nuclear proteins Based on a collection of nuclear localization sequences Lu et al(Proteome Annalyst) Based on SWISS-PROT keywords and the annotation of homologous proteins
6 Computational Approaches(6/7) (iv) Hybrid methods Nakai K and Kanehisa(PSORT) One of the first methods developed for predicting the subcellular localization Using the overall amino acid composition, N-terminal targeting sequence information, and motifs This method uses a set of knowledge-based "if-then" rules Predicts 14 animal and 17 plant subcellular localizations PSORT II and and PSORT-B-Extensions of the PSORT Drawid and G Method that incorporates information about sequence motifs, overall sequence properties and mRNA expression levels Based on a Bayesian prediction model and was tested on the yeast genome Guda C et al(MITOPRED) Specialized for predicting mitochondrial proteins Based on amino acid composition