Download presentation
Presentation is loading. Please wait.
Published byWarren Pitts Modified over 9 years ago
1
Copyright 2003 limsoon wong Recognition of Protein Features Limsoon Wong Institute for Infocomm Research BI6103 guest lecture on ?? March 2004
2
Copyright 2003 limsoon wong Lecture Plan Membrane proteins Subcellular localization
3
Copyright 2003 limsoon wong Recognition of Transmembrane Helices
4
Copyright 2003 limsoon wong Eukaryotic Cells Eukaryotic cells have membrane-bound compartments with specialized functions
5
Copyright 2003 limsoon wong Lipids & Membrane Membrane is a double layer of lipids and associated proteins which define subcellular compartments or enclose the cell Lipids consist of a “polar head group” and long-chain fatty acids This dual nature promotes formation of lipid bilayers “Hydrophobic tails” are shielded from aqueous environment Water-soluble (i.e., charged or polar) molecules cant pass through this impermeable barrier Permeability across the bilayer is regulated by membrane proteins that span the bilayer and function like channels or pores
6
Copyright 2003 limsoon wong all- -barrel Membrane Proteins Two types of membrane proteins: Integral vs peripheral Two types of integral membrane proteins: all- vs -barrel
7
Copyright 2003 limsoon wong Topography & Topology topography: predict location of transmembrane segment topology: predict location of N- and C- termini wrt lipid bilayer We focus on topography prediction for all- membrane proteins Lipid molecules
8
Copyright 2003 limsoon wong Datasets Jayasinghe et al. Protein Sci, 10:455-458, 2001 –59 high resolution membrane proteins –www.biocomp.unibo.it/gigi/ENSEMBLE Moller et al. Bioinformatics, 16:1159--1160, 2000 –151 low resolution membrane proteins Jones et al., Biochem., 33(10):3038--3049, 1994 –38 multi-spanning and 45 single-spanning membrane proteins –topologies experimentally determined Sonnhammer et al., ISMB, 6:175-182, 1998 –108 multi-spanning and 52 single-spanning membrane proteins –most of experimentally determined topologies, but less reliably determined than Jones et al.
9
Copyright 2003 limsoon wong Monne et al., JMB, 288:141--145, 1999: Turn Propensity Scale for TM Helices E. coli Lep protein contains two TM domains (H1, H2) and C-terminal doman P2 Translocation of P2 to lumenal side is easy to test by glycoslation Replace H2 by 40 residue poly-L segment LIK 4 L 21 XL 7 VL 10 Q 3 P The poly-L segment can form either one long TM or 2 closely-spaced TM helices, depending on what is substituted for X ER
10
Copyright 2003 limsoon wong Monne et al., JMB, 288:141--145, 1999: Turn Propensity Scale for TM Helices Using the poly-L segment, measure “turn” propensity of the 20 amino acids by substituting them for the X in the poly-L segment Hydrophobic residues (I, V, L, F, C, M, A) do not induce turn Charged and polar residues (except S & T) induce turn Exercise: –What are the charged/polar residues? –What could be reason of S & T not inducing turn? glycoslated non-glycoslated
11
Copyright 2003 limsoon wong Monne et al., JMB, 288:141--145, 1999 In all- membrane proteins, –hydrophobic residues prefer membrane env and have low turn propensity –charged & polar residues induce turn formation to avoid membrane interior prediction of TM helix distinction of 1 long TM helix vs 2 closely spaced TM helices Monne et al., JMB, 288:141--145, 1999: Turn Propensity Scale for TM Helices
12
Copyright 2003 limsoon wong Monne et al., JMB, 288:141--145, 1999 Inside of cellular membrane is hydrophobic Segment of protein that spans membrane is expected to contain many hydrophobic amino acids Locate segments that have high average “hydrophobicity” score Wiess et al, ISMB, 1:420--421, 1993 Hydrophobicity Approach
13
Copyright 2003 limsoon wong Wiess et al, ISMB, 1:420--421, 1993 Hydrophobicity Approach find a segment of 10 to 70aa with hp > 0.71 expand to longer segment with hp > 0.35 mark this segment as TM repeat above starting from position after previous segment Caveats: –may be unable to distinguish hydrophobic core of nonmembrane proteins vs. transmembrane regions –what are the right thresholds? Adjustable thresholds
14
Copyright 2003 limsoon wong An Example: Bacteriorhodopsin http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=protein&list_uids=461610&dopt=GenPept&term=bacteriorhodopsin&qty=1 1 gigtllmlig tfyfiargwg vtdkkareyy aitilvpgia saaylsmffg iglttvevag 61 maepleiyya ryadwlfttp lllldlalla nadrttigtl igvdalmivt gligalshtp 121 larytwwlfs tiaflfvlyy lltvlrsaaa elsedvqttf ntltalvavl wtaypilwii 181 gtegagvvgl gvetlafmvl dvta 7 transmembrane helices
15
Copyright 2003 limsoon wong An Example: Bacteriorhodopsin 1 gigtllmlig tfyfiargwg vtdkkareyy aitilvpgia saaylsmffg iglttvevag 61 maepleiyya ryadwlfttp lllldlalla nadrttigtl igvdalmivt gligalshtp 121 larytwwlfs tiaflfvlyy lltvlrsaaa elsedvqttf ntltalvavl wtaypilwii 181 gtegagvvgl gvetlafmvl dvta After applying hydrophobicity scale...
16
Copyright 2003 limsoon wong An Example: Bacteriorhodopsin Compute hydrophobicity score, hp > 7 1 gigtllmlig tfyfiargwg vtdkkareyy aitilvpgia saaylsmffg iglttvevag 61 maepleiyya ryadwlfttp lllldlalla nadrttigtl igvdalmivt gligalshtp 121 larytwwlfs tiaflfvlyy lltvlrsaaa elsedvqttf ntltalvavl wtaypilwii 181 gtegagvvgl gvetlafmvl dvta TM identified: 6/7, TM FP: 0 TM residue identified: 62/117, TM residue FP: 4
17
Copyright 2003 limsoon wong An Example: Bacteriorhodopsin Expand segment, maintain hp > 5, avoid low hydrophobicity 1 gigtllmlig tfyfiargwg vtdkkareyy aitilvpgia saaylsmffg iglttvevag 61 maepleiyya ryadwlfttp lllldlalla nadrttigtl igvdalmivt gligalshtp 121 larytwwlfs tiaflfvlyy lltvlrsaaa elsedvqttf ntltalvavl wtaypilwii 181 gtegagvvgl gvetlafmvl dvta TM identified: 6/7, TM FP: 0 TM residue identified: 100/117, TM residue FP:15
18
Copyright 2003 limsoon wong Sonnhammer et al., ISMB, 6:175-182, 1998: TMHMM, A HMM Approach There are 3 main locations of a residue: –TM helix core (viz., in hydrophobic tail of membrane –TM helix cap (viz., in head of membrane) cytoplasmic vs non-cytoplasmic side of the helix core –loops cytoplasimc vs non-cytoplasmic (short) vs non-cytoplasmic (long) So needs HMM with 7 states Exercise: What is the 7th state for? cyto non-cyto
19
Copyright 2003 limsoon wong Sonnhammer et al., ISMB, 6:175-182, 1998: TMHMM, Architecture cyto non-cyto Each state has an associated probability distribution over the 20 amino acids characterizing the variability of amino acids in the region it models
20
Copyright 2003 limsoon wong Sonnhammer et al., ISMB, 6:175-182, 1998: TMHMM, Architecture The first 3 and last 2 core states have to be traversed. But all other core states can be bypassed. This models core regions of 5--25 residues
21
Copyright 2003 limsoon wong Sonnhammer et al., ISMB, 6:175-182, 1998: TMHMM, Architecture The states of globular, loop, & cap regions. The caps are 5 residues each. Since core is 5--25 residues, this allows for helices 15--35 residues long To model bias in amino acid usage near cap To model neutral amino acid distribution
22
Copyright 2003 limsoon wong Sonnhammer et al., ISMB, 6:175-182, 1998: TMHMM, Training the HMM Stage 1: Baum-Welch is used for maximum likelihood estimation from “diluted” labeled training data. As precise end of TM is only approximately known, we “dilute” by unlabeling 3 residues on each side of a helix boundary to accommodate this Stage 2: Baum-Welch is used for maximum likelihood estimation from “relabeled” training data. The original training data are diluted as by unlabeling 5 residues on each side of a helix boundary. Model from Stage 1 is used to produce “relabeled training data” by relabeling this part under constraints of remaining labels Stage 3: Model from Stage 2 is further tuned by a method for “discriminative” training, to maximize probability of correct prediction (Krogh, ISMB, 5:179--186, 1997)
23
Copyright 2003 limsoon wong Krogh, ISMB, 5:179--186, 1997: Discriminative HMM Training
24
Copyright 2003 limsoon wong Sonnhammer et al., ISMB, 6:175-182, 1998: TMHMM, Example Non-cytoplasmic Cytoplasmic TM segment Datasets Jones et al., Biochem., 33(10):3038--3049, 1994 Sonnhammer et al., ISMB, 6:175-182, 1998
25
Copyright 2003 limsoon wong Sonnhammer et al., ISMB, 6:175-182, 1998: TMHMM, Accuracy (10-CV) All TM segments & their orientation correctly predicted All TM segments correctly predicted, ignoring orientation precision Jones et al Sonnhammer et al
26
Copyright 2003 limsoon wong NNHMM1HMM2 ENSEMBLE Martelli et al. Bioinformatics, 19:i205--i211, 2003 ENSEMBLE
27
Copyright 2003 limsoon wong ENSEMBLE: The Neural Network Part The NN part is a cascade shown above, a la Rost et al., Protein Science, 1995 h1h1 h2h2 h5h5 HMM LOOP Input layer 17*2 inputs 1 17 15 hidden units 17 * 20 input units Feed-forward back-propagation neural network
28
Copyright 2003 limsoon wong ENSEMBLE: The HMM1 Part HMM1 models the hydrophobic nature of most TM helices, a la Krogh et al. JMB 2001 & Sonnhammer et al., ISMB 1998
29
Copyright 2003 limsoon wong ENSEMBLE: The HMM2 Part HMM2 models TM helices that are mix of hydrophobic and hydrophilic residues, ala Martelli et al., Bioinformatics 2002.
30
Copyright 2003 limsoon wong NNHMM1HMM2 ENSEMBLE ENSEMBLE: Predicting if a residue is in TM NN(p,i) = NN(H,p,i) NN(L,p,i) HMM 1 (p,i) = AP 1 (H,p,i) AP 1 (I,p,i) AP 1 (O,p,i) HMM 2 (p,i) = AP 2 (H,p,i) AP 2 (I,p,i) AP 2 (O,p,i) E(p,i) = ( NN(p,i) + HMM 1 (p,i) + HMM 2 (p,i)) / 3 position helix loop (inner I, outer O) E(p,i) > 0 means residue i of protein p is in TM helix
31
Copyright 2003 limsoon wong Ensemble: Topography Prediction Fariselli et al., Bioinformatics, 2003 NNHMM1HMM2 ENSEMBLE MaxSubSeq TM helix found by MaxSubSeq but would be missed w/o it This path is taken means positions m to j form a helix
32
Copyright 2003 limsoon wong Ensemble: Topography Prediction Results A prediction is considered correct if (a) the number of TM segments is correct and (b) the overlap between a predicted and a real TM segment > 8aa
33
Copyright 2003 limsoon wong Topology Prediction: Postive-Inside Rule Gavel et al., FEBS, 282:41--46, 1991 Positively- charged residues (Lys and Arg) are enriched more than 2 fold in stromal vs luminal loops
34
Copyright 2003 limsoon wong Topology Prediction: Ensemble “positive-inside” rule
35
Copyright 2003 limsoon wong Ensemble: Topology Prediction Results
36
Copyright 2003 limsoon wong Short Break
37
Copyright 2003 limsoon wong Subcellular Localization
38
Copyright 2003 limsoon wong Compartments and Sorting Eukaryotic cells requires proteins be targeted to their subcellular destinations Protein sorting is determined by specific amino acid sequences, or “signals”, within the protein Secretory pathway targets proteins to plasma membrane, some membrane- bound organelles such as lysosomes, or to export proteins from the cell
39
Copyright 2003 limsoon wong Secretory Pathway The secretory pathway consists of the endoplasmic reticulum (ER), Golgi apparatus and transport vesicles The transport vesicles carry proteins from one compartment to the other Exocytosis is mediated by fusion of secretory vesicles with the plasma membrane. Endocytosis is the opposite of exocytosis and involves the uptake of extracellular material by pinching off vesicles from the plasma membrane The contents of the endocytic vesicles are delivered to the lysosomes by membrane fusion Lysosomes contain hydrolytic enzymes that breakdown macromolecules into the smaller subunits which can be utilized by the cell for its own biosynthesis
40
Copyright 2003 limsoon wong Datasets Reinhartdt & Hubbard, NAR, 26:2230--2236, 1998 –2427 eukaryotic proteins for 4 locations (cytoplasmic, extracellular, nuclear,& mitochondrial) –997 prokaryotic proteins for 3 locations (cytoplasmic, extracellular, & periplasmic) Park & Kanehisa, Bioinformatics, 19:1656--1663, 2003 –7589 eukaryotic proteins from 709 organisms for 12 locations (chloroplast, cytoplasmic, cytoskeleton, ER, extracellular, golgi, lysosomal, mitochondrial, nuclear, peroxisomal, plasma membrane, vacuolar) Chou & Cai, JBC., 277:45765--45769, 2002 –2191 proteins for 12 locations Emanuelsson et al., JMB, 300:1005--1016, 2000 Gardy et al., NAR, 31:3613--3617, 2003
41
Copyright 2003 limsoon wong Common Eukaryotic Protein Sorting Signals For a comprehensive list of cellular localization sites, see http://mendel.imp.univie.ac.at/CELL_LOC/index.html
42
Copyright 2003 limsoon wong Schematic View of Sorting Signals cleavage site ~25aa
43
Copyright 2003 limsoon wong Sequence Logos of SP, mTP, & cTP SP signal peptide mTP mitochondrial transfer peptide cTP chloroplast transit peptide
44
Copyright 2003 limsoon wong Neural Network Approach: TargetP Emanuelsson et al., JMB, 300:1005--1016, 2000 cTP, mTP, SP –4 hidden units –feedforward NNs –input windows: 55aa (cTP), 35aa (mTP), 27aa (SP) sparsely encoded Integrating Network –0 hidden unit –feedforward NN –input is taken from the outputs of cTP, mTP, SP networks over 100aa at N-terminal cTP: chloroplast transit peptide, mTP: mitochondria transfer peptide, SP: signal peptide
45
Copyright 2003 limsoon wong TargetP: Performance Dataset: Emanuelsson et al., JMB, 2000
46
Copyright 2003 limsoon wong Expert System Approach: PSORT Horton & Nakai, ISMB, 1997 A simplified version of the decision tree that PSORT uses to check and reason over various sorting signals
47
Copyright 2003 limsoon wong A Refinement: PSORT-B Gardy et al., NAR, 31:3613--3617, 2003 SCL- BLAST MotifsHMMTOP Outer Membrane Protein SubLocC Signal Peptides Bayesian Network Localization sites or “unknown” Sites considered –cytoplasm –inner membrane –periplasm –outer membrane –extracellular space
48
Copyright 2003 limsoon wong PSORT-B: SCL-BLAST Homology to a protein of known localization is good indicator of a protein’s actual localization site BLAST target protein against a database of proteins whose localization sites are known Return localization sites of hits at E-value of 10e -10 over 80% of length
49
Copyright 2003 limsoon wong PSORT-B: Motifs Some motifs in PROSITE may be able to identify subcellular localization with 100% precision Scan target protein against a database of such motifs (28 such 100%-precision motifs are known) Return localization sites corresponding to the motif hits
50
Copyright 2003 limsoon wong PSORT-B: HMMTOP -helical transmembrane region is reliable indicator of localization to inner membrane Scan target protein for transmembrane helices using HMMTOP Return localization site as “inner membrane” if >2 helices found
51
Copyright 2003 limsoon wong PSORT-B: Outer Membrane Proteins Outer-membrane proteins have characteristics - barrel structure Identify freq seq occurring only in -barrel proteins (279 such freq seq known) Scan target protein for these freq seq Return localization site as “outer membrane” if >2 such freq seq found
52
Copyright 2003 limsoon wong PSORT-B: SubLocC Overall amino acid composition is useful for recognizing cytoplasmic proteins Trained SVM on overall amino acid composition to predict cytoplasmic vs non- cytoplasmic, as in SubLoc Analyze target protein’s amino acid composition using this SVM
53
Copyright 2003 limsoon wong PSORT-B: Signal Peptides Presence of signal peptide at N- terminal means protein not cytoplasmic Train HMM and SVM to recognize signal peptides and their cleavage sites If high-confidence cleavage site found by HMM in first 70aa of target protein, then “non-cytoplasmic” If low-confidence cleavage site found, pass candidate signal peptide to SVM to confirm If confirmed, then “non-cytoplasmic” Otherwise, “unknown”
54
Copyright 2003 limsoon wong PSORT-B: Bayesian Network Bayesian Network integrates results from the 6 modules Produces a score for each of the 5 possible localization sites If a site scores >7.5, then predicts as a localization site of the target protein If no site scores >7.5, then makes no prediction
55
Copyright 2003 limsoon wong PSORT-B: Performance of Individual Modules Dataset: Gardy et al., NAR, 2003
56
Copyright 2003 limsoon wong PSORT-B: Performance wrt Localization Sites PSORT-B is a considerable improvement over original PSORT Dataset: Gardy et al., NAR, 2003
57
Copyright 2003 limsoon wong PSORT vs PSORT-B: Some Remarks PSORT considers various signal/features in a top-down way driven by its reasoning tree PSORT-B generates all signal/features in a bottom-up way, then integrate them for decision making using Bayesian Network Machine learning “beats” human expert? Probably the number of features/rules needed is too much/complicated
58
Copyright 2003 limsoon wong Amino acid composition of proteins residing in different sites are different
59
Copyright 2003 limsoon wong Amino Acid Composition Differences each cellular location has own characteristic physio-chemical environment proteins in each location have adapted thru evolution to that environment thus reflected in the protein structure and amino acid composition If the above is true, the amino acid composition differences wrt cellular location sites should be more pronounced on protein surfaces than protein interior Exercise: Why?
60
Copyright 2003 limsoon wong Adaptation of Protein Surfaces Andrade et al., JMB, 1998 Proportion of j th amino acid type in i th protein To test the theory of adaptation of protein surfaces to subcellular localization, we do a plot of 3 types of composition vectors along their first two principal components
61
Copyright 2003 limsoon wong Adaptation of Protein Surfaces Andrade et al., JMB, 1998 Total amino acid composition vector Surface amino acid composition vector Interior amino acid composition vector Clearly total & surface composition vectors show better separation than interior composition vectors
62
Copyright 2003 limsoon wong Amino Acid Composition This means can use amino acid composition vectors, especially those from protein surfaces, to predict subcellular localization! Let’s see how this turn out….
63
Copyright 2003 limsoon wong Neural Networks: NNPSL Reinhardt & Hubbard, NAR, 26:2230--2236, 1998 Input 1 Input 20 cytoplasmic extracellular mitochodrial nuclear fraction of each amino acid in the input protein
64
Copyright 2003 limsoon wong NNPSL: Performance Outputs NNPSL have values 0 to 1. The difference ( ) between the highest and the next highest nodes can be used as a reliability index 0 < < 0.2 0.2 < < 0.4 0.4 < < 0.6 0.6 < < 0.8 0.8 < < 1 Dataset: Reinhardt & Hubbard, NAR, 1998
65
Copyright 2003 limsoon wong Performance Emanuelsson, BIB, 3:361--376, 2002 (940 proteins) (2738 proteins) Dataset: Emanuelsson et al., JMB, 2000
66
Copyright 2003 limsoon wong Markov Chain Yuan, FEBS Letters, 451:23--26, 1999 Why?
67
Copyright 2003 limsoon wong Markov Chain: Performance NNPSL4th Order Markov (Eukaryotic) Dataset: Reinhardt & Hubbard, NAR, 1998
68
Copyright 2003 limsoon wong Support Vector Machines: SubLoc Hua & Sun, Bioinformatics, 17:721--728, 2001 extracellular vs rest nuclear vs rest cytoplasmic vs rest mitochondrial vs rest Argmax X X-vs-rest SVM The SVMs use polynomial kernel with d = 9 (prokaryotic), K(X i,X j ) = (X i ·X j + 1) d RBF kernel with =16 (eukaryotic), K(X i, X j ) = exp(- |X i - X j | 2 20-dimensional vector giving amino acid composition of the input protein
69
Copyright 2003 limsoon wong SubLoc: Performance NNPSL SubLoc (Eukaryotic) Dataset: Reinhardt & Hubbard, NAR, 1998
70
Copyright 2003 limsoon wong SubLoc: Robustness of Amino Acid Composition Approach Amazingly, accuracy of SubLoc is virtually unaffected when the first 10, 20, 30, & 40 amino acids in a protein are deleted Amino acid composition is a robust indicator of subcellular localization, and is insensitive to errors in N-terminal sequences
71
Copyright 2003 limsoon wong Amino Acid Composition: Taking it Further How about pairs of consecutive amino acids? (a.k.a 2-grams) How about 3- grams, …, k-grams? How about pseudo amino acid composition? How about presence of entire functional domains? (I.e. think of the presence/absence of a functional domain as a summary of amino acid sequence info...)
72
Copyright 2003 limsoon wong Functional Domain Composition Chou & Cai, JBC, 277:45765--45769, 2002 Training seqs of various localization sites BLAST against db of known functional domains (SBASE-A) amino acid composition + Train SVM using these vectors x i = 1 means ith domain is present
73
Copyright 2003 limsoon wong Functional Domain Composition: Performance Not so good Why? Number of known domains in SBASE-A too small Need to handle situation where a protein has no hit in known domains Dataset: Reinhardt & Hubbard, NAR, 1998
74
Copyright 2003 limsoon wong Functional Domain Composition Cai & Chou, BBRC, 305:407--411, 2003 Training seqs of various localization sites BLAST against db of known functional domains (Interpro) NN-5875D: Train k-NN (k=1) using these vectors or, if no hit found Pseudo amino acid composition Amino acid composition NN-40D: Train k-NN (k=1) using these vectors If a protein got a hit in Interpro, use NN-5875D; else use NN-40D
75
Copyright 2003 limsoon wong Functional Domain Composition: Performance Dataset: Reinhardt & Hubbard, NAR, 1998
76
Copyright 2003 limsoon wong Notes
77
Copyright 2003 limsoon wong References (Transmembrane) Wiess et al. “Transmembrane segment prediction from protein sequence data”, ISMB, 420--421, 1993 Gavel et al. “The positive-inside rule applies to thylakoid membrane proteins”, FEBS 282:41--46, 1991 Monne et al. “A turn propensity scale for transmembrane helices”, JMB, 288:141--145, 1999 Sonnhammer et al. “A hidden Markov model for predicting transmembrane helices in protein sequences”, ISMB, 6:175--182, 1998 Martelli et al. “An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins”, Bioinformatics, 19(suppl):i205--i211, 2003
78
Copyright 2003 limsoon wong References (Transmembrane) Von Heijne. “Membrane protein structure prediction”, JMB, 225: 487--494, 1992 Jacoboni et al. “Prediction of the transmembrane regions of beta-barrel membrane proteins with a neural network- based predictor”, Protein Sci., 10:779--787, 2001 Martelli et al. “a sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins”, Bioinformatics, 18:S46--S53, 2002 Moller et al. “Evaluation of methods for the prediction of membrane spanning regions”, Bioinformatics, 17:646--653, 2001 Fariselli et al. “MaxSubSeq: an algorithm for segment- length optimization. The case study of the transmembrane spanning segments”, Bioinformatics, 19:500--505, 2003
79
Copyright 2003 limsoon wong References (Transmembrane) Rost et al. “Transmembrane helices predicted at 95% accuracy”, Protein Sci., 4:521--533, 1995 Krogh et al. “Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes”, JMB, 305:567--580, 2001 Andersson et al. “Different positively charged amino acids have similar effectson the topology of a polytopic transmembrane protein in E. coli”, JBC, 267:1491--1495, 1992
80
Copyright 2003 limsoon wong References (Subcellular Localization) Horton & Nakai, “Better prediction of protein cellular localization sites with the k-nearest neighbours classifier”, ISMB, 5:147--152, 1997 Gardy et al., “PSORT-B: Improving protein subcellular localization for Gram-negative bacteria”, NAR, 31:3613--3617, 2003 Emanuelsson, “Predicting protein subcellular localization from amino acid sequence information”, BIB, 3:361--376, 2002 Andrade et al., “Adaptation of protein surfaces to subcellular location”, JMB, 276:517--525, 1998 Yuan, “Prediction of protein subcellular locations using Markov chain models”, FEBS Letters, 451:23--26, 1999
81
Copyright 2003 limsoon wong References (Subcellular Localization) Emanuelsson et al., “ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites”, Protein Sci., 8:978--984, 1999 Emanuelsson et al., "Predicting subcellular localization of proteins based on their N-terminal amino acid sequence", JMB, 300:1005-1016, 2000 Hua & Sun, “Support vector machine approach for protein subcellular localization prediction”, Bioinformatics, 17:721--728, 2001 Reinhardt & Hubbard, “Using neural networks for prediction of the subcellular location of proteins”, NAR, 26:2230--2236, 1998
82
Copyright 2003 limsoon wong References (Subcellular Localization) Cai & Chou, “Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition”, BBRC, 305:407--411, 2003 Chou & Cai, “Using functional domain composition and support vector machines for prediction of protein subcellular location”, JBC, 277:45765--45769, 2002 Park & Kanehisa, “Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs”, Bioinformatics, 19:1656--1663, 2003
83
Copyright 2003 limsoon wong References (PTM) Eisenhaber et al. “Post-translational GPI-lipid anchor modification of proteins in kingdoms of life: analysis of protein sequence data from complete genomes”, Protein Engineering,14(1):17-25, 2001 Eisenhaber et al. “Automated annotation of GPI anchor sites: case study C. elegans”,Trends Biochem Sci., 25(7):340-341, 2000 Eisenhaber et al. “Prediction of potential GPI-modification sites in proprotein sequences”, JMB, 292(3):741-758, 1999 Eisenhaber et al. “Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase”, Protein Engineering, 11(12):1155-1161, 1998 Not Used
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.