Download presentation
Presentation is loading. Please wait.
1
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen
2
A data-driven method to predict a feature, given a set of training data In biology input features could be amino acid sequence or nucleotides Secondary structure prediction Signal peptide prediction Surface accessibility Propeptide prediction Use of artificial neural networks N C Signal peptide Propeptide Mature/active protein
3
Neural network prediction methods http://www.cbs.dtu.dk/services/ http://www.cbs.dtu.dk/services/
4
Pattern recognition
5
Biological Neural network
6
Biological neuron structure Synapse Neuron Terminal Several connections
7
Diversity of interactions in a network enables complex calculations Similar in biological and artificial systems Excitatory (+) and inhibitory (-) relations between compute units fire 0 1
8
Transfer of biological principles to artificial neural network algorithms Non-linear relation between input and output Massively parallel information processing Data-driven construction of algorithms Ability to generalize to new data items
9
Sparse encoding of amino acid sequence windows
10
Sparse encoding Inp Neuron 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 AAcid A 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Q 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 E 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
11
BLOSUM encoding (Blosum50 matrix) A R N D C Q E G H I L K M F P S T W Y V A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4
12
Sequence encoding (continued) Sparse encoding –V:0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 –L:0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 –V. L=0 (unrelated) Blosum encoding –V: 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 –L:-1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 –V. L = 0.88 (highly related) –V. R = -0.08 (close to unrelated)
13
I1I2I3 h1h2 O1 Input h1 h1 hidden output = 1/ (1+e -x ) o=H1*v1,1 + H2*v2,1 O1 = (o) w1,1 v1,1 w3,1 v2,1 Error = O - True w1,2
14
Sigmodial or logistic function
16
Training and error reduction
17
18
Size matters
19
ß-strand Helix Turn Bend Secondary Structure Elements
20
Neural Network Architecture I K E E H V I I Q A E H E C IKEEHVIIQAEFYLNPDQSGEF….. Window Input Layer Hidden Layer Output Layer Weights
21
Normally the best prediction is obtained by averaging results from several predictions - “wisdom of the crowd Two types of neural networks Prediction of features in classes/bins e.g. H, E or C (1,0,0) Values close to 1 or 0 are more accurate than values close to 1/2 Prediction of real values e.g. Surface accessibility (0.43) Reliability of a prediction is more difficult to estimate Predictions and reliability of a prediction
23
Eukaryotic SP & TM Signal peptide cleavage 1523 seq C-terminal end of TM-regions 669 seq
24
Signal peptide prediction Signal pepdide likeness Cleavage site Combined information
25
Propeptide prediction Many secretory proteins and peptides are synthesized as inactive precursors that in addition to signal peptide cleavage undergo post-translational processing to become biologically active polypeptides. Precursors are usually cleaved at sites composed of single or paired basic amino acid residues by members of the subtilisin/kexin-like proprotein convertase (PC) family. In mammals, seven members have been identified, with furin being the one first discovered and best characterized. Recently, the involvement of furin in diseases ranging from Alzheimer's disease and cancer to anthrax and Ebola fever has created additional focus on proprotein processing. We have developed a method for prediction of cleavage sites for PCs based on artificial neural networks. Two different types of neural networks have been constructed: a furin-specific network based on experimental results derived from the literature, and a general PC-specific network trained on data from the Swiss-Prot protein database. The method predicts cleavage sites in independent sequences with a sensitivity of 95% for the furin neural network and 62% for the general PC network. Protein Engineering, Design and Selection: 17: 107-112, 2004. General cleavage: R/K-X n -R/K, n=0, 2, 4, 6 Furin cleavage: R-X-R/K-R
26
Propeptide prediction Furin cleavage
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.