Download presentation
Presentation is loading. Please wait.
1
Protein Secondary Structures
Assignment and prediction Pernille Haste Andersen
2
Outline What is protein secondary structure How can it be used?
Different prediction methods Alignment to homologues Propensity methods Neural networks Evaluation of prediction methods Links to prediction servers
3
Secondary Structure Elements
ß-strand Helix Bend Turn
4
Use of secondary structure
Classification of protein structures Definition of loops (active sites) Use in fold recognition methods Improvements of alignments Definition of domain boundaries
5
Classification of secondary structure
Defining features Dihedral angles Hydrogen bonds Geometry Assigned manually by crystallographers or Automatic DSSP (Kabsch & Sander,1983) STRIDE (Frishman & Argos, 1995) DSSPcont (Andersen et al., 2002)
6
Dihedral Angles phi - dihedral angle of the N-Calpha bond
From phi dihedral angle of the N-Calpha bond psi dihedral angle of the Calpha-C bond omega dihedral angle of the C-N (peptide) bond
7
Helices phi(deg) psi(deg) H-bond pattern
alpha-helix i+4 pi-helix i+5 310 helix i+3 (omega = 180 deg ) From
8
Beta Strands Antiparallel Parallel phi(deg) psi(deg) omega (deg)
beta strand Antiparallel From Parallel From
9
Secondary Structure Elements
ß-strand Helix Bend Turn
10
Secondary Structure Type Descriptions
* H = alpha helix * G = helix * I = 5 helix (pi helix) * E = extended strand, participates in beta ladder * B = residue in isolated beta-bridge * T = hydrogen bonded turn * S = bend * C = coil
11
Automatic assignment programs
DSSP ( ) STRIDE ( ) DSSPcont ( ) The protein data bank visualizes DSSP assignments on structures in the data base # RESIDUE AA STRUCTURE BP1 BP2 ACC N-H-->O O-->H-N N-H-->O O-->H-N TCO KAPPA ALPHA PHI PSI X-CA Y-CA Z-CA A E , , , , A H , , , , A V , , , , A I E -A A , , , , A I E -A A , , , , A Q E -A A , , , , A A E +A A , , , , A E E +A A , , , , A F E -A A , , , , A Y E -A A , , , , A L E >> -A A , , , , A N T 45S , , , , A P T 45S , , , , A D T 45S , , , ,
12
Secondary Structure Prediction
What to predict? All 8 types or pool types into groups DSSP Q3 * H = alpha helix * G = 310 -helix * I = 5 helix (pi helix) * E = extended strand * B = beta-bridge * T = hydrogen bonded turn * S = bend * C = coil H E C
13
Secondary Structure Prediction
Straight HEC What to predict? All 8 types or pool types into groups Q3 * H = alpha helix * E = extended strand * T = hydrogen bonded turn * S = bend * C = coil * G = 310-helix * I = 5 helix (pi helix) * B = beta-bridge H E C
14
Secondary Structure Prediction
Simple alignments Align to a close homolog for which the structure has been experimentally solved. Heuristic Methods (e.g., Chou-Fasman, 1974) Apply scores for each amino acid an sum up over a window. Neural Networks Raw Sequence (late 80’s) Blosum matrix (e.g., PhD, early 90’s) Position specific alignment profiles (e.g., PsiPred, late 90’s) Multiple networks balloting, probability conversion, output expansion (Petersen et al., 2000).
15
Improvement of accuracy
1974 Chou & Fasman ~50-53% 1978 Garnier 63% 1987 Zvelebil 66% 1988 Quian & Sejnowski 64.3% 1993 Rost & Sander % 1997 Frishman & Argos <75% 1999 Cuff & Barton 72.9% 1999 Jones % 2000 Petersen et al %
16
Simple Alignments Solved structure of a homolog to query is needed
Homologous proteins have ~88% identical (3 state) secondary structure If no close homologue can be identified alignments will give almost random results
17
Propensities: Amino acid preferences in -Helix
Capping
18
Propensities: Amino acid preferences in -Strand
19
Propensities: Amino acid preferences in coil
20
Chou-Fasman propensities
Name P(a) P(b) P(turn) f(i) f(i+1) f(i+2) f(i+3) Ala Arg Asp Asn Cys Glu Gln Gly His Ile Leu Lys Met Phe Pro Ser Thr Trp Tyr Val
21
Chou-Fasman Generally applicable
Works for sequences with no solved homologs But the accuracy is low! The problem is that the method does not use enough information about the structural context of a residue
22
Neural Networks Benefits Drawbacks Generally applicable
Can capture higher order correlations Inputs other than sequence information Drawbacks Needs a high amount of data (different solved structures). However, today nearly 2500 structures with low sequence identity/high resolution are solved Complex method with several pitfalls
23
Architecture Weights Input Layer I K H E Output Layer E E C H V I I Q
Hidden Layer Window IKEEHVIIQAEFYLNPDQSGEF…..
24
Sparse encoding Inp Neuron AAcid A R N D C Q E
25
Input Layer I K E 1 E H V I I Q A E
26
BLOSUM 62 A R N D C Q E G H I L K M F P S T W Y V B Z X *
27
Input Layer I K E E H V I I Q A E -1 2 -4 2 5 -2 -3 -3 1 -2 -3 -1 -1
I 2 K -4 E 2 E 5 H -2 V I -3 I -3 Q 1 A -2 E -3 -1 -1 -3 -2 -2
28
Secondary networks (Structure-to-Structure)
Weights Input Layer H E H C Output Layer E H C E C H E C Window Hidden Layer IKEEHVIIQAEFYLNPDQSGEF…..
29
PHD method (Rost and Sander)
Combine neural networks with sequence profiles 6-8 Percentage points increase in prediction accuracy over standard neural networks Use second layer “Structure to structure” network to filter predictions Jury of predictors Set up as mail server
30
PSI-Pred (Jones) Use alignments from iterative sequence searches (PSI-Blast) as input to a neural network Better predictions due to better sequence profiles Available as stand alone program and via the web
31
Position specific scoring matrices (PSI-BLAST profiles)
A R N D C Q E G H I L K M F P S T W Y V 1 I 2 K 3 E 4 E 5 H 6 V 7 I 8 I 9 Q 10 A 11 E 12 F 13 Y 14 L 15 N 16 P 17 D
32
Several different architectures
Sequence-to-structure Window sizes 15,17,19 and 21 Hidden units 50 and 75 10-fold cross validation => 80 predictions Structure-to-structure Window size 17 Hidden units 40 10-fold cross validation => 800 predictions Output: C C H H C C C Output: C C C C C C C
33
The majority rules Combining predictions from several networks improves the prediction Combinations of 800 different networks were used in the method described by Petersen TN et al. 2000, Prediction of protein secondary structure at 80 % accuracy. Proteins
34
Activities to probabilities
Helix activities (output) Strand activities (output) Coil probabilities! (calculated) Coil conversion … 1.0 0.10 . 1.0
35
Benchmarking secondary structure predictions
EVA Newly solved structures are send to prediction servers. Every week
36
EVA results (Rost et al., 2001)
PROFphd 77.0% PSIPRED 76.8% SAM-T99sec 76.1% SSpro % Jpred % PHD % Cubic.columbia.edu/eva
37
Links to servers Several links: ProfPHD PSIPRED JPred
ProfPHD PSIPRED JPred
38
Practical Conclusions
If you need a secondary structure prediction use the newer methods based on advanced machine learning methods such as : ProfPHD PSIPRED JPred And not one of the older ones such as : Chou-Fasman Garnier
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.