Download presentation
Presentation is loading. Please wait.
Published byEmory Mitchell Modified over 9 years ago
1
Comparative methods Basic logics: The 3D structure of the protein is deduced from: 1.Similarities between the protein and other proteins 2.Statistical tendencies, characteristic of its sequence Physical aspects of the structure are not included in the prediction Major categories of comparative structure prediction: 1.Secondary structure prediction 2.Homology modeling 3.Fold recognition
2
1. Secondary structure prediction Basic methodology: Each amino acid has a statistical propensity to appear in certain secondary structures (e.g. helix, sheet, turn) The individual amino acid propensities are additive Thus, the propensity of an entire protein segment can be calculated By using a ‘sliding window’, protein segments with strong secondary structure propensities can be identified
3
P(H), P(E), P(turn) – frequency parameters for appearing in an α-helix, β- sheet, and turn F(i), F(i+1), F(i+2), F(i+3) – frequencies of being in 1 st to 4 th position of β-turn 1.Chou and Fassman (1974) Residue propensities + a sliding widow for prediction Major steps in secondary structure prediction
4
Success rate: ~50% Y Y Y Y Y Y Y Y Y
5
2.Sternberg (1987) Incorporating evolutionary information in the calculation, in the form of multiple sequence alignments (MSAs) (homologous proteins tend to have similar secondary structures) Success rate: 69%
6
3.Rost and Sander (1994) (PHD-Sec) Combines neural networks (i.e. machine learning) with multiple sequence alignments Success rates: PHD-Sec – 72%; PREDATOR – 75%; PSIPRED – 77%
7
Common problems in secondary structure prediction Prediction is problematic at the extremities of secondary elements Success rate is always under 100% - maybe due to tertiary effects in proteins
8
2. Homology modeling Basic logics: Homologous proteins (proteins with a common ancestor; high sequence identity) share similar structures Thus, the structure of a protein can be predicted according to its sequence similarity to proteins of known structure (family)
9
Homology modeling includes the following steps: 1.Finding a ‘template’ protein with high enough sequence identity to the query protein (desirable: at least 30%) [PSI-BLAST] 2.Aligning the two sequences 3.Transferring the coordinates of identical amino acids from the template to the query protein (for non-identical residues - other prediction methods are used)
10
4. Performing energy optimization to get rid of clashes and distortions
11
5.
12
Problems: 1.The number of proteins of known structure that can serve as templates (i.e. > 30% sequence identity) is limited 2.Predicting loops - loops are rich in insertions and deletions, and are therefore difficult to predict Partial solution: combination of sequence-based methods and hydrophobicity profiles make it possible to infer the structure of loops
13
3. Fold recognition (profile) Basic logics: The sequence-based statistical tendencies (polarity, exposure, secondary structure) of the query protein are compared to those of other proteins with known structure The best match represents the protein of the closest fold to the query protein Useful for: 1.Finding the fold of a query protein 2.Predicting whether a query protein has a novel fold
14
1.Each of the 20 amino acids is classified according to 3 basic structure-related statistical tendencies: polarity, solvent exposure and secondary structure 2.Each position in the query protein is assigned a code, describing the specific tendencies of this position. This yields a structure-based sequence profile for the query protein 3.The profile is systematically compared to a library containing the profiles of all proteins of known structure 4.A match represents a protein with similar fold 5.If a match is not found, the query protein is assumed to have a novel fold 3. Fold recognition (profile): steps
15
4. Fold recognition (Threading) A combination of homology modeling and structural profiles Like homology modeling: it predicts the structure of the query protein based on sequence alignments with template proteins However: instead of one 3D model, many low-resolution models are constructed by using different alignments The different models are evaluated based on residue-residue preferences in known structures (converted to energy terms by the Boltzman equation)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.