Download presentation
Presentation is loading. Please wait.
1
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
2
Predicting protein 3d structure Goal: 3d structure from 1d sequence What kind of fold the given sequence may adopt? Fold recognition Comparative modeling ab-initio An existing fold A new fold
3
Measuring progress CASP – Critical Assessment of Structure Prediction CAFASP – Critical Assessment of Fully Automated Structure Prediction Targets: unpublished NMR or X-ray structures Goal: predict target 3d structure and submit it for independent and comparative review
4
What Forces Hold the Structure? Hydrogen Bonds
5
What Forces Hold the Structure? Charge-charge interactions Positive charged groups prefer to be situated against negatively charged groups Hydrophobic effect
6
What Forces Hold the Structure? u Disulfide bonds l S-S bonds between Cysteine residues
7
Homology modeling Based on the two major observations: 1. The structure of a protein is uniquely defined by its amino acid sequence. 2. Similar sequences adopt practically identical structures, distantly related sequences still fold into similar structures.
8
Growth of the Protein Data Bank
9
Fraction of New Folds
10
[Rost, Protein Eng. 1999] Two zones of sequence alignment
11
The 7 steps to homology modeling 1. Template recognition and initial alignment ― BLAST, FASTA 2. Alignment correction ― Better alignment, MSA
12
The 7 steps to homology modeling 3. Backbone generation ― Copy backbone atoms [and side-chains of conserved residues] 4. Loop modeling ― Knowledge based ― Energy based
13
The 7 steps to homology modeling 5. Side-chain modeling ― Rotamer: a low energy side-chain conformation ― Rotamer library [backbone independent, dependent] ― HUGE search space [~5 N ] High accuracy for residues in the hydrophobic core [90%], much lower for residues in the surface [50%]
14
The 7 steps to homology modeling 6. Model optimization ― Predict the side-chains, then the resulting shifts in the backbone, then the rotamers for the new backbone … 7. Model validation ― Calculating the model’s energy ― Determination of normality indices: ― bond lengths, bond and torsion angles ― Inside/outside distribution of polar residues ― Radial distribution function
15
Predicting protein 3d structure Goal: 3d structure from 1d sequence What kind of fold the given sequence may adopt? Fold recognition Comparative modeling ab-initio An existing fold A new fold
16
Fold recognition Which of the known folds is likely to be similar to the (unknown) fold of a new protein when only its amino-acid sequence is known?
17
Fraction of new folds (PDB new entries in 1998) Koppensteiner et al., 2000, JMB 296:1139-1152.
18
Unrelated proteins adopt similar folds Only 100 folds account for ~50% of all protein superfamilies Possible explanations: 1. Divergent evolution 2. Convergent evolution 3. Limited number of folds 4. Misguided analysis
19
Proteins as seen by a Biologist Does a new protein sequence belong to a given family of proteins (with a specific set of mutation rules)? Fold recognition is based on: Sequence alignment, multiple sequence alignment Profile HMM, PSI-BLAST
20
Proteins as seen by a Physicist “Thermodynamic hypothesis”: The native conformation of a protein corresponds to a global free energy minimum of the system (protein + solvent) Naïve approach: having a correct energy function, search for the native structure in the conformational space
21
Threading Threading: energy based fold recognition Define: 1. Protein model and interaction description 2. Alignment algorithm 3. Energy parameterization 1 2 345 6 7 10 8 9 A C CEC A D A A C E ab E ab A C D E ….. A -3 -1 0 0.. C -1 -4 1 2.. D 0 1 5 6.. E 0 2 6 7.......
22
MAHFPGFGQSLLFGYPVYVFGD... Potential fold... 1)... 56)... n)... -10... -123... 20.5 Find best fold for a protein sequence: Fold recognition (threading)
23
GenTHREADER (Jones, 1999, JMB 287:797-815) For each template provide MSA l align the query sequence with the MSA l assess the alignment by sequence alignment score l assess the alignment by pairwise potentials l assess the alignment by solvation function l record lengths of: alignment, query, template
24
Essentials of GenTHREADER
25
Predicting protein 3d structure Goal: 3d structure from 1d sequence What kind of fold the given sequence may adopt? Fold recognition Comparative modeling ab-initio An existing fold A new fold
26
Ab-initio folding Goal: Predict structure from “first principles” Requires: l A free energy function, sufficiently close to the “true potential” l A method for searching the conformational space Benefits: l Works for novel folds l Shows that we understand the process
27
Ab-initio folding – the challenge 1. Current potential functions have limited accuracy 2. The conformational space is HUGE Possible simplifications: l Reduced representation l Simplified potentials l Coarse search strategies
28
Representation Detailed representation – include all atoms of the protein and the surrounding solvent computational expansive Implicit solvent models United atom representation Side-chain as centroid or c α Restricted side-chain configurations (rotamers) Restricted backbone torsion angles
29
Rosetta [Simons et al. 1997] “Structural” signatures are reoccurring within protein structures Use these as cues during structure search I-sites Library – a catalog of local sequence-structure correlations Serine hairpinType-I hairpin Frayed helix
30
Fragment insertion Monte Carlo Energy function change backbone angles Convert to 3D accept or reject Choose a fragment fragments backbone torsion angles Rosetta: a folding simulation program evaluate
31
Potential functions Molecular mechanics – models the forces that determines protein conformation Van der Waals: Lennard-Jones 12-6 Electrostatic: Coulomb’s law Scoring functions – empirically derived from solved structures Useful with reduced complexity models Useful in treating aspects of protein thermodynamics
32
Search methods Molecular dynamics – Simulates the motion of a molecule in a given potential Impractical … Coarse sampling of energy landscape: Simulated annealing, genetic algorithms, …
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.