Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]

Similar presentations


Presentation on theme: ". Protein Structure Prediction [Based on Structural Bioinformatics, section VII]"— Presentation transcript:

1 . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]

2 Predicting protein 3d structure Goal: 3d structure from 1d sequence What kind of fold the given sequence may adopt? Fold recognition Comparative modeling ab-initio An existing fold A new fold

3 Measuring progress CASP – Critical Assessment of Structure Prediction CAFASP – Critical Assessment of Fully Automated Structure Prediction Targets: unpublished NMR or X-ray structures Goal: predict target 3d structure and submit it for independent and comparative review

4 What Forces Hold the Structure? Hydrogen Bonds

5 What Forces Hold the Structure? Charge-charge interactions Positive charged groups prefer to be situated against negatively charged groups Hydrophobic effect

6 What Forces Hold the Structure? u Disulfide bonds l S-S bonds between Cysteine residues

7 Homology modeling Based on the two major observations: 1. The structure of a protein is uniquely defined by its amino acid sequence. 2. Similar sequences adopt practically identical structures, distantly related sequences still fold into similar structures.

8 Growth of the Protein Data Bank

9 Fraction of New Folds

10 [Rost, Protein Eng. 1999] Two zones of sequence alignment

11 The 7 steps to homology modeling 1. Template recognition and initial alignment ― BLAST, FASTA 2. Alignment correction ― Better alignment, MSA

12 The 7 steps to homology modeling 3. Backbone generation ― Copy backbone atoms [and side-chains of conserved residues] 4. Loop modeling ― Knowledge based ― Energy based

13 The 7 steps to homology modeling 5. Side-chain modeling ― Rotamer: a low energy side-chain conformation ― Rotamer library [backbone independent, dependent] ― HUGE search space [~5 N ] High accuracy for residues in the hydrophobic core [90%], much lower for residues in the surface [50%]

14 The 7 steps to homology modeling 6. Model optimization ― Predict the side-chains, then the resulting shifts in the backbone, then the rotamers for the new backbone … 7. Model validation ― Calculating the model’s energy ― Determination of normality indices: ― bond lengths, bond and torsion angles ― Inside/outside distribution of polar residues ― Radial distribution function

15 Predicting protein 3d structure Goal: 3d structure from 1d sequence What kind of fold the given sequence may adopt? Fold recognition Comparative modeling ab-initio An existing fold A new fold

16 Fold recognition Which of the known folds is likely to be similar to the (unknown) fold of a new protein when only its amino-acid sequence is known?

17 Fraction of new folds (PDB new entries in 1998) Koppensteiner et al., 2000, JMB 296:1139-1152.

18 Unrelated proteins adopt similar folds Only 100 folds account for ~50% of all protein superfamilies Possible explanations: 1. Divergent evolution 2. Convergent evolution 3. Limited number of folds 4. Misguided analysis

19 Proteins as seen by a Biologist Does a new protein sequence belong to a given family of proteins (with a specific set of mutation rules)? Fold recognition is based on: Sequence alignment, multiple sequence alignment Profile HMM, PSI-BLAST

20 Proteins as seen by a Physicist “Thermodynamic hypothesis”: The native conformation of a protein corresponds to a global free energy minimum of the system (protein + solvent) Naïve approach: having a correct energy function, search for the native structure in the conformational space

21 Threading Threading: energy based fold recognition Define: 1. Protein model and interaction description 2. Alignment algorithm 3. Energy parameterization 1 2 345 6 7 10 8 9 A C CEC A D A A C E ab E ab A C D E ….. A -3 -1 0 0.. C -1 -4 1 2.. D 0 1 5 6.. E 0 2 6 7.......

22 MAHFPGFGQSLLFGYPVYVFGD... Potential fold... 1)... 56)... n)... -10... -123... 20.5 Find best fold for a protein sequence: Fold recognition (threading)

23 GenTHREADER (Jones, 1999, JMB 287:797-815) For each template provide MSA l align the query sequence with the MSA l assess the alignment by sequence alignment score l assess the alignment by pairwise potentials l assess the alignment by solvation function l record lengths of: alignment, query, template

24 Essentials of GenTHREADER

25 Predicting protein 3d structure Goal: 3d structure from 1d sequence What kind of fold the given sequence may adopt? Fold recognition Comparative modeling ab-initio An existing fold A new fold

26 Ab-initio folding Goal: Predict structure from “first principles” Requires: l A free energy function, sufficiently close to the “true potential” l A method for searching the conformational space Benefits: l Works for novel folds l Shows that we understand the process

27 Ab-initio folding – the challenge 1. Current potential functions have limited accuracy 2. The conformational space is HUGE Possible simplifications: l Reduced representation l Simplified potentials l Coarse search strategies

28 Representation Detailed representation – include all atoms of the protein and the surrounding solvent  computational expansive Implicit solvent models United atom representation Side-chain as centroid or c α Restricted side-chain configurations (rotamers) Restricted backbone torsion angles

29 Rosetta [Simons et al. 1997] “Structural” signatures are reoccurring within protein structures Use these as cues during structure search I-sites Library – a catalog of local sequence-structure correlations Serine hairpinType-I hairpin Frayed helix

30 Fragment insertion Monte Carlo Energy function change backbone angles Convert to 3D accept or reject Choose a fragment fragments backbone torsion angles Rosetta: a folding simulation program evaluate

31 Potential functions Molecular mechanics – models the forces that determines protein conformation Van der Waals: Lennard-Jones 12-6 Electrostatic: Coulomb’s law Scoring functions – empirically derived from solved structures Useful with reduced complexity models Useful in treating aspects of protein thermodynamics

32 Search methods Molecular dynamics – Simulates the motion of a molecule in a given potential Impractical … Coarse sampling of energy landscape: Simulated annealing, genetic algorithms, …


Download ppt ". Protein Structure Prediction [Based on Structural Bioinformatics, section VII]"

Similar presentations


Ads by Google