Download presentation
Presentation is loading. Please wait.
Published byIsaiah Taylor Modified over 10 years ago
1
Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)
2
The Protein Folding Problem Given a particular sequence of amino acid residues (primary structure), what will the tertiary/quaternary structure of the resulting protein be?Central question of molecular biology:Given a particular sequence of amino acid residues (primary structure), what will the tertiary/quaternary structure of the resulting protein be? Input: AAVIKYGCAL… Output: 1 1, 2 2 … = backbone conformation: (no side chains yet)
3
Disulfide Bonds Two cyteines in close proximity will form a covalent bond Disulfide bond, disulfide bridge, or dicysteine bond. Significantly stabilizes tertiary structure.
4
Protein Folding – Biological perspective Central dogma: Sequence specifies structureCentral dogma: Sequence specifies structure Denature – to unfold a protein back to random coil configuration – -mercaptoethanol – breaks disulfide bonds –Urea or guanidine hydrochloride – denaturant –Also heat or pH Anfinsens experiments –Denatured ribonuclease –Spontaneously regained enzymatic activity –Evidence that it re-folded to native conformation
5
Folding intermediates Levinthals paradox – Consider a 100 residue protein. If each residue can take only 3 positions, there are 3 100 = 5 10 47 possible conformations. –If it takes 10 -13 s to convert from 1 structure to another, exhaustive search would take 1.6 10 27 years! Folding must proceed by progressive stabilization of intermediates –Molten globules – most secondary structure formed, but much less compact than native conformation.
6
Forces driving protein folding It is believed that hydrophobic collapse is a key driving force for protein folding –Hydrophobic core –Polar surface interacting with solvent Minimum volume (no cavities) Disulfide bond formation stabilizes Hydrogen bonds Polar and electrostatic interactions
7
Folding help Proteins are, in fact, only marginally stable –Native state is typically only 5 to 10 kcal/mole more stable than the unfolded form Many proteins help in folding –Protein disulfide isomerase – catalyzes shuffling of disulfide bonds –Chaperones – break up aggregates and (in theory) unfold misfolded proteins
8
The Hydrophobic Core Hemoglobin A is the protein in red blood cells (erythrocytes) responsible for binding oxygen. The mutation E6 V in the chain places a hydrophobic Val on the surface of hemoglobin The resulting sticky patch causes hemoglobin S to agglutinate (stick together) and form fibers which deform the red blood cell and do not carry oxygen efficiently Sickle cell anemia was the first identified molecular disease
9
Sickle Cell Anemia Sequestering hydrophobic residues in the protein core protects proteins from hydrophobic agglutination.
10
Computational Problems in Protein Folding Two key questions: –Evaluation – how can we tell a correctly-folded protein from an incorrectly folded protein? H-bonds, electrostatics, hydrophobic effect, etc. Derive a function, see how well it does on real proteins –Optimization – once we get an evaluation function, can we optimize it? Simulated annealing/monte carlo EC Heuristics
11
Fold Optimization Simple lattice models (HP-models) –Two types of residues: hydrophobic and polar –2-D or 3-D lattice –The only force is hydrophobic collapse –Score = number of H H contacts
12
H/P model scoring: count noncovalent hydrophobic interactions. Sometimes: –Penalize for buried polar or surface hydrophobic residues Scoring Lattice Models
13
What can we do with lattice models? For smaller polypeptides, exhaustive search can be used –Looking at the best fold, even in such a simple model, can teach us interesting things about the protein folding process For larger chains, other optimization and search methods must be used –Greedy, branch and bound –Evolutionary computing, simulated annealing –Graph theoretical methods
14
The hydrophobic zipper effect: Learning from Lattice Models Ken Dill ~ 1997
15
Absolute directions –UURRDLDRRU Relative directions –LFRFRRLLFL –Advantage, we cant have UD or RL in absolute –Only three directions: LRF What about bumps? LFRRR –Bad score –Use a better representation Representing a lattice model
16
Preference-order representation Each position has two preferences –If it cant have either of the two, it will take the least favorite path if possible Example: {LR},{FL},{RL}, {FR},{RL},{RL},{FR},{RF} Can still cause bumps: {LF},{FR},{RL},{FL}, {RL},{FL},{RF},{RL}, {FL}
17
Decoding the representation The optimizer works on the representation, but to score, we have to decode into a structure that lets us check for bumps and score. Example: How many bumps in: URDDLLDRURU? We can do it on graph paper –Start at 0,0 –Fill in the graph
18
More realistic models Higher resolution lattices (45° lattice, etc.) Off-lattice models –Local moves –Optimization/search methods and / representations Greedy search Branch and bound EC, Monte Carlo, simulated annealing, etc.
19
Threading: Fold recognition Given: –Sequence: IVACIVSTEYDVMKAAR… –A database of molecular coordinates Map the sequence onto each fold Evaluate –Objective 1: improve scoring function –Objective 2: folding
20
X-Ray Crystallography ~0.5mm The crystal is a mosaic of millions of copies of the protein. As much as 70% is solvent (water)! May take months (and a green thumb) to grow.
21
X-Ray diffraction Image is averaged over: –Space (many copies) –Time (of the diffraction experiment)
22
The Protein Data Bank ATOM 1 N ALA E 1 22.382 47.782 112.975 1.00 24.09 3APR 213 ATOM 2 CA ALA E 1 22.957 47.648 111.613 1.00 22.40 3APR 214 ATOM 3 C ALA E 1 23.572 46.251 111.545 1.00 21.32 3APR 215 ATOM 4 O ALA E 1 23.948 45.688 112.603 1.00 21.54 3APR 216 ATOM 5 CB ALA E 1 23.932 48.787 111.380 1.00 22.79 3APR 217 ATOM 6 N GLY E 2 23.656 45.723 110.336 1.00 19.17 3APR 218 ATOM 7 CA GLY E 2 24.216 44.393 110.087 1.00 17.35 3APR 219 ATOM 8 C GLY E 2 25.653 44.308 110.579 1.00 16.49 3APR 220 ATOM 9 O GLY E 2 26.258 45.296 110.994 1.00 15.35 3APR 221 ATOM 10 N VAL E 3 26.213 43.110 110.521 1.00 16.21 3APR 222 ATOM 11 CA VAL E 3 27.594 42.879 110.975 1.00 16.02 3APR 223 ATOM 12 C VAL E 3 28.569 43.613 110.055 1.00 15.69 3APR 224 ATOM 13 O VAL E 3 28.429 43.444 108.822 1.00 16.43 3APR 225 ATOM 14 CB VAL E 3 27.834 41.363 110.979 1.00 16.66 3APR 226 ATOM 15 CG1 VAL E 3 29.259 41.013 111.404 1.00 17.35 3APR 227 ATOM 16 CG2 VAL E 3 26.811 40.649 111.850 1.00 17.03 3APR 228 http://www.rcsb.org/pdb/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.