Download presentation
Presentation is loading. Please wait.
Published byRachel Hunter Modified over 9 years ago
1
4. Modeling of side chains 1
2
Protein Structure Prediction: – given: sequence of protein – predict: structure of protein Challenges: – conformation space goal: describe continuous, immense space of conformations in an efficient and representative way – realistic energy function goal: energy minimum at or near experimentally derived structure (native) – efficient and reliable search algorithm goal: locate minimum (global minimum energy conformation GMEC) Prediction of side chain conformations: – subtask of protein structure prediction Side chain modeling is part of structure prediction 2
3
The importance of side chain modeling Side chain prediction subtask of protein structure prediction given: correct backbone conformation predict: side chain conformations (i.e. whole protein) successful prediction of protein structure depends on successful prediction of the side chain conformations complete details not solved by experiment allows evaluation of protocol at detailed, full-atom level allows flexibility in docking 3
4
Prediction of side chain conformations 1.rotamer libraries 2.dependence on backbone accuracy 3.approaches that locate GMEC or MECs Rosetta & other approaches DEE - Dead end elimination, SCWRL, PB - Belief propagation, LP -Linear integer programming Today’s menu 4
5
Side chains are described as rotamers Dihedral angles define side chain (assuming equilibrium bond and angle values) From wikipedia 5
6
Serine 1 preferences t=180 o g - =-60 o g + =+60 o Side chains assume discrete conformations Staggered conformations minimize collision with neighboring atoms Lovell, 2000 6
7
Rotamer: discrete side chain conformation defined by Rotamer libraries contain preferred conformations Dunbrack, 2002 Shapovalov and Dunbrack* 2011 BBDEP3854 1.8 * Shapovalov & Dunbrack, Structure 2011 7
8
Ponder & Richards, 1987: Analysis of ~20 proteins (~2000 side chains) 67 rotamers can adequately represent side chain conformations (for 17/20aa) Representative rotamer libraries are surprisingly small 8
9
Dunbrack & Karplus, 1993: For each (20 o x20 o ) bin, derive statistics on values Reflects dependence of side chain conformation on backbone conformation Backbone dependent rotamer libraries 9
10
Observed frequency of gauche +, gauche - + trans is very different in different backbone conformations sheet, helix, and coil regions (n=850 proteins, <1.7 Å resolution, and pair-wise seqid < 50%) Rotamer preferences depend on backbone conformation: example Valine 10
11
use Bayesian statistics to estimate populations for all rotamers, of all side chain types, for each (10 o x10 o ) bin P( Bayesian statistical analysis of rotamer library Dunbrack 1997 using Bayesian formalism, combine prior distribution based on P( *P ) fully dependent data … to describe both well-sampled regions sparsely sampled regions 11
12
Rotamer energy (E dun ): a knowledge-based score 1.Calculate p obs : frequencies of rotamers (or any other feature) 2.Convert into effective potential energy using Boltzmann equation Boas & Harbury, 2007 G = -RTln (p obs /p exp ) 12
13
Structure determination revisited Refit electron density maps 15% of non-rotameric side chains can be refitted to 1 (or 2) rotameric conformations 13 (Shapovalov & Dunbrack, 2007)
14
Refit electron density maps Rotameric side chains have lower entropy (dispersion of electron density around ) than side chains with multiple conformations in pdb, or non-rotameric side chains Structure determination revisited Residue type 1 entropy 14 (Shapovalov & Dunbrack, 2007)
15
Many good reasons: 1.More structural data 2.Improved set: Electron density calculations - remove highly dynamic side chains 3.Derive accurate and smooth density estimates of rotamer populations (incl. rare rotamers) as continuous function of backbone dihedral angles 4.Derive smooth estimates of the mean values and variances of rotameric side-chain dihedral angles 5.Improve treatment of non-rotameric degrees of freedom 2011: Improved Dunbrack library 15 Shapovalov & Dunbrack, 2011
16
Calculate rotamer preference for given bin: Adaptive Kernel density estimation allows: – smoother density function (prevents steep derivatives in Rosetta minimizations!) – more detailed binning The 2011 Dunbrack library 1.For each rotamer r of aa: determine a probability density estimate r |r) (= Ramachandran distribution for each rotamer) 2.Use Bayes’ rule to invert this density to produce an estimate of the rotamer probability P(r): backbone independent probability of rotamer r 16
17
Smoother density function P(r = g+| , aa = Ser) histogram Original probability density Using adaptive density kernels (integrate over neighborhood of adaptive size) 17
18
Not all side chain atoms show rotameric distribution Better description of non-rotameric side chains Original library Met 1 SP3 Met 1 SP3 Gln 3 SP2 Gln 3 SP2 Example: GLN 3 angles for 1 =g+; 2 =t) New library 18 Alpha helixBeta sheet Loops (polyP II)
19
Better description of non-rotameric side chains Example: ASN 2 angles for 1 =g+)
20
…. Leads to slight improvement in modeling 20
21
Rotamer frequency: rare conformations reflect increased internal strain – important to take frequency into account frequency can be used as energy term: E i = -K ln P i Increasing availability of high-resolution structures narrows distribution around rotamer in library Indicates that errors are responsible for outliers Refitting of electron density maps non-rotameric conformations often incorrectly modeled and high in entropy Some conclusions about rotamer libraries 21
22
Rotamericity <100%: Include more side chain conformations! – Position-dependent rotamers (example: unbound conformations in docking predictions) – Additional conformations around rotamer (± sd) – Non-rotameric side chain angles: describe as continuous density function Some conclusions about rotamer libraries 22
23
Prediction of side chain conformations 1.rotamer libraries 2.dependence on backbone accuracy 3.approaches that locate GMEC or MECs Rosetta & other approaches DEE - Dead end elimination, SCWRL, PB - Belief propagation, LP -Linear integer programming Today’s menu 23
24
Most common local backbone move in ultra-high resolution structures (<1.0Å) Changes side chain orientation without effect on backbone 3 rotations around C -C axes In 3% of all residues (1/4=Serine) Two distinct rotamers related by backrub moves for Ile (tt,mm) Backrub Motions: “How protein backbone shrugs when side chain dances” 24 Change of 1,3 Compensatory changes of 1,2 and 2,3 Davis, 2006
25
Prediction of side chain conformations 1.rotamer libraries 2.dependence on backbone accuracy 3.approaches that locate GMEC or MECs Rosetta & other approaches DEE - Dead end elimination, SCWRL, PB - Belief propagation, LP -Linear integer programming Today’s menu 25
26
Prediction of side chain conformations using rotamers Given: – protein backbone – for each residue: set of possible conformations (rotamers from library) Wanted: Combination of rotamers that results in lowest total energy GMEC = min ( E ir + E irjs ) location of GMEC is NP-hard (Fraenkel, 1997; Pierce, 2002) i i+1 i+2 i i+1 i+2 Self energy Pair energy 26
27
Side chain modeling = find best combination of rotamers How? 1.systematic scan for a protein with – 50 residue, and – 9 rotamers/residue number of combinations to scan: N=50 9 ~ 10 47 ! feasible only for small proteins search space needs to be reduced i i+1 i+2 Pos…iaia ibib … … jaja e ia,ja e ib,ja jbjb e ia,jb e ib,jb …. E tot = i E i + i,j E ij iaia ibib icic 27
28
Deterministic Approaches (e.g. DEE): – Guarantee location of GMEC – Can be slow – Advantageous when GMEC is (the only) near- native conformation Heuristic Approaches (e.g. MC): – Locate Population of low-energy models (not necessarily GMEC) – Faster, often converge Search strategies for locating GMEC or MECs 28
29
DEE (Dead-end elimination): – prune impossible rotamers, determine GMEC from reduced rotamer set Residue-interacting graphs (SCWRL) – use dynamic programming on graph to find GMEC – start with “leafs”: residues with low connectivity in graph Linear Programming (Kingsford) – solve set of linear constraints – can locate GMEC for sparsely connected graphs – dependent on energy function Guaranteed finding of GMEC 29
30
Approach: remove rotamers that cannot be part of the GMEC Rotamer r at position i can be eliminated if there exists a rotamer t such that: Iterative application of DEE removes many rotamers, at certain positions only one rotamer is left (Note that some rotamers can be removed from the beginning because they clash with the backbone - too high E it ) Dead End Elimination (DEE) r t E Combinations of rotamers at positions j≠i 30 Desmet & Lasters, 1992
31
Approach: remove rotamers that cannot be part of the GMEC, second criterion: Rotamer r at position i can be eliminated if there exists a rotamer t such that: This criterion allows removing of additional rotamers Refined DEE r t E Combinations of rotamers at positions j≠i 31 Goldstein, 1994
32
Approach: remove rotamers that cannot be part of the GMEC - additional criterion: Rotamer r at position i can be eliminated if there exists rotamers t 1 and t 2 such that: takes more time to compute at the end, we are left with 1 combination, or with a few combinations only, that need to be evaluated using other criteria More sophisticated DEE criteria…. r t1t2t1t2 E Combinations of rotamers at positions j≠i 32
33
DEE guarantees to find GMEC… … but may miss conformations that have only slightly worse energy Given that the energy function is not perfect, we want to find also additional conformations with comparable energy Approach used in Orbit: use MC to find additional low-energy combinations that resemble GMEC DEE-based approaches 33
34
Local sampling starting from GMEC reveals conservation pattern of designs Alignment with zif268 second finger Alignment with zif268 second finger Conservation across 1000 simulations Conservation across 1000 simulations Ranking of predicted sequences sequences Design of a sequence that adopts a zinc finger fold without zinc 34 Dahiyat & Mayo (1997)
35
SCWRL - residue-interacting graphs DEE - remain with residues with > 1 rotamer: “active residues” undirected graph of active residues: – side chains = vertices – interacting rotamer pairs: connected by edge identify – articulation points (break cluster apart) & – bi-connected components (cannot be broken into different parts by removing one node) Very simple energy function: only dunbrack energy and repulsion 35 Canutescu, 2003
36
SCWRL - residue-interacting graphs Solve a cluster using bi-connected components For each, calculate best energy given specific rotamer in bi- connected residue Pruning is easy since energy function only positive [Backtracking: when certain threshold is used, a specific rotamer (combination) can be deleted] 36 Canutescu, 2003
37
Define cutoff values to prune branches that probably do not contain low-energy conformations Mean-field approach, Belief Propagation Self-consistent algorithms Monte-Carlo sampling Heuristic approaches 37
38
Side chain optimization Rigid body minimization Random perturbation MC Sc modeling in Rosetta: part of a cycle START Random perturbation Side chain optimization Rigid body minimization FINISH Energy Rigid body orientations rigid body optimization backbone optimization 38
39
Side chain modeling protocols in Rosetta Monte-Carlo procedure: heuristic does not converge – several runs needed to locate solution use backbone-dependent rotamer library (Dunbrack) approaches “Repacking” – model side chain conformation from scratch “Rotamer Trial” – refine side chain conformations “Rotamer Trial with minimization” (RTmin) – off-rotamer sampling by minimization 39
40
Monte Carlo sampling pre-calculate E ir and E irjt matrix Self energy: Energy between rotamer r at position i with constant part Pairwise energy: between rotamer r at position i and rotamer t at position j (sparse matrix) E total = i E ir + i j E irjt simulated annealing make random change start with high acceptance rate, gradually lower temperature acceptance based on Boltzmann distribution 40
41
“Repacking”: full combinatorial side chain optimization remove all side chains gradually add side chains: select from backbone-dependent rotamer library add position-specific rotamers (e.g. from unbound conformation): set their energy to minimum rotamer energy, to ensure acceptance use simulated annealing to create increasingly well packed side chains repeat to sample range of low-energy conformations 41
42
“Rotamer trial”: side chain adjustment Find better rotamers for existing structure pick residue at random search for rotamer with lower energy replace rotamer Repeated until all high-energy positions are improved Fast 42
43
Side chain modeling based on rotamer libraries Combinatorial problem Approaches for side chain modeling involve smart reduction of combinatorial complexity (heuristic or exact) Side chain modeling as a “toy model” for structural modeling Side chain modeling can be extended to Design by adding rotamer options of different amino acids Side chain modeling: Summary 43
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.