Presentation is loading. Please wait.

Presentation is loading. Please wait.

4. Modeling of side chains 1. Protein Structure Prediction: – given: sequence of protein – predict: structure of protein Challenges: – conformation space.

Similar presentations


Presentation on theme: "4. Modeling of side chains 1. Protein Structure Prediction: – given: sequence of protein – predict: structure of protein Challenges: – conformation space."— Presentation transcript:

1 4. Modeling of side chains 1

2 Protein Structure Prediction: – given: sequence of protein – predict: structure of protein Challenges: – conformation space goal: describe continuous, immense space of conformations in an efficient and representative way – realistic energy function goal: energy minimum at or near experimentally derived structure (native) – efficient and reliable search algorithm goal: locate minimum (global minimum energy conformation GMEC) Prediction of side chain conformations: – subtask of protein structure prediction Side chain modeling is part of structure prediction 2

3 The importance of side chain modeling Side chain prediction subtask of protein structure prediction given: correct backbone conformation predict: side chain conformations (i.e. whole protein) successful prediction of protein structure depends on successful prediction of the side chain conformations complete details not solved by experiment allows evaluation of protocol at detailed, full-atom level allows flexibility in docking 3

4 Prediction of side chain conformations 1.rotamer libraries 2.dependence on backbone accuracy 3.approaches that locate GMEC or MECs Rosetta & other approaches DEE - Dead end elimination, SCWRL, PB - Belief propagation, LP -Linear integer programming Today’s menu 4

5 Side chains are described as rotamers Dihedral angles     define side chain (assuming equilibrium bond and angle values) From wikipedia 5

6 Serine  1 preferences t=180 o g - =-60 o g + =+60 o Side chains assume discrete conformations Staggered conformations minimize collision with neighboring atoms Lovell, 2000 6

7 Rotamer: discrete side chain conformation defined by     Rotamer libraries contain preferred conformations Dunbrack, 2002 Shapovalov and Dunbrack* 2011 BBDEP3854 1.8 * Shapovalov & Dunbrack, Structure 2011 7

8 Ponder & Richards, 1987: Analysis of ~20 proteins (~2000 side chains) 67 rotamers can adequately represent side chain conformations (for 17/20aa) Representative rotamer libraries are surprisingly small 8

9 Dunbrack & Karplus, 1993: For each  (20 o x20 o ) bin, derive statistics on     values Reflects dependence of side chain conformation on backbone conformation Backbone dependent rotamer libraries   9

10 Observed frequency of gauche +, gauche - + trans is very different in different backbone conformations sheet, helix, and coil regions (n=850 proteins, <1.7 Å resolution, and pair-wise seqid < 50%) Rotamer preferences depend on backbone conformation: example Valine 10

11 use Bayesian statistics to estimate populations for all rotamers, of all side chain types, for each  (10 o x10 o ) bin P(          Bayesian statistical analysis of rotamer library Dunbrack 1997 using Bayesian formalism, combine prior distribution based on P(  *P  ) fully  dependent data … to describe both well-sampled regions sparsely sampled regions 11

12 Rotamer energy (E dun ): a knowledge-based score 1.Calculate p obs : frequencies of rotamers (or any other feature) 2.Convert into effective potential energy using Boltzmann equation Boas & Harbury, 2007  G = -RTln (p obs /p exp ) 12

13 Structure determination revisited Refit electron density maps 15% of non-rotameric side chains can be refitted to 1 (or 2) rotameric conformations 13 (Shapovalov & Dunbrack, 2007)

14 Refit electron density maps Rotameric side chains have lower entropy (dispersion of electron density around  ) than side chains with multiple conformations in pdb, or non-rotameric side chains Structure determination revisited Residue type  1 entropy 14 (Shapovalov & Dunbrack, 2007)

15 Many good reasons: 1.More structural data 2.Improved set: Electron density calculations - remove highly dynamic side chains 3.Derive accurate and smooth density estimates of rotamer populations (incl. rare rotamers) as continuous function of backbone dihedral angles 4.Derive smooth estimates of the mean values and variances of rotameric side-chain dihedral angles 5.Improve treatment of non-rotameric degrees of freedom 2011: Improved Dunbrack library 15 Shapovalov & Dunbrack, 2011

16 Calculate rotamer preference for given  bin: Adaptive Kernel density estimation allows: – smoother density function (prevents steep derivatives in Rosetta minimizations!) – more detailed binning The 2011 Dunbrack library 1.For each rotamer r of aa: determine a probability density estimate r  |r) (= Ramachandran distribution for each rotamer) 2.Use Bayes’ rule to invert this density to produce an estimate of the rotamer probability P(r): backbone independent probability of rotamer r 16

17 Smoother density function P(r = g+| , aa = Ser) histogram Original probability density Using adaptive density kernels (integrate over neighborhood of adaptive size) 17

18 Not all side chain atoms show rotameric distribution Better description of non-rotameric side chains Original library Met  1 SP3 Met  1 SP3 Gln  3 SP2 Gln  3 SP2 Example: GLN  3 angles for  1 =g+;  2 =t)  New library 18 Alpha helixBeta sheet Loops (polyP II)

19 Better description of non-rotameric side chains Example: ASN  2 angles for  1 =g+)

20 …. Leads to slight improvement in modeling 20

21 Rotamer frequency: rare conformations reflect increased internal strain – important to take frequency into account frequency can be used as energy term: E i = -K ln P i Increasing availability of high-resolution structures narrows distribution around rotamer in library Indicates that errors are responsible for outliers Refitting of electron density maps non-rotameric conformations often incorrectly modeled and high in entropy Some conclusions about rotamer libraries 21

22 Rotamericity <100%: Include more side chain conformations! – Position-dependent rotamers (example: unbound conformations in docking predictions) – Additional conformations around rotamer (± sd) – Non-rotameric side chain angles: describe as continuous density function Some conclusions about rotamer libraries 22

23 Prediction of side chain conformations 1.rotamer libraries 2.dependence on backbone accuracy 3.approaches that locate GMEC or MECs Rosetta & other approaches DEE - Dead end elimination, SCWRL, PB - Belief propagation, LP -Linear integer programming Today’s menu 23

24 Most common local backbone move in ultra-high resolution structures (<1.0Å) Changes side chain orientation without effect on backbone 3 rotations around C  -C  axes In 3% of all residues (1/4=Serine) Two distinct rotamers related by backrub moves for Ile (tt,mm) Backrub Motions: “How protein backbone shrugs when side chain dances” 24 Change of  1,3 Compensatory changes of  1,2 and 2,3 Davis, 2006

25 Prediction of side chain conformations 1.rotamer libraries 2.dependence on backbone accuracy 3.approaches that locate GMEC or MECs Rosetta & other approaches DEE - Dead end elimination, SCWRL, PB - Belief propagation, LP -Linear integer programming Today’s menu 25

26 Prediction of side chain conformations using rotamers Given: – protein backbone – for each residue: set of possible conformations (rotamers from library) Wanted: Combination of rotamers that results in lowest total energy GMEC = min (  E ir +  E irjs ) location of GMEC is NP-hard (Fraenkel, 1997; Pierce, 2002) i i+1 i+2 i i+1 i+2 Self energy Pair energy 26

27 Side chain modeling = find best combination of rotamers How? 1.systematic scan for a protein with – 50 residue, and – 9 rotamers/residue number of combinations to scan: N=50 9 ~ 10 47 !  feasible only for small proteins  search space needs to be reduced i i+1 i+2 Pos…iaia ibib … … jaja e ia,ja e ib,ja jbjb e ia,jb e ib,jb …. E tot =  i E i +  i,j E ij iaia ibib icic 27

28 Deterministic Approaches (e.g. DEE): – Guarantee location of GMEC – Can be slow – Advantageous when GMEC is (the only) near- native conformation Heuristic Approaches (e.g. MC): – Locate Population of low-energy models (not necessarily GMEC) – Faster, often converge Search strategies for locating GMEC or MECs 28

29 DEE (Dead-end elimination): – prune impossible rotamers, determine GMEC from reduced rotamer set Residue-interacting graphs (SCWRL) – use dynamic programming on graph to find GMEC – start with “leafs”: residues with low connectivity in graph Linear Programming (Kingsford) – solve set of linear constraints – can locate GMEC for sparsely connected graphs – dependent on energy function Guaranteed finding of GMEC 29

30 Approach: remove rotamers that cannot be part of the GMEC Rotamer r at position i can be eliminated if there exists a rotamer t such that: Iterative application of DEE removes many rotamers, at certain positions only one rotamer is left (Note that some rotamers can be removed from the beginning because they clash with the backbone - too high E it ) Dead End Elimination (DEE) r t E Combinations of rotamers at positions j≠i 30 Desmet & Lasters, 1992

31 Approach: remove rotamers that cannot be part of the GMEC, second criterion: Rotamer r at position i can be eliminated if there exists a rotamer t such that: This criterion allows removing of additional rotamers Refined DEE r t E Combinations of rotamers at positions j≠i 31 Goldstein, 1994

32 Approach: remove rotamers that cannot be part of the GMEC - additional criterion: Rotamer r at position i can be eliminated if there exists rotamers t 1 and t 2 such that: takes more time to compute at the end, we are left with 1 combination, or with a few combinations only, that need to be evaluated using other criteria More sophisticated DEE criteria…. r t1t2t1t2 E Combinations of rotamers at positions j≠i 32

33 DEE guarantees to find GMEC… … but may miss conformations that have only slightly worse energy Given that the energy function is not perfect, we want to find also additional conformations with comparable energy Approach used in Orbit: use MC to find additional low-energy combinations that resemble GMEC DEE-based approaches 33

34 Local sampling starting from GMEC reveals conservation pattern of designs Alignment with zif268 second finger Alignment with zif268 second finger Conservation across 1000 simulations Conservation across 1000 simulations Ranking of predicted sequences sequences Design of a sequence that adopts a zinc finger fold without zinc 34 Dahiyat & Mayo (1997)

35 SCWRL - residue-interacting graphs DEE - remain with residues with > 1 rotamer: “active residues” undirected graph of active residues: – side chains = vertices – interacting rotamer pairs: connected by edge identify – articulation points (break cluster apart) & – bi-connected components (cannot be broken into different parts by removing one node) Very simple energy function: only dunbrack energy and repulsion 35 Canutescu, 2003

36 SCWRL - residue-interacting graphs Solve a cluster using bi-connected components For each, calculate best energy given specific rotamer in bi- connected residue Pruning is easy since energy function only positive [Backtracking: when certain threshold is used, a specific rotamer (combination) can be deleted] 36 Canutescu, 2003

37 Define cutoff values to prune branches that probably do not contain low-energy conformations Mean-field approach, Belief Propagation Self-consistent algorithms Monte-Carlo sampling Heuristic approaches 37

38 Side chain optimization Rigid body minimization Random perturbation MC Sc modeling in Rosetta: part of a cycle START Random perturbation Side chain optimization Rigid body minimization FINISH Energy Rigid body orientations rigid body optimization backbone optimization 38

39 Side chain modeling protocols in Rosetta Monte-Carlo procedure: heuristic does not converge – several runs needed to locate solution use backbone-dependent rotamer library (Dunbrack) approaches “Repacking” – model side chain conformation from scratch “Rotamer Trial” – refine side chain conformations “Rotamer Trial with minimization” (RTmin) – off-rotamer sampling by minimization 39

40 Monte Carlo sampling pre-calculate E ir and E irjt matrix Self energy: Energy between rotamer r at position i with constant part Pairwise energy: between rotamer r at position i and rotamer t at position j (sparse matrix) E total =  i E ir +  i  j E irjt simulated annealing make random change start with high acceptance rate, gradually lower temperature acceptance based on Boltzmann distribution 40

41 “Repacking”: full combinatorial side chain optimization remove all side chains gradually add side chains: select from backbone-dependent rotamer library add position-specific rotamers (e.g. from unbound conformation): set their energy to minimum rotamer energy, to ensure acceptance use simulated annealing to create increasingly well packed side chains repeat to sample range of low-energy conformations 41

42 “Rotamer trial”: side chain adjustment Find better rotamers for existing structure pick residue at random search for rotamer with lower energy replace rotamer Repeated until all high-energy positions are improved Fast 42

43 Side chain modeling based on rotamer libraries  Combinatorial problem Approaches for side chain modeling involve smart reduction of combinatorial complexity (heuristic or exact) Side chain modeling as a “toy model” for structural modeling Side chain modeling can be extended to Design by adding rotamer options of different amino acids Side chain modeling: Summary 43


Download ppt "4. Modeling of side chains 1. Protein Structure Prediction: – given: sequence of protein – predict: structure of protein Challenges: – conformation space."

Similar presentations


Ads by Google