Dead-End Elimination for Protein Design with Flexible Rotamers Ivelin Georgiev Donald Lab 02/19/2008
Computational Design energy function rotamer library protein design wildtype energy function input structure rotamer library protein design algorithm The input to a protein design algorithm typically includes the following: A wildtype fixed-backbone structure which is to be redesigned; A rotamer library of low-energy side-chain conformations that discretizes the continuous side-chain conformation space and makes the computational search feasible; and An energy function for scoring and ranking the candidate structures. The energy function typically consists of some of the standard molecular mechanics energy terms (such as vdW, electrostatics, and solvation energies), but may also include some statistical and other terms. Based on the input model, the protein design algorithm makes predictions about mutations to the wildtype sequence, in order to achieve a desired novel functionality, such as improving the thermostability of the protein, switching the enzyme specificity towards a novel substrate, or redesigning the protein to perform a completely novel function. stability specificity novel function drug design mutant …
provable energy minimization ε-approximation algorithm C MinDEE provable energy minimization ensembles partition function ε-approximation algorithm q q* Contributions redesign for Leu
Traditional-DEE it ir rotamer pruning O(q2n2) Enumerate E lower Desmet et al., 1992 it ir E lower bound E upper bound fixed backbone/side-chains ir rotamer pruning O(q2n2) E it Dead-End Elimination (DEE) is a provably-accurate deterministic algorithm that reduces the search space for protein design problems by pruning rotamers that are provably not part of the GMEC. Specifically, for a given residue position i, DEE compares the pruning candidate rotamer i_r (here, the notation i_r means rotamer identity r at residue position i) against a competitor rotamer i_t. If a lower bound on the energy of any conformation with i_r is still greater than an upper bound on the energy of any conformation with i_t, then we can always obtain a lower-energy conformation by switching from rotamer i_r to rotamer i_t. Thus, rotamer i_r can be provably pruned from further consideration, since it cannot belong to the GMEC. This pruning step is repeated for all rotamers as pruning candidates, for all competitors, and for all residue positions, until no more rotamers can be provably pruned. This typically results in a significantly reduced set of conformations that have to be subsequently enumerated, making the mutation search computationally feasible. Unfortunately, DEE is provably-accurate only for a fixed backbone. The question then arises, can there be an algorithm for backbone flexibility that incorporates the same provable guarantees as DEE for a fixed backbone. conformations Enumerate GMEC
Traditional-DEE with Rigid Rotamers/Backbone Conformations Energy it
Traditional-DEE with Side-chain Dihedral Flexibility Conformations min Energy max it
√ X √ C C C provably-correct not provably-correct MinDEE Traditional-DEE C rigid energies √ provably-correct Traditional-DEE C E minimization X not provably-correct C √ MinDEE E minimization provably-correct
C √ MinDEE E minimization provably-correct
continuous side-chain dihedral space MinDEE voxels bound rotamer movement Instead of sampling the backbone dihedrals, we define restraining boxes around each residue that bound the displacement of this residue from its original pose. In effect, through kinematic constraints, these boxes bound the backbone movement. Instead of a discrete set of backbones, we thus have a continuous space of possible backbone conformations. The restraining boxes can be obtained in many ways, but we use two bounding conditions. First, we bound the displacement of Ca atoms from their original position, and second, we bound the change in the phi and psi angles for each flexible residue.
E(ir , js) χir χjs lower / upper energy bounds ir MinDEE it js ir We can then compute lower and upper energy bounds on the rotamer-to-backbone and pairwise rotamer-to-rotamer interaction energies within the restraining boxes for each rotamer. For example, we compute a lower energy bound for a rotamer pair i_r – j_s by allowing the backbone to flex in order to minimize the interaction energy between these two rotamers, within the rotamer restraining boxes. The minimization process in (phi, psi) space is shown by the red path in the figure on the right. χir ir js χjs
MinDEE: - - > 0 lower / upper energy bounds ir it js lower bound pruning candidate lower / upper energy bounds ir it competitor js witness MinDEE: - - > 0 The E(minus) terms represent a lower bound on the rotamer-to-template and rotamer-to-rotamer interaction energies that involve the pruning candidate rotamer i_r, and the E(plus) terms are the respective upper energy bounds that involve the competitor rotamer i_t. The main difference from the traditional-DEE criterion is the inclusion of the E(delta) term, which takes into account possible energy changes due to backbone movement. lower bound on ir conformation energies upper bound on it conformation energies possible energy changes due to rotamer movement not in trad-DEE
MinDEE: Side-chain Dihedral Flexibility traditional-DEE MinDEE Also add some of Bruce Tidor’s summary of the MinDEE algorithm (appropriate manipulation of 1- and 2-body energy bounds) from the K* grant.
MinDEE Applications
∫ MinDEE Applications GMEC-based Ensemble-based min JCB’05 1 Z single lowest-energy conformation Ensemble-based JCB’05 ∫ 1 Z weighted average sequence K* TIAAIC 7.3 GIRMQM 3.1 TGIAIV 2.9 LMLAIS 1.7 TWAIGY 0.3 a K*: provably-accurate approximation to the binding constant via conformational ensembles
MinDEE/A*: GMEC-based Method
MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) full E minimization
MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) full E minimization
MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) … full E minimization …
MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) … … full E minimization … minGMEC B(c) > E(best)
Hybrid-K*: Ensembles Method
Hybrid-K*: Ensembles Method Volume filter seq1 C DEE pruning C’ A* search (E lower bounds) … … full E minimization p’ q*
Hybrid-K*: Ensembles Method Volume filter seq1 C DEE pruning C’ A* search (E lower bounds) … … full E minimization p’ q*
Hybrid-K*: Ensembles Method C DEE pruning C’ A* search (E lower bounds) … full E minimization p’ q* q* < (1-ε)q
Hybrid-K*: Ensembles Method repeat search C C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization full E minimization p’ p’ q* q* q* < (1-ε)q q* ≥ (1-ε)q
Hybrid-K*: Ensembles Method Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization p’ full E minimization q* q* q* ≥ (1-ε)q q* ≥ (1-ε)q
Hybrid-K*: Inter-mutation Pruning Volume filter seq1 seqn C C DEE pruning C’ A* search (E lower bounds) … full E minimization p’ q* q* ≥ (1-ε)q
Hybrid-K*: Inter-mutation Pruning Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … full E minimization p’ full E minimization p’ q* q* Ķ*n K*i <
Hybrid-K*: Inter-mutation Pruning Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization p’ p’ full E minimization q* q* q* < (1-ε)q K*i >>> Ķ*n
Hybrid-K*: Intra-mutation Pruning Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization p’ full E minimization q* q* q* ≥ (1-ε)q q* ≥ (1-ε)q
Results MinDEE: GMEC-based MinDEE: Ensembles Traditional-DEE this work previous Results Traditional-DEE single structure rigid energies MinDEE: GMEC-based K* (RECOMB’04) E minimization ensembles MinDEE: Ensembles
Structural Model 1AMU (Conti et al., 1997) Residues: 39 flexible: 9 steric shell: 30 Flexible ligand AMP Richardsons’ rotamer library AMBER (vdW,elect,dihed) + EEF1 2-point mutation search for Leu GAVLIFYWM allowed 235 236 239 278 299 301 322 330 331 D A W T I C GrsA-PheA Active Site
Comparison to Traditional-DEE minGMEC trad-GMEC minGMEC: 235 236 239 278 299 301 322 330 331 D M W T I A* C 2 5 3 6 - 9 * minGMEC rotamer pruned by traditional-DEE trad-GMEC ranked 397th E(minGMEC) < E(rigid-GMEC) by ≈ 6 kcal/mol
Hybrid-K* Computational 9 hrs. on 24 processors Conf. Remaining Pruning Factor (%) Initial 6.8 x 108 - Volume Filter 2.04 x 108 3.33 (70.0) MinDEE Filter 4.13 x 106 49.43 (98.0) Steric Filter 3.86 x 106 1.07 (6.5) A* Filter 7.82 x 104 49.41 (98.0) Computational 9 hrs. on 24 processors Original K* fully-evaluated 30% more conformations K* w/o filters: ≈ 3,263 days Top 40 Mutations – Hybrid-K* Predictions T278M/A301G (Stachelhaus et al., 1999) ranked 3rd G301 in all known natural Leu adenylation domains Experimental verification
MinDEE/A* Ew = 12.5 kcal/mol 4 days on a single processor Top 40 Mutations – MinDEE/A* Ew = 12.5 kcal/mol 4 days on a single processor 206 of 421 rotamers pruned over 60,000 extracted conformations 7,261 conformations (221 unique sequences) within Ew minGMEC: A236M/A322M Rotamer Diversity for A236M/A322M Conf Energies vs. RMSD for A236M/A322M
Conclusions and Future Work Top 40 Mutations – Hybrid-K* Traditional-DEE not correct with energy minimization MinDEE provably-correct and efficient MinDEE capable of returning lower-energy conformations Ensemble-based and GMEC-based redesign predictions are substantially different MinDEE: Ensembles Method successfully predicts both known and novel redesigns Improve MinDEE pruning efficiency Improve model accuracy Marriage of MinDEE and BD
Acknowledgments Funding: Bruce Donald Ryan Lilien Amy Anderson Serkan Apaydin John MacMaster Tony Yan All members of Donald Lab Funding: NIH NSF