Dead-End Elimination for Protein Design with Flexible Rotamers

Slides:



Advertisements
Similar presentations
ROTAMER OPTIMIZATION FOR PROTEIN DESIGN THROUGH MAP ESTIMATION AND PROBLEM-SIZE REDUCTION Hong, Lippow, Tidor, Lozano-Perez. JCC Presented by Kyle.
Advertisements

Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Crystallography -- lecture 21 Sidechain chi angles Rotamers Dead End Elimination Theorem Sidechain chi angles Rotamers Dead End Elimination Theorem.
Short fast history of protein design Site-directed mutagenesis -- protein engineering (J. Wells, 1980's) Coiled coils, helix bundles (W. DeGrado, 1980's-90's)
The Many Roles of Computational Science in Drug Design and Analysis Mala L. Radhakrishnan Department of Chemistry, Wellesley College June 17, 2008 DOE.
Iterative Relaxation of Constraints (IRC) Can’t solve originalCan solve relaxed PRMs sample randomly but… start goal C-obst difficult to sample points.
Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
FLEX* - REVIEW.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
Protein Side Chain Packing Problem: A Maximum Edge-Weight Clique Algorithmic Approach Dukka Bahadur K.C, Tatsuya Akutsu and Tomokazu Seki Proceedings of.
Computational Structure-Based Redesign of Enzyme Activity Cheng-Yu Chen, Ivelin Georgiev, Amy C.Anderson, Bruce R.Donald A Different computational redesign.
An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States EDDA KLOPPMANN, G. MATTHIAS ULLMANN, TORSTEN BECKER.
COMPARATIVE or HOMOLOGY MODELING
Computational protein design. Reasons to pursue the goal of protein design In medicine and industry, the ability to precisely engineer protein hormones.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
De novo Protein Design Presented by Alison Fraser, Christine Lee, Pradhuman Jhala, Corban Rivera.
In molecular switching, the recognition of an external signal such as ligand binding by one protein is coupled to the catalytic activity of a second protein.
Rotamer Packing Problem: The algorithms Hugo Willy 26 May 2010.
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Altman et al. JACS 2008, Presented By Swati Jain.
Structure prediction: Homology modeling
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
A B C D E F A ABSTRACT A novel, efficient, robust, feature-based algorithm is presented for intramodality and multimodality medical image registration.
Protein Design with Backbone Optimization Brian Kuhlman University of North Carolina at Chapel Hill.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3  Provided an approach to generating.
Bioinformatics 2 -- lecture 9
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
Solving and Analyzing Side-Chain Positioning Problems Using Linear and Integer Programming Carleton L. Kingsford, Bernard Chazelle and Mona Singh Bioinformatics.
Molecular dynamics simulations of toxin binding to ion channels Quantitative description protein –ligand interactions is a fundamental problem in molecular.
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
In silico Protein Design: Implementing Dead-End Elimination algorithm
Protein Structures from A Statistical Perspective Jinfeng Zhang Department of Statistics Florida State University.
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.
Bioinformatics 2 -- lecture 20 Protein design -- the state of the art.
1 of 21 SDA development -Description of sda Description of sda-5a - Sda for docking.
Avdesh Mishra, Md Tamjidul Hoque {amishra2,
Traveling Salesperson Problem
Computational Protein Redesign and the Non-Ribosomal Code
Protein Structure Prediction and Protein Homology modeling
Support Vector Machine (SVM)
Volume 90, Issue 11, Pages (June 2006)
An Integrated Approach to Protein-Protein Docking
Volume 25, Issue 11, Pages e3 (November 2017)
Volume 19, Issue 7, Pages (July 2011)
Volume 22, Issue 2, Pages (February 2014)
Protein structure prediction.
Volume 86, Issue 4, Pages (April 2004)
Complementarity of Structure Ensembles in Protein-Protein Binding
Volume 16, Issue 5, Pages (May 2008)
Monica Berrondo, Marc Ostermeier, Jeffrey J. Gray  Structure 
Benoit Villiers, Florian Hollfelder  Chemistry & Biology 
Ligand Binding to the Voltage-Gated Kv1
Volume 11, Issue 2, Pages (February 2003)
Volume 24, Issue 1, Pages (January 2016)
Stefan Oßwald, Philipp Karkowski, Maren Bennewitz
Feng Ding, Douglas Tsao, Huifen Nie, Nikolay V. Dokholyan  Structure 
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Benoit Villiers, Florian Hollfelder  Chemistry & Biology 
Volume 22, Issue 2, Pages (February 2014)
Presentation transcript:

Dead-End Elimination for Protein Design with Flexible Rotamers Ivelin Georgiev Donald Lab 02/19/2008

Computational Design energy function rotamer library protein design wildtype energy function input structure rotamer library protein design algorithm The input to a protein design algorithm typically includes the following: A wildtype fixed-backbone structure which is to be redesigned; A rotamer library of low-energy side-chain conformations that discretizes the continuous side-chain conformation space and makes the computational search feasible; and An energy function for scoring and ranking the candidate structures. The energy function typically consists of some of the standard molecular mechanics energy terms (such as vdW, electrostatics, and solvation energies), but may also include some statistical and other terms. Based on the input model, the protein design algorithm makes predictions about mutations to the wildtype sequence, in order to achieve a desired novel functionality, such as improving the thermostability of the protein, switching the enzyme specificity towards a novel substrate, or redesigning the protein to perform a completely novel function. stability specificity novel function drug design mutant …

provable energy minimization ε-approximation algorithm C MinDEE provable energy minimization ensembles partition function ε-approximation algorithm q q* Contributions redesign for Leu

Traditional-DEE it ir rotamer pruning O(q2n2) Enumerate E lower Desmet et al., 1992 it ir E lower bound E upper bound fixed backbone/side-chains ir rotamer pruning O(q2n2) E it Dead-End Elimination (DEE) is a provably-accurate deterministic algorithm that reduces the search space for protein design problems by pruning rotamers that are provably not part of the GMEC. Specifically, for a given residue position i, DEE compares the pruning candidate rotamer i_r (here, the notation i_r means rotamer identity r at residue position i) against a competitor rotamer i_t. If a lower bound on the energy of any conformation with i_r is still greater than an upper bound on the energy of any conformation with i_t, then we can always obtain a lower-energy conformation by switching from rotamer i_r to rotamer i_t. Thus, rotamer i_r can be provably pruned from further consideration, since it cannot belong to the GMEC. This pruning step is repeated for all rotamers as pruning candidates, for all competitors, and for all residue positions, until no more rotamers can be provably pruned. This typically results in a significantly reduced set of conformations that have to be subsequently enumerated, making the mutation search computationally feasible. Unfortunately, DEE is provably-accurate only for a fixed backbone. The question then arises, can there be an algorithm for backbone flexibility that incorporates the same provable guarantees as DEE for a fixed backbone. conformations Enumerate GMEC

Traditional-DEE with Rigid Rotamers/Backbone Conformations Energy it

Traditional-DEE with Side-chain Dihedral Flexibility Conformations min Energy max it

√ X √ C C C provably-correct not provably-correct MinDEE Traditional-DEE C rigid energies √ provably-correct Traditional-DEE C E minimization X not provably-correct C √ MinDEE E minimization provably-correct

C √ MinDEE E minimization provably-correct

continuous side-chain dihedral space MinDEE voxels bound rotamer movement Instead of sampling the backbone dihedrals, we define restraining boxes around each residue that bound the displacement of this residue from its original pose. In effect, through kinematic constraints, these boxes bound the backbone movement. Instead of a discrete set of backbones, we thus have a continuous space of possible backbone conformations. The restraining boxes can be obtained in many ways, but we use two bounding conditions. First, we bound the displacement of Ca atoms from their original position, and second, we bound the change in the phi and psi angles for each flexible residue.

E(ir , js) χir χjs lower / upper energy bounds ir MinDEE it js ir We can then compute lower and upper energy bounds on the rotamer-to-backbone and pairwise rotamer-to-rotamer interaction energies within the restraining boxes for each rotamer. For example, we compute a lower energy bound for a rotamer pair i_r – j_s by allowing the backbone to flex in order to minimize the interaction energy between these two rotamers, within the rotamer restraining boxes. The minimization process in (phi, psi) space is shown by the red path in the figure on the right. χir ir js χjs

MinDEE: - - > 0 lower / upper energy bounds ir it js lower bound pruning candidate lower / upper energy bounds ir it competitor js witness MinDEE: - - > 0 The E(minus) terms represent a lower bound on the rotamer-to-template and rotamer-to-rotamer interaction energies that involve the pruning candidate rotamer i_r, and the E(plus) terms are the respective upper energy bounds that involve the competitor rotamer i_t. The main difference from the traditional-DEE criterion is the inclusion of the E(delta) term, which takes into account possible energy changes due to backbone movement. lower bound on ir conformation energies upper bound on it conformation energies possible energy changes due to rotamer movement not in trad-DEE

MinDEE: Side-chain Dihedral Flexibility traditional-DEE MinDEE Also add some of Bruce Tidor’s summary of the MinDEE algorithm (appropriate manipulation of 1- and 2-body energy bounds) from the K* grant.

MinDEE Applications

∫ MinDEE Applications GMEC-based Ensemble-based min JCB’05 1 Z single lowest-energy conformation Ensemble-based JCB’05 ∫ 1 Z weighted average sequence K* TIAAIC 7.3 GIRMQM 3.1 TGIAIV 2.9 LMLAIS 1.7 TWAIGY 0.3 a K*: provably-accurate approximation to the binding constant via conformational ensembles

MinDEE/A*: GMEC-based Method

MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) full E minimization

MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) full E minimization

MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) … full E minimization …

MinDEE/A*: GMEC-based Method pruning O(n2r2) C’ A* search (E lower bounds) … … full E minimization … minGMEC B(c) > E(best)

Hybrid-K*: Ensembles Method

Hybrid-K*: Ensembles Method Volume filter seq1 C DEE pruning C’ A* search (E lower bounds) … … full E minimization p’ q*

Hybrid-K*: Ensembles Method Volume filter seq1 C DEE pruning C’ A* search (E lower bounds) … … full E minimization p’ q*

Hybrid-K*: Ensembles Method C DEE pruning C’ A* search (E lower bounds) … full E minimization p’ q* q* < (1-ε)q

Hybrid-K*: Ensembles Method repeat search C C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization full E minimization p’ p’ q* q* q* < (1-ε)q q* ≥ (1-ε)q

Hybrid-K*: Ensembles Method Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization p’ full E minimization q* q* q* ≥ (1-ε)q q* ≥ (1-ε)q

Hybrid-K*: Inter-mutation Pruning Volume filter seq1 seqn C C DEE pruning C’ A* search (E lower bounds) … full E minimization p’ q* q* ≥ (1-ε)q

Hybrid-K*: Inter-mutation Pruning Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … full E minimization p’ full E minimization p’ q* q* Ķ*n K*i <

Hybrid-K*: Inter-mutation Pruning Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization p’ p’ full E minimization q* q* q* < (1-ε)q K*i >>> Ķ*n

Hybrid-K*: Intra-mutation Pruning Volume filter seq1 seqn C … C DEE pruning DEE pruning C’ C’ A* search (E lower bounds) A* search (E lower bounds) … … … full E minimization p’ full E minimization q* q* q* ≥ (1-ε)q q* ≥ (1-ε)q

Results MinDEE: GMEC-based MinDEE: Ensembles Traditional-DEE this work previous Results Traditional-DEE single structure rigid energies MinDEE: GMEC-based K* (RECOMB’04) E minimization ensembles MinDEE: Ensembles

Structural Model 1AMU (Conti et al., 1997) Residues: 39 flexible: 9 steric shell: 30 Flexible ligand AMP Richardsons’ rotamer library AMBER (vdW,elect,dihed) + EEF1 2-point mutation search for Leu GAVLIFYWM allowed 235 236 239 278 299 301 322 330 331 D A W T I C GrsA-PheA Active Site

Comparison to Traditional-DEE  minGMEC trad-GMEC minGMEC: 235 236 239 278 299 301 322 330 331 D M W T I A* C 2 5 3 6 - 9 * minGMEC rotamer pruned by traditional-DEE trad-GMEC ranked 397th E(minGMEC) < E(rigid-GMEC) by ≈ 6 kcal/mol

Hybrid-K* Computational 9 hrs. on 24 processors Conf. Remaining Pruning Factor (%) Initial 6.8 x 108 - Volume Filter 2.04 x 108 3.33 (70.0) MinDEE Filter 4.13 x 106 49.43 (98.0) Steric Filter 3.86 x 106 1.07 (6.5) A* Filter 7.82 x 104 49.41 (98.0) Computational 9 hrs. on 24 processors Original K* fully-evaluated 30% more conformations K* w/o filters: ≈ 3,263 days Top 40 Mutations – Hybrid-K* Predictions T278M/A301G (Stachelhaus et al., 1999) ranked 3rd G301 in all known natural Leu adenylation domains Experimental verification

MinDEE/A*  Ew = 12.5 kcal/mol 4 days on a single processor Top 40 Mutations – MinDEE/A* Ew = 12.5 kcal/mol 4 days on a single processor 206 of 421 rotamers pruned over 60,000 extracted conformations 7,261 conformations (221 unique sequences) within Ew minGMEC: A236M/A322M Rotamer Diversity for A236M/A322M Conf Energies vs. RMSD for A236M/A322M 

Conclusions and Future Work Top 40 Mutations – Hybrid-K* Traditional-DEE not correct with energy minimization MinDEE provably-correct and efficient MinDEE capable of returning lower-energy conformations Ensemble-based and GMEC-based redesign predictions are substantially different MinDEE: Ensembles Method successfully predicts both known and novel redesigns Improve MinDEE pruning efficiency Improve model accuracy Marriage of MinDEE and BD

Acknowledgments Funding: Bruce Donald Ryan Lilien Amy Anderson Serkan Apaydin John MacMaster Tony Yan All members of Donald Lab Funding: NIH NSF