Protein structure prediction Computer-aided pharmaceutical design: Modeling receptor flexibility Applications to molecular simulation Work on this paper by the authors has been supported in part by NSF , EIA , a Texas ATP grant, a Whitaker Biomedical Engineering Grant and a Sloan Fellowship to Lydia Kavraki. David Schwarz has been partially supported by a National Defense Science and Engineering Graduate Fellowship from the Office of Naval Research and a President’s Graduate Fellowship from Rice University. 1) Generation of molecular dynamics simulation trajectory a) Start with known protein structure (from RCSB Protein Data Bank) b) Run 2 nanosecond simulation (1,000,000 steps) 1 Dept. of Computer Science, Rice University, 2 Dept. of Chemistry, Rice University, 3 Dept. of Computer Science and Dept. of Bioengineering, Rice University David Schwarz 1 Mark Moll 1 Lydia E. Kavraki 3 Allison Heath 1 Analysis of Biomolecular Interactions Using a Robotics-Inspired Approach with Applications to Tissue Engineering Two known structures of HIV-1 protease, a protein vital to the life cycle of the human immunodeficiency virus, bound to inhibitors. A pharmaceutical company screening the bulky inhibitor on the right, but only testing it on the closed protein structure on the left, would fail to identify it as a potential inhibitor, and therefore a potential drug. HIV-1 protease structures generated by molecular dynamics 2) Determination of collective coordinates by principal component analysis (PCA) of trajectory First principal component of HIV-1 protease from simulation of structure 4HVP a) Singular value decomposition on representative conformations from trajectory b) Output: Set of vectors representing coordinated motions of receptor, in order of decreasing contribution to overall variation of structure Geometric Space Search: Molecular Expansive Spaces Loosely based on Expansive Spaces Tree (EST) path planning algorithm from robotics Designed for rapid coverage of space Here we adapt an EST-like method for coverage molecular conformation spaces Algorithm: Existing point chosen randomly for expansion based on: Energy of explored points Average distance to nearest neighbors Number of times point has already been used for expansion New point generated within set radius of chosen point Two candidate methods to get new point: Simple (Gaussian neighbor generation) More complex (Random bounce walk) Illustration of space-covering properties of expansive spaces search. Each point represents a conformation of the receptor. a) Expansive search b) Random walk Results Acknowledgements Experiments to determine effectiveness of search algorithm independent of physical model Molecular docking experiments on results of search to determine usefulness as drug-design target structures Experiments with alternative parameterizations (such as dihedral coordinates) Work in Progress and Future Work Average pairwise distance of generated structures Å RMSD Search set diameter (expansiveness) Å RMSD Results are for conformational searches of HIV-1 protease starting from PDB structures 1AID and 4HVP and FK506-binding protein (FKBP) starting from PDB structures 1A7X-A and 1FKR-17. RMSD = Root Mean Squared Distance Standard deviation Average distance to known structures (binding site RMSD) Average distance to known structures (all atom RMSD) Standard deviation Å RMSD Standard deviation Å RMSD HIV-1 protease Inhibitors (drug candidates) Explicitly modeling receptor flexibility is computationally impossible Collective coordinates = reduced basis for motion of the receptor (dimensionality reduction) Example: HIV-1 protease 3120 atoms, each with three Cartesian degrees of freedom (x,y,z), for a total of 9360 dimensions—computationally intractable use first five principal components as a reduced basis—five dimensional space likely to be tractable Cecilia Clementi 2 Dimensional reduction: Collective coordinates Powerful search algorithm: Expansive spaces search Dimensional reduction: Collective Coordinates Why model protein flexibility? Our approach FKBP Distinct structures: At least 1 Å RMSD apart Monte Carlo Simulation is a standard but slow conformational search method Expansive search generates more distinct structures than Monte Carlo, and complex neighbor generation scheme works best Set diameter: Maximum distance between any two structures in result set Expansive search consistently generates broader search sets than random walk or Monte Carlo simulation Indicates better coverage of conformation space