Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3 Provided an approach to generating physical conformations of a protein Modeled flexibility of the binding site Future work Investigate other modes of motion Incorporate multiple motion vectors Conclusions Acknowledgements 1 Dept. of Electrical and Computer Engineering, Rice University 2 Dept. of Computer Science, Rice University 3 Dept. of Bioengineering, Rice University Computer Research Association’s Committee on the Status of Women in Computing Research Distributed Mentor Project W. M. Keck Center Undergraduate Research Training Program Physical and Biological Computing Group, Rice University For questions, comments, and preprint requests: Yi-Chieh Wu References A.A. Canutescu and R. L. Dunbrack. Cyclic coordinate descent: A robotics algorithm for protein loop closure. Protein Science, 12: , A. Shehu. (2004). Sampling Biomolecular Conformations with Spatial and Energetic Constraints. MS Thesis, Rice University. Modeled flap movement of HIV-1 protease using first PCA Opened and closed flaps but kept protein stable Movement concentrated in flaps Open-flap conformations are less constrained – recovered conformations with higher RMSDs Discussion Spatial Constraints Inverse kinematics – CCD Features defined along backbone, so sidechains kept rigid Displacement only valid in a small neighborhood Energetic Constraints Full conjugate gradient minimization of CHARMM energy Energy cutoff of 600 kcal/mol “Rewind” to previous conformation if high-energy barrier encountered Spatial and Energetic Constraints Problem Definition Generate a set of conformations that capture the most important motions Follow along collective modes of motion starting from an initial structure Limited by local search – analysis fails far from the native Problem Statement Motivation Most current methods consider proteins as rigid structures Models incorporating protein flexibility provide better representations HIV-1 protease A virus protein that assists in HIV replication Target of drug design – single point of failure Native structure: fully minimized structure of 4HVP from the Protein Databank Principle component analysis (PCA) Identifies major modes of motion Direct physical interpretations HIV-1 protease: First eigenvector corresponds to opening and closing of the flaps surrounding the binding site Model System Figure 4. Backbone representation of HIV-1 protease bound to an inhibitor (orange). Features: residues with constrained positions Choose atoms with the largest displacements (Figure 5) Internal features moved along the PCA – capture flap movement End features unmoved – keep rest of protein native-like to maintain low-energy Feature Definition Figure 5: Atom displacements along the first PCA. Red circles mark the indices of our chosen features. MethodResults RMSDs of Recovered Conformations along the First PCA (The highest RMSD as measured against the native structure is given. RMSD is measured in angstroms.) Step Size Flap All- Atom RMSD Flap Backbone RMSD Flap Sidechain RMSD Rest All- Atom RMSD Total All- Atom RMSD Close Open Figure 6: Backbone representation of flap movement along the first PCA. Features used are shown as gray spheres. Algorithm Rigid geometry model Dihedrals are the only degrees of freedom Reduce problem dimensionality Proteins as Robotic Manipulators † Figures adapted from: I. Lotan. (2004). Algorithms exploiting the chain structure of proteins. PhD Thesis, Stanford University. Figure 3 † : Using CCD to satisfy spatial constraints. One joint (circled in green) is rotated at a time to bring the end-effector (blue) closer to the target position (red). Figure 2: A protein modeled as an articulated mechanism. Figure 1 † : Rigid geometry model. Only dihedral angles are used as degrees of freedom. Backbone dihedrals (phi and psi) are depicted. PROGRAM OUTPUTINPUT Initialize Protein and Features Move Features by PCA Check Energy Rewind to Previous Conformation Conformations Time Analysis Closure Satisfaction, Energies, RMSD Protein (Native) PCA Vector, Step Size Features Energy Cutoff Randomization, CCD, and Minimization Parameters within cutoff outside cutoff Use CCD to Satisfy Features Minimize Energy Applications Protein native state behavior Molecular interactions Drug design and discovery Protein Flexibility Energy landscape Funnel-shaped → thermodynamically stable native structure Varying energetic constraints → non- symmetric for open- and close-flap conformations More conformations around the native Cyclic Coordinate Descent (CCD) Iterative, heuristic approach to solving inverse kinematics Adjusts one dihedral at a time to move an atom to its constrained position Computationally fast and analytically simple Robotic Representation Atoms ≡ joints Bonds ≡ links Apply robotic techniques