Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes
Example: HIV-1 Protease Active Site (Aspartyl groups) Docking to find drug candidates
Example: HIV-1 Protease Docking to find drug candidates
Why is this difficult? n # of possible conformations are astronomical –thousands of degrees of freedom (DOF) n Free energy changes are small –Below the accuracy of our energy functions n Molecules are flexible –alter each other’s structure as they interact
Some techniques n Surface representation, that efficiently represents the docking surface and identifies the regions of interest (cavities and protrusions) Connolly surface Lenhoff technique Kuntz et al. Clustered-Spheres Alpha shapes n Surface matching that matches surfaces to optimize a binding score: Geometric Hashing
Surface Representation n Each atomic sphere is given the van der Waals radius of the atom n Rolling a Probe Sphere over the Van der Waals Surface leads to the Solvent Reentrant Surface or Connolly surface
Lenhoff technique n Computes a “complementary” surface for the receptor instead of the Connolly surface, i.e. computes possible positions for the atom centers of the ligand Atom centers of the ligand van der Waals surface
Kuntz et al. Clustered-Spheres n Uses clustered-spheres to identify cavities on the receptor and protrusions on the ligand n Compute a sphere for every pair of surface points, i and j, with the sphere center on the normal from point i n Regions where many spheres overlap are either cavities (on the receptor) or protrusions (on the ligand) i j
Alpha Shapes n Formalizes the idea of “shape” n In 2D an “edge” between two points is “alpha- exposed” if there exists a circle of radius alpha such that the two points lie on the surface of the circle and the circle contains no other points from the point set
Alpha Shapes: Example Alpha=infinity Alpha=3.0 Å
Surface Matching n Find the transformation (rotation + translation) that will maximize the number of matching surface points from the receptor and the ligand First satisfy steric constraints… Find the best fit of the receptor and ligand using only geometrical constraints … then use energy calculations to refine the docking Selet the fit that has the minimum energy
Geometric Hashing Building the Hash Table: –For each triplet of points from the ligand, generate a unique system of reference –Store the position and orientation of all remaining points in this coordinate system in the Hash Table Searching in the Hash Table –For each triplet of points from the receptor, generate a unique system of reference –Search the coordinates for each remaining point in the receptor and find the appropriate hash table bin: For every entry there, vote for the basis
Geometric Hashing –Determine those entries that received more than a threshold of votes, such entry corresponds to a potential match –For each potential match recover the transformation T that results in the best least-squares match between all corresponding triplets –Transform the features of the model according to the recovered transformation T and verify it. If the verification fails, choose a different receptor triplet and repeat the searching.
Example Docking Programs n DOCK (I. D. Kuntz, UCSF) n AutoDOCK (A. Olson, Scripps) n RosettaDOCK (Baker, U Wash., Gray, JHU) More information in:
DOCK DOCK works in 5 steps: n Step 1 n Step 1 Start with coordinates of target receptor n Step 2 Generate molecular surface for receptor n Step 3 Fill active site of receptor with spheres –potential locations for ligand atoms n Step 4 Match sphere centers to ligand atoms –determines possible orientations for the ligand n Step 5 Find the top scoring orientation
Other Docking programs AutoDock –designed to dock flexible ligands into receptor binding sites –Has a range of powerful optimization algorithms RosettaDOCK –models physical forces –Creates a large number of decoys –degeneracy after clustering is final criterion in selection of decoys to output
A Protein-Protein Docking Algorithm (Gray & Baker) n Goal: to predict protein-protein complexes from the coordinates of unbound monomer components. n Two steps: A low-resolution Monte Carlo search and a final optimization using Monte Carlo minimization. n Up to 10 5 independent simulations produce “decoys” that are ranked using an energy function. n The top-ranking decoys are clustered for output.
Docking protocol
Docking protocol: Step 1 RANDOM START POSITION n Creation of a decoy begins with a random orientation of each partner and a translation of one partner along the line of protein centers to create a glancing contact between the proteins
Docking protocol: Step 2 LOW-RESOLUTION MONTE CARLO SEARCH n Low-resolution representation: N, C , C, O for the backbone and a “centroid” for the side-chain n One partner is translated and rotated around the surface of the other through 500 Monte Carlo move attempts n The score terms: A reward for contacting residues, a penalty for overlapping residues, an alignment score, residue environment and residue-residue interactions
Docking protocol: Step 3 HIGH-RESOLUTION REFINEMENT n Explicit side-chains are added to the protein backbones using a rotamer packing algorithm, thus changing the energy surface n An explicit minimization finds the nearest local minimum accessible via rigid body translation and rotation n Start and Finish positions are compared by the Metropolis criterion
Docking protocol: Step 3 n Before each cycle, the position of one protein is perturbed by random translations and by random rotations n To simultaneously optimize the side-chain conformations and the rigid body position, the side-chain packing and the minimization operations are repeated 50 times
Docking protocol: Step 3 COMPUTATIONAL EFFICIENCY 1. The packing algorithm usually varies the conformation of one residue at a time; rotamer optimization is performed once every eight cycles 2. Periodically filter to detect and reject inferior decoys without further refinement
Docking protocol: Step 4 CLUSTERING & PREDICTIONS n Repeat search to create approximately 10 5 decoys per target n Cluster best 200 decoys by a hierarchical clustering algorithm using RMSD n The clusters with the most members become predictions, ranked by cluster size
Docking protocol: Results
CAPRI Challenge (2002) At least one docking partner presented in its unbound form Participants permitted 5 attempts for each target The 7 CAPRI Docking Targets
CAPRI Challenge Participants & Algorithms
Results: CAPRI Challenge This were the results for the different predictors and targets:
Conclusions The computational molecular docking problem is far from being solved. There are two major bottle-necks: 1.The algorithms handle limited flexibility 2.Need selective and efficient scoring functions