Download presentation
Presentation is loading. Please wait.
Published byArline Simpson Modified over 8 years ago
1
Vicky Choi Assistant Professor Department of Computer Science Virginia Tech Yucca: An Efficient Algorithm for Small Molecule Docking
2
Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm Results on recent 100-complex benchmark Details of the algorithm
3
Molecular Docking - Computational prediction of the structure of receptor-ligand complexes Receptor: Protein Ligand: Protein or Small Molecule Protein-Protein Docking Protein-Small molecule Docking
4
Protein-Protein Docking Barnase Barstar 1BRS : Barnase + Barstar
5
Protein-Small Molecule Docking Receptor: Adipocyte lipid-binding protein PDB code: 1LIC Ligand: Hexadecanesulfonic acid
6
Why is Docking Important? - Molecular interactions are central to most of biological processes - The number of known molecular structures continues to grow, computational analysis of molecular interactions is increasingly important - Computational prediction of molecular interactions is an invaluable tool for structure- based drug design
7
Example: HIV-protease Image adopted from Nature Rev. Drug Discov. 2, 369-378 (2003)
8
Formulation of Docking Problem structures of the complex - score X-ray structure ? A search algorithm that finds the docking complex structure measured by the scoring function. A scoring function that can discriminate correct (experimentally observed) docking complex structure from incorrect ones.
9
Terminology: conformation, configuration, pose Conformation: the relative positions of atoms in the 3D structure of a molecule, independent of the coordinate system 2 different conformations of a ligand
10
Configuration/placement: the positions of atoms of a molecule after undergoing a rigid transformation (rotation and translation) in a coordinate system 2 different configurations (of same conformation) Terminology: conformation, configuration, pose
11
Pose: a configuration of a conformation of a molecule in a coordinate system 2 different poses of a ligand Terminology: conformation, configuration, pose
12
Why is Docking Difficult ? - Scoring Function: Estimate the binding affinity between ligand and receptor Factors: van der Waals interactions, hydrogen bonding, hydrophobic effects etc - Search Space is high-dimensional: Both molecules are flexible – hundreds to thousands of degrees of freedom (DOF) Total possible poses are astronomical
13
Hydrogen bond hydrogen bond: a hydrogen is “sandwiched” between two electron-attracting atoms From : http://www.accessexcellence.org/RC/VL/GG
14
Why is Docking Difficult ? - Scoring Function: Estimate the binding affinity between ligand and receptor Factors: van der Waals interactions, hydrogen bonding, hydrophobic effects etc - Search Space is high-dimensional: Both molecules are flexible – hundreds to thousands of degrees of freedom (DOF) Total possible poses are astronomical
15
Types of Docking Problems - Protein-Protein Docking Bound docking (“rigid redocking problem”): 6 degrees of freedom: 3 for rotation, 3 for translation Unbound docking : side chain flexibility - Protein-Small Molecule Docking Rigid receptor, rigid ligand Rigid receptor, flexible ligand Flexible receptor, flexible ligand
16
X-ray pose Rigid-Receptor Flexible-Ligand Docking −Rigid Receptor: (hold fixed) −Flexible Ligand: Find a pose of the ligand which is close to its X-ray pose (bound conformation). RMSD: Root-Mean-Square-Distance
17
Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm Results on recent 100-complex benchmark Details of the algorithm
18
Available Docking Software DOCK (Kuntz et al, 1982, Ewing & Kuntz 2001) FlexX (Rarey et al 1996) Hammerhead (Welch et al 1996) Surflex (Jain 2003) SLIDE (Kuhn et al 2002) AutoDock (Olson et al 1990, Morris et al 1998) ICM (Abagyan et al 1994) MCDock (Liu & Wang 1999) GOLD (Jones et al 1997) GemDock (Yang & Chen 2004) FRED (McGann et al 2002) Glide (Friesner et al 2004) Yucca (Choi 2005) …
19
Docking Algorithms - Stochastic Search: Genetic Algorithm, Monte Carlo simulated annealing AutoDock, MCDock, ICM, GOLD, Glide - Incremental Construction: Rigid fragments with rotatable bonds Incremental : preferred torsion angles DOCK, FlexX, SLIDE, Surflex - Multiconformer: Generate a set of low-energy conformers Rigid docking FLOG, FRED, Yucca
20
Incremental construction (FlexX & DOCK) 0. Fragmentation: 2. Anchor fragment placement 3. Incremental addition of other fragments a set of preferred torsion angles (<13) branch-and-bound heuristic 1. Base (“Anchor”) fragment selection: - specificity - placeability - placeability
21
Docking Algorithms - Stochastic Search: Genetic Algorithm, Monte Carlo simulated annealing AutoDock, MCDock, ICM, GOLD, Glide - Incremental Construction: Rigid fragments with rotatable bonds Incremental : preferred torsion angles DOCK, FlexX, SLIDE, Surflex - Multiconformer: Generate a set of low-energy conformers Rigid docking FLOG, FRED, Yucca
22
Types of Scoring Functions Force Field-Based: use non-bonded energies of force fields (e.g AMBER and CHARMM) Empirical-Based: derive from a set of protein-ligand complexes with measured binding affinity Knowledge-Based: statistical atom pair potentials derived from structural databases (use Boltzmann law)
23
Grid: precompute scoring function Most of docking algorithms are capable of dealing with different (additive) scoring functions
24
Scoring Functions Comparison Comparative Evaluation of 11 Scoring Functions for molecular Docking by R. Wang, Y. Lu & S. Wang, J. Med Chem, 2003
25
Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm Results on recent 100-complex benchmark Details of the algorithm
26
Comparative Study Benchmark: 100 protein-ligand complexes Diversity: molecular weight, number of rotatable bonds, volume of binding site cavity, polar surface area of the ligands 8 docking algorithms: Dock, FlexX, FRED, Glide, GOLD, Slide, Surflex, QXP Comparative Evaluation of Eight Docking Tools for Docking and Virtual Screening Accuracy by E. Kellenberger, J. Rodrigo, P. Muller,D. Rognan Proteins: Structure, Function, and Bioinformatics (2004)
27
Docking & ranking accuracy Docking accuracy: Among the 30 top-scored poses, the smallest RMSD < 2A Ranking accuracy: The top-scored pose’s RMSD<2A
28
Docking Accuracy Glide, GOLD, Surflex, QXP : > 80% FlexX: 66% FRED: 62% DOCK, SLIDE: ~50% Yucca: 76% Remark: QXP used some information of the bound-conformation.
29
Ranking Accuracy Glide, GOLD, Surflex, FlexX: 50-55% DOCK, FRED, Slide, QXP : <40% Yucca: 45%
30
Speed Average CPU time (seconds) on a 270MHz SGI R12K processor Running IRIX6.5: FRED 18 DOCK 46 FlexX 67 QXP 108 SLIDE 118 Surflex 135 GOLD 137 GLIDE 234 Yucca: average 4 seconds on a Pentium IV (3.0GHz) computer
31
Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm Results on recent 100-complex benchmark Details of the algorithm Multiconformer docking Our Scoring Function Local Improvement
32
Yucca: Multiconformer docker - Generate a set of comformers Use OMEGA (OpenEyes Scientific Co.) to generate a set of low- energy conformers Divide-and-Conquer: fragmentation + a set of preferred torsion angles Allow up to maximum 500 conformers for each molecule. Total: 5967 conformers (100-complex benchmark) Average 1.4 seconds per ligand - Rigid docking each conformer Coarse sampling ! a set of initial configurations Locally improve each configuration to a local minimum configuration
33
Rigid docking 1.Move to quasi-centroid; 2.Rotate about quasi-centroid by an angle; 3. Locally improve each configuration. Local improvement
34
Our Scoring Function - 2 components: - Energy - Bump Energy(Receptor, Ligand) = a 2 Receptor b 2 Ligand Energy(a,b) Bump(Receptor, Ligand) = a 2 Receptor b 2 Ligand Bump(a,b) Objective: Energy is minimized with Bump · Tolerance
35
Piecewise Linear Potentials (PLP) Atom Types: o hydrogen bond donor o hydrogen bond acceptor o hydrogen bond donor/acceptor o nonpolar Interaction Types: o H-bond: donor and acceptor o replusion : donor-donor or acceptor-acceptor o dispersion : other contacts Molecular recognition of the inbibitor AG-1343 by HIV-1 protease: conformationally flexible docking by evolution programming. D. K. Gehlhaar, et al. Chemistry & Biology, 1995.
36
PLP cont. donoracceptorbothnonpolar donorrepulsionH-bond dispersion accptorH-bondrepulsionH-bonddispersion bothH-bond dispersion nonpolardispersion H-bond: donor - acceptor A=2.3, B=2.6, C=3.1, D=3.4 E=-2, F=20 E total = E H-bond + E repulsion + E dispersion
37
Our PLP-based scoring function - Energy(a,b) = PLP energy (a,b) Example: dist(a,b) = 3.1, energy = -2 - Bump(a,b) = 1 if PLP energy(a,b)>0 Energy(Receptor, Ligand) = a 2 Receptor b 2 Ligand Energy(a,b) Bump(Receptor, Ligand) = a 2 Receptor b 2 Ligand Bump(a,b) Objective: Energy is minimized with Bump · Tolerance
38
Yucca: The Algorithm 0. Preprocessing – precompute grids; Rigid docking of each conformer: 1. Coarse sample a set of initial configurations; 2. Locally improve each configuration.
39
Preprocessing: Compute grids For each atom type: - Energy grid (0.2 A) - Bump grid (0.2 A) Energy(a) = b 2 Receptor Energy(a,b)
40
Attractor grid Attractor grid (0.8 A): - According to the distance to the protein atoms, find the lowest energy grid point within the local neighborhood.
41
Bump-free grid Bump-free grid (0.2 A): -The nearest bump-free grid point within the neighborhood.
42
Yucca: The Algorithm 0. Preprocessing – precompute grids; Rigid docking of each conformer: 1. Coarse sample a set of initial configuration; 2. Locally improve each configuration.
43
Yucca: Coarse Sampling Step - Translation : centroid ! quasi-centroid - Rotate about qausi-centroid ! initial configuration Quasi-centroid = centroid of the grid points with energy<-2, bump=0 Distance (quasi-centroid, centroid of bound ligand) < 2.5 A Sample around the quasi-centroid (a cube with distance 2 A)
44
Rotation A rotation in R 3 can be specified by a rotation angle about a rotation axis u – represented by unit quaternion. Rotation axes: 20 uniformly distributed points on unit sphere Rotation angle = max{5 /radius(ligand), /6} Total initial configurations: · 9*20*6=1080
45
Yucca: Local Improvement Step Step 1: Outer Loop – lower energy Step 2: Inner Loop – resolve collision
46
Tool: Weighted Least-Squares Superposition = WLSS(w, B, C) : i w i || (b i ) – c i || 2 is minimized
47
Outer Loop: decreasing energy - For each ligand atom, match it with the lowest energy grid point within its neighborhood by looking up from attractor grid; - Apply Least-Squares Superposition;
48
Collision resolution
49
Inner Loop: collision resolution - If an atom is bump free, match it to its original position; - If an atom causes bump, match it with the nearest bump-free grid point using bump-free grid; Set a larger weight (proportional to its inverse square distance); - Apply weighted least square superposition
50
Example 1 Notation: [Energy, Bump, RMSD] Root Mean Square Distance Input: [2575, 38, 2.31] Outer iteration 1: [-2402, 41, 1.64] Inner loop: [-6706, 27, 1.50] ! [-8468,23,1.27] ! [-10279, 14, 0.97] ! [-11158, 6, 0.82] Outer iteration 2: [-10376, 14, 1.01] Inner loop: [-10956, 11, 0.80] ! [-10951, 8, 0.63] ! [-10482, 5, 0.57] Outer iteration 3: [-9586, 15, 0.83] Inner loop: [-11140, 3, 0.65]
51
Example 2 Notation: [Energy, Bump, RMSD] Root Mean Square Distance Input: [22133, 82, 5.20] Outer iteration 1: [25597, 101, 4.97] Inner loop: [20871, 87, 5.20] ! [14601, 71, 5.22] ! [7508, 51, 5.18] ! [3638, 38, 5.26] ! [1810, 29, 5.32] Outer iteration 2: [6644, 59, 5.05] Inner loop: [2408, 37, 4.98] ! [-27, 30, 4.88] ! [-1521, 27, 5.03] ! [-2249, 20, 4.97] ! [-2238, 15, 4.89] Outer iteration 3: [-461, 31, 4.80] Inner loop: [-1924, 27, 4.97] ! [-2051, 25, 4.82] ! [-3153, 21, 4.76] ! [-3627, 17, 4.61] ! [-3889, 13, 4.58]
52
Yucca: The Algorithm 0. Preprocessing – precompute grids; Rigid docking of each conformer: 1. Coarse sample a set of initial configuration; 2. Locally improve each configuration: Outer Loop - Lower energy Inner Loop – Resolving collisions
53
Our algorithm Yucca’s performance - Average 4 seconds on Pentium IV (3GHz) - Docking accuracy : 76% - Ranking accuracy: 45%
54
14 Difficult Docking Cases (among 100-complex) - No more than 2 programs (among the 8 programs) manage to successfully dock the ligand 1LIC:
55
14 Difficult Docking Cases (among 100-complex) - No more than 2 programs (among the 8 programs) manage to successfully dock the ligand - Yucca’s results: Without including bound conformation in OMEGA : 6 successes, 8 fails With including bound conformation in OMEGA: 12 sucesses, 2 fails
56
Work in Progress - Conformer generator - Use the available databases to mine the correlated torsion angles - “Directed tweak” to resolve the collisions - Better scoring function - Flexible receptor docking - Virtual Screening
57
Acknowledgement David Bevan (Biochemistry, VT) Gavin Tsai (NCI/NIH) Joel Gillespie (VBI, VT) Bradley Feuston (Merck Research Laboratory)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.