Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vicky Choi Assistant Professor Department of Computer Science Virginia Tech Yucca: An Efficient Algorithm for Small Molecule Docking.

Similar presentations


Presentation on theme: "Vicky Choi Assistant Professor Department of Computer Science Virginia Tech Yucca: An Efficient Algorithm for Small Molecule Docking."— Presentation transcript:

1 Vicky Choi Assistant Professor Department of Computer Science Virginia Tech Yucca: An Efficient Algorithm for Small Molecule Docking

2 Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm  Results on recent 100-complex benchmark  Details of the algorithm

3 Molecular Docking - Computational prediction of the structure of receptor-ligand complexes Receptor: Protein Ligand: Protein or Small Molecule Protein-Protein Docking Protein-Small molecule Docking

4 Protein-Protein Docking Barnase Barstar 1BRS : Barnase + Barstar

5 Protein-Small Molecule Docking Receptor: Adipocyte lipid-binding protein PDB code: 1LIC Ligand: Hexadecanesulfonic acid

6 Why is Docking Important? - Molecular interactions are central to most of biological processes - The number of known molecular structures continues to grow, computational analysis of molecular interactions is increasingly important - Computational prediction of molecular interactions is an invaluable tool for structure- based drug design

7 Example: HIV-protease Image adopted from Nature Rev. Drug Discov. 2, 369-378 (2003)

8 Formulation of Docking Problem structures of the complex - score X-ray structure ? A search algorithm that finds the docking complex structure measured by the scoring function. A scoring function that can discriminate correct (experimentally observed) docking complex structure from incorrect ones.

9 Terminology: conformation, configuration, pose Conformation: the relative positions of atoms in the 3D structure of a molecule, independent of the coordinate system 2 different conformations of a ligand

10 Configuration/placement: the positions of atoms of a molecule after undergoing a rigid transformation (rotation and translation) in a coordinate system 2 different configurations (of same conformation) Terminology: conformation, configuration, pose

11 Pose: a configuration of a conformation of a molecule in a coordinate system 2 different poses of a ligand Terminology: conformation, configuration, pose

12 Why is Docking Difficult ? - Scoring Function:  Estimate the binding affinity between ligand and receptor  Factors: van der Waals interactions, hydrogen bonding, hydrophobic effects etc - Search Space is high-dimensional:  Both molecules are flexible – hundreds to thousands of degrees of freedom (DOF)  Total possible poses are astronomical

13 Hydrogen bond hydrogen bond: a hydrogen is “sandwiched” between two electron-attracting atoms From : http://www.accessexcellence.org/RC/VL/GG

14 Why is Docking Difficult ? - Scoring Function:  Estimate the binding affinity between ligand and receptor  Factors: van der Waals interactions, hydrogen bonding, hydrophobic effects etc - Search Space is high-dimensional:  Both molecules are flexible – hundreds to thousands of degrees of freedom (DOF)  Total possible poses are astronomical

15 Types of Docking Problems - Protein-Protein Docking  Bound docking (“rigid redocking problem”): 6 degrees of freedom: 3 for rotation, 3 for translation  Unbound docking : side chain flexibility - Protein-Small Molecule Docking  Rigid receptor, rigid ligand  Rigid receptor, flexible ligand  Flexible receptor, flexible ligand

16 X-ray pose Rigid-Receptor Flexible-Ligand Docking −Rigid Receptor: (hold fixed) −Flexible Ligand: Find a pose of the ligand which is close to its X-ray pose (bound conformation). RMSD: Root-Mean-Square-Distance

17 Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm  Results on recent 100-complex benchmark  Details of the algorithm

18 Available Docking Software DOCK (Kuntz et al, 1982, Ewing & Kuntz 2001) FlexX (Rarey et al 1996) Hammerhead (Welch et al 1996) Surflex (Jain 2003) SLIDE (Kuhn et al 2002) AutoDock (Olson et al 1990, Morris et al 1998) ICM (Abagyan et al 1994) MCDock (Liu & Wang 1999) GOLD (Jones et al 1997) GemDock (Yang & Chen 2004) FRED (McGann et al 2002) Glide (Friesner et al 2004) Yucca (Choi 2005) …

19 Docking Algorithms - Stochastic Search:  Genetic Algorithm, Monte Carlo simulated annealing  AutoDock, MCDock, ICM, GOLD, Glide - Incremental Construction:  Rigid fragments with rotatable bonds  Incremental : preferred torsion angles  DOCK, FlexX, SLIDE, Surflex - Multiconformer:  Generate a set of low-energy conformers  Rigid docking  FLOG, FRED, Yucca

20 Incremental construction (FlexX & DOCK) 0. Fragmentation: 2. Anchor fragment placement 3. Incremental addition of other fragments a set of preferred torsion angles (<13) branch-and-bound heuristic 1. Base (“Anchor”) fragment selection: - specificity - placeability - placeability

21 Docking Algorithms - Stochastic Search:  Genetic Algorithm, Monte Carlo simulated annealing  AutoDock, MCDock, ICM, GOLD, Glide - Incremental Construction:  Rigid fragments with rotatable bonds  Incremental : preferred torsion angles  DOCK, FlexX, SLIDE, Surflex - Multiconformer:  Generate a set of low-energy conformers  Rigid docking  FLOG, FRED, Yucca

22 Types of Scoring Functions  Force Field-Based: use non-bonded energies of force fields (e.g AMBER and CHARMM)  Empirical-Based: derive from a set of protein-ligand complexes with measured binding affinity  Knowledge-Based: statistical atom pair potentials derived from structural databases (use Boltzmann law)

23 Grid: precompute scoring function Most of docking algorithms are capable of dealing with different (additive) scoring functions

24 Scoring Functions Comparison Comparative Evaluation of 11 Scoring Functions for molecular Docking by R. Wang, Y. Lu & S. Wang, J. Med Chem, 2003

25 Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm  Results on recent 100-complex benchmark  Details of the algorithm

26 Comparative Study Benchmark: 100 protein-ligand complexes Diversity: molecular weight, number of rotatable bonds, volume of binding site cavity, polar surface area of the ligands 8 docking algorithms: Dock, FlexX, FRED, Glide, GOLD, Slide, Surflex, QXP Comparative Evaluation of Eight Docking Tools for Docking and Virtual Screening Accuracy by E. Kellenberger, J. Rodrigo, P. Muller,D. Rognan Proteins: Structure, Function, and Bioinformatics (2004)

27 Docking & ranking accuracy Docking accuracy: Among the 30 top-scored poses, the smallest RMSD < 2A Ranking accuracy: The top-scored pose’s RMSD<2A

28 Docking Accuracy Glide, GOLD, Surflex, QXP : > 80% FlexX: 66% FRED: 62% DOCK, SLIDE: ~50% Yucca: 76% Remark: QXP used some information of the bound-conformation.

29 Ranking Accuracy Glide, GOLD, Surflex, FlexX: 50-55% DOCK, FRED, Slide, QXP : <40% Yucca: 45%

30 Speed Average CPU time (seconds) on a 270MHz SGI R12K processor Running IRIX6.5: FRED 18 DOCK 46 FlexX 67 QXP 108 SLIDE 118 Surflex 135 GOLD 137 GLIDE 234 Yucca: average 4 seconds on a Pentium IV (3.0GHz) computer

31 Outline - Introduction to Molecular Docking - Available docking algorithms & scoring functions - Yucca: New Algorithm  Results on recent 100-complex benchmark  Details of the algorithm  Multiconformer docking  Our Scoring Function  Local Improvement

32 Yucca: Multiconformer docker - Generate a set of comformers  Use OMEGA (OpenEyes Scientific Co.) to generate a set of low- energy conformers  Divide-and-Conquer: fragmentation + a set of preferred torsion angles  Allow up to maximum 500 conformers for each molecule. Total: 5967 conformers (100-complex benchmark)  Average 1.4 seconds per ligand - Rigid docking each conformer  Coarse sampling ! a set of initial configurations  Locally improve each configuration to a local minimum configuration

33 Rigid docking 1.Move to quasi-centroid; 2.Rotate about quasi-centroid by an angle; 3. Locally improve each configuration. Local improvement

34 Our Scoring Function - 2 components: - Energy - Bump Energy(Receptor, Ligand) =  a 2 Receptor  b 2 Ligand Energy(a,b) Bump(Receptor, Ligand) =  a 2 Receptor  b 2 Ligand Bump(a,b) Objective: Energy is minimized with Bump · Tolerance

35 Piecewise Linear Potentials (PLP) Atom Types: o hydrogen bond donor o hydrogen bond acceptor o hydrogen bond donor/acceptor o nonpolar Interaction Types: o H-bond: donor and acceptor o replusion : donor-donor or acceptor-acceptor o dispersion : other contacts Molecular recognition of the inbibitor AG-1343 by HIV-1 protease: conformationally flexible docking by evolution programming. D. K. Gehlhaar, et al. Chemistry & Biology, 1995.

36 PLP cont. donoracceptorbothnonpolar donorrepulsionH-bond dispersion accptorH-bondrepulsionH-bonddispersion bothH-bond dispersion nonpolardispersion H-bond: donor - acceptor A=2.3, B=2.6, C=3.1, D=3.4 E=-2, F=20 E total = E H-bond + E repulsion + E dispersion

37 Our PLP-based scoring function - Energy(a,b) = PLP energy (a,b)  Example: dist(a,b) = 3.1, energy = -2 - Bump(a,b) = 1 if PLP energy(a,b)>0 Energy(Receptor, Ligand) =  a 2 Receptor  b 2 Ligand Energy(a,b) Bump(Receptor, Ligand) =  a 2 Receptor  b 2 Ligand Bump(a,b) Objective: Energy is minimized with Bump · Tolerance

38 Yucca: The Algorithm 0. Preprocessing – precompute grids; Rigid docking of each conformer: 1. Coarse sample a set of initial configurations; 2. Locally improve each configuration.

39 Preprocessing: Compute grids For each atom type: - Energy grid (0.2 A) - Bump grid (0.2 A) Energy(a) =  b 2 Receptor Energy(a,b)

40 Attractor grid Attractor grid (0.8 A): - According to the distance to the protein atoms, find the lowest energy grid point within the local neighborhood.

41 Bump-free grid Bump-free grid (0.2 A): -The nearest bump-free grid point within the neighborhood.

42 Yucca: The Algorithm 0. Preprocessing – precompute grids; Rigid docking of each conformer: 1. Coarse sample a set of initial configuration; 2. Locally improve each configuration.

43 Yucca: Coarse Sampling Step - Translation : centroid ! quasi-centroid - Rotate about qausi-centroid ! initial configuration Quasi-centroid = centroid of the grid points with energy<-2, bump=0 Distance (quasi-centroid, centroid of bound ligand) < 2.5 A Sample around the quasi-centroid (a cube with distance 2 A)

44 Rotation A rotation in R 3 can be specified by a rotation angle  about a rotation axis u – represented by unit quaternion. Rotation axes: 20 uniformly distributed points on unit sphere Rotation angle = max{5  /radius(ligand),  /6} Total initial configurations: · 9*20*6=1080

45 Yucca: Local Improvement Step Step 1: Outer Loop – lower energy Step 2: Inner Loop – resolve collision

46 Tool: Weighted Least-Squares Superposition  = WLSS(w, B, C) :  i w i ||  (b i ) – c i || 2 is minimized

47 Outer Loop: decreasing energy - For each ligand atom, match it with the lowest energy grid point within its neighborhood by looking up from attractor grid; - Apply Least-Squares Superposition;

48 Collision resolution

49 Inner Loop: collision resolution - If an atom is bump free, match it to its original position; - If an atom causes bump, match it with the nearest bump-free grid point using bump-free grid; Set a larger weight (proportional to its inverse square distance); - Apply weighted least square superposition

50 Example 1 Notation: [Energy, Bump, RMSD] Root Mean Square Distance Input: [2575, 38, 2.31] Outer iteration 1: [-2402, 41, 1.64] Inner loop: [-6706, 27, 1.50] ! [-8468,23,1.27] ! [-10279, 14, 0.97] ! [-11158, 6, 0.82] Outer iteration 2: [-10376, 14, 1.01] Inner loop: [-10956, 11, 0.80] ! [-10951, 8, 0.63] ! [-10482, 5, 0.57] Outer iteration 3: [-9586, 15, 0.83] Inner loop: [-11140, 3, 0.65]

51 Example 2 Notation: [Energy, Bump, RMSD] Root Mean Square Distance Input: [22133, 82, 5.20] Outer iteration 1: [25597, 101, 4.97] Inner loop: [20871, 87, 5.20] ! [14601, 71, 5.22] ! [7508, 51, 5.18] ! [3638, 38, 5.26] ! [1810, 29, 5.32] Outer iteration 2: [6644, 59, 5.05] Inner loop: [2408, 37, 4.98] ! [-27, 30, 4.88] ! [-1521, 27, 5.03] ! [-2249, 20, 4.97] ! [-2238, 15, 4.89] Outer iteration 3: [-461, 31, 4.80] Inner loop: [-1924, 27, 4.97] ! [-2051, 25, 4.82] ! [-3153, 21, 4.76] ! [-3627, 17, 4.61] ! [-3889, 13, 4.58]

52 Yucca: The Algorithm 0. Preprocessing – precompute grids; Rigid docking of each conformer: 1. Coarse sample a set of initial configuration; 2. Locally improve each configuration: Outer Loop - Lower energy Inner Loop – Resolving collisions

53 Our algorithm Yucca’s performance - Average 4 seconds on Pentium IV (3GHz) - Docking accuracy : 76% - Ranking accuracy: 45%

54 14 Difficult Docking Cases (among 100-complex) - No more than 2 programs (among the 8 programs) manage to successfully dock the ligand 1LIC:

55 14 Difficult Docking Cases (among 100-complex) - No more than 2 programs (among the 8 programs) manage to successfully dock the ligand - Yucca’s results:  Without including bound conformation in OMEGA : 6 successes, 8 fails  With including bound conformation in OMEGA: 12 sucesses, 2 fails

56 Work in Progress - Conformer generator - Use the available databases to mine the correlated torsion angles - “Directed tweak” to resolve the collisions - Better scoring function - Flexible receptor docking - Virtual Screening

57 Acknowledgement David Bevan (Biochemistry, VT) Gavin Tsai (NCI/NIH) Joel Gillespie (VBI, VT) Bradley Feuston (Merck Research Laboratory)


Download ppt "Vicky Choi Assistant Professor Department of Computer Science Virginia Tech Yucca: An Efficient Algorithm for Small Molecule Docking."

Similar presentations


Ads by Google