Computational Drug Design Apr 2010 Postgrad course on Comp Chem Noel M. O’Boyle.

Slides:



Advertisements
Similar presentations
Case Study: Dopamine D 3 Receptor Anthagonists Chapter 3 – Molecular Modeling 1.
Advertisements

1 Sequential Screening S. Stanley Young NISS HTS Workshop October 25, 2002.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
AutoDock 4 and AutoDock Vina -Brief Intruction
3D Molecular Structures C371 Fall Morgan Algorithm (Leach & Gillet, p. 8)
Improving enrichment rates A practical solution to an impractical problem Noel O’Boyle Cambridge Crystallographic Data Centre
Insight into Molecular Geometry and Interactions using Small Molecule Crystallographic Data John Liebeschuetz Cambridge Crystallographic.
Why multiple scoring functions can improve docking performance - Testing hypotheses for rescoring success Noel M. O’Boyle, John W. Liebeschuetz and Jason.
How to approximate complex physical and thermodynamic interactions? Employ rigid or flexible structures for ligand and receptor (Side-chains or Back-bone.
Lipinski’s rule of five
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005.
Molecular Docking G. Schaftenaar Docking Challenge Identification of the ligand’s correct binding geometry in the binding site ( Binding Mode ) Observation:
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
M. Wagener 3D Database Searching and Scaffold Hopping Markus Wagener NV Organon.
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.
FLEX* - REVIEW.
Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott.
Development and Validation of a Genetic Algorithm for Flexible Docking Gareth Jones, Peter Willet, Robert C. Glen, Andrew R. Leach and Robin Taylor J.
BL5203: Molecular Recognition & Interaction Lecture 5: Drug Design Methods Ligand-Protein Docking (Part I) Prof. Chen Yu Zong Tel:
4. Modeling 3D-periodic systems Cut-off radii, charges groups Ewald summation Growth units, bonds, attachment energy Predicting crystal structures.
Molecular Docking Using GOLD Tommi Suvitaival Seppo Virtanen S Basics for Biosystems of the Cell Fall 2006.
RAPID: Randomized Pharmacophore Identification for Drug Design PW Finn, LE Kavraki, JC Latombe, R Motwani, C Shelton, S Venkatasubramanian, A Yao Presented.
Structure Based Drug Design
Quantitative Structure-Activity Relationships (QSAR)  Attempts to identify and quantitate physicochemical properties of a drug in relation to its biological.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Comparative Evaluation of 11 Scoring Functions for Molekular Docking Authors: Renxiao Wang, Yipin Lu and Shaomeng Wang Presented by Florian Lenz.
eHiTS Score Darryl Reid, Zsolt Zsoldos, Bashir S. Sadjad, Aniko Simon, The next stage in scoring function evolution: a new statistically.
QSAR Qualitative Structure-Activity Relationships Can one predict activity (or properties in QSPR) simply on the basis of knowledge of the structure of.
Computational Chemistry. Overview What is Computational Chemistry? How does it work? Why is it useful? What are its limits? Types of Computational Chemistry.
Protein Tertiary Structure Prediction
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Molecular Descriptors
Molecular Docking S. Shahriar Arab. Overview of the lecture Introduction to molecular docking: Definition Types Some techniques Programs “Protein-Protein.
ClusPro: an automated docking and discrimination method for the prediction of protein complexes Stephen R. Comeau, David W.Gatchell, Sandor Vajda, and.
A genetic algorithm for structure based de-novo design Scott C.-H. Pegg, Jose J. Haresco & Irwin D. Kuntz February 21, 2006.
Conformational Sampling
Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based applications programs Data mining tools for protein-ligand complexes.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
In silico discovery of inhibitors using structure-based approaches Jasmita Gill Structural and Computational Biology Group, ICGEB, New Delhi Nov 2005.
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
SimBioSys Inc.© Slide #1 Enrichment and cross-validation studies of the eHiTS high throughput screening software package.
SimBioSys Inc.© 2004http:// Conformational sampling in protein-ligand complex environment Zsolt Zsoldos SimBioSys Inc., © 2004 Contents:
A two-state homology model of the hERG K + channel: application to ligand binding Ramkumar Rajamani, Brett Tongue, Jian Li, Charles H. Reynolds J & J PRD.
Altman et al. JACS 2008, Presented By Swati Jain.
Virtual Screening C371 Fall INTRODUCTION Virtual screening – Computational or in silico analog of biological screening –Score, rank, and/or filter.
Hierarchical Database Screenings for HIV-1 Reverse Transcriptase Using a Pharmacophore Model, Rigid Docking, Solvation Docking, and MM-PB/SA Junmei Wang,
R L R L L L R R L L R R L L water DOCKING SIMULATIONS.
Structure- based Structure-based computer-aided drug discovery (SB-CADD) approach: helps to design and evaluate the quality, in terms of affinity, of series.
Lecture 9: Theory of Non-Covalent Binding Equilibria Dr. Ronald M. Levy Statistical Thermodynamics.
FlexWeb Nassim Sohaee. FlexWeb 2 Proteins The ability of proteins to change their conformation is important to their function as biological machines.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular Similarity-Based Search Engine Ajay N. Jain UCSF Cancer Research Institute and Comprehensive.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Protein structure prediction Computer-aided pharmaceutical design: Modeling receptor flexibility Applications to molecular simulation Work on this paper.
CoMFA Study of Piperidine Analogues of Cocaine at the Dopamine Transporter: Exploring the Binding Mode of the 3  -Substituent of the Piperidine Ring Using.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
Elon Yariv Graduate student in Prof. Nir Ben-Tal’s lab Department of Biochemistry and Molecular Biology, Tel Aviv University.
Molecular Modeling in Drug Discovery: an Overview
1 of 21 SDA development -Description of sda Description of sda-5a - Sda for docking.
Page 1 Computer-aided Drug Design —Profacgen. Page 2 The most fundamental goal in the drug design process is to determine whether a given compound will.
Lipinski’s rule of five
DATA MINING FOR SMALL MOLECULE ALLOSTERIC INHIBITORS
Virtual Screening.
Ligand Docking to MHC Class I Molecules
Alexey Sulimov, Ekaterina Katkova, Vladimir Sulimov,
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

Computational Drug Design Apr 2010 Postgrad course on Comp Chem Noel M. O’Boyle

References Computational Drug Design, David C. Young My EBI lecture on protein-ligand docking, An Introduction to Cheminformatics, AR Leach, VJ Gillet

Rational Drug Design Use knowledge of protein or ligand structures –Does not rely on trial-and-error or screening –Computer-aided drug design (CADD) now plays an important role in rational design Structure-based drug design –Uses protein structure directly –CADD: Protein-ligand docking Ligand-based drug design –Derive information from ligand structures –Protein structure not always available 40% of all prescription pharmaceuticals target GPCRs –Protein structure has large degree of flexibility Structure deforms to accommodate ligands or gross movements occur on binding –CADD: Pharmacophore approach, Quantitative structure- activity relationship (QSAR)

Computer-aided drug design (CADD) Known ligand(s)No known ligand Known protein structure Unknown protein structure Structure-based drug design (SBDD) Protein-ligand docking Ligand-based drug design (LBDD) 1 or more ligands Similarity searching Several ligands Pharmacophore searching Many ligands (20+) Quantitative Structure-Activity Relationships (QSAR) De novo design CADD of no use Need experimental data of some sort Can apply ADMET filters

Virtual screening Virtual screening is the computational or in silico analogue of biological screening The aim is to score, rank or filter a set of structures using one or more computational procedures It can be used –to help decide which compounds to screen (experimentally) –which libraries to synthesise –which compounds to purchase from an external company –to analyse the results of an experiment, such as a HTS run

Virtual screening AR Leach, VJ Gillet, An Introduction to Cheminformatics

What is a Pharmacophore? Two somewhat distinct usages: That substructure of a molecule that is responsible for its pharmacological activity (c.f. chromophore) A set of geometrical constraints between specific functional groups that enable the molecule to have biological activity Bojarski, Curr. Top. Med. Chem. 2006, 6, 2005.

Overview of Pharmacophore-based Drug Design Activity data Search compound library for actives Generate pharmacophore Test activity Buy or synthesise ‘hits’ See also John Van Drie’s

Pharmacophore generation and searching Protein structure not required –There are also approaches that create pharmacophores from the active site Assumes that all (or the majority) of the known actives bind to the same location Pharmacophore generation –Identify pharmacophoric features (hydrogen bond donors and acceptors, lipophilic groups, charges) –Find a geometrical arrangement of pharmacophoric features that all actives that match with a low-energy conformation Pharmacophore searching –Given a pharmacophore, find all molecules in a database that can match it in a low-energy conformation Some pharmacophore software gives an estimate of activity, but most just give true or false for a match –Scaffold-hopping possible Doesn’t require structural similarity Just needs to match the pharmacophore

Protein-ligand docking Predicts... The pose of the molecule in the binding site The binding affinity or a score representing the strength of binding A Structure-Based Drug Design (SBDD) method –“structure” means “using protein structure” Computational method that mimics the binding of a ligand to a protein Given...

Protein-ligand docking II Typically, protein-ligand docking software consist of two main components which work together: 1. Search algorithm –Generates a large number of poses of a molecule in the binding site 2. Scoring function –Calculates a score or binding affinity for a particular pose The difficulty with protein–ligand docking is in part due to the fact that it involves many degrees of freedom –The translation and rotation of one molecule relative to another involves six degrees of freedom –There are in addition the conformational degrees of freedom of both the ligand and the protein –The solvent may also play a significant role in determining the protein–ligand geometry The search algorithm generates poses, orientations of particular conformations of the molecule in the binding site –Tries to cover the search space, if not exhaustively, then as extensively as possible –There is a tradeoff between time and search space coverage

Ligand conformations Conformations are different three-dimensional structures of molecules that result from rotation about single bonds –That is, they have the same bond lengths and angles but different torsion angles For a molecule with N rotatable bonds, if each torsion angle is rotated in increments of θ degrees, number of conformations is (360º/ θ) N Question –If the torsion angles are incremented in steps of 30º, how many conformations does a molecule with 5 rotatable bonds have, compared to one with 4 rotatable bonds? Having too many rotatable bonds results in “combinatorial explosion” Also ring conformations Taxol

Types of search algorithms Classified based on the degrees of freedom that they consider Rigid docking –The ligand is treated as a rigid structure during the docking Only the translational and rotational degrees of freedom are considered –To deal with the problem of ligand conformations, a large number of conformations of each ligand are generated in advance and each is docked separately Flexible docking is more common today –Conformations of each molecule are generated on-the-fly by the search algorithm during the docking process –Avoids considering conformations that do not fit –Exhaustive (systematic) searching computationally too expensive as the search space is very large –One common approach is to use stochastic search methods These don’t guarantee optimum solution, but good solution within reasonable length of time Stochastic means that they incorporate a degree of randomness Includes genetic algorithms (GOLD), simulated annealing (AutoDock)

Handling protein conformations Most docking software treats the protein as rigid –Rigid Receptor Approximation This approximation may be invalid for a particular protein-ligand complex as... –the protein may deform slightly to accommodate different ligands (ligand-induced fit) –protein side chains in the active site may adopt different conformations Some docking programs allow protein side-chain flexibility –For example, selected side chains are allowed to undergo torsional rotation around acyclic bonds –Increases the search space Larger protein movements can only be handled by separate dockings to different protein conformations

The perfect scoring function will… Accurately calculate the binding affinity –Will allow actives to be identified in a virtual screen –Be able to rank actives in terms of affinity Score the poses of an active higher than poses of an inactive –Will rank actives higher than inactives in a virtual screen Score the correct pose of the active higher than an incorrect pose of the active –Will allow the correct pose of the active to be identified Broadly speaking, scoring functions can be divided into the following classes: –Forcefield-based Based on terms from molecular mechanics forcefields GoldScore, DOCK, AutoDock –Empirical Parameterised against experimental binding affinities ChemScore, PLP, Glide SP/XP –Knowledge-based potentials Based on statistical analysis of observed pairwise distributions PMF, DrugScore, ASP

Böhm’s empirical scoring function The hydrogen bonding and ionic terms are both dependent on the geometry of the interaction, with large deviations from ideal geometries (ideal distance R, ideal angle α) being penalised. The lipophilic term is proportional to the contact surface area (Alipo) between protein and ligand involving non-polar atoms. The conformational entropy term is the penalty associated with freezing internal rotations of the ligand. It is largely entropic in nature. Here the value is directly proportional to the number of rotatable bonds in the ligand (NROT). The ∆G values on the right of the equation are all constants determined using multiple linear regression on experimental binding data for 45 protein–ligand complexes –Hence “empirical” In general, scoring functions assume that the free energy of binding can be written as a linear sum of terms to reflect the various contributions to binding Bohm, J. Comput.-Aided Mol. Des., 1994, 8, 243 Bohm’s scoring function included contributions from hydrogen bonding, ionic interactions, lipophilic interactions and the loss of internal conformational freedom of the ligand.

Pose prediction accuracy Given a set of actives with known crystal poses, can they be docked accurately? Accuracy measured by RMSD (root mean squared deviation) compared to known crystal structures –RMSD = square root of the average of (the difference between a particular coordinate in the crystal and that coordinate in the pose) 2 –Within 2.0Å RMSD considered cut-off for accuracy –More sophisticated measures have been proposed, but are not widely adopted In general, the best docking software predicts the correct pose about 70% of the time Note: it’s always easier to find the correct pose when docking back into the active’s own crystal structure –More difficult to cross-dock

Assess performance of a virtual screen Need a dataset of N act known actives, and inactives Dock all molecules, and rank each by score Ideally, all actives would be at the top of the list –In practice, we are interested in any improvement over what is expected by chance Define enrichment, E, as the number of actives found (N found ) in the top X% of scores ( typically 1% or 5% ), compared to how many expected by chance –E = N found / (N act * X/100) –E > 1 implies “positive enrichment”, better than random –E < 1 implies “negative enrichment”, worse than random Why use a cut-off instead of looking at the mean rank of the actives? –Typically, the researchers might test only have the resources to experimentally test the top 1% or 5% of compounds More sophisticated approaches have been developed (e.g. BEDROC) but enrichment is still widely used

Preparing the protein structure The Protein Data Bank (PDB) is a repository of protein crystal structures, often in complexes with inhibitors PDB structures often contain water molecules –In general, all water molecules are removed except where it is known that they play an important role in coordinating to the ligand PDB structures are missing all hydrogen atoms –Many docking programs require the protein to have explicit hydrogens. In general these can be added unambiguously, except in the case of acidic/basic side chains An incorrect assignment of protonation states in the active site will give poor results Glutamate, Aspartate have COO- or COOH –OH is hydrogen bond donor, O- is not Histidine is a base and its neutral form has two tautomers

Preparing the protein structure For particular protein side chains, the PDB structure can be incorrect Crystallography gives electron density, not molecular structure –In poorly resolved crystal structures of proteins, isoelectronic groups can give make it difficult to deduce the correct structure Affects asparagine, glutamine, histidine Important? Affects hydrogen bonding pattern May need to flip amide or imidazole –How to decide? Look at hydrogen bonding pattern in crystal structures containing ligands