Insight into Molecular Geometry and Interactions using Small Molecule Crystallographic Data John Liebeschuetz Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, UK
How much Data is Available? CSD Growth ,768 entries June 2007 Growth of the Cambridge Structural Database over 40 years Predicted Growth to 2010 >500,000 entries during 2009
CSD Data Content Literature Reference: G. Bringmann, M. Ochse, K. Wolf, J. Kraus, K. Peters, E-M. Peters, M. Herderich, L. Ake, F. Tayman Phytochemistry 51,1999, 271 Other text: R-factor:.0506 Colour: pale yellow Habit: acicular Polymorph: Form IV Source: Rothmannia longiflora 4-Oxonicotinamide-1-(1’-beta-D-2’,3’,5’-tri-O-acetyl- ribofuranoside) C17 H20 N2 O9
Molecular Interactions as well as Geometry HEPPEX
Cambridge Structural Database System Cambridge Structural Database PreQuest Database Production VISTA Statistical analysis Mercury Graphical display, packing analysis ConQuest Database Search Mogul Library of Molecular Geometry IsoStar Library of Intermolecular Interactions Knowledge Bases
Using Structural Data in molecular modelling for pharmaceutical design Intramolecular – 3D geometry – Designing in the desired Conformer – Validation that models have correct geometry Intermolecular – Interactions between molecules – Design of pharmacophores – Validation of interactions found during modelling – Identification of new ways to satisfy binding motifs – Knowledge-based scoring functions for docking
Designing in the right Conformation (1) Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl, M. J. Chem. Inf. Mod, 48(1), ) It is possible using Conquest to generate incidence histograms for any geometric feature, for any substructure, if sufficient high quality structures including that substructure, are present in the CSD
Designing in the right Conformation (2) Sulphonamide is common in drug molecules. Its conformational behaviour well captured by CSD Ortho Substitution (Blue histogram) shifts the maximum Pyramidalisation of the N of the sulphonamide can also be explored. This is a common effect in sulphonamides (and piperidines) but is poorly reproduced by modelling software Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl, M. J. Chem. Inf. Mod, 48(1), )
Designing in the right Conformation (3) : Example 1: CSD analysis indicates the bioactive conformation is stable only for most active structure Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl, M. J. Chem. Inf. Mod, 48(1), )
Rapid access of geometric information from the CSD Incorporates pre-computed libraries of bond lengths, valence angles and torsion angles >20 million individual geometrical parameters derived entirely from the CSD and updated annually Sketch or import molecule, then click on feature of interest to view distribution, mean values and statistics Validation of Model Geometry: Mogul Bruno et al., J. Chem. Inf. Comput. Sci., 44, , 2004
Select fragment
Validation of PDB Ligand Geometry PDB structures suffer from less well defined electron density Protein X-Ray refinement force fields often are poorly parameterised to reproduce ligand geometries Sometimes protein crystallographers start with a poor ligand model
Validation of Ligand structures in the PDB via Protein/Ligand analysis tool Relibase+* *
Validation of ligand structures found in the PDB Ligand from 1HAK, Two abnormal torsions indicated Further examination reveals the piperidine to be Boat form
15% of 100 recent PDB entries have ligand geometry that are almost certainly in significant error (in house analysis using Relibase+/Mogul) The good news - For structures deposited before 2000 the figure is 26% Pre Validation of ligand structures found in the PDB using Mogul
Designing in the right Conformation (3) : Large Rings Brameld. K.A., Kuhn, B., Reuter, D.C. and Stahl, M. J. Chem. Inf. Mod, 48(1), )
Validation of ligand structures found in the PDB Ligand from 1HAK, Two abnormal torsions indicated Further examination reveals the piperidine to be Boat form
Mogul 1.3: Ring Conformations Mogul currently holds data on bonds, angles and torsions. In the 2010 release of the Cambridge Structural Database System Mogul will also contain a comprehensive ring knowledge base Ring libraries from α-Mogul 1.3 have been introduced into Gold 4.1 to allow knowledge- based ring-flexing during docking
A Knowledge Base of Intermolecular Interactions Experimental data from: Cambridge Structural Database Protein Data Bank (protein-ligand complexes only) Theoretical potential energy minima (DMA, IMPT) Typical Uses: Probability of an interaction occurring Preferred geometries Design Strategies IsoStar
central group: -CONH 2 contact group: NH IsoStar Methodology Search CSD or PDB for structures containing contact Superimpose hits and display distribution
IsoStar Scatterplots vs. Density Maps N-H donors around amide C=O Scatterplot Contour surface
IsoStar –indole and isoxazole interactions with faces of phenyl rings
Using Intermolecular information to build pharmacophores from proteins Use intermolecular information (IsoStar) to map a protein binding site (e.g. using SuperStar, an extra module to the CSDS ) Create a pharmacophore from this information (possible in SuperStar) c.f. GRID/FLAP
Motif searching Certain signature interaction motifs might be key to identifying inhibitor substructures of interest. Can we identify such motifs in the CSD and thereby uncover new ideas? Materials Mercury: A new tool for the drug development and crystal design community Most tools are specific to small molecule crystals.... However ….
1.Comparison of crystal structures: polymorphs, solvates etc can identify significant ‘packing features’. 2.We can then search the CSD using ‘Packing Feature Search’ Packing Feature Search
H-bonding Motif Search: Kinase Binding Motifs CDK2 Complex – 1ke8 Set up a ‘Packing Feature Search’ around Hinge Region
H-bonding Motif Search: Kinase Binding Motifs MISTOXWUSQAC Provides ideas for new motifs – Fragment based design
Protein/Ligand Docking relies on a scoring function to rank binding poses –Scoring functions may be Molecular Mechanics based, Empirical or Knowledge Based A Knowledge Based score is calculated by the sum of atom-atom potentials derived from a crystallographic database The atom-atom potential = - log Knowledge based scoring functions (PMF, Bleep, DrugScore, ASP) have been developed using protein-ligand data (PDB) The CSD contains better resolved structures and a much greater variety of chemical functionality than the PDB –DrugScore CSD has demonstrably improved performance over DrugScore (Velec, Gohlke & Klebe, J. Med. Chem., 48 (2005), 6296 ) observed interactions Knowledge-based scoring using small molecule structural data reference state
Uses of Small Molecule Structural Data in Drug Design: Conclusions Use in Model Validation - –Geometry of designed synthetic candidates –Geometry of X-ray derived Ligand Structures –Intermolecular interactions of a candidate structure with a model of binding site Design of Pharmacophores Search for fragments fitting a binding motif Creation of robust and versatile Knowledge- Based scoring functions for docking
Acknowledgements Thank you for your attention Jana Henneman James Chisholm