Solubility is an important issue in drug discovery and a major source of attrition This is expensive for the pharma industry A good model for predicting.

Slides:



Advertisements
Similar presentations
Computers in Chemistry Dr John Mitchell & Rosanna Alderson University of St Andrews.
Advertisements

Evaluating Free Energies of Binding using Amber: The MM-PBSA Approach.
Crystallography, Birkbeck MOLECULAR SIMULATIONS ALL YOU (N)EVER WANTED TO KNOW Julia M. Goodfellow Dynamic Processes: Lecture 1 Lecture Notes.
Design of Experiments Lecture I
Section 06 General Concepts of Chemical Equilibrium.
Computational simulation of Molecular dynamics Talk given at the theory group, School of Physics, USM 31 Oct pm.
In silico calculation of aqueous solubility Dr John Mitchell University of St Andrews.
In silico prediction of solubility: Solid progress but no solution? Dr John Mitchell University of St Andrews.
Introduction to Molecular Orbitals
Chemistry 6440 / 7440 Semi-Empirical Molecular Orbital Methods.
Lipinski’s rule of five
Computers in Chemistry Dr John Mitchell University of St Andrews.
Chemistry 6440 / 7440 Models for Solvation
Quantum Chemical and Machine Learning Approaches to Property Prediction for Druglike Molecules Dr John Mitchell University of St Andrews.
Case Studies Class 5. Computational Chemistry Structure of molecules and their reactivities Two major areas –molecular mechanics –electronic structure.
Cheminformatics II Apr 2010 Postgrad course on Comp Chem Noel M. O’Boyle.
The Advanced Chemical Engineering Thermodynamics The retrospect of the science and the thermodynamics Q&A -1- 9/16/2005(1) Ji-Sheng Chang.
Experimental Evaluation
Computers in Chemistry Dr John Mitchell University of St Andrews.
Monte Carlo Methods in Partial Differential Equations.
Computational Chemistry. Overview What is Computational Chemistry? How does it work? Why is it useful? What are its limits? Types of Computational Chemistry.
An Introduction to Molecular Orbital Theory. Levels of Calculation Classical (Molecular) Mechanics quick, simple; accuracy depends on parameterization;
Calculation of Molecular Structures and Properties Molecular structures and molecular properties by quantum chemical methods Dr. Vasile Chiş Biomedical.
1 Can we Predict Anything Useful from 2-D Molecular Structure? Dr John Mitchell Unilever Centre for Molecular Science Informatics Department of Chemistry.
Objectives of this course
Molecular Modeling Fundamentals: Modus in Silico C372 Introduction to Cheminformatics II Kelsey Forsythe.
By: Lea Versoza. Chemistry  A branch of physical science, is the study of the composition, properties and behavior of matter.  Is concerned with atoms.
In Silico Methods for ADMET and Solubility Prediction
Calibration Guidelines 1. Start simple, add complexity carefully 2. Use a broad range of information 3. Be well-posed & be comprehensive 4. Include diverse.
David Kim Allergan Inc. SoCalBSI California State University, Los Angeles.
Chem 1140; Molecular Modeling Molecular Mechanics Semiempirical QM Modeling CaCHE.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
1.Solvation Models and 2. Combined QM / MM Methods See review article on Solvation by Cramer and Truhlar: Chem. Rev. 99, (1999)
Quantum Chemical and Machine Learning Calculations of the Intrinsic Aqueous Solubility of Druglike Molecules Dr John Mitchell University of St Andrews.
1 Modelling in Chemistry: High and Low-Throughput Regimes Dr John Mitchell Unilever Centre for Molecular Science Informatics Department of Chemistry University.
Hypothesis testing and effect sizes Sylvain Chartier Laboratory for Computational Neurodynamics and Cognition Centre for Neural Dynamics.
Understanding Molecular Simulations Introduction
Properties of Pure Substances Chapter 3. Why do we need physical properties?  As we analyze thermodynamic systems we describe them using physical properties.
NCHRP Project Development of Verification and Validation Procedures for Computer Simulation use in Roadside Safety Applications SURVEY OF PRACTITIONERS.
Real Gas Relationships
Informed by Informatics? Dr John Mitchell Unilever Centre for Molecular Science Informatics Department of Chemistry University of Cambridge, U.K.
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
In silico calculation of aqueous solubility Dr John Mitchell Unilever Centre for Molecular Science Informatics Department of Chemistry University of Cambridge,
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Chapter 2 Modeling Approaches  Physical/chemical (fundamental, global) Model structure by theoretical analysis  Material/energy balances  Heat, mass,
Quantum Mechanics/ Molecular Mechanics (QM/MM) Todd J. Martinez.
LITERATURE SEARCH ASSIGNMENT A) Properties of diatomic molecules A diatomic molecule is a molecule composed of two atoms. For homonuclear diatomics the.
Javier Junquera Introduction to atomistic simulation methods in condensed matter Alberto García Pablo Ordejón.
12 Thermodynamics 12.1 Types of Enthalpy Change 12.2 Born-Haber Cycles 12.3 Enthalpy Changes – Enthalpy of Solution 12.4 Mean Bond Enthalpy 12.5 Entropy.
Interacting Molecules in a Dense Fluid
Role of Theory Model and understand catalytic processes at the electronic/atomistic level. This involves proposing atomic structures, suggesting reaction.
Review Session BS123A/MB223 UC-Irvine Ray Luo, MBB, BS.
In silico calculation of aqueous solubility Dr John Mitchell University of St Andrews.
Generalized van der Waals Partition Function
Development of Methods for Predicting Solvation and Separation of Energetic Materials in Supercritical Fluids Jason Thompson, Casey Kelly, Benjamin Lynch,
Advanced methods of molecular dynamics 1.Monte Carlo methods 2.Free energy calculations 3.Ab initio molecular dynamics 4.Quantum molecular dynamics 5.Trajectory.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Physiochemical properties of drugs Using the Sirius T3 to make measurements.
Lipinski’s rule of five
Topic 3: Thermal physics 3.1 – Thermal concepts
Applications of the Canonical Ensemble: Simple Models of Paramagnetism
S as an energy relationship
Overview of Molecular Dynamics Simulation Theory
Applications of the Canonical Ensemble:
TOPIC 17 EQUILIBRIUM 17.1 THE EQUILIBRIUM LAW.
The loss function, the normal equation,
Mathematical Foundations of BME Reza Shadmehr
Humanity v The Machines
Quantum One.
Presentation transcript:

Solubility is an important issue in drug discovery and a major source of attrition This is expensive for the pharma industry A good model for predicting the solubility of druglike molecules would be very valuable.

How should we approach the prediction/estimation/calculation of the aqueous solubility of druglike molecules? Two (apparently) fundamentally different approaches

Modelling in Chemistry Density Functional Theory ab initio Molecular Dynamics Monte Carlo Docking PHYSICS-BASED EMPIRICAL ATOMISTIC Car-Parrinello NON-ATOMISTIC DPD CoMFA 2-D QSAR/QSPR Machine Learning AM1, PM3 etc. Fluid Dynamics

Density Functional Theory ab initio Molecular Dynamics Monte Carlo Docking Car-Parrinello DPD CoMFA 2-D QSAR/QSPR Machine Learning AM1, PM3 etc. HIGH THROUGHPUT LOW THROUGHPUT Fluid Dynamics

Density Functional Theory ab initio Molecular Dynamics Monte Carlo Docking Car-Parrinello DPD CoMFA 2-D QSAR/QSPR Machine Learning AM1, PM3 etc. INFORMATICS THEORETICAL CHEMISTRY NO FIRM BOUNDARIES! Fluid Dynamics

Although their scientific domains overlap, they represent distinct philosophies …

The Two Faces of Computational Chemistry Theoretical Chemistry Informatics

“The problem is too difficult to solve using physics and chemistry, so we will design a black box to link structure and solubility”

Informatics and Empirical Models In general, Informatics methods represent phenomena mathematically, but not in a physics-based way. Inputs and output model are based on an empirically parameterised equation or more elaborate mathematical model. Do not attempt to simulate reality. Usually High Throughput.

Density Functional Theory ab initio Molecular Dynamics Monte Carlo Docking Car-Parrinello DPD CoMFA 2-D QSAR/QSPR (regression) Machine Learning (Support Vector Machine, Random Forest, Neural Network etc.) AM1, PM3 etc. Fluid Dynamics Towards Solubility Calculation: Informatics Techniques

Theoretical Chemistry “The problem is difficult, but by making suitable approximations we can solve it at reasonable cost based on our understanding of physics and chemistry”

Theoretical Chemistry Calculations and simulations based on real physics. Calculations are either quantum mechanical or use parameters derived from quantum mechanics. Attempt to model or simulate reality. Usually Low Throughput.

Drug Disc.Today, 10 (4), 289 (2005)

Density Functional Theory ab initio Molecular Dynamics Monte Carlo Docking Car-Parrinello DPD CoMFA 2-D QSAR/QSPR Machine Learning AM1, PM3 etc. Fluid Dynamics Towards Solubility Calculation: Theoretical Chemistry Techniques

The state of the art is that … … solubility has proved a difficult property to calculate. It involves different phases (solid & solution) and different substances (solute and solvent), and both enthalpy & entropy are important. The theoretical approaches are generally based around thermodynamic cycles and involve some empirical element.

The state of the art is that … … there has been solid progress but no solution. Some examples of good papers in the area …

We want to construct a theoretical model that will predict solubility for druglike molecules … We expect our model to use real physics and chemistry and to give some insight … We don’t expect it to be fast by QSPR standards, but it should be reasonably accurate … Now to our own work … We may need to include some empirical parameters…

Reference

Where do the data come from? Supersaturated Solution Subsaturated Solution 8 Intrinsic solubility values

Solubility measurements Classical Method Shake Flask Method ● Many published variations of this method ● With DMSO (normal method in industry) Kinetic Solubility ● Equilibria reached? ● Filtering Big errors ● Detection by UV-Vis Chromophores needed Datasets compiled from diverse literature data may have significant random and systematic errors.

K0K0 KaKa Intrinsic solubility- Of an ionisable compound is the thermodynamic solubility of the free acid or base form (Horter, D, Dressman, J. B., Adv. Drug Deliv. Rev., 1997, 25, 3-14) A NaA - ………. Na + AH S 0 is essentially the solubility of the neutral form only.

Diclofenac In Solution Powder ● We continue “Chasing equilibrium” until a specified number of crossing points have been reached ● A crossing point represents the moment when the solution switches from a saturated solution to a subsaturated solution; no change in pH, gradient zero, no re-dissolving nor precipitating…. SOLUTION IS IN EQUILIBRIUM Error less than 0.05 log units !!!! Supersaturated Solution Subsaturated Solution 8 Intrinsic solubility values * A. Llinàs, J. C. Burley, K. J. Box, R. C. Glen and J. M. Goodman. Diclofenac solubility: independent determination of the intrinsic solubility of three crystal forms. J. Med. Chem. 2007, 50(5), ● First precipitation – Kinetic Solubility (Not in Equilibrium) ● Thermodynamic Solubility through “Chasing Equilibrium”- Intrinsic Solubility (In Equilibrium) Supersaturation Factor SSF = S kin – S 0 “CheqSol”

For this study we (Toni Llinàs) measured 30 solubilities using the CheqSol method and took another 30 from other high quality studies (Bergstrom & Rytting). We use a Sirius glpKa instrument

Our goal is to ask …

Can we use theoretical chemistry to calculate solubility via a thermodynamic cycle?

 G sub from lattice energy & an entropy term (DMAREL based on B3LYP/6-31G*)  G solv from a semi-empirical solvation model (SCRF B3LYP/6-31G* in Jaguar) (i.e., different kinds of theoretical/computational methods, albeit with consistent functional and basis set )  G tr from ClogP

 G sub =  H sub – T  S sub  H sub = -U latt – 2RT  S sub = S trans,gas + S rot,gas – S phonon,xl U latt and S phonon,xl are obtained from DMAREL, using B3LYP/6-31G*. Intramolecular vibrational entropy is assumed to cancel.

 G solv from a semi-empirical solvation model (SCRF B3LYP/6-31G* in Jaguar) This is likely to be the least accurate term in our equation. We also tried SM5.4 with AM1 & PM3 in Spartan, with similar results.

 G tr from ClogP ClogP is a fragment-based (informatics) method of estimating the octanol-water partition coefficient.

What Error is Acceptable? For typically diverse sets of druglike molecules, a “good” QSPR will have an RMSE ≈ 0.7 logS units. An RMSE > 1.0 logS unit is probably unacceptable. This corresponds to an error range of 4.0 to 5.7 kJ/mol in  G sol.

What Error is Acceptable? A useless model would have an RMSE close to the SD of the test set logS values: ~ 1.4 logS units; The best possible model would have an RMSE close to the SD resulting from the experimental error in the underlying data: ~ 0.5 logS units?

Results from Theoretical Calculations

● Direct calculation was a nice idea, but didn’t quite work – errors larger than QSPR ● “Why not add a correction factor to account for the difference between the theoretical methods?” ● This was originally intended to calibrate the different theoretical approaches, but … Results from Theoretical Calculations

● Direct calculation was a nice idea, but didn’t quite work – errors larger than QSPR ● “Why not add a correction factor to account for the difference between the theoretical methods?” ● This was originally intended to calibrate the different theoretical approaches, but … Results from Theoretical Calculations

● Direct calculation was a nice idea, but didn’t quite work – errors larger than QSPR ● “Why not add a correction factor to account for the difference between the theoretical methods?” ● This was originally intended to calibrate the different theoretical approaches, but … Results from Theoretical Calculations

… ● Within a week this had become a hybrid method, essentially a QSPR with the theoretical energies as descriptors

Results from Hybrid Model

We find that  G solv is poorly correlated with logS and fails to appear in the regression equation. B_rotR is the proportion of bonds that are rotatable and describes the propensity of flexible molecules to be more soluble. We can also write an almost equivalent equation in the form …

This regression equation gave r 2 =0.77 and RMSE=0.71

How Well Did We Do? For a training-test split of 34:26, we obtain an RMSE of 0.71 logS units for the test set. This is comparable with the performance of “pure” QSPR models. This corresponds to an error of about 4.0 kJ/mol in  G sol.

Drug Disc.Today, 10 (4), 289 (2005)

U latt  S sub & b_rotR  G solv & ClogP

Solubility by TD Cycle: Conclusions ● We have a hybrid part-theoretical, part-empirical method. ● An interesting idea, but relatively low throughput - and a crystal structure is needed. ● Similarly accurate to pure QSPR for a druglike set. ● Instructive to compare with literature of theoretical solubility studies.

Solubility by TD Cycle: Conclusions ● We have a hybrid part-theoretical, part-empirical method. ● An interesting idea, but relatively low throughput - and a crystal structure is needed. ● Similarly accurate to pure QSPR for a druglike set. ● Instructive to compare with literature of theoretical solubility studies.

Solubility by TD Cycle: Conclusions ● We have a hybrid part-theoretical, part-empirical method. ● An interesting idea, but relatively low throughput - and a crystal structure is needed. ● Similarly accurate to pure QSPR for a druglike set. ● Instructive to compare with literature of theoretical solubility studies.

Solubility by TD Cycle: Conclusions ● We have a hybrid part-theoretical, part-empirical method. ● An interesting idea, but relatively low throughput - and a crystal structure is needed. ● Similarly accurate to pure QSPR for a druglike set. ● Instructive to compare with literature of theoretical solubility studies.

Thanks Pfizer & PIPMS Unilever Dave Palmer Toni Llinàs Florian Nigsch Laura Hughes