Molecular Flexibility 1/27 Molecular Flexibility Esther Kellenberger Faculté de Pharmacie UMR 7200, Illkirch Tel: 03 68 85 42 21 e-mail: ekellen@unistra.fr
Molecules have geometries… introduction Force field Geometry-based sampling Energy-based sampling conclusion 2/27 Molecules have geometries… … « good » geometries in bioactive conformations Methotrexate, used in treatment of cancer, autoimmune diseases methotrexate bound to therapeutcal targets (dihydrofolate reductase and thymidilate synthase)
Molecules have geometries… introduction Force field Geometry-based sampling Energy-based sampling conclusion 3/27 Molecules have geometries… … and there are imposible conformations unusual bond length, steric collisions, distorded ring, …
The number of molecular conformations introduction Force field Geometry-based sampling Energy-based sampling conclusion 4/27 The number of molecular conformations … depends on the molecular degrees of freedom = Number of rotatable bonds (NROT) Appr. number of simple bonds between two non-hydrogen atoms. For methotrexate, NROT= 10 Considering 3 possible angular values for each NROT yields 310 = 59 049 different conformations
How to evaluate the conformations? introduction Force field Geometry-based sampling Energy-based sampling conclusion 5/27 How to evaluate the conformations? potential energy In physics, potential energy exists when a force acts upon an object that tends to restore it to a lower energy configuration. Potential energy is the energy stored in a body or in a system due to its position in a force field or due to its configuration (SI unit= Joules, common unit = kcal/mol, 1 cal = 4.1868 J) A force field is a vector field that describes a non-contact force acting on a particle at various positions in space. stable (good) conformation low energy unstable (bad) conformation high energy
Experimental properties of a molecular is introduction Force field Geometry-based sampling Energy-based sampling conclusion 6/27 Experimental properties of a molecular is an mean of properties of populated conformers Boltzmann’s probability distribution Boltzmann’s probability distribution P (conformer of energy E) ~ exp ( - E / kb T) Boltzmann averaging for the observed property Property (molecule) = Σ P(conformer) X property(conformer)
of the potential energy 7/27 Chapter1: Evaluation of the potential energy of conformers
E = E covalent + E non covalent introduction Force field Geometry-based sampling Energy-based sampling conclusion 8/27 Molecular mechanics Molecular systems are modeled using Newton’s laws: each atom is simulated as a single particle each particle is assigned a radius (van der Waals), polarizability, and a constant net charge bonded interactions are treated as "springs" with an equilibrium distance equal to the bond length Molecular system's potential energy (E) in a given conformation as a sum of individual energy terms: E = E covalent + E non covalent
Covalent contributions to E introduction Force field Geometry-based sampling Energy-based sampling conclusion 9/27 Covalent contributions to E Bond stretching Angle stretching Torsion correction term Ex. of « standard » values: θ0= 109.5° for Csp3 θ0= 120° for Csp2 θ0= 180° for Csp Ex.of values: for Csp3‐Csp3 n= 3, γ= 0 Etors = 0 at 60°, 180° & -60° Ex.of « standard » values: r0=1.53Å for Csp3‐Csp3 r0=1.09Å for C‐H
Non covalent contributions to E introduction Force field Geometry-based sampling Energy-based sampling conclusion 10/27 Non covalent contributions to E Van der Waals term Lennard Jones potential (6-12) EVdW = A / rij12 – B/rij6 where A = 4 εσ12 B = 4 εσ6 ε = depth of the well σ ~ distance with minimum EVdW Electrostatic term Coulomb’s law Ecoulomb = δ + δ - / 4πε0 rij where δ = charge ε0 = solvent dielectric constant Desolvation and hydrophobic term
Key points on the energy surface introduction Force field Geometry-based sampling Energy-based sampling conclusion 11/27 Key points on the energy surface high barrier Energy low barrier « ugly » geometries Local minimum Local minimum Global minimum « good » geometries Conformational state
introduction Force field Geometry-based sampling Energy-based sampling conclusion 12/27 Energy minimization Given a starting geometry, deterministic algorithms allow the discovery of the adjacent local minimum. Energy starting final starting final Conformational state
The limits of conformational exploration introduction Force field Geometry-based sampling Energy-based sampling conclusion 13/27 The limits of conformational exploration by molecular dynamics Molecular dynamics trajectory may be seen as an exchange of potential and kinetic energy, with total energy being conserved. The dynamic system consists of moving particles (i.e. molecular atoms with coordinates and velocities). Particle position as a function of time is obtained by solving equation from the Newton’s laws. sampling depends on the number of frames (time) Energy Amplitude of motion controled by heat starting heating minimisation Conformational state
exploration of the molecular energy landscape 14/27 Chapter2: exploration of the molecular energy landscape
Torsions : the gateway to conformational sampling introduction Force field Geometry-based sampling Energy-based sampling conclusion 15/27 Torsions : the gateway to conformational sampling Energy surface with respect to two torsions
Systematic Search and random search introduction Force field Geometry-based sampling Energy-based sampling conclusion 16/27 Systematic Search and random search angular incremental or random change of selected rotatable bonds Solutions sorted by Energy (relative)
Generation of haloperidol 3D conformers by omega introduction Force field Geometry-based sampling Energy-based sampling conclusion 17/27 Generation of haloperidol 3D conformers by omega http://www.eyesopen.com/products/applications/omega.html 1. Enumerating ring conformations and invertible nitrogen atoms (fragment library) 2. Torsion alteration 3. Reassembly 4. Evaluation MMFF force field Knowledge based Tables pairwise rmsd>2.5Å, Energy threshold 28 conformers
Increasing complexity of energy hypersurface … introduction Force field Geometry-based sampling Energy-based sampling conclusion 18/27 Increasing complexity of energy hypersurface … Geometry-based sampling methods: a systematic search is possible if NROT < 4-5 Enumeration restricted to a fixed number of conformers for flexible compounds (Ex: 200 in omega) Energy-based sampling methods: (molecular dynamics ) stochastic sampling: Monte-Carlo and Genetic algorithm
motion toward energetically favored regions introduction Force field Geometry-based sampling Energy-based sampling conclusion 19/27 Monte Carlo random modification of conformations combined with acceptation criteria motion toward energetically favored regions Energi Energy Conformational state
Monte carlo algorithm X steps yes no yes, no, restore previous state introduction Force field Geometry-based sampling Energy-based sampling conclusion 20/27 Monte carlo algorithm Initial state X steps Χ11 Χ12 … Χ1n Perform move Evaluate E(x) Χ21 Χ22 … Χ2 n Randomly chosen torsional axis Random rotation around that axis yes Better energy no yes, acceptance test no, restore previous state replace state
introduction Force field Geometry-based sampling Energy-based sampling conclusion 21/27 Acceptation criteria The Boltzmann statistics: P is also called the Bolzmann factor Test if Ef < Ei new pose is accepted if Ef > Ei calculate probability P of acceptance Compare P with random number h if h < P new pose accepted if h > P restart based on last accepted pose Large energy differences and low temperature lower the Boltzmann factor P acceptance range goes down k: boltzman constant T: temperature æ æ Ef -Ei ç è æ P = = exp e ç ç - - ÷ ÷ ç ç ÷ ÷ è è kT
Genetic algorithm Evolution Genetic in the real world introduction Force field Geometry-based sampling Energy-based sampling conclusion 22/27 Genetic algorithm Genetic in the real world Genotype : ensemble of genes contained in chromosomes. Diploid organism : 2 copies of each gene. Phenotype : ensemble of individual features, resulting from gene expression. Evolution environment selection pressure survival if adapted phenotype parent 1 parent 2 Chromosomes generation 1 gene 2 copies + Reproduction child 1 child 2 child 3 genera- tion 2 evolution dominant genes adapted phenotype recessive genes inadapted phenotype & & generation 3
Genetic in the real world (continued) introduction Force field Geometry-based sampling Energy-based sampling conclusion 23/27 Genetic in the real world (continued) increased diversity after: Cross-over mutation * parent 1 parent 2 generation 1 * + Reproduction generation 2 child 1 child 2
introduction Force field Geometry-based sampling Energy-based sampling conclusion 24/27 « virtual genetic » « chromosome »: fingerprint which codes ligand conformation (e.g., Torsions: binary coding of the angle value) parent 1 1101100100110110 parent 2 1100111000011110 « crossover » : mixing 2 chromosomes (random position) parent 1 11011 | 00100110110 parent 2 11001 | 11000011110 child 1 11011 | 11000011110 child 2 11001 | 00100110110 « mutation » : random modification of one (or more) string parent 1 1101111000011110 parent 2 1100100100110110 child 1 1101011000011110 child 2 1101101100110110 « selection »: energy below a selection threshold (fitness)
Intermediate population introduction Force field Geometry-based sampling Energy-based sampling conclusion 25/27 initial population Size (4) individuals sorted by energy (color: high fitness low fitness) Intermediate population Final population Χ11 Χ12 … Χ1n Χ21 Χ22 … Χ2n Χ31 Χ32 … Χ3n Χ41 Χ42 … Χ4n crossover rate mutation rate Genetic operators max number of generations Χ11 Χ12 … Χ1n Χ51 Χ52 … Χ5n Convergence: evolution of the average/best fitness Χ21 Χ22 … Χ2n Χ61 Χ62 … Χ6n Χ31 Χ32 … Χ3n Χ71 Χ72 … Χ7n Χ41 Χ42 … Χ4n Χ81 Χ82 … Χ8n random Selection fitness score (green), Survival rate (4) Χ11 Χ12 … Χ1n Χ21 Χ22 … Χ2n Χ31 Χ32 … Χ3n Χ41 Χ42 … Χ4n
Genetic algorithm is an optimization method: introduction Force field Geometry-based sampling Energy-based sampling conclusion 26/27 Genetic algorithm is an optimization method: How to preserve the diversity? Selection pressure: child chromosome replace the worst members of the population / bias in the selection of parent chromosomes (towards high fitness or favoring torsion values seen in in previous populations) Multiple islands model: population split into sub-populations, with parallel simulations and occasionally swapping solutions (migration) Discard of redundant chromosomes (requires a metric to evaluate the similarity of individuals) the niche model: a niche is a ensemble of similar individuals in a population (as estimated by RMSD). If there a more than niche size individuals in the niche, then the new individual is replaces the worst individual of the niche rather than the worse individual of the population, in order to preserve diversity within the population.
introduction Force field Geometry-based sampling Energy-based sampling conclusion 27/27 CONCLUSION Conformational Sampling is the key element for understanding of molecular behavior It may range from very simple to extremely difficult, to impossible If you don’t do it well, better don’t do it at all: empirical methods based on molecular topology only may be more accurate than 3D models based on wrong – or too few – conformations Two main sources of errors: A.) wrong calculated energy- geometry landscape (poor Force Field parameterization) and B.) – insufficient sampling! Thanks to Dragos Howarth!