Presentation is loading. Please wait.

Presentation is loading. Please wait.

Force-field-based conformational sampling & docking: status, results, issues of the project Dragos Horvath ANR (Agence Nationale de la Recherche)

Similar presentations


Presentation on theme: "Force-field-based conformational sampling & docking: status, results, issues of the project Dragos Horvath ANR (Agence Nationale de la Recherche)"— Presentation transcript:

1 Force-field-based conformational sampling & docking: status, results, issues of the Docking@GRID project Dragos Horvath ANR (Agence Nationale de la Recherche) – sponsored project « ANR DOCK » http://www2.lifl.fr/~talbi/docking/ E.-G. Talbi, A.-A. Tantar, J.C. Boisson, N. Melab – LIFL, INRIA, Univ. Lille 1, FR S. Roy, L. Brillet - DGSV, CEA Grenoble, FR D. Horvath, S. Conilleau - Laboratoire d’InfoChimie UMR 7177, CNRS Univ. Strasbourg, FR

2 Outline… Conformational Sampling & Docking using a Hybrid Genetic Algorithm –Goals of the Docking@Grid project –GRID deployment strategies – the challenge. The force field issue Preliminary results & status –Small protein folding –Flexible Docking including protein loop sampling

3 The Challenge… “Well”-docked (folded) zone “Misdocked” (folded) conformers “Misdocked” (folded) conformers EE E#E# PDB Absolute Energy Minimum Native-like: one local clash Energy=f(Geometry) defined by the Empirical Force Field Publisher’s Force Field: « Nice H bond » My Force Field: « Bad Contact » Microstates contributing to macroscopic property

4 Why Docking@Grid? Classical Molecular Modeling software is work- station-based, and has extremely limited sampling capacities. –Typically, it explores an extremely limited phase space zone – the one that matches experimental constraints, if available; a likely irrelevant one, otherwise! –Modeling is a jungle of empirical parameters – it is easier to refit these rather than improve the sampling. –Massively parallel computing tools are new to the field and not necessarily used innovatively - rather as a large pool of workstations (more sampling-deficient docking attempts lead to more false positives to test).

5 The Ultimate Goals… Design of GRID-based folding and docking tools, performing extensive conformational sampling. Calibration of the associated molecular force field: –Generally applicable to proteins, sugars, organic ligands Full atom simulations, no large protein folding –Tailor-made for use with torsional degrees of freedom only! Continuum model for solvent effects! –Consistent, in the sense that docking affinities & folding propensities should be directly linked to computed force field energies of sampled ensembles no a posteriori rescoring of docking poses! Docking is just simultaneous conformational sampling of several molecules!

6 GA-driven Conformational Sampling & Docking Tool Based on a Genetic Algorithm, coding conformers as "chromosomes" in which each locus stands for a –Torsional angle value –Euler angle value –Translation value The In Silico Darwinian Evolution, leading to fitter and fitter (lower energy) conformers, was enhanced by –hybridization with various optimization heuristics –Fine-tuning of the parameters controlling the evolutionary strategy nn  …   

7 Hybrid Heuristics: Targeted torsion choice vs. Taboo search "Traditionalism": favoring torsion values seen in ‘seed’ ancestor solutions Seed (ancestor) cross-overs to (stochastically) replace random chromosome initialization with user-defined probability Taboo Search & Intrapopulation diversity control: –Discarding chromosomes that are too similar to fitter conformers or to previously visited geometries

8 Search for Optimal Sampling Setups in the Strategy Parameter Space… p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p 14 p 15 Population management Population size Number of parallel process Migration rate between ‘islands’ Evolution management Crossover rate Mutation rate One/two point crossover rate Selection pressure Dissimilarity limit Maximal age Convergence management Apocalypse (population reset) frequency Elitism Global stop condition

9 GRID 5000-based ‘Planetary’ Model If (free node) DEPLOY Island Model - Executables - Molecule File - Constraint Files - Seeds List - Taboo List - Operational Pars -Stablest Chromosomes -Sampling Success Score Solution Merger & Clusterer Conformer & Cluster Database ‘Panspermia’ policy center ‘recent’ clusters: seeds ‘old’ clusters: taboo Sampling Success vs. Operational Pars Stop:  max. ‘Mission Nr.’  no new clusters since N ‘missions’ www.grid5000.fr Operational Pars Selector

10 The Force Field Challenge Robust sampling is no good if energy model is not chemically meaningful! All force fields are equal[ly wrong], but some are more equal than others! Effective interatomic distance d 0 ij ‘Smoothing’ distance d ij

11 The Force Field Fitting Procedure… Install a NEW FF parameter configuration For each training molecule Locally explore neighborhood of experimental geometry Run GA-driven Exhaustive Sampler Add all sampled conformers to Data Base & calculate RMS Deviation from "native" geometry Recalculate energies of stored conformers according to current FF setup Calculate Folding  G according to chosen RMS radius All  G <0? Yes, for the first time! OK! Yes, reconfirmed! NO! Tryptophane cage 1L2Y Villin headpiece 1VII Tryptophane zipper 1LE4 PIN1 WW domain 1PIN Chignolin 1UAO Cyclodextrines (6 & 7 glucose units) Distance-dependent dielectric constant Weighing factor of the desolvation penalty Weighing factor of the hydrophobic contacts Weighing factor of repulsive van der Waals Attractive & repulsive van der Waals coefficients of the following type: 'co' (carbonyl C), 'o' (ether-type O), 'h' (aliphatic H), 'cp' (aromatic C), 'oc' (carbonyl O) RMS deviation from native

12 Trp cage 1L2YAb initio folding of Trp cage 1L2Y: native structure (reproducibly) found and ranked as most stable. Planetary model used max. 20 nodes for 4…5 days PDB

13 Villin headpiece 1VIIAb initio folding of the Villin headpiece 1VII: helical parts are seen to fold in a matter of days (40 nodes) – although not properly oriented. PDB

14 ChignolinGood news for the  -hairpin of Chignolin: out of the top 10 best ranked conformers, 8 are native-like Number one is not – but in this case, that may not be a problem PDB #1,#5

15 However, proper folding of 1LE1 could be achieved (though not reproducibly!) with previous force field versions – is the current setup too helix-specific? The 1LE1  -sheet is not the absolute energy minimum according to the current setup! PDB

16 Casein Kinase 2 (3BQC)Docking simulations in presence of flexible loops, such as the hinge region of Casein Kinase 2 (3BQC) – pose of ligand emodin and loop geometry are correctly predicted (3BQC not in FF training set). Flexible hinge region PDB, #1

17 Conclusions & Perspectives Although force field refitting seems to go in the right direction, it is far from the established goals of universality and robustness –We are looking for an empirical expression of the energy function of the molecular geometry, with the property that the low-energy zones of these landscapes match experimentally evidenced conformers, for – ideally – all the molecules and ligand-protein complexes. –Currently, only the parameters of the current functional form have been varied – maybe the functional dependence will have to be changed… or maybe there is NO APPROPRIATE FUNCTION SATISFYING THIS PROBLEM! –An interesting question would be whether (given that in the force field approach only the location of relevant minima matters) a functional form describing less rugged – easy to sample – energy landscapes may be found. Technically, this development is extremely demanding in terms of computational resources –Exhaustive conformational sampling/docking with 100…200 degrees of freedom are already a challenging task - days to weeks on 20…40 nodes (less for docking problems) –They have to be rerun with each new force field parameter set, in order to update the database of well-folded and decoy structures. –Readjusting the force field parameters is, per se, difficult because each objective function evaluation implies the recalculation of energy values of millions of conformers for several compounds! T T H A N K S H T H A N K S A T H A N K S N T H A N K S K T H A N K S S


Download ppt "Force-field-based conformational sampling & docking: status, results, issues of the project Dragos Horvath ANR (Agence Nationale de la Recherche)"

Similar presentations


Ads by Google