Free energy calculations General methods Free energy is the most important quantity that characterizes a dynamical process. Two types of free energy calculations:

Slides:



Advertisements
Similar presentations
Time averages and ensemble averages
Advertisements

Electrolyte Solutions - Debye-Huckel Theory
Simulazione di Biomolecole: metodi e applicazioni giorgio colombo
Chemical Kinetics : rate of a chemical reaction Before a chemical reaction can take place the molecules involved must be raised to a state of higher potential.
Statistical mechanics
Lecture 14: Special interactions. What did we cover in the last lecture? Restricted motion of molecules near a surface results in a repulsive force which.
Ion Solvation Thermodynamics from Simulation with a Polarizable Force Field Gaurav Chopra 07 February 2005 CS 379 A Alan GrossfeildPengyu Ren Jay W. Ponder.
Lecture 14: Advanced Conformational Sampling
The Calculation of Enthalpy and Entropy Differences??? (Housekeeping Details for the Calculation of Free Energy Differences) first edition: p
Lecture 3 – 4. October 2010 Molecular force field 1.
Analysis of Trajectory Data
Lecture 3 The Debye theory. Gases and polar molecules in non-polar solvent. The reaction field of a non-polarizable point dipole The internal and the direction.
Lecture 6 The dielectric response functions. Superposition principle.
Week 5 MD simulations of protein-ligand interactions Lecture 9: Fundamental problems in description of ligand binding to proteins: i) determination of.
Motion Along a Straight Line
Chapter 21 & 22 Electric Charge Coulomb’s Law This force of repulsion or attraction due to the charge properties of objects is called an electrostatic.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Solar System Physics I Dr Martin Hendry 5 lectures, beginning Autumn 2007 Department of Physics and Astronomy Astronomy 1X Session
Section 5: The Ideal Gas Law The atmospheres of planets (and the Sun too) can be modelled as an Ideal Gas – i.e. consisting of point-like particles (atoms.
Monte-Carlo simulations of the structure of complex liquids with various interaction potentials Alja ž Godec Advisers: prof. dr. Janko Jamnik and doc.
Free energies and phase transitions. Condition for phase coexistence in a one-component system:
Verification of the Crooks fluctuation theorem and recovery of RNA folding free energies D. Collin, F. Ritort, C. Jarzynski, S. B. Smith, I. Tinoco, Jr.
Javier Junquera Molecular dynamics in the microcanonical (NVE) ensemble: the Verlet algorithm.
Chapter 25 Electric Potential Electrical Potential and Potential Difference When a test charge is placed in an electric field, it experiences a.
Deca-Alanine Stretching
Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM and Related Methods Dr. Ronald M. Levy Statistical.
The Ideal Monatomic Gas. Canonical ensemble: N, V, T 2.
Structure and function of transporters from molecular dynamics simulations Serdar Kuyucak University of Sydney.
Common Potential Energy Functions of Separation Distance The Potential Energy function describes the energy of a particular state. When given as a function.
7. Lecture SS 2005Optimization, Energy Landscapes, Protein Folding1 V7: Diffusional association of proteins and Brownian dynamics simulations Brownian.
Advanced methods of molecular dynamics 1.Monte Carlo methods 2.Free energy calculations 3.Ab initio molecular dynamics 4.Quantum molecular dynamics 5.Trajectory.
Homework 2 (due We, Feb. 1): Reading: Van Holde, Chapter 1 Van Holde Chapter 3.1 to 3.3 Van Holde Chapter 2 (we’ll go through Chapters 1 and 3 first. 1.Van.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
ChE 452 Lecture 25 Non-linear Collisions 1. Background: Collision Theory Key equation Method Use molecular dynamics to simulate the collisions Integrate.
Firohman Current is a flux quantity and is defined as: Current density, J, measured in Amps/m 2, yields current in Amps when it is integrated.
Chapter 5: Conductors and Dielectrics. Current and Current Density Current is a flux quantity and is defined as: Current density, J, measured in Amps/m.
Interacting Molecules in a Dense Fluid
Lecture 9: Theory of Non-Covalent Binding Equilibria Dr. Ronald M. Levy Statistical Thermodynamics.
Review Of Statistical Mechanics Continued
Molecular Mechanics (Molecular Force Fields). Each atom moves by Newton’s 2 nd Law: F = ma E = … x Y Principles of M olecular Dynamics (MD): F =
Theory of dilute electrolyte solutions and ionized gases
INTEGRALS We saw in Section 5.1 that a limit of the form arises when we compute an area. We also saw that it arises when we try to find the distance traveled.
CHAPTER 2.3 PROBABILITY DISTRIBUTIONS. 2.3 GAUSSIAN OR NORMAL ERROR DISTRIBUTION  The Gaussian distribution is an approximation to the binomial distribution.
Molecular dynamics simulations of toxin binding to ion channels Quantitative description protein –ligand interactions is a fundamental problem in molecular.
Electrostatics of Channels—pK’s and potential of mean force for permeation Sameer Varma, NCSA/UIUC/Beckman Institute Computational Biology/Nanoscience.
Statistical Mechanics for Free Energy Calculations
Capacitance Chapter 25. Capacitance A capacitor consists of two isolated conductors (the plates) with charges +q and -q. Its capacitance C is defined.
Electrostatic field in dielectric media When a material has no free charge carriers or very few charge carriers, it is known as dielectric. For example.
Applications of the Canonical Ensemble: Simple Models of Paramagnetism
Introduction-2 Important molecular interactions in Biomolecules
8/7/2018 Statistical Thermodynamics
The Protonation State of the Glu-71/Asp-80 Residues in the KcsA Potassium Channel: A First-Principles QM/MM Molecular Dynamics Study  Denis Bucher, Leonardo.
Hydroxide and Proton Migration in Aquaporins
Ion Permeation through a Narrow Channel: Using Gramicidin to Ascertain All-Atom Molecular Dynamics Potential of Mean Force Methodology and Biomolecular.
Jing Han, Kristyna Pluhackova, Tsjerk A. Wassenaar, Rainer A. Böckmann 
Implications of Microgravity on Calcium Dynamics in the Cardiac Troponin Complex Morgan Beckett University of Arizona Department of Chemistry & Biochemistry.
Mechanism and Energetics of Charybdotoxin Unbinding from a Potassium Channel from Molecular Dynamics Simulations  Po-chia Chen, Serdar Kuyucak  Biophysical.
Michel A. Cuendet, Olivier Michielin  Biophysical Journal 
Volume 112, Issue 7, Pages (April 2017)
Experimental Overview
Volume 95, Issue 9, Pages (November 2008)
Ligand Binding to the Voltage-Gated Kv1
Absence of Ion-Binding Affinity in the Putatively Inactivated Low-[K+] Structure of the KcsA Potassium Channel  Céline Boiteux, Simon Bernèche  Structure 
Kristen E. Norman, Hugh Nymeyer  Biophysical Journal 
Michel A. Cuendet, Olivier Michielin  Biophysical Journal 
The Selectivity of K+ Ion Channels: Testing the Hypotheses
Michael Thomas, Dylan Jayatilaka, Ben Corry  Biophysical Journal 
Mechanism of Anionic Conduction across ClC
Chze Ling Wee, David Gavaghan, Mark S.P. Sansom  Biophysical Journal 
David L. Bostick, Karunesh Arora, Charles L. Brooks 
Presentation transcript:

Free energy calculations General methods Free energy is the most important quantity that characterizes a dynamical process. Two types of free energy calculations: 1. Path independent methods for calculation of relative binding free energies (e.g. free energy perturbation (FEP), thermodynamic integration(TI). 2. Path dependent methods for calculation of absolute binding free energies,e.g. umbrella sampling (US) with weighted histogram analysis method (WHAM), steered MD with Jarzynski’s equation (JE) 1

Example : Binding of a K + ion to gramicidin A (gA) Initial state (A): K + ion in bulk, a water molecule at the binding site. Final state (B): K + ion at the binding site, water in place of the ion. In the path independent method, we calculate the binding energy of the K + ion from the free energy difference between the two states: gA (bulk)(bulk) A B 2 W K + + K + gA+ W

In the path dependent method, we first choose a continuous path from the binding site to bulk water (reaction coordinate). In the case of gramicidin, the channel axis is the obvious choice for the reaction coordinate. The free energy profile of the K + ion along this path is calculated using a method such as umbrella sampling. The free energy of binding is given by the difference in free energy between the binding site and the bulk. 3 K+K+ K+K+ gramicidinbulk B A

Free energy perturbation (FEP) Free energy differences can be calculated relatively easily and several methods have been developed for this purpose. The starting point for most approaches is Zwanzig’s perturbation formula for the free energy difference between two states A and B: The equality should hold if there is sufficient sampling. However, if the two states are not similar enough, this is difficult to achieve and there will be a large hysteresis effect (i.e. the forward and backward results will be very different). 4

Derivation of the perturbation formula From statistical mechanics, the Helmholtz free energy is given by (we will assume it is the same as Gibss free energy and use G for it) (Z: partition function) Where it is assumed that the states A and B are similar 5

FEP with alchemical transformation To obtain accurate results with the perturbation formula, the energy difference between the states should be < 2 kT, which is not satisfied for most biomolecular processes. To deal with this problem, one introduces a hybrid Hamiltonian and performs the transformation from A to B gradually by changing the parameter  from 0 to 1 in small steps. That is, one divides [0,1] into n subintervals with { i, i = 0, n}, and for each i value, calculates the free energy difference from the ensemble average 6

The total free energy change is then obtained by summing the contributions from each subinterval The number of subintervals is chosen such that the free energy change at each step is < 2 kT, otherwise the method may lose its validity. Points to be aware of: 1.Most codes use equal subintervals for i. But the changes in  G i are usually highly non-linear. One should try to choose i such that  G i remains around 1-2 kT for all values. 2.The simulation times (equilibration + production) have to be chosen carefully. It is not possible to extend them in case of non-convergence (have to start over). 7

Thermodynamic integration (TI) Another way to obtain the free energy difference is to integrate the derivative of the hybrid Hamiltonian H(  : This integral is evaluated most efficiently using a Gaussian quadrature. In typical calculations for ions, 7-point quadrature is sufficient. (But check that 9-point quadrature gives the same result for others) The advantage of TI over FEP is that the production run can be extended as long as necessary and the convergence of the free energy can be monitored (when the cumulative  G flattens, it has converged). 8

Example: Free energy change in mutation of a ligand A very common question is how a mutation in a ligand (or protein) changes the free energy of the protein-ligand complex. + A  G A  G bulk (A  B)  G bs (A  B) +  G B B Thermodynamic cycle 9

Applications 1.Ion selectivity of potassium channels 2. Selectivity of amino acid transporters (e.g. glutamate transporter) 3. Free energy change when a sidechain is mutated in a bound ligand. Similar calculation as above. Important in developing drug leads from peptides. 10

2. Path dependent methods Consider the previous example of binding of a K + ion to the gramicidin channel. In the path dependent method, K + ion is moved from bulk to the binding site in small steps and the free energy profile, W(z) (also called potential of mean force or PMF), is constructed. The relative binding free energy is given by The binding constant and the absolute binding free energy are determined from the PMF by invoking a 1D approximation 11

Calculation of PMF from umbrella sampling One samples the ligand position along a reaction coordinate and determines the potential of mean force (PMF) from the Boltzmann eq. Here z 0 is a reference point, e.g. a point in bulk where W vanishes. In general, a particle cannot be adequately sampled at high-energy points. To counter that, one introduces harmonic potentials, which restrain the particle at desired points, and then unbias its effect. For convenience, one introduces umbrella potentials at regular intervals along the reaction coordinate (e.g. ~0.5 Å). The PMF’s obtained in each interval are unbiased and optimally combined using the Weighted Histogram Analysis Method (WHAM). 12

Points to consider in umbrella sampling Two main parameters in umbrella sampling are the force constant, k and the distance between windows, d. In bulk, the position of the ligand will have a Gaussian distribution given by The overlap between two Gaussian distributions separated by d The parameters should be chosen such that 10% > % overlap > 5% If the overlap is too small, PMF will have discontinuities If it is too large, simulations are not very efficient.

Steered MD (SMD) simulations and Jarzynski’s equation Steered MD is a more recent method where a harmonic force is applied to an atom on a peptide and the reference point of this force is pulled with a constant velocity. It has been used to study unfolding of proteins and binding of ligands. The discovery of Jarzynski’s equation in 1997 enabled determination of PMF from SMD, which has boosted its applications. Jarzynski’s equation: Work done by the harmonic force This method seems to work well in simple systems and when  G is large but beware of its applications in complex systems! 14

Beware of: 1) Problems with force fields The force fields that are commonly used in MD simulations (e.g., CHARMM, AMBER, GROMOS) neglect the polarization interaction. While the effects of induced polarization have been included in a mean field sense by boosting the partial charges, such an approximation is expected to work only in the environment where the force field has been optimized but not in a different situation. The most relevant case is the force fields for proteins, which are optimized for bulk water. One has to be wary of using the same force fields for membrane proteins because lipid molecules have a very different polarization characteristic compared to water (dielectric constants are 2 and 80, respectively) Other cases that require caution are: interfaces and highly charged ions. 15

2) Problems with sampling At zero temperature, the potential function U is sufficient to characterize the system completely. At room temperature, the fundamental quantity is the free energy, F = U  TS, which creates the sampling problem. Example: F=  24, U=  41, and TS=  17 (kJ/mol) for liquid water at STP. Statistical weight: But if S 2 >> S 1 we may have F 2 < F 1 16

Dimer formed by two right- handed β helices Each monomer consists of 16 amino acid residues Pore is 26 Å long, 4 Å in diamet. Structure is stabilized by hydrogen bonds Occupied by a single-file water chain (~7) Water dipoles are aligned with the channel axis Conducts monovalent cations at diffusion rates (divalent ions bind and block) Examples from gramicidin A channel 17

1. Potential energy profile for a K + ion in gramicidin A BD simulations – inverting data gives | MD simulations – Pot. mean force U w = 8 kT, U b = 5 kT, U w = 5 kT, U b = 22 kT 18

Free energy calculations Free energy differences are calculated using the thermodynamic integration (TI) and free energy perturbation (FEP) methods. e.g. a K + ion in bulk is translocated to the gA center while the water molecule at that position is translocated to the ion’s position. Two step process (via a neutral water, W 0 ) to minimize fluctuations: W  W 0  K + (gA center) K +  W 0  W (bulk) To check hysteresis effects, free energy differences are calculated both in forward (  G + ) and backward (-  G - ) directions. FE (kT) G+G+  G _  G av TI FEP PMF

Free energy of translocating a K + ion to the gA center Running average for 700 ps Solid: forward (bulk to gA) Dashed: backward 1. Convergence: The free energy plot should become flat 2. No hysteresis: The two results should agree within 1 kcal/mol

Distribution of water dipole moments in bulk and in gramicidin In the presence of a K + ion, the dipole moment of hydration waters decreases in bulk but increases in the gramicidin A channel. Ab initio simulations in gramicidin show the importance of polarization int. 21

Electrostatic energy of a K + ion + 6 waters 22

Each window is simulated for 400 ps Well depths: Ub(K) ~ 7 kT Ub(Ca) ~ 2 kT Ca 2+ binding to gA and blocking of K + ions cannot be explained. *** Problems with divalent ions *** PMF results for K +, Ca 2+ and Cl  ions 23

Lessons from the gramicidin simulations 1.Current force fields which ignore polarization are not expected to work in narrow pores where water and ions form a single file. Ab initio MD calculations indicate that hydration waters of a K + ion are more polarized in gA than in bulk. 2.Hydration waters around a divalent ion are more polarized than those of a monovalent ion. Example: dipole moment of water from ab initio calculations: Bulk water: 3.0 Debye Hydration shell of K + ion: 2.8 Debye Hydration shell of Ca 2+ ion: 3.4 Debye Thus the current force fields, which are optimised for monovalent ions, cannot work well for divalent ions. 24

2. Sampling problem in a simple vs complex system: Test of Jarzynski’s Equation Carbon nanotube Gramicidin A channel 25

Comparison of K + ion PMF’s obtained from umbrella sampling & WHAM and from Jarzynski’s equality using steered MD simulations Carbon nanotube Gramicidin A channel v(A/ns) 26

Sampling is more difficult in non-equilibrium methods 1.In a carbon nanotube, interaction of the K + ion (and the hydration waters) with the C atoms on the wall are short range, hence equilibration of the system is quite fast. In such a situation, Jarzynski’s Equation works as well as umbrella sampling, and because it is simpler to implement, it would be the method of choice 2.In the gramicidin channel, the K + ion (and the hydration waters) interact with the charged atoms on the protein wall. Because Coulomb interaction is long range, equilibration takes more time. In such cases, Jarzynski’s Equation is not very reliable, and umbrella sampling should be preferred for accurate results. 27

Equilibration and convergence issues in PMF calculations Finite resources means we need to make optimal choices for equilibration and production times in free energy calculations. Equilibration is the initial simulation data, where the system is still evolving (not equilibrated yet) and must be thrown away. Choosing it too short will blemish the result and too long will waste computing time. During production, the system is fluctuating around equilibrium. It must be run long enough to allow the system to sample all energetically important states. Otherwise the calculations will not be accurate. Convergence tests can be used for this purpose but note that there are no absolute criteria that one can use (running longer is the only choice if you are in doubt). 28

The ion-ion potentials in force fields are determined from combination rules with no direct experimental input. This is not satisfactory and any guidance from ab inito calculations would be very useful. In the examples below the PMF’s for the dissociation of Na-Cl and Ca-Cl ion pairs are calculated from ab initio MD (Car-Parrinello MD) simulations using the constraint-force method (faster than umbr. samp). The average force needed to keep the ions at a fixed distance, r, is calculated for a range of r values at A intervals and these are integrated to determine the PMF. Note that ion-water dynamics is fast which makes these picosecond ab initio calculations feasible. They would not be feasible for proteins. Example: ab initio calculation of PMF’s for Na-Cl and Ca-Cl

Example: PMF for dissociation of Na-Cl Total run is 6 ps. The data is divided into 1 ps blocks to check equilibration Here 2 ps of data are dropped and the PMF is obtained from the last 4 ps. 1-2 ps equilibration 3-6 ps production (black line)

Another way to check the equilibration is to drop successively more data for equilibration and see if the result changes. r = 3.1 A r = 3.9 A r = 4.7 A

Comparison of ab initio and classical PMF’s for Na-Cl None of the classical force fields can match the ab initio PMF. In particular AMBER has a deep contact min. which leads to crystallization

Accelerated MD for speeding up convergence Using biasing potentials in the low energy regions of the potential energy surface, barriers can be lowered, leading to faster convergence. For the Na-Cl PMF considered here, this leads to ~4-fold speed up.

Example: PMF for dissociation of Ca-Cl Convergence could not be obtained after 23 ps of ab initio simulation. Inspection of the forces shows large variations according n(Ca). Ca hydration numbers n(Ca) = 5 r<3.7 n(Ca) = 5 or 6 for r 3.7 – 4.9 n(Ca) = 6 r>4.9

Ca-Cl PMF and its dependence on n(Ca) The PMF with n(Ca) = 5 in the intermediate region is unphysical Switching to n(Ca) = 6 yields a reasonable PMF.

Comparison of ab initio and classical PMF’s for Na—Cl The CHARMM force field does a better job than that of Dang & Smith but it still needs to be improved.

Example: PMF for a K + ion in the Kv1.2 potassium channel The trigger for permeation of K + ions is the entry of the K + ion at cavity to the S4 binding site. To find out whether the K + ion can bind to S4 while S1 – S3 are occupied, or they have to move to S0 - S2 to enable binding, two PMF’s are constructed with the final states S1 – S3 – S4 and S0 – S2 – S4.

The first check in PMF calculations is whether there are sufficient overlaps between the neighbouring windows. This can be achieved by visually checking the density plots for all the windows (top) or by directly calculating the overlaps and plotting them in a bar graph (bottom).

Next decide on the equilibration time and start collecting density data. How long do we run? An efficient way to decide is to run in small blocks and check for convergence in the accumulated data. In the example here, the total run is 600 ps and 100 ps is dropped for equilibration. To show convergence, data are added in 100 ps blocks and a PMF is constructed from the accumulated data at every 100 ps

One acumulates a great deal of trajectory data during the PMF calculations, and it would be pity if all one extracts from it is the reaction coordinate of the ion. A detailed picture of the reaction process can be obtained using visualisation methods (e.g. making a video!) Here is an example showing that in the S1 – S3 – S4 PMF, the cavity ion does not trap the water at S4 as it moves in.

The K+ ions in the filter move together, e.g., from S1 – S3 to S0 – S2 or S2 – S4. This can be studied by constructing the PMF for the center of mass of the ions. But how do we know that this is a reliable reaction coordinate? A simple way to show this is to plot the distribution of ion-ion distances and show that it remains Gaussian as the pair moves across the filter (as if they are connected by a spring)

Example: PMF for binding of charybdotoxin to K + channel From the previous examples, we have seen that ions equilibrate quite fast (~100 ps) and < 1 ns production run is sufficient for PMF. For complex ligands, the situation is obviously more complicated. For one thing, the ligand may be distorted, which will lead to erroneous results. One also requires much longer equlibration of the system (typically > 1 ns), and longer production runs ( > 1 ns).

Convergence of the toxin PMF Force constant k=20 kcal/mol/A 2 Umbrella windows at 0.5 A Each color represents 400 ps of sampling. The first 1.2 ns is dropped for equilibration and PMF is obtained from the last 2 ns (black line)

Equilibration and convergence issues in FEP & TI 1. FEP calculations In FEP, one has to decide on the number of windows and the equilibration time in advance. The windows are created serially, so if the equilibration time is inadequate, it has to be repeated using longer equilibration time and the initial data are wasted. A second potential problem in FEP calculations is the requirement that  G i remains around 1-2 kT for all windows. Because the change in the free energy is nonlinear, it is very difficult to guess the number of windows one should use. For the same reason, using fixed intervals is not optimal. Exponentially spaced intervals would reduce the required number of windows by half. 44

Example: Na + binding energy in glutamate transporter Window  G(Na + ; b.s.  bulk) 40eq eq exp.27.1

Free energy change  G at each step of FEP calculation

Exponential versus equal spacing for  The interval [0, 0.5] is mapped to an exponential for 40 windows. (Fold it over to get the interval [0.5, 1] ) exp. equal

2. TI calculations In TI, one only need to specify the number of windows in advance. The data can be divided into equilibration and production parts later. Moreover, one can continue accumulating data if there is a problem with convergence, thus there is no wastage of data. Convergence can be monitored by plotting the running average of the free energy. Flattening out of the curve is usually taken as a sign for convergence. Because small number of windows are used in TI, equilibration may prove difficult in some systems. An initial FEP calculation with large number of windows can resolve this problem (choose the TI windows from the nearest FEP window). 48

Example: Na + and Asp binding energies in glut. transporter TI calculation of the binding free energy of Na+ ion to the binding site 1 in Gltph. Integration is done using Gaussian quadrature with 7 points. Thick lines show the running averages, which flatten out as the data accumulate. Thin lines show averages over 50 ps blocks of data.

Asp binding energy in glutamate transporter TI calculation of the binding free energy of Asp to the binding site in Gltph. Asp is substituted with 5 water molecules. First 400 ps data account for equilibration and the 1 ns of data are used in the production.