An Introduction to Free Energy Calculations School of Molecular and Microbial Sciences (SMMS) Chemistry Building (#68) University of Queensland Brisbane, QLD 4072, Australia Phone: FAX: Alan E. Mark
Free Energy: The ‘Holy Grail’ of Computational Chemistry Almost all measurable properties depend (directly or indirectly) on the free energy of the system Probability of finding the system in a given state Hamiltonian (energy function) How the system preferentially evolves in time. Free Energy - conformational equilibria - association constants (equilibrium properties) - rates of reaction - transition paths (non-equilibrium processes)
Class of Properties Examples Experimental Quantity Instantaneous properties Only current state -potential energy -kinetic energy -configuration... (easy) -heat of vaporization -temperature -pressure -structure Time dependent properties Current state + previous states -position, f(t) -velocity, f(t) -orientation, f(t) -dipole moment, f(t)... (depends on time scale) -diffusion -order parameters -dielectric properties Global properties All possible states -free energy -entropy … (hard, depends on the extent of the available phase space) -binding constants -relative stability -barrier heights -heat capacity -compressibility -most things experimentally measurable MD Simulations Table 1: Classification of Properties
Thermodynamic Relations
Classical Statistical Mechanics For a system of N particles: masses: Cartesian coordinates: Conjugate momenta: Interaction function: Hamiltonian: kineticpotential
Z = the canonical partition function: The sum or integral over all possible states of the system (normalization). Link to Thermodynamics: Helmholtz free energy For a system in equilibrium which can be in M energy states, the probability P n of the system having an energy E n is given by: k = Boltzmann constant T = temperature Z = canonical partition function Basic statistical mechanics
= ensemble average of A
Classical Statistical Mechanics Partition Function F = –kT lnZ = F ideal gas + F configurational Free energy: h = Plancks constant k = Boltzmanns constant T = temperature
Methods to Compute Free Energies (cont.) Direct Determination of the Absolute Free Energy: Task: Find regions of configurational space ( ) that determine Z. Problems: 1.Never sample all configurational space. negative 2.Each addition part of phase space sampled makes a negative (favorable) contribution to the free energy i) Absolute free energy only if complete partition function can be enumerated. ii) Simulation techniques absolute free energy systematically over-estimated. Integrand always > o
Methods to Compute Free Energies (cont.) Absolute Free Energy as Ensemble Average (1): Q. Can one estimate the absolute free energy as an ensemble average? Step 1. integrate momenta and positions separately. Step 2. Divide by a volume integral (no change). Step3. Invert to express as an ensemble average. excess free energy
Methods to Compute Free Energies (cont.) Absolute Free Energy as Ensemble Average (2): Absolute free energy as an ensemble average NOTE: Exponent in +ve. Low energy states dominate ensemble BUT high energy states dominate the free energy (entropy is important). Probability of sampling states contributing little (or nothing) to F high Probability of sampling states contributing a lot to F low. Very poor convergence!! Q. Can one estimate the absolute free energy as an ensemble average? NO
Methods to Compute Free Energies (cont.) Absolute Free Energy Simulation no Experiment no Experimentally only measure Relative Free Energies: differencetwo (or more) states i.e. difference in free energy F between two (or more) states of a system. F AB = F B – F A
Methods to Compute Free Energies (cont.) F binding = F bound – F unbound F stability = F folded – F unfolded F partition = F solvent B – F solvent A Relative free energies: Examples
Methods to Compute Free Energies: G from experiment Two Possibilities: 1.Relative probability of finding the system in one of two states. reversible 2. Work required to go from an initial to a final state via a reversible path. Example 1. Relative Probability: Free energy of binding, G binding Association constant, K A = relative probability of bound/free ligand + acceptor ↔ ligand:acceptor KAKA complex
Methods to Compute Free Energies: G from experiment Two Possibilities: 1.Relative probability of finding the system in one of two states. reversible 2. Work required to go from an initial to a final state via a reversible path. Example 2. Reversible work: Change of state. UsingV(x) = potential constant T
Methods to Compute Free Energies: G from simulation Two Possibilities: 1.Relative probability of finding the system in one of two states. reversible 2. Work required to go from an initial to a final state via a reversible path. Same as experiment A. In terms of a probability function, P(r) B. As reversible work along coordinate work = integral of force (derivative of the potential w.r.t a given coordinate) V(r) = potential Choice of coordinate system is arbitrary
Methods to Compute Free Energies (cont.) Temperature Integration: Reference State: Can use low temperature harmonic state as a reference but phase transitions are a problem. Method: Perform constant V,N simulations at different temperatures to obtain average of E. Integrate numerically
Methods to Compute Free Energies (cont.) Method: Perform constant N,T simulations at different volumes to compute pressure. Integrate numerically As for temperature integration. Reference State: A state with a large volume (treat as ideal gas). Problems with phase transitions. Pressure Integration:
Free Energy as a Function of a Spatial Coordinate, ξ. Integration along a path: Derivative of free energy as function of ξ 1. Work along a reversible path
Free Energy as a Function of a Spatial Coordinate, ξ. Derivative of free energy as function of ξ ξ is a spatial coordinate therefore: (cartesian or internal ) Free energy profile (PMF) = ⌠ average force acting along, ξ ⌡ Method: Simulate at given values of ξ, average force acting along ξ, integrate. PMF = Potential of mean force 1. Work along a reversible path
Zangi, R., et al. Proteins 43 (2001) Free energy barrier estimation of unfolding the α-helical surfactant-associated polypeptide C SP-C in water. Compare the stability of lung epithelium surfactant protein C in water and methanol SP-C (LRIPCCPVNLKRLLVVVVVVVLVVVVIVGALLMGL) Free energy unwinding two helix turns (Val25 to Leu32) water methanol
Free Energy as a Function of a Spatial Coordinate, ξ. 2. Probability Distribution: Estimate F(ξ) from probability distribution along ξ, P(ξ) ξ, 1-dimension of multi-dimensional space. sampling possible limited range (e.g. - < ξ < for dihedral) sampling possible
Free Energy as a Function of a Spatial Coordinate, ξ. 2. Probability Distribution: Estimate F(ξ) from probability distribution along ξ, P(ξ) Example: PMF from radial distribution function (rdf). PMF = average force from other H2O molecules Use in implicit solvent model to reduce vacuum effects. 2. Probability Distribution (cont.):
Free Energy as a Function of a Spatial Coordinate, ξ. 2. Probability Distribution (cont.): Estimate F(ξ) from probability distribution along ξ, P(ξ) Problems: Sampling must be continuous and sufficient for all ξ.
Methods to Compute Free Energies: Umbrella Sampling. Aim: Bias or focus sampling in important regions of phase space. Method: 1. Modify energy function V(r) to restrict sampling only to region(s) of interest. weighting function
Methods to Compute Free Energies: Umbrella Sampling. Aim: Bias or focus sampling in important regions of phase space. Method: 1. Modify energy function V(r) to restrict sampling only to region(s) of interest. weighting function e.g. harmonic restraining potential 2. Correct for Umbrella (weighting function ). Non-Boltzmann weights. Correction: Unbiased ensemble average from a biased ensemble. Umbrella Sampling: Widely used general method to improve sampling.
Methods to Compute Free Energies: Umbrella Sampling. 3. Umbrella Sampling. A. Modify potential energy function to focus (bias) sampling along ξ. weighting function e.g. harmonic restraining potential B. Umbrella potential biased sampling probability distribution P(ξ) U. C. Determine F(ξ) (unbiased) from biased distribution. umbrella potentialbiased distribution unknown offset work to apply umbrella potential
Methods to Compute Free Energies: Umbrella Sampling. 3. Umbrella Sampling. C. Determine F(ξ) (unbiased) from biased distribution. umbrella potentialbiased distribution unknown offset work to apply umbrella potential
Interaction of the cell-penetrating peptides with lipid bilayers Pulling of a single penetratin molecule through a DPPC bilayer. Semen Yesylevskyy
Thermodynamic Cycles: Basic principles. 1. Free Energy is a State Function: Difference in free energy between two states is independent of the path used to go between them.
Mark, et al. (1991) J. Chem. Phys. 94, Calculation of relative free energy via indirect pathways Optimal choice of path maximizes the relaxation of environment
Thermodynamic Cycles: Basic principles. 2. G for Any Closed Path = 0: Thermodynamic Cycle a closed path for which G = 0
Thermodynamic Cycles: Relative free energies 1. Hydration Free Energy: Difference between A and B
Thermodynamic Cycles (cont.). 2. Binding Free Energy: Modification of a ligand
Methods to Compute Free Energies: Coupling Parameter Approach Aim: General expression for the difference in free energy between two arbitrary states A and B. Note: A and B effectively become two states of a single system Hamiltonian function of, H( ) Partition function function of, Z( ) Free Energy function of, F( ) Express the Hamiltonian as a function of a coupling parameter, Example mass of atom
Methods to Compute Free Energies: Coupling Parameter Approach 1. Perturbation Formula: Express F AB as the ratio of the probability of state A and state B. Estimate F AB from the probability of finding a configuration appropriate to state B [H( B )] in an ensemble of states generated at state A [H( A )] or visa versa
Methods to Compute Free Energies: Coupling Parameter Approach 1. Perturbation Formula (cont.): In principle the perturbation formula is correct for a mutation between any two states A and B. refers to sampling over all possible states. In practice the perturbation approach only converges to the correct answer if there is a strong overlap of configurational space (low energy states) in state A and B. (i.e if the difference between A and B is small). Expressed generally as an integral over a series of small changes. perturbation formula
Methods to Compute Free Energies: Coupling Parameter Approach 2. Integration Formula: Reversible work along a path (defined by dependence of Hamiltonian). Analytical derivative for F’( )
Methods to Compute Free Energies: Coupling Parameter Approach 2. Integration Formula: Reversible work along a path (defined by dependence of Hamiltonian). Analytical derivative for F’( ) integration formula Integral of the average derivative of the potential w.r.t. an arbitrary coordinate
Methods to Compute Free Energies: Coupling Parameter Approach 1. Perturbation Formula: perturbation formula Note: Convergence depends on overlap of low energy regions of configurational space at and + . must have small effect on configurations sampled Requirements: 1. Equilibrium 2. Convergence of the ensemble average
Perform integration by: 1. Making function of time and slowly change during simulation. (single-configurational TI or slow growth) 2. Simulating at fixed values and integrate numerically (multi-configurational TI or numerical quadrature) integration formula Methods to Compute Free Energies: Coupling Parameter Approach Requirements (at each ) A. Equilibrium B. must converge. C. F( ) must be a smooth function.
O Methods to Compute Free Energies: Coupling Parameter Approach HH Example: Difference in free energy of hydration between water and methanol. O CH3 H water methanol Gradually change water into methanol by changing the charge, LJ and angle parameters during the course of a simulation. Calculated the work done. coupling parameter free energy methanol water
Perturbation or Integration C’ x0x0 X V C x’ 0 Ensembles overlap: Use perturbation Derivative of free energy changes slowly with X: Use integration
Perturbation: Phase space overlap efficient in efficient
Methods to Compute Free Energies: Entropy and Enthalpy Q. Why not calculate entropy and enthalpy? and Free Energy: Entropy is expressed in terms of the correlations between Entropy: combine Derivative of S w.r.t. :
Methods to Compute Free Energies: Entropy and Enthalpy (cont.) Thermodynamic Integration Formula for S AB : Enthalpy difference, E AB is given by the difference between the two end states. Enthalpy: NOTE: F expressed only in terms of i.e. interactions which change as a function of (local interactions). S and E depend on (all the interactions). Accuracy of S and E << F (1-2 orders of magnitude)
Table 2: Summary of Methods to Calculate Free Energies
Aim: Estimate excess chemical potential, µ. excess chemical potential = excess free energy per particle* * excess free energy = free energy in excess of ideal gas (configurational part) Methods: 1. Switch off all intermolecular interactions (N,V,T). (i.e turn system into an ideal gas and estimate work). 2. Add or remove 1 particle (V,T or P,T). Methods to Compute Free Energies: Applications
(Widom) Particle Insertion (Test Particle Insertion) Particle Insertion: A perturbation approach: For an equilibrium configuration (MD or MC) of N particles. 1.Randomly attempt to add in 1 additional particle (single step). 2.Estimate free energy using the perturbation formula. State A: N particles State B: N+1 particles V = change in V for N+1 particles Randomly insert a test (ghost) particle M times)
(Widom) Particle Insertion (Test Particle Insertion) Dense systems: Low energy (good) configurations are never sampled. No spontaneously formed cavities large enough to accept test (ghost particle) Solution: Slowly grow particles, allow the system to adapt. Particle insertion fails.
Particle insertion works well for low density systems but not for high density systems. Question: For what type of system would you expect particle deletion to work well? Hint: For particle insertion you must sample locations for which it is favourable to add particle (i.e. a cavity). For particle deletion you need to sample locations where it is unfavourable to have a particle (i.e. where particles overlap). (Widom) Particle Insertion (or Test Particle Insertion)
Example: Creation of a cavity (purely repulsive) in SPC water SPC = Simple Point Charge (water model) Particle insertion: Works well if many appropriately sized cavities are sampled. If no appropriate cavities method fails Thermodynamic integration: Larger cavities can be created because the system can relax. repulsive part of LJ
Depending on how the water molecules pack around the cavity the free energy of certain cavity sizes is harder to predict. Relative accuracy of the excess Gibbs free energy (derivative) for different cavity sizes. Thermodynamic integration: Methods to Compute Free Energies Example: Creation of a cavity (purely repulsive) in SPC water Simulate at a number of different cavity sizes then integrate. Ensemble average behaves differently for different cavity sizes.
-cyclodextrin, p-methoxyphenol showing the orientation of the guest when inserted. Binding of p-substituted phenols to -Cyclodextrin System: -cyclodextrin H 2 O; Constant T,P Analyze:a) Orientation and motion of ligand in pocket b) Interaction energies c) Binding free energies Relatively small host-guest system: -cyclodextrin 6 sugar (glucose) units (cyclic)
Cyclodextrin VDW surface of -cyclodextrin front cut away -cyclodextrin with p-chloro-phenol. Note, the tight fit of the ring and the distortion of the cyclodextrin
Motion of Guest in -Cyclodextrin Side view of -CD with p-Cl-phenol showing motion of the guest inside the cavity (water not shown). CD with p-Cl-phenol showing van der Waals contacts.
Motion of p-Cl-phenol in -cyclodextrin ring (rotation of guest) Relaxation times (guest) a) Rotational averaging20-40ps b) In/out motion60-80ps c) Tilt averaging>80ps Studies on Structure and Binding of Cyclodextrin
Part C. Estimation of Binding Energies of p-Substituted Phenols to -CD: Comparison of host-guest interaction energies and experimental binding free energies. (no entropic contributions) Average error between interaction energy and binding free energy > 12kJ/mol (4.6 kT or 2 orders of magnitude in association constants)
Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Example: Difference in Binding Free Energy G binding p-methylphenol p-chorophenol to -cyclodextrin Studies on Structure and Binding of Cyclodextrin Coupling Parameter Approach Express the Hamiltonian as a function of a coupling parameter such that Calculate the work to go from the initial to the final state over a reversible path integration formula
Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin
Difference in Binding Free Energy G binding : p-methyl phenol p-chloro phenol bound to -cyclodextrin (p-CH3) (p-Cl) Thermodynamic Cycle: Express difference in experimental binding free energies G 1 and G 2 in terms of non-physical mutations G 3 and G 4 Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin Free energy is a state function G is independent of path. G (any cycle) = 0
Methods To Calculate Free Energy: Make Hamiltonian dependent on. dependency determines pathway from A to B = 0 state A = 1state B
Methods To Calculate Free Energy: Slow growth Integration Formula: Slow Growth: 1. Make function of time and slowly change during simulation Approximate integral by sum (errors accumulate) Compare G forward and G reverse Estimate error from hysteresis.
Methods To Calculate Free Energy: Slow growth Requirements: (for each ) A. System in equilibrium: equil. > system B. Sufficient sampling sample >> system = relaxation time
Slow Growth: Slowly change as a function of time during simulation Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin
Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin Hysteresis first increases then decreases with longer sampling. Slow Growth: Slowly change as a function of time during simulation
Motion of p-Cl-phenol in -cyclodextrin ring (rotation of guest) Relaxation times (guest) a) Rotational averaging20-40ps b) In/out motion60-80ps c) Tilt averaging>80ps Studies on Structure and Binding of Cyclodextrin
Methods To Calculate Free Energy: Integration Formula: Numerical Quadrature: 1. Simulate at fixed values and integrate numerically integration formula Requirements (at each ) A. Equilibrium B. must converge. C. F( ) must be a smooth function.
Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin Ensemble average of must converge.
Change in free energy (kJ mol -1 ) as a function of sampling time in water. The integral from = 0 to 1 : 9 equally spaced points, Simpson’s rule approximation. Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin
Ensemble average of must converge. Change in free energy (kJ mol -1 ) as a function of sampling time in cyclodextrin. The integral from = 0 to 1: 9 equally spaced points, Simpson’s rule approximation. Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin
Methods To Calculate Free Energy F( ) must be a smooth function Effect of Increasing the number of points used to perform the integration. Example: Binding Free Energies of p-Substituted Phenols to -CD:
Average difference between the experimental and calculated binding free energies including entropic contributions < 3 kJ/mol (approx. 1kT or factor of 2 in association constant) Comparison of Experimental and Calculated Binding Free Energies. Part D. Binding Free Energies of p-Substituted Phenols to -CD: Free Energy Calculations Studies on Structure and Binding of Cyclodextrin
R`R` Ki ( M) 1H3.1 2NH OH39.3 4CH NHCOCH R`R` R2R2 Ki ( M) 6NHCO 2 CH 3 H86 7NHCOCH3CONH NHCO 2 CH3CONH HCONH NHCOCH 3 H32 Anti-diabetic target glucose-analog active site inhibitors. Large number of high resolution GP- inhibitor complex crystal structures (validate calculations) Example: Glycogen phosphorylase (GP)
TI cycle in water TI cycle in protein Cycles well converged in water and protein. Accurately predicts structures of derivatives. Force Field leads to deviations ~5-10 kJ/mole from experimental binding constants.
The value of dG( )/d as a function of for the forward (black) and reverse (red) MCTI mutation of 7 to 8. Error bars are also shown. Multi-configurational TI (MCTI) Vs. Single-configurational TI (slow-growth) G 7,8 7 to 8 in water8 to 7 in water SG (20 ns) hysteresis 0.56 MCTI (15 points)
Beutler, T. C. et al. (1994) Chem. Phys. Lett. 222, Avoiding singularities and numerical instabilities when growing or deleting atoms. Practical Challenges: Lennard-Jones potential: Singularity as r-> 0 Derivative has a singularity at r ij = 0
Scale 0.01
Example: Simultaneously grow multiple atoms J. Am. Chem. Soc. 1996, 118, Free Energies of Transfer of Trp Analogs from Chloroform to Water Ac-Trp to Trp-NMe. Solid line: water Broken line: methanol
Non-Equilibrium Methods: Jarzynski Equality Given a sufficiently large sample of W Gives equilibrium work from non-equilibrium simulations (or experiments) (closely related to fluctuation dissipation theorem and reaction path sampling) Described as “remarkable”, “amazing”, “unexpected”. Can be considered as coordinate transformation where time is treated as a constraint (importance sampling in trajectory space)
A B G A->B = 0 Move particle from location A to location B in box of water. Very simple test case for non-equilibrium pulling in a dissipative environment.
A B G A->B = 0 Move particle from A to B Method 1. Reversible work a.Very slowly pull particle from A to B (system always in equilibrium). b.Determine average force. x trivial
A B G A->B = 0 Move particle from A to B Non-equilibrium pulling x many paths w > 0 some paths w = 0 must find paths for which w < 0 longer the path faster I pull more likely w > 0 must find path with w <<0
A B G A->B = 0 Move particle from A to B Non-equilibrium pulling x Do any paths give w = 0 Yes but rare
A B G A->B = 0 Move particle from A to B Non-equilibrium pulling x Which if any paths give w < 0 Particle must be pushed from A to B by the environment
Methane in water Equilibrium versus non-equilibrium work: Moving particles in a condensed phase
Non-Equilibrium Methods: Jarzynski Equality Does is work: YES Advantages: 1.Can be only option (AFM experiments). 2.All of the tricks developed for equilibrium free energy calculations can be applied. 3.Unified free energy calculations and keeps many theoreticians off the streets. Disadvantages: 1.Suffers same convergence problems as perturbation methods. 2.Number of trials grows exponentially with the average dissipative work. 3.Changes can only correspond to observed fluctuations in the system at equilibrium. 4.Equilibrium methods generally more efficient
Probability of finding the system in a given state Hamiltonian (energy function) How the system preferentially evolves in time. Free Energy If we know one we can derive the others Choice coordinate system is arbitrary (cartesian, parameter, trajectory) Must sample a representative ensemble. Secret is to transform the system or bias the sampling to improve sampling.