Download presentation
Presentation is loading. Please wait.
1
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling techniques
2
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques217 Jan 2006 Molecular Simulations: Brief History 1936Gelatine balls(Morell and Hildebrand) 1953MC simulations(Metropolis et al.) 1957MC of Lennard-Jones spheres(Wood and Parker) 1964MD of liquid argon10 ps(Rahman) 1970’sNon-equilibrium methods 1970’sStochastic dynamics methods 1974MD of liquid water(Stillinger and Rahman) 1977MD of protein in vacuo20 ps(McCammon et al.) 1980’sQuantum-mechanical effects 1983MD of protein in water20 ps(van Gunsteren et al.) 1998MD of peptide folding100 ns(Daura et al.) 1998MD of protein folding 1 s (Duan and Kollman) TodayLarge proteins or complexes in water or membrane; up to microseconds (10-100 CPU days ~10^14 slower than nature; computer speed x10 every 6 years) 2029Protein folding1 ms 2034E-coli, 10^11 atoms1 ns 2056Cell, 10^15 atoms1 ns 2080Protein folding as fast as in nature
3
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques317 Jan 2006 Protein flexibility Also a correctly folded protein is dynamic –Crystal structure yields average position of the atoms –‘Breathing’ overall motion possible
4
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques417 Jan 2006 B-factors De gemiddelde beweging van atoom rond gemiddelde positie alpha helices beta-sheet
5
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques517 Jan 2006 Peptide folding from simulation A small (beta-)peptide forms helical structure according to NMR Computer simulations of the atomic motions: molecular dynamics
6
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques617 Jan 2006 Folding and un-folding in 200 ns Unfolded structures all different? how different? 3 21 10 10 possibilities! Folded structures all the same folded unfolded
7
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques717 Jan 2006 Temperature dependence folded unfolded folding equilibrium depends on temperature 360 K 320 K 340 K 350 K 298 K
8
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques817 Jan 2006 Pressure dependence 2000 atm 1000 atm 1 atm folding equilibrium depends on pressure folded unfolded
9
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques917 Jan 2006 Number of relevant non-folded structures is very much smaller than the number of possible non-folded structures If the number of relevant non-folded structures increases proportionally with the folding time, only 10 9 protein structures need to be simulated in stead of 10 90 structures Folding-mechanism perhaps simpler after all… Surprising result Number of aminoacids in protein chain Folding time (exp/sim) (seconds) Number possible structures relevant (observed) structures peptide1010 -8 3 20 10 9 10 3 protein10010 -2 3 200 10 90 10 9
10
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1017 Jan 2006 Phase Space Defines state of classical system of N particles: –coordinates q = (x 1, y 1, z 1, x 2, …, z N ) –momenta p = (p x1, p y1, p z1, p x2, …, p zN ) One conformation (+ momenta) is one point (p,q) in phase space Motion is a curved line in phase space –trajectory: (p(t),q(t))
11
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1117 Jan 2006 Molecular Motions: Time & Length-scales
12
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1217 Jan 2006 Newton Dynamics Sir Isaac Newton t t + t
13
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1317 Jan 2006 Classical (Newton) Mechanics A system has coordinates q and momenta p (= mv): p = ( p 1, p 2, …, p N ) q = ( q 1, q 2, …, q N ) This is called the configuration space. The total energy can be split into two components: –kinetic energy (K): K(p) = ½ mv 2 = ½ p 2 /m –potential energy (V): V(q) depends on interaction(s) The potential energy is described by –bonded interactions (e.g. bond stretching, angle bending) –non-bonded interactions (e.g. van der Waals, electrostatic) Non-bonded interactions determine the conformational variation that we observe for example in protein motions.
14
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1417 Jan 2006 The Hamilton Function The Hamiltonian function represents the total energy: H(p,q) = K(p) + V(q) Is the generalised expression of classical mechanics In two differential expressions: Newton equations of motion, but in a very elegant way Use 'generalised coordinates' ( p and q ): –can use any coordiate system e.g., Cartesian coordinates or Euler angles q H q = ––– = ––– t p k p H p = ––– = ––– t q k..
15
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1517 Jan 2006 Hamilton's Principle "The time derivative of the integral over the energy of ( p q - H(p,q) ) t = 0 Hamilton's principle is most fundamental –Newton's equation of motion are only one set of equations that can be derived from Hamilton's principle. The integral is called the 'action‘, meaning: –If we integrate the trajectory of an object in a configuration space given by positions q and momenta p between time points (integration limits) t1 and t2, then the value of the integral (= the 'action') of a 'real‘ trajectory is a minimum (more precisely an extremum) if compared to all other trajectories. Example: Why does a thrown stone follow a parabolic trajectory? –If you vary the trajectory and calculate the action, the parbolic trajectory will yield the smallest 'action'...
16
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1617 Jan 2006 Harmonic oscillator: 1-dimensional motion 2 dimensions in phase-space: –position (1-dimensional) –momentum (1-dimensional) analytical solution for integration: –q(t) = b · cos (√k/m · t ) –p(t) = -b · √mk · sin ( √k/m · t ) p(t) q(t)
17
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1717 Jan 2006 Calculating Averages Integration of phase space: –1 particle, 2 values per coordinate (e.g. up, down): 1*6 degrees of freedom (dof); 2 6 = 64 points 2 particles: 2*6 dof; 2 12 = 4.096 points 3 particles: 3*6 dof; 2 18 = 262.144 points 4 particles: 4*6 dof; 2 24 = 16.777.216 points Need whole of phase space ? –only low energy states are relevant
18
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1817 Jan 2006 Solving Complex systems No analytical solutions Numerical integration: –by time (Molecular Dynamics) –by ensemble (Monte-Carlo) Molecular Dynamics: Numerical integration in time –Euler’s approximation: q(t + Δt) = q(t) + p(t)/m · Δt p(t + Δt) = p(t) + m · a(t) · Δt –Verlet / Leap-frog
19
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques1917 Jan 2006 Features of Newton Dynamics Newton’s equations: –Energy conservative –Time reversible –Deterministic Numeric integration by Verlet algrorithm: ‘Simulation’ r(t + t) ~ 2 r(t) - r(t - t) + F(t)/m t 2 [ + 2 O( t 4 ) ] In ‘real’ simulation: Rounding errors (cumulative): not fully reversible no full energy conservation Coupling to thermal bath re-scaling not fully deterministic ‘Lyapunov’ instability trajectories diverge
20
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2017 Jan 2006 Derivation: Verlet Taylor expansion: –q(t+Δt) = q(t) + q’(t)Δt + 1/2! q’’(t)Δt 2 + 1/3! q’’’(t)Δt 3 + … where: q’ (t) = v (t) (1 st derivative, velocity) and: q’’ (t) = a (t) (2 nd derivative, acceleration) q(t+Δt) = q(t) + q’(t)Δt + 1/2! q’’(t)Δt 2 + 1/3! q’’’(t)Δt 3 q(t−Δt) = q(t) − q’(t)Δt + 1/2! q’’(t)Δt 2 − 1/3! q’’’(t)Δt 3 + q(t+Δt) + q(t−Δt) = 2q(t) + 2·1/2! q’’(t)Δt 2 –Rearrange: q(t+Δt) = 2q(t) − q(t−Δt) + a(t)Δt 2 2 nd order; but 3 rd order accuracy
21
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2117 Jan 2006 What do we obtain? Trajectory: q(t) and p(t) Probability of occurence: P(p,q) = 1/Z e -H(p,q)/kT Averages along trajectory: = 1/T A(q(t),p(t)) dt (where T denotes total time, and not! temperature)
22
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2217 Jan 2006 Convergence Amount of phase-space covered –“Sampling” Impossible to prove: You cannot know what you don’t know Energy “landscape” in phase-space –there might be a “next valley”
23
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2317 Jan 2006 Example: Convergence (1)
24
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2417 Jan 2006 Example: Convergence (2)
25
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2517 Jan 2006 Example: Convergence (3) Apparent Convergence on all timescales 100 ps – 10 ns !
26
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2617 Jan 2006 Efficiency Time step limited by vibrational frequencies –heavy-atom–hydrogen bond vibration 10 -14 s (10fs) –10-20 integration steps per vibrational period: 0.5 fs time step; 2.000.000 steps for 1 ns Removal of fast vibrations (constraining): –hydrogen atom bond and angle motion –heavy-atom bond motion –out-of-plane motions (e.g. aromatic groups) In practice: 1-2 fs time step –5-7 fs maximum
27
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2717 Jan 2006 Constraining to remove degrees of freedom, e.g.: –bond i-j vibrations keep distance i-j constant –angle i-j-k vibrations keep distance i-k constant Constraint Algorithms –SHAKE iterative adjustment of lagrange multipliers –LINCS Taylor expansion of matrix inversion non-iterative (more stable) no highly connected constraints –SETTLE Analytical Solution –for symmetric 3-atom molecules (like water)
28
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2817 Jan 2006 Improving Performance Pairwise potential: F ij = − F ji Potential E(r) ~ 0 at large r : cut-off –Coulomb: ~ 1/r –Lennard-Jones: ~1/r 6 Atoms move little in one step: pair-list –Evaluating r is expensive: r = √|r j −r i | Large distances change less: twin-range –short-range each step; long range less often Multiple time-step methods Many Processor/Compiler/Language specific optimizations: –use of Fortran vs. C –optimize cache performance arrays of positions, velocities, foces, parameters are very large –compiler optimizations
29
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques2917 Jan 2006 Ignoring Degrees of Freedom Internal: –bonds, angles → Constraint algorithm larger time steps External: –“Solvent” → Langevin dynamics less (explicit) particles –Inertia & “solvent” → Brownian dynamics larger time steps
30
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3017 Jan 2006 Trajectory on Energy Surface
31
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3117 Jan 2006 Sampling in Conformational Space Most of the computational time is spent on calculating (local, harmonic) vibrations. Entropy Energy E >> KT vibration
32
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3217 Jan 2006 Barriers Kitao et al. (1998) Proteins 33, 496-517.
33
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3317 Jan 2006 Psychology of Theorists 100% “In theory, there should be no difference between theory and practice. In practice, however, there is always a difference...“ (Witten and Frank) “For every complex question there is a simple and wrong solution.” (Albert Einstein) “All models are wrong, but some are useful.” (George Box) 0% OPTIMIST SCALE
34
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3417 Jan 2006 Monte Carlo Sampling Ergodic hypothesis: –Sampling over time (Molecular Dynamics approach); and –Ensemble averaging (Monte Carlo approach) Yield the same result: (r) = NVE Detailed Balance condition: p(o) (o n) = p(n) (n o)
35
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3517 Jan 2006 Metropolis Selection Scheme Metropolis acceptance rule that satisfies detailed equilibrium: acc(o n) = p(n)/p(o) = e - E/kT if p(n) < (o) acc(o n) = 1 if p(n) (o) Metropolis Monte Carlo Ergodic probability density for configurations around r N e -E/kT p(r N ) = –––––– e -E/kT
36
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3617 Jan 2006 Search Strategies
37
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3717 Jan 2006 Leaps
38
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3817 Jan 2006 Computational Scheme Readuction of the leaps will lead to classical dynamics Control parameter: –RMSD –Angle deviation
39
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques3917 Jan 2006 Computational Load: Solvation Most computational time (>95%) spent on calculating (bulk) water-water interactions
40
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4017 Jan 2006 Implicit Solvation
41
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4117 Jan 2006 POPS Solvent accessible area –fast and accurate area calculation –resolution: POPS-A (per atom) POPS-R (per residue) –parametrised on 120000 atoms and 12000 residues –derivable -> MD Free energy of solvation G solv i = area i · i POPS is implemented in GROMOS96 parameters 'sigma' from simulations in water: –amino acids in helix, sheet and extended conformation –peptides in helix and sheet conformation
42
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4217 Jan 2006 POPS server
43
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4317 Jan 2006 Test molecules: alanine dipeptide
44
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4417 Jan 2006 Test molecules: BPTI / Y35G-BPTI Classical MD Leap-dynamics Essential dynamics
45
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4517 Jan 2006 Calmodulin domains Apparent unfolding temperatures (CD) –C-domain : 315 K (42 ° C) –N-domain : 328 K (55 °C) LD simulations: –3 ns –4 trajectories 290 K 325 K 360 K
46
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4617 Jan 2006 Snapshots
47
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4717 Jan 2006 Trajectories
48
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4817 Jan 2006 Example: Protein & Ligand Dynamics
49
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques4917 Jan 2006 Example: Essential Dynamics Analysis Cyt-P450 BM3 7 x 10ns “free” MD simulations
50
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5017 Jan 2006 CD
51
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5117 Jan 2006 Comparison CD / simulation
52
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5217 Jan 2006 Example: Minima
53
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5317 Jan 2006 Example: Conformations
54
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5417 Jan 2006 Levinthal’s paradox Eiwitvouwingsprobleem: –Voorspel de 3D structuur vanuit de sequentie –Begrijp het vouwingsproces
55
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5517 Jan 2006 Folding energy Each protein conformation has a certain energy and a certain flexibility (entropy) Corresponds to a point on a multidimensional free energy surface may have higher energy but lower free energy than energy E(x) coordinate x Three coordinates per atom 3N-6 dimensions possible G = H – T S
56
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5617 Jan 2006 Folded state Native state = lowest point on the free energy landscape Many possible routes Many possible local minima (misfolded structures)
57
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5717 Jan 2006 Molten globule First step: hydrophobic collapse Molten globule: globular structure, not yet correct folded Local minimum on the free energy surface
58
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5817 Jan 2006 Force Field “the collection of all forces that we consider to occur in a mechanical atomar system” A generalised description: E total = E bonded + E non-bonded + E crossterm Crossterms: –non-bonded interaction influence the bonded interaction (v.v.). –Some force fields neglect those terms. Note that force fields are (mostly) designed for pairwise atom interactions. –Higher order interactions are implicitly included in the pairwise interaction parameters.
59
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques5917 Jan 2006 Force Field Components: Bonded Interactions
60
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques6017 Jan 2006 Force Field Components: Non-Bonded Interactions
61
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques6117 Jan 2006 All Together…
62
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques6217 Jan 2006 Reduced Units Generalise description of (atomic) systems –expres all quantities in basic units derived from system's dimensions For example, a Lennard-Jones interaction: V LJ = ƒ(r/ ) is characteristic interaction energy; is equilibrium distance Choose basic units: –unit of length, –unit of energy, –unit of mass, m (mass of the atoms in the system) all other units can be derived from these, e.g.: –time: m/ –temperature: /k B (from: Frenkel and Smit, 'Understanding Molecular Simulations', Academic Press.) Other choices, e.g., ‘MD’ units: –length nm (10 -9 m),mass u, time ps (10 -12 s), charge e, temp K –energy kJ mol -1, veolcity nm ps -1, pressure kJ mol -1 nm -3
63
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques6317 Jan 2006 Main points
64
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques6417 Jan 2006
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.