Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nathan Baker (baker@biochem.wustl.edu) BME 540 The Monte Carlo method Nathan Baker (baker@biochem.wustl.edu) BME 540.

Similar presentations


Presentation on theme: "Nathan Baker (baker@biochem.wustl.edu) BME 540 The Monte Carlo method Nathan Baker (baker@biochem.wustl.edu) BME 540."— Presentation transcript:

1 Nathan Baker (baker@biochem.wustl.edu) BME 540
The Monte Carlo method Nathan Baker BME 540

2 What is Monte Carlo? A method to evaluate integrals
Popular in all areas of science, economics, sociology, etc. Involves: Random sampling of domain Evaluation of function Acceptance/rejection of points

3 Why Monte Carlo? The domain of integration is very complex:
Find the integral of a complicated solvent-solute energy function outside the boundary of a protein. Find the volume of an object from a tomographic data set. The integrals are very high-dimensional: A partition function or free energy for a protein Analyzing metabolic networks Predicting trends in the stock market

4 Basic Monte Carlo Algorithm
Suppose we want to approximate in a high-dimensional space For i = 1 to n Pick a point xi at random Accept or reject the point based on criterion If accepted, then add f(xi) to total sum Error estimates are “free” by calculating sums of squares Error typically decays as

5 Example: the area of a circle
Sample points randomly from square surrounding circle of radius 5 10,000 sample points Acceptance criterion: inside circle Actual area: Calculated area:

6 Example: more complicated shapes

7 Example: multiple dimensions
What is the average of a variable for a N-dimensional probability distribution? Two approaches: Quadrature Discretize each dimension into a set of n points Possibly use adaptivity to guide discretization For a reasonably smooth function, error decreases as n-N/2 Monte Carlo Sample m points from the space Possibly weight sampling based on reference function Error decreases as m-1/2

8 Problems: sampling tails of distributions
We want to Integrate a sharply-peaked function Use Monte Carlo with uniformly-distributed random numbers What happens? Very few points contribute to the integral (~9%) Poor computational efficiency/convergence Can we ignore the tails? NO! Solution: use a different distribution

9 Improved sampling: change of variables
One way to improve sampling is to change variables: New distribution is flatter Uniform variates more useful Advantages: Simplicity Very useful for generating distributions of non-uniform variates (coming up) Disadvantages Most useful for invertible functions

10 Change of variables: method
Given an integral Transform variables Choose variables to give (nearly) flat distribution Integrate

11 Change of variables application: exponential
Given an integral Transform variables Choose variables to give (nearly) flat distribution Integrate

12 Change of variables application: exponential
Before transformation: 0.5% of points in domain contribute to integral Slow convergence

13 Change of variables example: exponential
Before transformation: 0.5% of points in domain contribute to integral Slow convergence After transformation: All of the points in domain contribute Rapid (exact) convergence In practice: Speed-up best when inversion is nearly exact

14 Importance sampling Functions aren’t always invertible
Is there another way to improve sampling of “important” regions of the functions? Find flat distributions Bias sampling Find a function that is almost proportional to the one of interest: Rewrite your integral as a “weighted” integral:

15 Importance sampling example: a lumpy Gaussian
Our original integrand is This is close to Therefore: Sample random numbers from (we’ll talk about how to do this later): Evaluate the following integrand over the random number distribution:

16 Importance sampling example: a lumpy Gaussian
Convergence is pretty good (actual value …)

17 Evolution of Monte Carlo methods so far…
Uniform points and original integrand… …but this had very poor efficiency…. Uniform points and transformed integrand… …but this only worked for certain integrands…. Non-uniform points and scaled integrand… …but this is very cumbersome for complicated integrands… Now, we try Markov chain approaches…

18 Markov chains Properties Examples A sequence of randomly-chosen states
The probability of transitions between states is independent of history The entire chain represents a stationary probability distribution Examples Random number generators Brownian motion Hidden Markov Models (Perfectly) encrypted data

19 Detailed balance What sorts of Markov chains reach stationary distributions? This is the same question as posed for the Liouville equation… Equilibrium processes have stationary distributions What does it mean to be at equilibrium? Reversibility Detailed balance relates stationary probabilities to transitions:

20 Detailed balance: example
Markov chain with a Gaussian stationary probability distribution Detailed balance satisfied

21 Detailed balance: example
Markov chain with a Gaussian stationary probability distribution, EXCEPT: Steps to the right are 1% more favorable Detailed balance violated When both states are unfavorable, a 1% bias makes a big difference!

22 Markov chain Monte Carlo
Assembling the entire distribution for MC is usually hard: Complicated energy landscapes High-dimensional systems Extraordinarily difficult normalization Solution: Build up distribution from Markov chain Choose local transition probabilities which generate distribution of interest (i.e., ensure detailed balance) Each random variable is chosen based on the previous variable in the chain “Walk” along the Markov chain until convergence reached Result: Normalization not required, calculations are local

23 Application: microcanonical ensemble
Consider the particle in a N-D box. Monte Carlo algorithm for microcanonical ensemble: Sample Determine the number of states with energy ≤E Relate to the microcanonical partition function by differentiation

24 Application: microcanonical ensemble
9-dimensional particle in a box

25 Markov chain Monte Carlo: flavors
Molecular simulation: Metropolis Bayesian inferences: Gibbs Hidden Markov Models: Viterbi …etc….

26 Application: stochastic transitions
This is a very simplistic version of kinetic Monte Carlo… Each Monte Carlo step corresponds to a time interval The probability of moving between states in that time interval is related to the rate constants Simulate to give mean first passage times, transient populations, etc. A B

27 Metropolis Monte Carlo
Start with the detailed balance condition Derive an “acceptance ratio” condition Choose a particular acceptance ratio

28 Application: canonical ensemble
Our un-normalized probabilities look like Boltzmann factors: Our acceptance ratio is therefore:

29 Algorithm: NVT Metropolis Monte Carlo
Metropolis MC in a harmonic potential

30 Advantages of Metropolis MC simulations
Does not require forces Rapidly-changing energy functions No differentiation required Amenable to complex move sets: Torsions Rotamers Tautomers Etc.

31 Monte Carlo “machinery”
Boundary conditions Finite Periodic Interactions Complete Truncated How do we choose “moves”?

32 Monte Carlo moves Trial moves Questions: Rigid body translation
Rigid body rotation Internal conformational changes (soft vs. stiff modes) Titration/electronic states Questions: How “big” a move should we take? Move one particle or many?

33 Monte Carlo moves How “big” a move should we take?
Smaller moves: better acceptance rate, slower sampling Bigger moves: faster sampling, poorer acceptance rate Amortize mean squared displacement with respect to CPU time “Rules of thumb”: percent acceptance rate Move one particle or many? Possible to achieve more efficient sampling with correct multi-particle moves One-particle moves must choose particles at random

34 Random variates Don’t develop your own uniform variate method
Look for long period, lack of bias, etc. These are usually fulfilled in random(), drand48(), etc. I like “Mersenne twister” Use uniform variates for starting point of other distributions Can use all the methods we’ve described: Standard sampling (rejection method) Transformations Metropolis (watch out for correlations)

35 Random variates: transformations
Useful for functions with easy inverses Sample uniform variable and transform via inverse to give particular variates Tougher distributions require rejection with respect to “easy” distribution…

36 Problem: poor convergence
Error in Monte Carlo decreases as N-1/2 What if you want to improve error by 1%? This is due to the “filling behavior” of uniformly-distributed random points Affects nearly all areas of MC simulations Convergence can also be difficult to detect: Observe decrease in variance of data Employ error estimates Sub-sample data

37 Quasi-random sequences
Compare to the “pseudo-random” methods we’ve discussed so far. Can we find a sub-random sequence with low internal correlations that fills space better? Solution: a maximally-avoiding set of points Number-theoretic methods: Base transformations Primes Polynomials Convergence can reach N-1 (Sobol sequence) Caveats: How many points (resolution)? What about sharply-varying functions? Hammersley quasi-random sequence

38 Application: the bootstrap method
Useful for estimating error fit parameters Resampling of original data Examination of resulting fit parameters Algorithm: Generate fit parameters for the original data For (number of resamples desired) Resample the data with replacement Generate new fit parameters Save deviation with respect to original fit parameters Analyze deviation statistics

39 Summary A method for sampling
Integrals Configuration space Error space Any distribution that can be represented by transitions can be sampled Sampling can be accelerated using various tricks


Download ppt "Nathan Baker (baker@biochem.wustl.edu) BME 540 The Monte Carlo method Nathan Baker (baker@biochem.wustl.edu) BME 540."

Similar presentations


Ads by Google