Importance sampling for MC simulation

Slides:



Advertisements
Similar presentations
The microcanonical ensemble Finding the probability distribution We consider an isolated system in the sense that the energy is a constant of motion. We.
Advertisements

Monte Carlo Simulation Wednesday, 9/11/2002 Stochastic simulations consider particle interactions. Ensemble sampling Markov Chain Metropolis Sampling.
Monte Carlo Methods and Statistical Physics
Thermodynamics II I.Ensembles II.Distributions III. Partition Functions IV. Using partition functions V. A bit on gibbes.
1 CE 530 Molecular Simulation Lecture 8 Markov Processes David A. Kofke Department of Chemical Engineering SUNY Buffalo
Statistics.
REGRESSION MODEL ASSUMPTIONS. The Regression Model We have hypothesized that: y =  0 +  1 x +  | | + | | So far we focused on the regression part –
Chapter Topics Types of Regression Models
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
 The Law of Large Numbers – Read the preface to Chapter 7 on page 388 and be prepared to summarize the Law of Large Numbers.
Simulation and Probability
A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse.
Standard error of estimate & Confidence interval.
Review of Probability.
Chapter 4 Continuous Random Variables and Probability Distributions
Chapter 7: Random Variables
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
Simulation of Random Walk How do we investigate this numerically? Choose the step length to be a=1 Use a computer to generate random numbers r i uniformly.
Advanced methods of molecular dynamics Monte Carlo methods
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Introduction to (Statistical) Thermodynamics
1 Physical Chemistry III Molecular Simulations Piti Treesukol Chemistry Department Faculty of Liberal Arts and Science Kasetsart University :
01/24/05© 2005 University of Wisconsin Last Time Raytracing and PBRT Structure Radiometric quantities.
Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.
Free energies and phase transitions. Condition for phase coexistence in a one-component system:
Probability theory 2 Tron Anders Moger September 13th 2006.
Statistical Thermodynamics CHEN 689 Fall 2015 Perla B. Balbuena 240 JEB
Continuous Probability Distributions Continuous random variable –Values from interval of numbers –Absence of gaps Continuous probability distribution –Distribution.
Statistical Experiment A statistical experiment or observation is any process by which an measurements are obtained.
Basic Probability (Chapter 2, W.J.Decoursey, 2003) Objectives: -Define probability and its relationship to relative frequency of an event. -Learn the basic.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Microscopic states (microstates) or microscopic configurations under external constraints (N or , V or P, T or E, etc.)  Ensemble (micro-canonical, canonical,
1 CE 530 Molecular Simulation Lecture 6 David A. Kofke Department of Chemical Engineering SUNY Buffalo
Sampling Distribution of the Sample Mean. Example a Let X denote the lifetime of a battery Suppose the distribution of battery battery lifetimes has 
Nathan Baker BME 540 The Monte Carlo method Nathan Baker BME 540.
Hit-and-Miss (or Rejection) Monte Carlo Method:
For a new configuration of the same volume V and number of molecules N, displace a randomly selected atom to a point chosen with uniform probability inside.
Lab 3b: Distribution of the mean
The Practice of Statistics Third Edition Chapter 7: Random Variables Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Monte Carlo Methods So far we have discussed Monte Carlo methods based on a uniform distribution of random numbers on the interval [0,1] p(x) = 1 0  x.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
Hit-and-Miss (or Rejection) Monte Carlo Method: a “brute-force” method based on completely random sampling Then, how do we throw the stones and count them.
1 Since everything is a reflection of our minds, everything can be changed by our minds.
4. Numerical Integration. Standard Quadrature We can find numerical value of a definite integral by the definition: where points x i are uniformly spaced.
Sampling Error SAMPLING ERROR-SINGLE MEAN The difference between a value (a statistic) computed from a sample and the corresponding value (a parameter)
P T A typical experiment in a real (not virtual) space 1.Some material is put in a container at fixed T & P. 2.The material is in a thermal fluctuation,
§ 5.3 Normal Distributions: Finding Values. Probability and Normal Distributions If a random variable, x, is normally distributed, you can find the probability.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Sampling and estimation Petter Mostad
An Introduction to Monte Carlo Methods in Statistical Physics Kristen A. Fichthorn The Pennsylvania State University University Park, PA
Monte Carlo in different ensembles Chapter 5
CSE 474 Simulation Modeling | MUSHFIQUR ROUF CSE474:
Chapter 3 Statistical Models or Quality Control Improvement.
CHAPTER – 1 UNCERTAINTIES IN MEASUREMENTS. 1.3 PARENT AND SAMPLE DISTRIBUTIONS  If we make a measurement x i in of a quantity x, we expect our observation.
1 CE 530 Molecular Simulation Lecture 21 Histogram Reweighting Methods David A. Kofke Department of Chemical Engineering SUNY Buffalo
Central Limit Theorem Let X 1, X 2, …, X n be n independent, identically distributed random variables with mean  and standard deviation . For large n:
Statistical Mechanics and Multi-Scale Simulation Methods ChBE
Lecture 14: Advanced Conformational Sampling Dr. Ronald M. Levy Statistical Thermodynamics.
Selecting Input Probability Distributions. 2 Introduction Part of modeling—what input probability distributions to use as input to simulation for: –Interarrival.
Systematic errors of MC simulations Equilibrium error averages taken before the system has reached equilibrium  Monitor the variables you are interested.
Random Variable 2013.
Probablity Density Functions
Statistical Methods For Engineers
Common Types of Simulations
Geology Geomath Chapter 7 - Statistics tom.h.wilson
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
These probabilities are the probabilities that individual values in a sample will fall in a 50 gram range, and thus represent the integral of individual.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

Importance sampling for MC simulation (“Importance-weighted random walk”) Sampling points from a uniform distribution may not be the best way for MC. When most of the weight of the integral comes from a small range of x where f(x) is large, sampling more often in this region would increase the accuracy of the MC.

Example: Importance of small region: Measuring the depth of the Nile Systematic quadrature or uniform sampling Importance sampling (importance weighted random walk) Frenkel and Smit, Understanding Molecular Simulations

Example: Importance of small region: Energy funnel in protein folding Cyrus Levinthal formulated the “Levinthal paradox” (late 1960’s): Consider a protein molecule composed of (only) 100 residues, each of which can assume (only) 3 different conformations. The number of possible structures of this protein yields 3100 = 5×1047. Assume that it takes (only) 100 fs to convert from one structure to another. It would require 5×1034 s = 1.6×1027 years to ”systematically” explore all possibilities. This long time disagrees with the actual folding time (μs~ms).  Levinthal’s paradox

To decrease the error of MC simulation ~ cost  (f) = standard deviation in the observable O, i.e., y = f(x), itself f measures how much f(x) deviates from its average over the integration region. ~ independent of the number of trials M (or N) ~ estimated from one simulation fluctuating, varying function sharp (probability) distribution O=f(x) a b x A <f> x2 x1 xi xM O 1 p(O) O=f(x) O 1 p(O) broad distribution flat function f f <f> A vs. a b x (f > 0) non-ideal, real case the ideal case (f = 0) Importance sampling

Importance sampling for MC simulation: Example From N-step uniform sampling 3 in accuracy From normalized importance sampling weighted by w(x)

Lab 3: Importance sampling for MC simulation

What’s new in Lab 3: Importance sampling * Include “cpu.h”, compile “cpu.c”, and call “cpu()” to measure the cpu time of the run. tstart = cpu(); ~ tend = cpu(); printf("CPU time: %5.5f s, CPU time/measure: %g s", tend - tstart, (tend - tstart) / M); * Calculate the normalization constant N for each probability distribution function (x). * Include “ran3.h” & “fran3.h”; compile fran3.c; call “fexp” or “flin” defined in fran3. r = ran3(&seed); f_x = exp(-x*x); r = ran3(&seed); rho = fexp(r, &x); f_x = exp(-x*x) / rho; r = ran3(&seed); rho = flin(r, &x); f_x = exp(-x*x) / rho; * Display histograms for the distribution of x values generated by fran3 (for the step 4 only). r = ran3(&seed); f_x = exp(-x*x); i_hist = (int) (x * inv_dx); if (i_hist < n_hist) hist[i_hist] += 1.0; r = ran3(&seed); rho = fexp(r, &x); f_x = exp(-x*x) / rho; i_hist = (int) (x * inv_dx); if (i_hist < n_hist) hist[i_hist] += 1.0; r = ran3(&seed); rho = flin(r, &x); f_x = exp(-x*x) / rho; i_hist = (int) (x * inv_dx); if (i_hist < n_hist) hist[i_hist] += 1.0;

Plot with gnuplot, excel, origin, … Fit to a function. * Display histograms for the distribution of x values generated by fran3. (full version) /* number of bins in histogram*/ n_hist = 50; /* Allocate memory for histogram */ hist = (double *) allocate_1d_array(n_hist, sizeof(double)); /* Initializae histogram */ for (i_hist = 0; i_hist < n_hist; ++i_hist) hist[i_hist] = 0.0; /* Size of the ihistogram bins */ dx = 1.0 / n_hist; /* 1.0 is the size of the interval */ inv_dx = 1.0 / dx; /* histogram accumulated */ i_hist = (int) (x * inv_dx); if (i_hist < n_hist) hist[i_hist] += 1.0; /* Write histogram. */ fp = fopen("hist_2.dat", "w"); for (i_hist = 0; i_hist < n_hist; ++i_hist) { x = (i_hist + 0.5) * dx; fprintf(fp, "%g %g", x, hist[i_hist] / M); } fclose(fp); Plot with gnuplot, excel, origin, … Fit to a function. What is the resulted function?

Sampling a non-uniform & discrete Further reading: Sampling a non-uniform & discrete probability distribution {pi} (tower sampling) (Ref) Gould, Toboshnik, Christian, Ch. 11.5

Further reading: Sampling a non-uniform & continuous probability distribution (x)

Example

Lab 3: Importance sampling for MC simulation bad distribution function w(x) (x) p(x) Normalized! constant distribution function (uniform) Normalized! good distribution function Normalized! Integrand f(x)

Results: Importance sampling for MC simulation One experiment with M measures n = 100 experiments, each with M measures

Results: Importance sampling for MC simulation One experiment with M measures n = 100 experiments, each with M measures

= Importance-weighted average Importance sampling = Importance-weighted average Analogy: Throw a dice with the results of {1, 2, 2, 2, 2, 4, 5, 6}. (Discrete) probability distribution {pi, i=1-6} = {1, 4, 0, 1, 1, 1} / 8 where 8 = 1+4+0+1+1+1 = {1/8, 1/2, 0, 1/8, 1/8, 1/8} Mean value <A> = 3 = 24/8 = (1+2+2+2+2+4+5+6) / 8 = (1 + 2x4 + 4 + 5 + 6) / 8 = 1x1/8 + 2x4/8 + 3x0/8 + 4x1/8 + 5x1/8 + 6x1/8

Beyond 1D integrals: A system of N particles in a container of a volume V in contact with a thermostat T (constant NVT) external constraint Particles interact with each other through a potential energy U(rN) (~ pair potential). U(rN) is the potential energy of a microstate {rN} = {x1, y1, z1, …, xN, yN, zN}. (rN) is the probability to find a microstate {rN} under the constant-NVT constraint. (=1/kT) or for discrete microstates Partition function Z (required for normalization) = the weighted sum of all the microstates compatible with the constant-NVT condition or for discrete microstates Average of an observable O, <O>, over all the microstates compatible with constant NVT ensemble average “canonical ensemble” for discrete microstates