Download presentation
Presentation is loading. Please wait.
1
Lecture 10 Outline Monte Carlo methods Monte Carlo methods History of methods History of methods Sequential random number generators Sequential random number generators Parallel random number generators Parallel random number generators Generating non-uniform random numbers Generating non-uniform random numbers Monte Carlo case studies Monte Carlo case studies
2
Monte Carlo Methods Monte Carlo is another name for statistical sampling methods of great importance to physics and computer science Monte Carlo is another name for statistical sampling methods of great importance to physics and computer science Applications of Monte Carlo Method Applications of Monte Carlo Method Evaluating integrals of arbitrary functions of 6+ dimensions Predicting future values of stocks Solving partial differential equations Sharpening satellite images Modeling cell populations Finding approximate solutions to NP-hard problems
3
In 1738, Swiss physicist and mathematician Daniel Bernoulli published Hydrodynamica which laid the basis for the kinetic theory of gases: great numbers of molecules moving in all directions, that their impact on a surface causes the gas pressure that we feel, and that what we experience as heat is simply the kinetic energy of their motion.Daniel Bernoullikinetic theory of gasesheat In 1859, Scottish physicist James Clerk Maxwell formulated the distribution of molecular velocities, which gave the proportion of molecules having a certain velocity in a specific range. This was the first-ever statistical law in physics. Maxwell used a simple thought experiment: particles must move independent of any chosen coordinates, hence the only possible distribution of velocities must be normal in each coordinate.James Clerk Maxwell distribution In 1864, Ludwig Boltzmann, a young student in Vienna, came across Maxwell’s paper and was so inspired by it that he spent much of his long, distinguished, and tortured life developing the subject further.Ludwig Boltzmann An Interesting History
4
History of Monte Carlo Method Credit for inventing the Monte Carlo method is shared by Stanislaw Ulam, John von Neuman and Nicholas Metropolis. Credit for inventing the Monte Carlo method is shared by Stanislaw Ulam, John von Neuman and Nicholas Metropolis. Ulam, a Polish born mathematician, worked for John von Neumann on the Manhattan Project. Ulam is known for designing the hydrogen bomb with Edward Teller in 1951. In a thought experiment he conceived of the MC method in 1946 while pondering the probabilities of winning a card game of solitaire. Ulam, a Polish born mathematician, worked for John von Neumann on the Manhattan Project. Ulam is known for designing the hydrogen bomb with Edward Teller in 1951. In a thought experiment he conceived of the MC method in 1946 while pondering the probabilities of winning a card game of solitaire. Ulam, von Neuman, and Metropolis developed algorithms for computer implementations, as well as exploring means of transforming non-random problems into random forms that would facilitate their solution via statistical sampling. This work transformed statistical sampling from a mathematical curiosity to a formal methodology applicable to a wide variety of problems. It was Metropolis who named the new methodology after the casinos of Monte Carlo. Ulam and Metropolis published a paper called “The Monte Carlo Method” in Journal of the American Statistical Association, 44 (247), 335- 341, in 1949. Ulam, von Neuman, and Metropolis developed algorithms for computer implementations, as well as exploring means of transforming non-random problems into random forms that would facilitate their solution via statistical sampling. This work transformed statistical sampling from a mathematical curiosity to a formal methodology applicable to a wide variety of problems. It was Metropolis who named the new methodology after the casinos of Monte Carlo. Ulam and Metropolis published a paper called “The Monte Carlo Method” in Journal of the American Statistical Association, 44 (247), 335- 341, in 1949.
5
Solving Integration Problems via Statistical Sampling: Monte Carlo Approximation How to evaluate integral of f(x)? How to evaluate integral of f(x)?
6
Integration Approximation Can approximate using another function g(x) Can approximate using another function g(x)
7
Integration Approximation Can approximate by taking the average or expected value Can approximate by taking the average or expected value
8
Integration Approximation Estimate the average by taking N samples Estimate the average by taking N samples
9
Monte Carlo Integration I m = Monte Carlo estimate I m = Monte Carlo estimate N = number of samples N = number of samples x 1, x 2, …, x N are uniformly distributed random numbers between a and b x 1, x 2, …, x N are uniformly distributed random numbers between a and b
10
Monte Carlo Integration
11
We have the definition of expected value and how to estimate it. We have the definition of expected value and how to estimate it. Since the expected value can be expressed as an integral, the integral is also approximated by the sum. Since the expected value can be expressed as an integral, the integral is also approximated by the sum. To simplify the integral, we can substitute g(x) = f(x)p(x). To simplify the integral, we can substitute g(x) = f(x)p(x).
12
Variance The variance describes how much the sampled values vary from each other. The variance describes how much the sampled values vary from each other. Variance proportional to 1/N Variance proportional to 1/N
13
Variance Standard Deviation is just the square root of the variance Standard Deviation is just the square root of the variance Standard Deviation proportional to 1 / sqrt(N) Standard Deviation proportional to 1 / sqrt(N) Need 4X samples to halve the error Need 4X samples to halve the error
14
Variance Problem: Problem: Variance (noise) decreases slowly Using more samples only removes a small amount of noise
15
Variance Reduction There are several ways to reduce the variance There are several ways to reduce the variance Importance Sampling Stratified Sampling Quasi-random Sampling Metropolis Random Mutations
16
Importance Sampling Idea: use more samples in important regions of the function Idea: use more samples in important regions of the function If function is high in small areas, use more samples there If function is high in small areas, use more samples there
17
Importance Sampling Want g/p to have low variance Want g/p to have low variance Choose a good function p similar to g: Choose a good function p similar to g:
18
Stratified Sampling Partition S into smaller domains S i Partition S into smaller domains S i Evaluate integral as sum of integrals over S i Evaluate integral as sum of integrals over S i Example: jittering for pixel sampling Example: jittering for pixel sampling Often works much better than importance sampling in practice Often works much better than importance sampling in practice
19
Parallelism in Monte Carlo Methods Monte Carlo methods often amenable to parallelism Monte Carlo methods often amenable to parallelism Find an estimate about p times faster Find an estimate about p times faster OR OR Reduce error of estimate by p 1/2 Reduce error of estimate by p 1/2
20
Random versus Pseudo-random Virtually all computers have “random number” generators Virtually all computers have “random number” generators Their operation is deterministic Their operation is deterministic Sequences are predictable Sequences are predictable More accurately called “pseudo-random number” generators More accurately called “pseudo-random number” generators In this chapter “random” is shorthand for “pseudo- random” In this chapter “random” is shorthand for “pseudo- random” “RNG” means “random number generator” “RNG” means “random number generator”
21
Properties of an Ideal RNG Uniformly distributed Uniformly distributed Uncorrelated Uncorrelated Never cycles Never cycles Satisfies any statistical test for randomness Satisfies any statistical test for randomness Reproducible Reproducible Machine-independent Machine-independent Changing “seed” value changes sequence Changing “seed” value changes sequence Easily split into independent subsequences Easily split into independent subsequences Fast Fast Limited memory requirements Limited memory requirements
22
No RNG Is Ideal Finite precision arithmetic finite number of states cycles Finite precision arithmetic finite number of states cycles Period = length of cycle If period > number of values needed, effectively acyclic Reproducible correlations Reproducible correlations Often speed versus quality trade-offs Often speed versus quality trade-offs
23
Linear Congruential RNGs Multiplier Additive constant Modulus Sequence depends on choice of seed, X 0
24
Period of Linear Congruential RNG Maximum period is M Maximum period is M For 32-bit integers maximum period is 2 32, or about 4 billion For 32-bit integers maximum period is 2 32, or about 4 billion This is too small for modern computers This is too small for modern computers Use a generator with at least 48 bits of precision Use a generator with at least 48 bits of precision
25
Producing Floating-Point Numbers X i, a, c, and M are all integers X i, a, c, and M are all integers X i s range in value from 0 to M-1 X i s range in value from 0 to M-1 To produce floating-point numbers in range [0, 1), divide X i by M To produce floating-point numbers in range [0, 1), divide X i by M
26
Defects of Linear Congruential RNGs Least significant bits correlated Least significant bits correlated Especially when M is a power of 2 k-tuples of random numbers form a lattice k-tuples of random numbers form a lattice Points tend to lie on hyperplanes Especially pronounced when k is large
27
Lagged Fibonacci RNGs p and q are lags, p > q p and q are lags, p > q * is any binary arithmetic operation * is any binary arithmetic operation Addition modulo M Addition modulo M Subtraction modulo M Subtraction modulo M Multiplication modulo M Multiplication modulo M Bitwise exclusive or Bitwise exclusive or
28
Properties of Lagged Fibonacci RNGs Require p seed values Require p seed values Careful selection of seed values, p, and q can result in very long periods and good randomness Careful selection of seed values, p, and q can result in very long periods and good randomness For example, suppose M has b bits For example, suppose M has b bits Maximum period for additive lagged Fibonacci RNG is (2 p -1)2 b-1 Maximum period for additive lagged Fibonacci RNG is (2 p -1)2 b-1
29
Ideal Parallel RNGs All properties of sequential RNGs All properties of sequential RNGs No correlations among numbers in different sequences No correlations among numbers in different sequences Scalability Scalability Locality Locality
30
Parallel RNG Designs Manager-worker Manager-worker Leapfrog Leapfrog Sequence splitting Sequence splitting Independent sequences Independent sequences
31
Manager-Worker Parallel RNG Manager process generates random numbers Manager process generates random numbers Worker processes consume them Worker processes consume them If algorithm is synchronous, may achieve goal of consistency If algorithm is synchronous, may achieve goal of consistency Not scalable Not scalable Does not exhibit locality Does not exhibit locality
32
Leapfrog Method Process with rank 1 of 4 processes
33
Properties of Leapfrog Method Easy modify linear congruential RNG to support jumping by p Easy modify linear congruential RNG to support jumping by p Can allow parallel program to generate same tuples as sequential program Can allow parallel program to generate same tuples as sequential program Does not support dynamic creation of new random number streams Does not support dynamic creation of new random number streams
34
Sequence Splitting Process with rank 1 of 4 processes
35
Properties of Sequence Splitting Forces each process to move ahead to its starting point Forces each process to move ahead to its starting point Does not support goal of reproducibility Does not support goal of reproducibility May run into long-range correlation problems May run into long-range correlation problems Can be modified to support dynamic creation of new sequences Can be modified to support dynamic creation of new sequences
36
Independent Sequences Run sequential RNG on each process Run sequential RNG on each process Start each with different seed(s) or other parameters Start each with different seed(s) or other parameters Example: linear congruential RNGs with different additive constants Example: linear congruential RNGs with different additive constants Works well with lagged Fibonacci RNGs Works well with lagged Fibonacci RNGs Supports goals of locality and scalability Supports goals of locality and scalability
37
Statistical Simulation: Metropolis Algorithm Metropolis algorithm. [Metropolis, Rosenbluth, Rosenbluth, Teller, Teller 1953] Metropolis algorithm. [Metropolis, Rosenbluth, Rosenbluth, Teller, Teller 1953] Simulate behavior of a physical system according to principles of statistical mechanics. Globally biased toward "downhill" lower-energy steps, but occasionally makes "uphill" steps to break out of local minima. Gibbs-Boltzmann function. The probability of finding a physical system in a state with energy E is proportional to e -E / (kT), where T > 0 is temperature and k is a constant. Gibbs-Boltzmann function. The probability of finding a physical system in a state with energy E is proportional to e -E / (kT), where T > 0 is temperature and k is a constant. For any temperature T > 0, function is monotone decreasing function of energy E. System more likely to be in a lower energy state than higher one. T large: high and low energy states have roughly same probability T small: low energy states are much more probable
38
Metropolis algorithm. Metropolis algorithm. Given a fixed temperature T, maintain current state S. Randomly perturb current state S to new state S' N(S). If E(S') E(S), update current state to S' Otherwise, update current state to S' with probability e - E / (kT), where E = E(S') - E(S) > 0. Convergence Theorem. Let f S (t) be fraction of first t steps in which simulation is in state S. Then, assuming some technical conditions, with probability 1: Convergence Theorem. Let f S (t) be fraction of first t steps in which simulation is in state S. Then, assuming some technical conditions, with probability 1: Intuition. Simulation spends roughly the right amount of time in each state, according to Gibbs-Boltzmann equation. Intuition. Simulation spends roughly the right amount of time in each state, according to Gibbs-Boltzmann equation.
39
Simulated Annealing Simulated annealing. Simulated annealing. T large probability of accepting an uphill move is large. T small uphill moves are almost never accepted. Idea: turn knob to control T. Cooling schedule: T = T(i) at iteration i. Physical analog. Physical analog. Take solid and raise it to high temperature, we do not expect it to maintain a nice crystal structure. Take a molten solid and freeze it very abruptly, we do not expect to get a perfect crystal either. Annealing: cool material gradually from high temperature, allowing it to reach equilibrium at succession of intermediate lower temperatures.
40
Other Distributions Analytical transformations Analytical transformations Box-Muller Transformation Box-Muller Transformation Rejection method Rejection method
41
Analytical Transformation -probability density function f(x) -cumulative distribution F(x) In theory of probability, a quantile function of a distribution is the inverse of its cumulative distribution function.theory of probabilitydistribution inversecumulative distribution function
42
Exponential Distribution: An exponential distribution arises naturally when modeling the time between independent events that happen at a constant average rate and are memoryless. One of the few cases where the quartile function is known analytically. 1.0
43
Example 1: Produce four samples from an exponential distribution with mean 3 Produce four samples from an exponential distribution with mean 3 Uniform sample: 0.540, 0.619, 0.452, 0.095 Uniform sample: 0.540, 0.619, 0.452, 0.095 Take natural log of each value and multiply by -3 Take natural log of each value and multiply by -3 Exponential sample: 1.850, 1.440, 2.317, 7.072 Exponential sample: 1.850, 1.440, 2.317, 7.072
44
Example 2: Simulation advances in time steps of 1 second Simulation advances in time steps of 1 second Probability of an event happening is from an exponential distribution with mean 5 seconds Probability of an event happening is from an exponential distribution with mean 5 seconds What is probability that event will happen in next second? What is probability that event will happen in next second? F(x=1/5) =1 - exp(-1/5)) = 0.181269247 F(x=1/5) =1 - exp(-1/5)) = 0.181269247 Use uniform random number to test for occurrence of event (if u < 0.181 then ‘event’ else ‘no event’) Use uniform random number to test for occurrence of event (if u < 0.181 then ‘event’ else ‘no event’)
45
Normal Distributions: Box-Muller Transformation Cannot invert cumulative distribution function to produce formula yielding random numbers from normal (gaussian) distribution Cannot invert cumulative distribution function to produce formula yielding random numbers from normal (gaussian) distribution Box-Muller transformation produces a pair of standard normal deviates g 1 and g 2 from a pair of normal deviates u 1 and u 2 Box-Muller transformation produces a pair of standard normal deviates g 1 and g 2 from a pair of normal deviates u 1 and u 2
46
Box-Muller Transformation repeat v 1 2u 1 - 1 v 2 2u 2 - 1 r v 1 2 + v 2 2 until r > 0 and r < 1 f sqrt (-2 ln r /r ) g 1 f v 1 g 2 f v 2 This is a consequence of the fact that the chi- square distribution with two degrees of freedom is an easily-generated exponential random variable.
47
Example Produce four samples from a normal distribution with mean 0 and standard deviation 1 Produce four samples from a normal distribution with mean 0 and standard deviation 1 u1u1u1u1 u2u2u2u2 v1v1v1v1 v2v2v2v2rf g1g1g1g1 g2g2g2g2 0.2340.784-0.5320.5680.6051.290-0.6860.732 0.8240.0390.648-0.9211.269 0.4300.176-0.140-0.6480.4391.935-0.271-1.254
48
Rejection Method
49
Example Generate random variables from this probability density function Generate random variables from this probability density function
50
Example (cont.) So h(x) f(x) for all x
51
Example (cont.) xixixixi uiuiuiui u i h(x i ) f(x i ) Outcome 0.8600.9750.6890.681Reject 1.5180.3570.2520.448Accept 0.3570.9200.6500.349Reject 1.3060.2720.1920.523Accept Two samples from f(x) are 1.518 and 1.306
52
Case Studies (Topics Introduced) Temperature inside a 2-D plate (Random walk) Temperature inside a 2-D plate (Random walk) Two-dimensional Ising model (Metropolis algorithm) Two-dimensional Ising model (Metropolis algorithm) Room assignment problem (Simulated annealing) Room assignment problem (Simulated annealing) Parking garage (Monte Carlo time) Parking garage (Monte Carlo time) Traffic circle (Simulating queues) Traffic circle (Simulating queues)
53
Temperature Inside a 2-D Plate Random walk
54
Example of Random Walk 13121322
55
NP-Hard Assignment Problems TSP:Find a tour of US cities that minimizes distance.
56
Physical Annealing Heat a solid until it melts Heat a solid until it melts Cool slowly to allow material to reach state of minimum energy Cool slowly to allow material to reach state of minimum energy Produces strong, defect-free crystal with regular structure Produces strong, defect-free crystal with regular structure
57
Simulated Annealing Makes analogy between physical annealing and solving combinatorial optimization problem Makes analogy between physical annealing and solving combinatorial optimization problem Solution to problem = state of material Solution to problem = state of material Value of objective function = energy associated with state Value of objective function = energy associated with state Optimal solution = minimum energy state Optimal solution = minimum energy state
58
How Simulated Annealing Works Iterative algorithm, slowly lower T Iterative algorithm, slowly lower T Randomly change solution to create alternate solution Randomly change solution to create alternate solution Compute , the change in value of objective function Compute , the change in value of objective function If < 0, then jump to alternate solution If < 0, then jump to alternate solution Otherwise, jump to alternate solution with probability e - /T Otherwise, jump to alternate solution with probability e - /T
59
Performance of Simulated Annealing Rate of convergence depends on initial value of T and temperature change function Rate of convergence depends on initial value of T and temperature change function Geometric temperature change functions typical; e.g., T i+1 = 0.999 T i Geometric temperature change functions typical; e.g., T i+1 = 0.999 T i Not guaranteed to find optimal solution Not guaranteed to find optimal solution Same algorithm using different random number streams can converge on different solutions Same algorithm using different random number streams can converge on different solutions Opportunity for parallelism Opportunity for parallelism
60
Convergence Starting with higher initial temperature leads to more iterations before convergence
61
Parking Garage Parking garage has S stalls Parking garage has S stalls Car arrivals fit Poisson distribution with mean A: Exponentially distributed inter- arrival times Car arrivals fit Poisson distribution with mean A: Exponentially distributed inter- arrival times Stay in garage fits a normal distribution with mean M and standard deviation M/S Stay in garage fits a normal distribution with mean M and standard deviation M/S
62
Implementation Idea 101.2142.170.391.7223.1 64.2 Current Time Times Spaces Are Available 15 Car CountCars Rejected 2
63
Summary (1/3) Applications of Monte Carlo methods Applications of Monte Carlo methods Numerical integration Simulation Random number generators Random number generators Linear congruential Lagged Fibonacci
64
Summary (2/3) Parallel random number generators Parallel random number generators Manager/worker Leapfrog Sequence splitting Independent sequences Non-uniform distributions Non-uniform distributions Analytical transformations Box-Muller transformation (Rejection method)
65
Summary (3/3) Concepts revealed in case studies Concepts revealed in case studies Monte Carlo time Random walk Metropolis algorithm Simulated annealing Modeling queues
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.