Download presentation
Presentation is loading. Please wait.
Published byLeslie Turner Modified over 9 years ago
1
Kevin Stevenson AST 4762/5765
2
What is MCMC? Random sampling algorithm Estimates model parameters and their uncertainty Only samples regions of high probability rather than uniform sampling Faster More efficient Region is called “phase space”
3
Phase Space Space in which all possible states of a system are represented Each space corresponds to one unique point Every parameter (or DoF) is represented by an axis Eg. 3 position vectors (x, y, z) require a 3- dimensional phase space Eg. Add time to produce a 4-D phase space Can be represented very easily in Python using arrays
4
Markov Chain A stochastic (or random) process having the Markov property Indeterminate future, evolution is described by probability distributions “Given the present state, future states are independent of the past states” In other words… At a given step, the system has a set of parameters that define its state At the next step, the system might change states or it might remain in the same state according to a certain probability Each prospective step is determined ONLY by its current state (no past memory)
5
Example: Random Walk Consider a drunk standing under a lamppost trying to get home He takes a step in a random direction (N, E, S, W), each having equal probability Having forgotten his previous step, he again takes a step in a random direction Forms a Markov chain
6
Random Walk Methods Metropolis-Hastings algorithm Vary all parameters simultaneously Accept step with a certain probability Gibbs sampling Special (usually faster) case of M-H Hold all parameters constant, except one Vary parameter to find best fit Choose next parameter and repeat Slice sampling Multi-try Metropolis
7
Avoiding Random Walk May want stepper to avoid doubling back Faster convergence Harder to implement Methods Successive over-relaxation Variation on Gibbs sampling Hybrid Monte Carlo Introduces momentum
8
Metropolis-Hastings Algorithm Goal: want to estimate model parameters and their uncertainty M-H algorithm generates a sequence of samples from a probability distribution that is difficult to sample from directly Distribution may not be Gaussian May not know distribution at all How does it generate this set?
9
Preferential Probability Want to visit a point x with a probability proportional to some given distribution functions, π(x) “Probability distribution” or “target density” Preferentially samples where π(x) is large Probability distribution: Probability of x falling within a particular interval Ergodic Must, in principle, be able to reach every point in the region of interest
10
Let Me Propose… Proposal distribution/density: Depends on current state, x 1 Generates a new proposed sample, x 2 Must also be ergodic Can be approximated by a Gaussian centered around x 1 May be symmetric:
11
Target & Proposal Densities P(x) = target density Q(x,x t ) = proposal density
12
Don’t We All Want To Feel Accepted? Acceptance probability: If α ≥ 1: Accept the proposed step Current state becomes x 2 If α < 1: Accept step with probability α Reject step with probability 1 – α State remains at x 1
13
Not Too Hot, Not Too Cold Acceptance rate: fraction of accepted steps Want an acceptance rate of 30 – 70% Too high => slow convergence Too low => small sample size Must tune the proposal density, Q, to obtain an acceptable acceptance rate If Q is Gaussian the we tune the standard deviation, σ Think of σ as a step size
14
What Is π?
15
Where to start Some starting positions are better than others The equilibrium distribution is rapidly approached from any starting position, x 0 Proof: Due to ergodicity, choosing any point as the starting point is equivalent to jumping into the equilibrium distribution chain at that particular point in time Suggestions for choosing a starting point: Best or mean parameters from previous run Least squares fit (scipy.optimize) Several starting locations from corners of phase space
16
Infinite Iterations How long do you run MCMC? As # iterations -> ∞, algorithm converges to a precise value This is NOT the true value, but the best apparent value for the dataset Run MCMC long enough to: Forget initial conditions (burn-in) Characterize your distribution Error in your parameter mean is smaller than observed dispersion in you Markov Chain
17
Burn-in Need burn-in to “forget” the starting position Remove AT LEAST the first 2% of the total run length Better yet, look at your data!
18
Through The Fire And Flames Remaining set of states represent a sample from the distribution π(x) Compute the mean (or median) and error of each parameter in your set Use every m th step for computations and histograms where m should be longer than the correlation time scale between steps m ~ 10 – 100 Relation between apparent and true values is indicated by the width of the distribution Plot histogram to see shape Fit Gaussian to determine width and, hence, error
19
Now Here We Stand Recap Chosen our proposal distribution, initial parameters and number of iterations Ran MCMC and removed burn-in portion Determined the mean/median of the apparent values Computed their errors What’s next? Plug those parameters into the model Analyze your results (do science!!!)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.