Markov chain monte carlo

Slides:



Advertisements
Similar presentations
02/12/ a tutorial on Markov Chain Monte Carlo (MCMC) Dima Damen Maths Club December 2 nd 2008.
Advertisements

Slice Sampling Radford M. Neal The Annals of Statistics (Vol. 31, No. 3, 2003)
Monte Carlo Methods and Statistical Physics
Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo Amir Hajian Amir Hajian Cosmo06 – September 25, 2006 Astro-ph/
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Markov-Chain Monte Carlo
Markov Chains 1.
11 - Markov Chains Jim Vallandingham.
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Bayesian Reasoning: Markov Chain Monte Carlo
Suggested readings Historical notes Markov chains MCMC details
BAYESIAN INFERENCE Sampling techniques
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
Machine Learning CUNY Graduate Center Lecture 7b: Sampling.
Evaluating Hypotheses
Monte Carlo Methods in Partial Differential Equations.
Lecture II-2: Probability Review
Introduction to Monte Carlo Methods D.J.C. Mackay.
Bayes Factor Based on Han and Carlin (2001, JASA).
Priors, Normal Models, Computing Posteriors
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
Module 1: Statistical Issues in Micro simulation Paul Sousa.
Stochastic Monte Carlo methods for non-linear statistical inverse problems Benjamin R. Herman Department of Electrical Engineering City College of New.
Monte Carlo Methods1 T Special Course In Information Science II Tomas Ukkonen
Simulated Annealing.
Markov Chain Monte Carlo and Gibbs Sampling Vasileios Hatzivassiloglou University of Texas at Dallas.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
Molecular Systematics
An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.
Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,
7. Metropolis Algorithm. Markov Chain and Monte Carlo Markov chain theory describes a particularly simple type of stochastic processes. Given a transition.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
STAT 534: Statistical Computing
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.
SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.
CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.
The Monte Carlo Method/ Markov Chains/ Metropolitan Algorithm from sec in “Adaptive Cooperative Systems” -summarized by Jinsan Yang.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
Introduction to Sampling based inference and MCMC
MCMC Output & Metropolis-Hastings Algorithm Part I
Optimization of Monte Carlo Integration
GEOGG121: Methods Monte Carlo methods, revision
Van Laarhoven, Aarts Version 1, October 2000
Advanced Statistical Computing Fall 2016
Basic simulation methodology
Properties of the Normal Distribution
Jun Liu Department of Statistics Stanford University
Bayesian inference Presented by Amir Hadadi
Markov Chain Monte Carlo
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Haim Kaplan and Uri Zwick
Multidimensional Integration Part I
Ch13 Empirical Methods.
CS 188: Artificial Intelligence
Lecture 15 Sampling.
Slides for Sampling from Posterior of Shape
Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics
VQMC J. Planelles.
Metropolis Light Transit
Markov Networks.
Presentation transcript:

Markov chain monte carlo Kevin Stevenson AST 4762/5765 Markov chain monte carlo

What is MCMC? Random sampling algorithm Estimates model parameters and their uncertainty Only samples regions of high probability rather than uniform sampling Faster More efficient Region is called “phase space”

Phase Space Space in which all possible states of a system are represented Each space corresponds to one unique point Every parameter (or DoF) is represented by an axis Eg. 3 position vectors (x, y, z) require a 3-dimensional phase space Eg. Add time to produce a 4-D phase space Can be represented very easily in Python using arrays

Markov Chain A stochastic (or random) process having the Markov property Indeterminate future, evolution is described by probability distributions “Given the present state, future states are independent of the past states” In other words… At a given step, the system has a set of parameters that define its state At the next step, the system might change states or it might remain in the same state according to a certain probability Each prospective step is determined ONLY by its current state (no past memory)

Example: Random Walk Consider a drunk standing under a lamppost trying to get home He takes a step in a random direction (N, E, S, W), each having equal probability Having forgotten his previous step, he again takes a step in a random direction Forms a Markov chain

Random Walk Methods Metropolis-Hastings algorithm Gibbs sampling Vary all parameters simultaneously Accept step with a certain probability Gibbs sampling Special (usually faster) case of M-H Hold all parameters constant, except one Vary parameter to find best fit Choose next parameter and repeat Slice sampling Multi-try Metropolis

Avoiding Random Walk May want stepper to avoid doubling back Methods Faster convergence Harder to implement Methods Successive over-relaxation Variation on Gibbs sampling Hybrid Monte Carlo Introduces momentum

Metropolis-Hastings Algorithm Goal: want to estimate model parameters and their uncertainty M-H algorithm generates a sequence of samples from a probability distribution that is difficult to sample from directly Distribution may not be Gaussian May not know distribution at all How does it generate this set?

Preferential Probability Want to visit a point x with a probability proportional to some given distribution functions, π(x) “Probability distribution” or “target density” Preferentially samples where π(x) is large Probability distribution: Probability of x falling within a particular interval Ergodic Must, in principle, be able to reach every point in the region of interest

Let Me Propose… Proposal distribution/density: Depends on current state, x1 Generates a new proposed sample, x2 Must also be ergodic Can be approximated by a Gaussian centered around x1 May be symmetric:

Target & Proposal Densities P(x) = target density Q(x,xt) = proposal density

Don’t We All Want To Feel Accepted? Acceptance probability: If α ≥ 1: Accept the proposed step Current state becomes x2 If α < 1: Accept step with probability α Reject step with probability 1 – α State remains at x1

Not Too Hot, Not Too Cold Acceptance rate: fraction of accepted steps Want an acceptance rate of 30 – 70% Too high => slow convergence Too low => small sample size Must tune the proposal density, Q, to obtain an acceptable acceptance rate If Q is Gaussian the we tune the standard deviation, σ Think of σ as a step size

What Is π?

Burn-in The equilibrium distribution is rapidly approached from any starting position, x0 Proof: Due to ergodicity, choosing any point as the starting point is equivalent to jumping into the equilibrium distribution chain at that particular point in time Need burn-in to “forget” the starting position Remove AT LEAST the first 2% of the total run length Better yet, look at your data!

After The Fire Remaining set of states represent a sample from the distribution π(x) Compute the mean (or median) and standard deviation of each parameter in your set Plug those parameters into the model DONE!!!