Markov chain monte carlo

Slides:

Advertisements

Similar presentations

02/12/ a tutorial on Markov Chain Monte Carlo (MCMC) Dima Damen Maths Club December 2 nd 2008.

Advertisements

Slice Sampling Radford M. Neal The Annals of Statistics (Vol. 31, No. 3, 2003)

Monte Carlo Methods and Statistical Physics

Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo Amir Hajian Amir Hajian Cosmo06 – September 25, 2006 Astro-ph/

Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.

Markov-Chain Monte Carlo

Markov Chains 1.

11 - Markov Chains Jim Vallandingham.

Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.

CHAPTER 16 MARKOV CHAIN MONTE CARLO

Bayesian Reasoning: Markov Chain Monte Carlo

Suggested readings Historical notes Markov chains MCMC details

BAYESIAN INFERENCE Sampling techniques

CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov

Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.

. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.

Machine Learning CUNY Graduate Center Lecture 7b: Sampling.

Evaluating Hypotheses

Monte Carlo Methods in Partial Differential Equations.

Lecture II-2: Probability Review

Introduction to Monte Carlo Methods D.J.C. Mackay.

Bayes Factor Based on Han and Carlin (2001, JASA).

Priors, Normal Models, Computing Posteriors

Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:

Module 1: Statistical Issues in Micro simulation Paul Sousa.

Stochastic Monte Carlo methods for non-linear statistical inverse problems Benjamin R. Herman Department of Electrical Engineering City College of New.

Monte Carlo Methods1 T Special Course In Information Science II Tomas Ukkonen

Simulated Annealing.

Markov Chain Monte Carlo and Gibbs Sampling Vasileios Hatzivassiloglou University of Texas at Dallas.

Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.

Molecular Systematics

An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.

Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,

7. Metropolis Algorithm. Markov Chain and Monte Carlo Markov chain theory describes a particularly simple type of stochastic processes. Given a transition.

Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.

CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct

Introduction to Sampling Methods Qi Zhao Oct.27,2004.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

STAT 534: Statistical Computing

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.

SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.

CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.

The Monte Carlo Method/ Markov Chains/ Metropolitan Algorithm from sec in “Adaptive Cooperative Systems” -summarized by Jinsan Yang.

Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.

Markov Chain Monte Carlo in R

Introduction to Sampling based inference and MCMC

MCMC Output & Metropolis-Hastings Algorithm Part I

Optimization of Monte Carlo Integration

GEOGG121: Methods Monte Carlo methods, revision

Van Laarhoven, Aarts Version 1, October 2000

Advanced Statistical Computing Fall 2016

Basic simulation methodology

Properties of the Normal Distribution

Jun Liu Department of Statistics Stanford University

Bayesian inference Presented by Amir Hadadi

Markov Chain Monte Carlo

Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.

Haim Kaplan and Uri Zwick

Multidimensional Integration Part I

Ch13 Empirical Methods.

CS 188: Artificial Intelligence

Lecture 15 Sampling.

Slides for Sampling from Posterior of Shape

Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics

VQMC J. Planelles.

Metropolis Light Transit

Markov Networks.

Presentation transcript:

Markov chain monte carlo Kevin Stevenson AST 4762/5765 Markov chain monte carlo

What is MCMC? Random sampling algorithm Estimates model parameters and their uncertainty Only samples regions of high probability rather than uniform sampling Faster More efficient Region is called “phase space”

Phase Space Space in which all possible states of a system are represented Each space corresponds to one unique point Every parameter (or DoF) is represented by an axis Eg. 3 position vectors (x, y, z) require a 3-dimensional phase space Eg. Add time to produce a 4-D phase space Can be represented very easily in Python using arrays

Markov Chain A stochastic (or random) process having the Markov property Indeterminate future, evolution is described by probability distributions “Given the present state, future states are independent of the past states” In other words… At a given step, the system has a set of parameters that define its state At the next step, the system might change states or it might remain in the same state according to a certain probability Each prospective step is determined ONLY by its current state (no past memory)

Example: Random Walk Consider a drunk standing under a lamppost trying to get home He takes a step in a random direction (N, E, S, W), each having equal probability Having forgotten his previous step, he again takes a step in a random direction Forms a Markov chain

Random Walk Methods Metropolis-Hastings algorithm Gibbs sampling Vary all parameters simultaneously Accept step with a certain probability Gibbs sampling Special (usually faster) case of M-H Hold all parameters constant, except one Vary parameter to find best fit Choose next parameter and repeat Slice sampling Multi-try Metropolis

Avoiding Random Walk May want stepper to avoid doubling back Methods Faster convergence Harder to implement Methods Successive over-relaxation Variation on Gibbs sampling Hybrid Monte Carlo Introduces momentum

Metropolis-Hastings Algorithm Goal: want to estimate model parameters and their uncertainty M-H algorithm generates a sequence of samples from a probability distribution that is difficult to sample from directly Distribution may not be Gaussian May not know distribution at all How does it generate this set?

Preferential Probability Want to visit a point x with a probability proportional to some given distribution functions, π(x) “Probability distribution” or “target density” Preferentially samples where π(x) is large Probability distribution: Probability of x falling within a particular interval Ergodic Must, in principle, be able to reach every point in the region of interest

Let Me Propose… Proposal distribution/density: Depends on current state, x1 Generates a new proposed sample, x2 Must also be ergodic Can be approximated by a Gaussian centered around x1 May be symmetric:

Target & Proposal Densities P(x) = target density Q(x,xt) = proposal density

Don’t We All Want To Feel Accepted? Acceptance probability: If α ≥ 1: Accept the proposed step Current state becomes x2 If α < 1: Accept step with probability α Reject step with probability 1 – α State remains at x1

Not Too Hot, Not Too Cold Acceptance rate: fraction of accepted steps Want an acceptance rate of 30 – 70% Too high => slow convergence Too low => small sample size Must tune the proposal density, Q, to obtain an acceptable acceptance rate If Q is Gaussian the we tune the standard deviation, σ Think of σ as a step size

What Is π?

Burn-in The equilibrium distribution is rapidly approached from any starting position, x0 Proof: Due to ergodicity, choosing any point as the starting point is equivalent to jumping into the equilibrium distribution chain at that particular point in time Need burn-in to “forget” the starting position Remove AT LEAST the first 2% of the total run length Better yet, look at your data!

After The Fire Remaining set of states represent a sample from the distribution π(x) Compute the mean (or median) and standard deviation of each parameter in your set Plug those parameters into the model DONE!!!