June 2, 2015 1 MARKOV CHAIN MONTE CARLO: A Workhorse for Modern Scientific Computation Xiao-Li Meng Department of Statistics Harvard University.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Monte Carlo Methods and Statistical Physics
Introduction of Markov Chain Monte Carlo Jeongkyun Lee.
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Markov-Chain Monte Carlo
Computer Vision Lab. SNU Young Ki Baik An Introduction to MCMC for Machine Learning (Markov Chain Monte Carlo)
Markov Chains 1.
Markov Networks.
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Graduate School of Information Sciences, Tohoku University
Bayesian statistics – MCMC techniques
Suggested readings Historical notes Markov chains MCMC details
Stochastic approximate inference Kay H. Brodersen Computational Neuroeconomics Group Department of Economics University of Zurich Machine Learning and.
BAYESIAN INFERENCE Sampling techniques
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
Machine Learning CUNY Graduate Center Lecture 7b: Sampling.
Computational statistics, course introduction Course contents  Monte Carlo Methods  Random number generation  Simulation methodology  Bootstrap  Markov.
Image Analysis and Markov Random Fields (MRFs) Quanren Xiong.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Bayes Factor Based on Han and Carlin (2001, JASA).
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
Module 1: Statistical Issues in Micro simulation Paul Sousa.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
Computer Science, Software Engineering & Robotics Workshop, FGCU, April 27-28, 2012 Fault Prediction with Particle Filters by David Hatfield mentors: Dr.
Monte Carlo Methods1 T Special Course In Information Science II Tomas Ukkonen
APPENDIX D R ANDOM N UMBER G ENERATION Organization of chapter in ISSO* – General description and linear congruential generators Criteria for “good” random.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
Improved Cross Entropy Method For Estimation Presented by: Alex & Yanna.
Tracking Multiple Cells By Correspondence Resolution In A Sequential Bayesian Framework Nilanjan Ray Gang Dong Scott T. Acton C.L. Brown Department of.
Ch. 14: Markov Chain Monte Carlo Methods based on Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009.; C, Andrieu, N, de Freitas,
Topic 5 - Joint distributions and the CLT
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.
Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,
7. Metropolis Algorithm. Markov Chain and Monte Carlo Markov chain theory describes a particularly simple type of stochastic processes. Given a transition.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Expectation-Maximization (EM) Algorithm & Monte Carlo Sampling for Inference and Approximation.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.
Lectures' Notes STAT –324 Probability Probability and Statistics for Engineers First Semester 1431/1432 and 5735 Teacher: Dr. Abdel-Hamid El-Zaid Department.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
4.3 Probability Distributions of Continuous Random Variables: For any continuous r. v. X, there exists a function f(x), called the density function of.
How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.
CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.
Markov Chain Monte Carlo in R
Markov Chain Monte Carlo methods --the final project of stat 6213
Optimization of Monte Carlo Integration
Advanced Statistical Computing Fall 2016
Basic simulation methodology
to George Mason University February 2014
Jun Liu Department of Statistics Stanford University
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Some Rules for Expectation
Markov Networks.
CHAPTER 22: Inference about a Population Proportion
Ch13 Empirical Methods.
4.3 Probability Distributions of Continuous Random Variables:
Lecture 15 Sampling.
Markov Networks.
Introduction to Probability: Solutions for Quizzes 4 and 5
Presentation transcript:

June 2, MARKOV CHAIN MONTE CARLO: A Workhorse for Modern Scientific Computation Xiao-Li Meng Department of Statistics Harvard University

2 June 2, 2015 Introduction

3 June 2, 2015 Applications of Monte Carlo Physics Chemistry Astronomy Biology Environment Engineering Traffic Sociology Education Psychology Arts Linguistics History Medical Science Economics Finance Management Policy Military Government Business …

4 June 2, 2015 Monte Carlo 之应用 A Chinese version of the previous slide 物理 化学 天文 生物 环境 工程 交通 社会 教育 心理 人文 语言 历史 医学 经济 金融 管理 政策 军事 政府 商务 …

5 June 2, 2015 Monte Carlo Integration Suppose we want to compute where f(x) is a probability density. If we have samples x 1,…,x n » f(x), we can estimate I by

6 June 2, 2015 Monte Carlo Optimization We want to maximize p(x) Simulate from f(x) / p (x). As ! 1, the simulated draws will be more and more concentrated around the maximizer of p(x)  =1  =2  =20

7 June 2, 2015 Simulating from a Distribution What does it mean? Suppose a random variable ( 随机变量 ) X can only take two values: Simulating from the distribution of X means that we want a collection of 0’s and 1’s: such that about 25% of them are 0’s and about 75%of them are 1’s, when n, the simulation size is large. The {x i, i = 1,…,n} don’t have to be independent

8 June 2, 2015 Simulating from a Complex Distribution Continuous variable X, described by a density function f(x) Complex:  the form of f(x)  the dimension of x

9 June 2, 2015 Markov Chain Monte Carlo where {U (t), t=1,2,…} are identically and independently distributed. Under regularity conditions, So We can treat {x (t), t= N 0, …, N} as an approximate sample from f(x), the stationary/limiting distribution.

10 June 2, 2015 Gibbs Sampler Target density: We know how to simulate form the conditional distributions For the previous example, N( ,  2 ) Normal Distribution “Bell Curve”

11 June 2, 2015

12 June 2, 2015 Statistical Inference Point Estimator: Variance Estimator: Interval Estimator:

13 June 2, 2015 Gibbs Sampler (k steps) Select an initial value (x 1 (0), x 2 (0),…, x k (0) ). For t = 0,1,2, …, N  Step 1: Draw x 1 (t+1) from f(x 1 |x 2 (t), x 3 (t),…, x k ( t ) )  Step 2: Draw x 2 (t+1) from f(x 2 |x 1 (t+1), x 3 (t),…, x k ( t ) ) …  Step K:Draw x k (t+1) from f(x k |x 1 (t+1), x 2 (t+1),…, x k-1 ( t+1 ) ) Output {(x 1 (t), x 2 (t),…, x k ( t ) ): t= 1,2,…,N} Discard the first N 0 draws Use {(x 1 (t), x 2 (t),…, x k ( t ) ): t= N 0 +1,2,…,N} as (approximate) samples from f(x 1, x 2,…, x k ).

14 June 2, 2015 Data Augmentation We want to simulate from But this is just the marginal distribution of So once we have simulations: {(x (t), y (t) : t= 1,2,…,N)}, we also obtain draws: {x (t) : t= 1,2,…,N)}

15 June 2, 2015 A More Complicated Example

16 June 2, 2015 Metropolis-Hastings algorithm Simulate from an approximate distribution q(z 1 |z 2 ), then  Step 0: Select z (0) ; Now for t = 1,2,…,N, repeat  Step 1: draw z 1 from q(z 1 |z 2 =z (t) )  Step 2: Calculate  Step 3: set Discard the first N 0 draws Accept reject

17 June 2, 2015 M-H Algorithm: An Intuitive Explanation Assume, then

18 June 2, 2015 M-H: A Terrible Implementation We choose q(z|z 2 )=q(z)=  (x-4)  (y-4)  Step 1: draw x » N(4,1), y » N(4,1); Dnote z 1 =(x,y)  Step 2: Calculate  Step 3: draw u » U[0,1] Let

19 June 2, 2015 Why is it so bad?

20 June 2, 2015 M-H: A Better Implementation Starting from some arbitrary (x (0),y (0) )  Step 1: draw x » N(x (t),1), y » N(y (t),1) “random walk”  Step 2: dnote z 1 =(x,y), calculate  Step 3: draw u » U[0,1] Let

21 June 2, 2015 Much Improved!

22 June 2, 2015 Further Discussion How large should N 0 and N be? Not an easy problem ! Key difficulty: multiple modes in unknown area We would like to know all (major) modes, as well as their surrounding mass. Not just the global mode We need “automatic, Hill-climbing” algorithms. ) The Expectation/Maximization (EM) Algorithm, which can be viewed as a deterministic version of Gibbs Sampler.

23 June 2, 2015 Drive/Drink Safely, Don’t become a Statistic ; Go to Graduate School, Become a Statistician !