The Markov Chain Monte Carlo Method Isabelle Stanton May 8, 2008 Theory Lunch.

Slides:



Advertisements
Similar presentations
Slow and Fast Mixing of Tempering and Swapping for the Potts Model Nayantara Bhatnagar, UC Berkeley Dana Randall, Georgia Tech.
Advertisements

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.
Monte Carlo Methods and Statistical Physics
Modeling and Simulation Monte carlo simulation 1 Arwa Ibrahim Ahmed Princess Nora University.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.
Randomized Algorithms Kyomin Jung KAIST Applied Algorithm Lab Jan 12, WSAC
Computer Vision Lab. SNU Young Ki Baik An Introduction to MCMC for Machine Learning (Markov Chain Monte Carlo)
Markov Chains 1.
11 - Markov Chains Jim Vallandingham.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Lecture 3: Markov processes, master equation
Graduate School of Information Sciences, Tohoku University
Bayesian statistics – MCMC techniques
BAYESIAN INFERENCE Sampling techniques
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov
Planning under Uncertainty
1 CE 530 Molecular Simulation Lecture 8 Markov Processes David A. Kofke Department of Chemical Engineering SUNY Buffalo
Simulation Where real stuff starts. ToC 1.What, transience, stationarity 2.How, discrete event, recurrence 3.Accuracy of output 4.Monte Carlo 5.Random.
What if time ran backwards? If X n, 0 ≤ n ≤ N is a Markov chain, what about Y n = X N-n ? If X n follows the stationary distribution, Y n has stationary.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
1 Towards Efficient Sampling: Exploiting Random Walk Strategy Wei Wei, Jordan Erenrich, and Bart Selman.
Complexity 1 Mazes And Random Walks. Complexity 2 Can You Solve This Maze?
1 On the Computation of the Permanent Dana Moshkovitz.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
Accelerating Simulated Annealing for the Permanent and Combinatorial Counting Problems.
Sampling Combinatorial Space Using Biased Random Walks Jordan Erenrich, Wei Wei and Bart Selman Dept. of Computer Science Cornell University.
Monte Carlo Methods in Partial Differential Equations.
6. Markov Chain. State Space The state space is the set of values a random variable X can take. E.g.: integer 1 to 6 in a dice experiment, or the locations.
Advanced methods of molecular dynamics Monte Carlo methods
Free energies and phase transitions. Condition for phase coexistence in a one-component system:
Stochastic Algorithms Some of the fastest known algorithms for certain tasks rely on chance Stochastic/Randomized Algorithms Two common variations – Monte.
Monte Carlo Simulation and Personal Finance Jacob Foley.
1 Theoretical Physics Experimental Physics Equipment, Observation Gambling: Cards, Dice Fast PCs Random- number generators Monte- Carlo methods Experimental.
F.F. Assaad. MPI-Stuttgart. Universität-Stuttgart Numerical approaches to the correlated electron problem: Quantum Monte Carlo.  The Monte.
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
Approximation Algorithms for Knapsack Problems 1 Tsvi Kopelowitz Modified by Ariel Rosenfeld.
Monte Carlo Methods Versatile methods for analyzing the behavior of some activity, plan or process that involves uncertainty.
Simulated Annealing.
Amplification of stochastic advantage
Markov Chain Monte Carlo and Gibbs Sampling Vasileios Hatzivassiloglou University of Texas at Dallas.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
Chapter 61 Continuous Time Markov Chains Birth and Death Processes,Transition Probability Function, Kolmogorov Equations, Limiting Probabilities, Uniformization.
M ONTE C ARLO SIMULATION Modeling and Simulation CS
Lecture 2 Molecular dynamics simulates a system by numerically following the path of all particles in phase space as a function of time the time T must.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,
CS 188: Artificial Intelligence Bayes Nets: Approximate Inference Instructor: Stuart Russell--- University of California, Berkeley.
An Introduction to Monte Carlo Methods in Statistical Physics Kristen A. Fichthorn The Pennsylvania State University University Park, PA
Javier Junquera Importance sampling Monte Carlo. Cambridge University Press, Cambridge, 2002 ISBN Bibliography.
CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Rainer Gemulla, Wolfgang Lehner and Peter J. Haas VLDB 2006 A Dip in the Reservoir: Maintaining Sample Synopses of Evolving Datasets 2008/8/27 1.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.
Modelling Complex Systems Video 4: A simple example in a complex way.
How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.
The Monte Carlo Method/ Markov Chains/ Metropolitan Algorithm from sec in “Adaptive Cooperative Systems” -summarized by Jinsan Yang.
Markov Chain Monte Carlo methods --the final project of stat 6213
Monte Carlo simulation
Intro to Sampling Methods
Reinforcement Learning (1)
Path Coupling And Approximate Counting
Markov chain monte carlo
Haim Kaplan and Uri Zwick
Statistical Data Mining
Presentation transcript:

The Markov Chain Monte Carlo Method Isabelle Stanton May 8, 2008 Theory Lunch

Monte Carlo vs Las Vegas Las Vegas Algorithms are randomized and always give the correct results but gamble with computation time Quicksort Monte Carlo algorithms have fixed running time but may be wrong Simulated Annealing Estimating volume

Markov Chains a memoryless stochastic process, eg, flipping a coin /61/6 1/6 1/61/6

Other Examples of Markov Chains Shuffling cards Flipping a coin PageRank Model Particle systems – focus of MCMC work

General Idea Model the system using a Markov Chain Use a Monte Carlo Algorithm to perform some computation task

Applications Approximate Counting - # of solutions to 3-SAT or Knapsack Statistical Physics – when do phase transitions occur? Combinatorial optimization – simulated annealing type of algorithms We'll focus on counting

Monte Carlo Counting How do you estimate the volume of a complex solid? Render with environment maps efficiently? Estimate an integral numerically?

(Picnic) Knapsack Holds 20 weighs 4 weighs 10 weighs 4 weighs 2 weighs 5 What is a solution?How many solutions are there?

Counting Knapsack Solutions Item weights: a = (a 0,...a n ) Knapsack size: a real number b Estimate the number of {0,1} vectors, x, that satisfy a*x ≤ b Let N denote the number of solutions

Naїve Solution Randomly generate x Calculate a*x If a*x ≤ b return 2 n else return 0 This will return N in expectation:  0*(2 n -N) + N*2 n / 2 n

Is this fast? Counterexample: a = (1,... 1) and b = n/3 Any solution has less than n/3 1's There are (n choose n/3)*2 n/3 solutions

no Pr(sample x, ||x|| ≤ n/3) < (n choose n/3)*2 -2n/3 In expectation, need to generate 2 n/3 x's before we get a single solution! Any polynomial number of trials will grossly underestimate N

Knapsack with MCMC Let M knap be a markov chain withstate space Ω(b) = {x | a*x ≤ b} This will allow us to sample a solution

Various M knap a=(0,.5,.5) b = 1.5 a=(0,1,1) b =

M knap Transitions Transitions  With probability 1/2, x transitions to x  Otherwise, select an i u.a.r. from 0 to n-1 and flip the ith bit of x. If x' is a solution, transition there a=(0,1,1) b = /6

Connected? Is M knap connected? Yes. To get from x to x' go through 0.

Ergodicity What is the stationary distribution of Knapsack?  Sample each solution with prob 1/N A MC is ergodic if the probability distribution over the states converges to the stationary distribution of the system, regardless of the starting configuration Is M knap ergodic? Yes.

Algorithm Idea Start at 0 and simulate M knap for enough steps that the distribution over the states is close to uniform Why does uniformity matter? Does this fix the problem yet?

The trick Assume that a 0 ≤ a 1... ≤ a n (0,1,2,…,n-1,n) Let b 0 = 0 and b i = min{b, Σ i a j } |Ω(b i-1 )| ≤ |Ω(b i )| - why? |Ω(b i )| ≤ (n+1)|Ω(b i-1 )| - why? Change any element of Ω(b i ) to one of Ω(b i-1 ) by switching the rightmost 1 to a 0

How does that help? |Ω(b)| = |Ω(b n )| = |Ω(b n )|/|Ω(b n-1 )| x |Ω(b n-1 )|/|Ω(b n-2 )| x... x |Ω(b 1 )|/Ω|(b 0 )| x |Ω(b 0 )| We can estimate each of these ratios by doing a walk on Ω(b i ) and computing the fraction of samples in Ω(b i-1 )‏ Good estimate since |Ω(b i-1 )| ≤ |Ω(b i )| ≤ (n+1)|Ω(b i-1 )|

Analysis Ignoring bias, the expectation of each trial is |Ω(b i-1 )|/|Ω(b i )| We perform t = 17ε -2 n 2 steps Focus on Var(X)/E(X)^2 in analyzing efficiency for MCMC methods

Analysis If Z is the product of the trials, E[Z] = П |Ω(b i-1 )|/|Ω(b i )| *Magic Statistics Steps* Var(Z)/(E[Z]) 2 ≤ ε 2 /16 By Chebyshev's: Pr[(1-ε/2)|Ω(b)| ≤ Z ≤ (1+ε/2)|Ω(b)| ] ≥ 3/4

Analysis We used nt = 17ε -2 n 3 steps This is a FPRAS (Fully Polynomial Randomized Approximation Scheme)‏ Except... what assumption did I make?

Mixing Time Assumption: We are close to the uniform distribution in 17ε -2 n 2 steps This is known as the mixing time It is unknown if this distribution mixes in polynomial time

Mixing Time What does mix in polynomial time?  Dice – 1 transition  Shuffling cards – 7 shuffles  ferromagnetic Ising model at high temperature – O(nlog n)‏ What doesn't?  ferromagnetic Ising model at low temperature – starts to form magnets

Wes Weimer Memorial Conclusion Slide The markov chain monte carlo method models the problem as a Markov Chain and then uses random walks Mixing time is important P# problems are hard Wes likes trespassing