Markov Chain Monte Carlo: Metropolis and Glauber Chains

Markov Chain Monte Carlo: Metropolis and Glauber Chains
Chapter 3 Markov Chain Monte Carlo: Metropolis and Glauber Chains Yael Harel

Contents Reminders from previous weeks Definitions Theorems Motivation
Metropolis Chains What is it? Construction over symmetric matrices Example Construction over asymmetric matrices Glauber Dynamics Examples Metropolis Chains VS Glauber Dynamics Summary

Reminders from previous weeks
Definitions Ω – finite state space (configurations) P – transition matrix μ t – the probability to be in each x∈Ω on time t (row vector) Moving to state y at time t+1: μ t+1 y = x∈Ω μ t x ∗P(x,y)  μ t+1 = μ t ∗P  μ t = μ 0 ∗ P t Chain P is irreducible if: ∀x,y∈Ω. ∃t. P t x,y >0 Stationary distribution: π= π ∗P Detailed balance equations: ∀x,y∈Ω. π x ∗P x,y = π y ∗P y,x Reversible chain – satisfying the detailed balance equations Regular graph – each vertex has the same degree Simple random walk on graph: P x,y = 1 deg⁡(x) if x~y (0 otherwise) P – the same matrix for every t – each row sum up to 1 Irreducible – it is possible to move from any state to any other state using transitions of positive probabilities.

Reminders from previous weeks
Theorems Irreducible chain  there exists a unique stationary distribution π satisfied detailed balance equations π stationary P symmetric, π uniform distribution  reversible chain

Motivation The problem in most of the book
Given probability distribution π. Assume a Markov chain can be constructed s.t π stationary. How large should t be in order that X t will be close enough to π? The problem in this chapter Can we construct a Markov chain s.t π is its stationary? Example – q-coloring The goal – given a graph  random sample a proper q-coloring. Why do we want to do it? Size of Ω can be estimated. Characteristics of colorings can be studied. Markov chain Monte Carlo Sampling from a given probability distribution easy difficult Xt will be close to pai when t is close enough by the Convergence Theorem q-coloring Like in the kindergarden G=(V,E) graph {1,2,…,q} colors q-coloring – assign a color to each vertex s.t neighbors don’t have the same color NP complete random sample = random uniform selection Size of omega can be estimated since the uniform distribution is 1/omega but it is no so easy to see (chapter 14 is talking about it)

Metropolis Chains Given: Ω – state space, ψ – symmetric matrix, π – distribution The goal: modify ψ to P s.t π = π * P The new chain construction In order that π will be stationary, detailed balanced should be satisfied: π x ∗ψ x,y ∗a x,y =π y ∗ψ y,x ∗a y,x π x ∗a x,y =π y ∗a y,x π(x)≥π x ∗a x,y =π y ∗a y,x ≤π(y) We would like P(x,x)~0  Maximize ψ x,y ∗a x,y  Maximize a x,y π x ∗a x,y = min π x , π y ≔π x ∧π y a x,y =1∧ π(y) π(x) ψ x,y =ψ(y,x) a y,x ,a(x,y)≤1 Pai will be the stationary matrix of P’. a(x,y) is probability. P(x,x) is this since each row of the matrix should sum up to 1. We would like that P(x,x) will be close to 0 so we won’t get stuck in the same state of the chain. Notice that P depends only on the ratio pai(x)/pai(y)! If pai(x) = h(x)/Z, Z is normalized factor and it is difficult to it (since omega is too big for example), we still won’t have a problem to construct the chain!

Metropolis Chains Example – uniform distribution over the global maximum Given: Ω – vertex set of regular graph, f – function defined on Ω The goal: find Ω ∗ ≔{x∈Ω :f x = f ∗ ≔ max y∈Ω f(y)} Hill climb The algorithm: move from x to neighbor y if f(y) > f(x) The problem: we can stuck in local maximum Building a metropolis chain For simplicity – the algorithm (ψ): a simple random walk. π λ x = λ f(x) Z(λ) λ≥1 Z λ ≔ x∈Ω λ f x lim λ→∞ π λ x = lim λ→∞ λ f(x) x∈ Ω ∗ λ f(x) + x∉ Ω ∗ λ f(x) = lim λ→∞ λ f(x) λ f ∗ (x) x∈ Ω ∗ x∉ Ω ∗ λ f(x) λ f ∗ (x) x∈ Ω ∗  lim λ→∞ π λ x = 1 | Ω ∗ | x∉ Ω ∗  lim λ→∞ π λ x = 0 f y ≥f x  π(y) π(x) = λ f y −f(x) ≥1P x,y =ψ x,y ∗1=ψ x,y f y <f x  π(y) π(x) = λ f y −f(x) <1P x,y =ψ x,y ∗ π y π x <ψ x,y Transition matrix of simple random walk over a regular graph is symmetric. Z(lambda) normalized pi(x) so it will be a probability. lambda^f(x) increases exponential when f(x) increases  pi(x) increases when f(x) increases. As I said before – there is no need to calculate Z! When lambda  inf, the stationary distribution pai converges to the uniform distribution over the global maximum of f In real applications, need to increase the lambda gradually in order that we won’t get stuck – this is called simulated anneallins – chains with lambda that increases over time a x,y =1∧ π y π x P(x,y) = ψ x,y ∗a x,y

Metropolis Chains Given: Ω – state space, ψ – matrix, π – distribution
The goal: modify ψ to P s.t π = π * P The new chain construction In order that π will be stationary, detailed balanced should be satisfied: π x ∗ψ x,y ∗a x,y =π y ∗ψ y,x ∗a y,x π x ∗ψ(x,y)≥π x ∗ψ(x,y)∗a x,y =π y ∗ψ(y,x)∗a y,x ≤π(y)∗ψ(y,x) We would like P(x,x)~0  Maximize ψ(x,y)∗a x,y  Maximize a(x,y) π x ∗ψ x,y ∗a x,y =(π x ∗ψ x,y )∧(π y ∗ψ y,x ) a x,y =1∧ π y ∗ψ(y,x) π(x)∗ψ(x,y) a y,x ,a(x,y)≤1 Pai will be the stationary matrix of P’. Notice that P depends only on the ratio pai(x)/pai(y)!

Metropolis Chains Example – uniform distribution for irregular graph
Each vertex is familiar only with its neighbors. ψ – matrix of simple random walk. π – uniform distribution over |V|. The goal: modify ψ to P s.t π is its stationary distribution. According to what we saw: a x,y =1∧ π y ∗ψ y,x π x ∗ψ x,y =1∧ 1 V ∗ 1 deg y V ∗ 1 deg x =1∧ deg⁡(x) deg⁡(y) Although we don’t know how all of the graph looks like, from each vertex, we can go to the next one! a x,y =1∧ π y ∗ψ(y,x) π(x)∗ψ(x,y) Psi – not symmetric! Facebook graph

Glauber Dynamics (Gibbes sampler)
Given: V – vertex set of a graph S – finite set SV – functions from V to S Ω⊆SV (proper configurations) π – probability distribution The goal: construct P s.t π = π * P The new chain construction Let: x∈Ω, v∈V Define: Ω x,v = y∈Ω :∀w∈V, w≠v. y w =x w Move to other configuration: Select v∈V at random Move to y∈Ω(x,v) with probability π y z∈Ω(x,v) π(z)  P(x,y)= 1 |V| ∗ π y z∈Ω(x,v) π(z) The detailed balance equations are satisfied: π x ∗ 1 V ∗ π y z∈Ω x,v π z =π y ∗ 1 V ∗ π x z∈Ω(y,v) π(z) Each vertex v in V gets a label S (color/sign…). π stationary of P Ω x,v =Ω y,v − the same “blob”

Example – q-coloring Given: G = (V,E), S = {1,2,…,q}, π = uniform over proper configurations The goal: construct Markov chain on the set of proper q-coloring x – proper configuration, v∈V j∈S is allowable if j∉{x(w) : w~v} Av(x) – {j∈S : j is allowable} Moving from proper configuration x to other proper configuration: Select v∈V at random Select j∈Av(x) at random P(x,y)= 1 |V| ∗ 1 | A v x | Av(x) = Av(y) P(x,y) = P(y,x) π x = 1 Ω π stationary 1 4 ∗ 1 3 S = colors Pai uniform Red X configuration – part of SV but not in Omega 1 4 ∗ 1 2 1 4 ∗ 1 3 1 4 ∗ 1 2 1 4 ∗ 1 3

Example – particle configuration Given: G = (V,E), S = {0,1}, π = uniform over proper configurations x – configuration: x(v)=1  v occupied x(v)=0  v vacant Proper configuration if ∀ v,w ∈E. x v ∗x w =0 Moving from proper configuration x to other proper configuration: Select v∈V at random If ∃(v,w)∈E s.t w is occupied  stay in configuration x Else y(v) = 1 with probability 1 2 v is vacant If x(v)=1  y=x 1 5 ∗ 1 2 P(x,y) = P(y,x) π x = 1 Ω π stationary S = 1 if there is a particle on the vertex and 0 otherwise Proper configuration – there are no neighbors that both of them are occupied 1 5 ∗ 1 2 1 5 ∗ 1 2 ∗ 1 2 =1− 1 10 1 5 ∗ 1 2 ∗5= 1 2 1 5 ∗ 1 2 1 5 ∗ 1 2

Metropolis Chains VS Glauber Dynamics
Given: G = (V,E) S – finite set π – probability distribution over SV ψ – chain with the following rule: Select v∈V at random Select s∈S at random and update v Metropolis construction Glauber construction P(x,y)= 1 |V| ∗ π y z∈Ω(x,v) π(z) Π stationary of P but… Will the Glauber and the Metropolis chains be equal? similar?

Metropolis Chains VS Glauber Dynamics The chains are different!
Example – q-coloring Metropolis Chain ψ – original matrix: ψ(x,y) = 1 |V| ∗ 1 q  maybe not a proper configuration! P s.t uniform π over proper q-coloring configurations is stationary: If y – proper configuration (π y >0): P(x,y) = ψ x,y Else: P(x,y) = 0 Galuber Chain ∀ x, y proper configurations: If x, y are different in ≤1vertex: P(x,y) = 1 |V| ∗ 1 | A v x | Else: P(x,y) = 0 The chains are different! v∈V was selected  the probability of remaining at the configuration: In Metropolis: q− A v x q + 1 q  Choose a non allowable color/the color of v In Glauber: 1 | A v x |  Choose the color of v Metropolis P matrix v∈V at random q∈Q at random and update v Glauber Av(x) – {j∈S : j is allowable}  v is different for each (x,y)

Metropolis Chains VS Glauber Dynamics
Example – particle configuration Metropolis Chain ψ – original matrix: ψ(x,y) = 1 |V| ∗ 1 2  maybe not a proper configuration! P s.t uniform π over proper particles configurations is stationary: If y – proper configuration (π y >0): P(x,y) = ψ x,y Else: P(x,y) = 0 Galuber Chain ∀ x, y proper configurations: If x, y are different in ≤1vertex: P(x,y) = 1 |V| ∗ Else: P(x,y) = 0 The chains are equal! Metropolis P matrix v∈V at random s∈{0,1} at random and update v

for each configuration
Summary Chain construction with a given stationary distribution Metropolis – given a transition matrix. Glauber – without any transition matrix.  Can be equal or similar Example – q-coloring NP-complete problem  #proper configurations – unknown. Construct a chain with the uniform stationary distribution. Simulation: For i=1 to N Run the chain T iterations Save the result Learn how does the configurations distribute After T iterations - π: same probability for each configuration In the next weeks: How to find T?

Thank you 

Markov Chain Monte Carlo: Metropolis and Glauber Chains

Similar presentations

Presentation on theme: "Markov Chain Monte Carlo: Metropolis and Glauber Chains"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Markov Chain Monte Carlo: Metropolis and Glauber Chains

Similar presentations

Presentation on theme: "Markov Chain Monte Carlo: Metropolis and Glauber Chains"— Presentation transcript:

Similar presentations

About project

Feedback