Gibbs Sampling A little bit of theory Outline: -What is Markov chain - -When will it be stationary? Properties …. -Gibbs is a Markov chain -P(z) is the stationary For structured learning like structured perceptron, know how to inference solve the problem Inference for everyone when using MRF Question - Ergoic: true considition. What is not? - Slice sampling (not sure to mention or not)
Gibbs Sampling Gibbs sampling from a distribution P(z) ( z = {z1,…,zN} ) 𝒛 𝟎 = 𝑧 1 0 , 𝑧 2 0 ,⋯, 𝑧 𝑁 0 For t = 1 to T: 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 … Markov Chain: a stochastic process in which future states are independent of past states given the present state 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 3 = 𝑧 3 𝑡 ,⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 Output: 𝒛 𝒕 = 𝑧 1 𝑡 , 𝑧 2 𝑡 ,⋯, 𝑧 𝑁 𝑡 𝒛 𝟏 , 𝒛 𝟐 , 𝒛 𝟑 ,…, 𝒛 𝑻 As sampling from P(z) Why?
Markov Chain Three cities A, B and C 1/6 1/2 1/2 1/2 1/6 2/3 1/2 Three cities A, B and C C 1/6 1/2 1/2 1/2 1/6 A B 2/3 1/2 Each day he decided which city he would visit with the probability distribution depending on which city he was in Random Walk Web pages ? Use on webpages, page rank The traveler recorded the cities he visited each day. …… A B C A A This is a Markov chain state
Markov Chain With sufficient samples …… A : B : C = 0.6 : 0.2 : 0.2 (independent of the starting city) 10000 days 10000 days 10000 days 5915 2025 2060 6016 2002 1982 5946 2016 2038 0.6 0.2
Markov Chain 0.2 0.2 0.6 The distribution will not change. P(A)=0.6 0.2 C P(A)=0.6 1/6 1/2 P(B)=0.2 1/2 1/2 P(C)=0.2 0.6 0.2 1/6 A B Stationary Distribution 2/3 1/2 Web pages ? Use on webpages, page rank 2/3 0.6 1/2 0.2 1/2 0.2 0.6 𝑃 𝑇 𝐴 𝐴 𝑃 𝐴 + 𝑃 𝑇 𝐴 𝐵 𝑃 𝐵 + 𝑃 𝑇 𝐴 𝐶 𝑃 𝐶 =𝑃 𝐴 𝑃 𝑇 𝐵 𝐴 𝑃 𝐴 + 𝑃 𝑇 𝐵 𝐵 𝑃 𝐵 + 𝑃 𝑇 𝐵 𝐶 𝑃 𝐶 =𝑃 𝐵 𝑃 𝑇 𝐶 𝐴 𝑃 𝐴 + 𝑃 𝑇 𝐶 𝐵 𝑃 𝐵 + 𝑃 𝑇 𝐶 𝐶 𝑃 𝐶 =𝑃 𝐶 The distribution will not change.
Unique stationary distribution Markov Chain A Markov Chain can have multiple stationary distributions. C A B 1 Reaching which stationary distribution depends on starting state The Markov Chain fulfill some conditions will have unique stationary distribution. Web pages ? Use on webpages, page rank Irreducible, appreriodic PT(s’|s) for any states s and s’ is not zero Unique stationary distribution (sufficient but not necessary condition)
Markov Chain from Gibbs Sampling Gibbs sampling from a distribution P(z) ( z = {z1,…,zN} ) 𝒛 𝟎 = 𝑧 1 0 , 𝑧 2 0 ,⋯, 𝑧 𝑁 0 For t = 1 to T: 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 3 = 𝑧 3 𝑡 ,⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 … Output: 𝒛 𝒕 = 𝑧 1 𝑡 , 𝑧 2 𝑡 ,⋯, 𝑧 𝑁 𝑡 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 Markov Chain: a stochastic process in which future states are independent of past states given the present state This is a Markov Chain 𝒛 𝟏 , 𝒛 𝟐 , 𝒛 𝟑 ,…, 𝒛 𝑻 zt only depend on zt-1 state
Markov Chain from Gibbs Sampling Gibbs sampling from a distribution P(z) ( z = {z1,…,zN} ) 𝒛 𝟎 = 𝑧 1 0 , 𝑧 2 0 ,⋯, 𝑧 𝑁 0 For t = 1 to T: 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 3 = 𝑧 3 𝑡 ,⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 … Output: 𝒛 𝒕 = 𝑧 1 𝑡 , 𝑧 2 𝑡 ,⋯, 𝑧 𝑁 𝑡 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 Markov Chain: a stochastic process in which future states are independent of past states given the present state Proof that the Markov chain has unique stationary distribution which is P(z). 𝒛 𝟏 , 𝒛 𝟐 , 𝒛 𝟑 ,…, 𝒛 𝑻
Markov Chain from Gibbs Sampling Markov chain from Gibbs sampling has unique stationary distribution? 𝑃 𝑇 𝑧 ′ |𝑧 >0, for any z and z’ Yes 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , ⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 None of the conditional probability is zero 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , ⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , ⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 … 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , ⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 can be any zt
Markov Chain from Gibbs Sampling Show that P(z) is a stationary distribution 𝑧 𝑃 𝑇 𝑧 ′ |𝑧 𝑃 𝑧 =𝑃 𝑧 ′ 𝑃 𝑇 𝑧 ′ |𝑧 =𝑃 𝑧 1 ′ | 𝑧 2 , 𝑧 3 , 𝑧 4 ,⋯, 𝑧 𝑁 ×𝑃 𝑧 2 ′ | 𝑧 1 ′ , 𝑧 3 , 𝑧 4 ,⋯, 𝑧 𝑁 ×𝑃 𝑧 3 ′ | 𝑧 1 ′ , 𝑧 2 ′ , 𝑧 4 ,⋯, 𝑧 𝑁 Can be proofed, Random systematic sweep … ×𝑃 𝑧 𝑁 ′ | 𝑧 1 ′ , 𝑧 2 ′ , 𝑧 3 ′ ,⋯, 𝑧 𝑁−1 ′ There is only one stationary distribution for Gibbs sampling, so we are done.
Thank you for your attention!