Gibbs Sampling A little bit of theory Outline: -What is Markov chain

Slides:

Advertisements

Similar presentations

1 Introduction to Discrete-Time Markov Chain. 2 Motivation  many dependent systems, e.g.,  inventory across periods  state of a machine  customers.

Advertisements

Matrices, Digraphs, Markov Chains & Their Use by Google Leslie Hogben Iowa State University and American Institute of Mathematics Leslie Hogben Iowa State.

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.

Introduction of Markov Chain Monte Carlo Jeongkyun Lee.

Discrete Time Markov Chains

Markov Chain Monte Carlo Prof. David Page transcribed by Matthew G. Lee.

11 - Markov Chains Jim Vallandingham.

TCOM 501: Networking Theory & Fundamentals

CHAPTER 16 MARKOV CHAIN MONTE CARLO

Андрей Андреевич Марков. Markov Chains Graduate Seminar in Applied Statistics Presented by Matthias Theubert Never look behind you…

Lecture 3: Markov processes, master equation

A Fuzzy Web Surfer Model Narayan L. Bhamidipati and Sankar K. Pal Indian Statistical Institute Kolkata.

Entropy Rates of a Stochastic Process

Overview of Markov chains David Gleich Purdue University Network & Matrix Computations Computer Science 15 Sept 2011.

Bayesian statistics – MCMC techniques

More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.

BAYESIAN INFERENCE Sampling techniques

Markov Chains Lecture #5

Algorithmic and Economic Aspects of Networks Nicole Immorlica.

Introduction to PageRank Algorithm and Programming Assignment 1 CSC4170 Web Intelligence and Social Computing Tutorial 4 Tutor: Tom Chao Zhou

. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Review.

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

Kurtis Cahill James Badal.  Introduction  Model a Maze as a Markov Chain  Assumptions  First Approach and Example  Second Approach and Example 

Machine Learning CUNY Graduate Center Lecture 7b: Sampling.

2003/02/13 Chapter 4 1頁1頁 Chapter 4 : Multiple Random Variables 4.1 Vector Random Variables.

1 Markov Chains Algorithms in Computational Biology Spring 2006 Slides were edited by Itai Sharon from Dan Geiger and Ydo Wexler.

The moment generating function of random variable X is given by Moment generating function.

Stochastic Process1 Indexed collection of random variables {X t } t   for each t  T  X t is a random variable T = Index Set State Space = range.

Problems, cont. 3. where k=0?. When are there stationary distributions? Theorem: An irreducible chain has a stationary distribution  iff the states are.

Basic Definitions Positive Matrix: 5.Non-negative Matrix:

The effect of New Links on Google Pagerank By Hui Xie Apr, 07.

6. Markov Chain. State Space The state space is the set of values a random variable X can take. E.g.: integer 1 to 6 in a dice experiment, or the locations.

Entropy Rate of a Markov Chain

1 Introduction to Stochastic Models GSLM Outline  limiting distribution  connectivity  types of states and of irreducible DTMCs  transient,

Properties of Random Direction Models Philippe Nain, Don Towsley, Benyuan Liu, Zhen Liu.

6.896: Probability and Computation Spring 2011 Constantinos (Costis) Daskalakis lecture 3.

1. 3 x = x 3 2. K + 0 = K (3 + r) = (12 + 3) + r 4. 7 (3 + n) = n Name the Property Commutative of Multiplication Identity of Addition.

CHAPTER 15 SECTION 1 – 2 Markov Models. Outline Probabilistic Inference Bayes Rule Markov Chains.

MTA SzTAKI & Veszprém University (Hungary) Guests at INRIA, Sophia Antipolis, 2000 and 2001 Paintbrush Rendering of Images Tamás Szirányi.

Chapter 3 : Problems 7, 11, 14 Chapter 4 : Problems 5, 6, 14 Due date : Monday, March 15, 2004 Assignment 3.

© 2015 McGraw-Hill Education. All rights reserved. Chapter 19 Markov Decision Processes.

Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,

Solve by Factoring Zero Product Property.

CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct

10.1 Properties of Markov Chains In this section, we will study a concept that utilizes a mathematical model that combines probability and matrices to.

(2 x 1) x 4 = 2 x (1 x 4) Associative Property of Multiplication 1.

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

Daphne Koller Sampling Methods Metropolis- Hastings Algorithm Probabilistic Graphical Models Inference.

Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.

Let E denote some event. Define a random variable X by Computing probabilities by conditioning.

The Monte Carlo Method/ Markov Chains/ Metropolitan Algorithm from sec in “Adaptive Cooperative Systems” -summarized by Jinsan Yang.

Chapter 3 Probability.

Markov Chains and Random Walks

Advanced Statistical Computing Fall 2016

DTMC Applications Ranking Web Pages & Slotted ALOHA

Markov chain monte carlo

6. Markov Chain.

Iterative Aggregation Disaggregation

CSE 531: Performance Analysis of Systems Lecture 4: DTMC

Chapter 14 Wiener Processes and Itô’s Lemma

Probability Rules Rule 1.

DONE BY: Mansurova L & Abdukahhorova M.

CS723 - Probability and Stochastic Processes

CS723 - Probability and Stochastic Processes

Chapter 3 Probability.

Chapter 3 Probability.

An Example of {AND, OR, Given that} Using a Normal Distribution

Sample Proofs 1. S>-M A 2. -S>-M A -M GOAL.

Presentation transcript:

Gibbs Sampling A little bit of theory Outline: -What is Markov chain - -When will it be stationary? Properties …. -Gibbs is a Markov chain -P(z) is the stationary For structured learning like structured perceptron, know how to inference solve the problem Inference for everyone when using MRF Question - Ergoic: true considition. What is not? - Slice sampling (not sure to mention or not)

Gibbs Sampling Gibbs sampling from a distribution P(z) ( z = {z1,…,zN} ) 𝒛 𝟎 = 𝑧 1 0 , 𝑧 2 0 ,⋯, 𝑧 𝑁 0 For t = 1 to T: 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 … Markov Chain: a stochastic process in which future states are independent of past states given the present state 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 3 = 𝑧 3 𝑡 ,⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 Output: 𝒛 𝒕 = 𝑧 1 𝑡 , 𝑧 2 𝑡 ,⋯, 𝑧 𝑁 𝑡 𝒛 𝟏 , 𝒛 𝟐 , 𝒛 𝟑 ,…, 𝒛 𝑻 As sampling from P(z) Why?

Markov Chain Three cities A, B and C 1/6 1/2 1/2 1/2 1/6 2/3 1/2 Three cities A, B and C C 1/6 1/2 1/2 1/2 1/6 A B 2/3 1/2 Each day he decided which city he would visit with the probability distribution depending on which city he was in Random Walk Web pages ? Use on webpages, page rank The traveler recorded the cities he visited each day. …… A B C A A This is a Markov chain state

Markov Chain With sufficient samples …… A : B : C = 0.6 : 0.2 : 0.2 (independent of the starting city) 10000 days 10000 days 10000 days 5915 2025 2060 6016 2002 1982 5946 2016 2038 0.6 0.2

Markov Chain 0.2 0.2 0.6 The distribution will not change. P(A)=0.6 0.2 C P(A)=0.6 1/6 1/2 P(B)=0.2 1/2 1/2 P(C)=0.2 0.6 0.2 1/6 A B Stationary Distribution 2/3 1/2 Web pages ? Use on webpages, page rank 2/3 0.6 1/2 0.2 1/2 0.2 0.6 𝑃 𝑇 𝐴 𝐴 𝑃 𝐴 + 𝑃 𝑇 𝐴 𝐵 𝑃 𝐵 + 𝑃 𝑇 𝐴 𝐶 𝑃 𝐶 =𝑃 𝐴 𝑃 𝑇 𝐵 𝐴 𝑃 𝐴 + 𝑃 𝑇 𝐵 𝐵 𝑃 𝐵 + 𝑃 𝑇 𝐵 𝐶 𝑃 𝐶 =𝑃 𝐵 𝑃 𝑇 𝐶 𝐴 𝑃 𝐴 + 𝑃 𝑇 𝐶 𝐵 𝑃 𝐵 + 𝑃 𝑇 𝐶 𝐶 𝑃 𝐶 =𝑃 𝐶 The distribution will not change.

Unique stationary distribution Markov Chain A Markov Chain can have multiple stationary distributions. C A B 1 Reaching which stationary distribution depends on starting state The Markov Chain fulfill some conditions will have unique stationary distribution. Web pages ? Use on webpages, page rank Irreducible, appreriodic PT(s’|s) for any states s and s’ is not zero Unique stationary distribution (sufficient but not necessary condition)

Markov Chain from Gibbs Sampling Gibbs sampling from a distribution P(z) ( z = {z1,…,zN} ) 𝒛 𝟎 = 𝑧 1 0 , 𝑧 2 0 ,⋯, 𝑧 𝑁 0 For t = 1 to T: 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 3 = 𝑧 3 𝑡 ,⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 … Output: 𝒛 𝒕 = 𝑧 1 𝑡 , 𝑧 2 𝑡 ,⋯, 𝑧 𝑁 𝑡 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 Markov Chain: a stochastic process in which future states are independent of past states given the present state This is a Markov Chain 𝒛 𝟏 , 𝒛 𝟐 , 𝒛 𝟑 ,…, 𝒛 𝑻 zt only depend on zt-1 state

Markov Chain from Gibbs Sampling Gibbs sampling from a distribution P(z) ( z = {z1,…,zN} ) 𝒛 𝟎 = 𝑧 1 0 , 𝑧 2 0 ,⋯, 𝑧 𝑁 0 For t = 1 to T: 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 3 = 𝑧 3 𝑡 ,⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 … Output: 𝒛 𝒕 = 𝑧 1 𝑡 , 𝑧 2 𝑡 ,⋯, 𝑧 𝑁 𝑡 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , 𝑧 4 = 𝑧 4 𝑡−1 ,⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 Markov Chain: a stochastic process in which future states are independent of past states given the present state Proof that the Markov chain has unique stationary distribution which is P(z). 𝒛 𝟏 , 𝒛 𝟐 , 𝒛 𝟑 ,…, 𝒛 𝑻

Markov Chain from Gibbs Sampling Markov chain from Gibbs sampling has unique stationary distribution? 𝑃 𝑇 𝑧 ′ |𝑧 >0, for any z and z’ Yes 𝑧 1 𝑡 ~𝑃 𝑧 1 | 𝑧 2 = 𝑧 2 𝑡−1 , 𝑧 3 = 𝑧 3 𝑡−1 , ⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 None of the conditional probability is zero 𝑧 2 𝑡 ~𝑃 𝑧 2 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 3 = 𝑧 3 𝑡−1 , ⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 𝑧 3 𝑡 ~𝑃 𝑧 3 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , ⋯, 𝑧 𝑁 = 𝑧 𝑁 𝑡−1 … 𝑧 𝑁 𝑡 ~𝑃 𝑧 𝑁 | 𝑧 1 = 𝑧 1 𝑡 , 𝑧 2 = 𝑧 2 𝑡 , ⋯, 𝑧 𝑁−1 = 𝑧 𝑁−1 𝑡 can be any zt

Markov Chain from Gibbs Sampling Show that P(z) is a stationary distribution 𝑧 𝑃 𝑇 𝑧 ′ |𝑧 𝑃 𝑧 =𝑃 𝑧 ′ 𝑃 𝑇 𝑧 ′ |𝑧 =𝑃 𝑧 1 ′ | 𝑧 2 , 𝑧 3 , 𝑧 4 ,⋯, 𝑧 𝑁 ×𝑃 𝑧 2 ′ | 𝑧 1 ′ , 𝑧 3 , 𝑧 4 ,⋯, 𝑧 𝑁 ×𝑃 𝑧 3 ′ | 𝑧 1 ′ , 𝑧 2 ′ , 𝑧 4 ,⋯, 𝑧 𝑁 Can be proofed, Random systematic sweep … ×𝑃 𝑧 𝑁 ′ | 𝑧 1 ′ , 𝑧 2 ′ , 𝑧 3 ′ ,⋯, 𝑧 𝑁−1 ′ There is only one stationary distribution for Gibbs sampling, so we are done.

Thank you for your attention!