6. Markov Chain.

Slides:



Advertisements
Similar presentations
Random Processes Introduction (2)
Advertisements

Lecture 6  Calculating P n – how do we raise a matrix to the n th power?  Ergodicity in Markov Chains.  When does a chain have equilibrium probabilities?
CS433 Modeling and Simulation Lecture 06 – Part 03 Discrete Markov Chains Dr. Anis Koubâa 12 Apr 2009 Al-Imam Mohammad Ibn Saud University.
Flows and Networks Plan for today (lecture 2): Questions? Continuous time Markov chain Birth-death process Example: pure birth process Example: pure death.
Markov Chains.
IERG5300 Tutorial 1 Discrete-time Markov Chain
Markov Chains 1.
. Computational Genomics Lecture 7c Hidden Markov Models (HMMs) © Ydo Wexler & Dan Geiger (Technion) and by Nir Friedman (HU) Modified by Benny Chor (TAU)
Markov Chain Monte Carlo Prof. David Page transcribed by Matthew G. Lee.
11 - Markov Chains Jim Vallandingham.
1 Part III Markov Chains & Queueing Systems 10.Discrete-Time Markov Chains 11.Stationary Distributions & Limiting Probabilities 12.State Classification.
Андрей Андреевич Марков. Markov Chains Graduate Seminar in Applied Statistics Presented by Matthias Theubert Never look behind you…
Lecture 3: Markov processes, master equation
Entropy Rates of a Stochastic Process
1 Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
6.896: Probability and Computation Spring 2011 Constantinos (Costis) Daskalakis lecture 2.
Overview of Markov chains David Gleich Purdue University Network & Matrix Computations Computer Science 15 Sept 2011.
Markov Chains Lecture #5
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Review.
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
TCOM 501: Networking Theory & Fundamentals
If time is continuous we cannot write down the simultaneous distribution of X(t) for all t. Rather, we pick n, t 1,...,t n and write down probabilities.
1 Markov Chains Algorithms in Computational Biology Spring 2006 Slides were edited by Itai Sharon from Dan Geiger and Ydo Wexler.
Markov Chains Chapter 16.
INDR 343 Problem Session
Stochastic Process1 Indexed collection of random variables {X t } t   for each t  T  X t is a random variable T = Index Set State Space = range.
Problems, cont. 3. where k=0?. When are there stationary distributions? Theorem: An irreducible chain has a stationary distribution  iff the states are.
CS6800 Advanced Theory of Computation Fall 2012 Vinay B Gavirangaswamy
Chapter6 Jointly Distributed Random Variables
6. Markov Chain. State Space The state space is the set of values a random variable X can take. E.g.: integer 1 to 6 in a dice experiment, or the locations.
Uncertainty Uncertain Knowledge Probability Review Bayes’ Theorem Summary.
Time to Equilibrium for Finite State Markov Chain 許元春(交通大學應用數學系)
Markov Chains and Random Walks. Def: A stochastic process X={X(t),t ∈ T} is a collection of random variables. If T is a countable set, say T={0,1,2, …
Markov Chains X(t) is a Markov Process if, for arbitrary times t1 < t2 < < tk < tk+1 If X(t) is discrete-valued If X(t) is continuous-valued i.e.
Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,
7. Metropolis Algorithm. Markov Chain and Monte Carlo Markov chain theory describes a particularly simple type of stochastic processes. Given a transition.
Discrete Time Markov Chains
To be presented by Maral Hudaybergenova IENG 513 FALL 2015.
10.1 Properties of Markov Chains In this section, we will study a concept that utilizes a mathematical model that combines probability and matrices to.
COMS Network Theory Week 5: October 6, 2010 Dragomir R. Radev Wednesdays, 6:10-8 PM 325 Pupin Terrace Fall 2010.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Flows and Networks (158052) Richard Boucherie Stochastische Operations Research -- TW wwwhome.math.utwente.nl/~boucherierj/onderwijs/158052/ html.
The Monte Carlo Method/ Markov Chains/ Metropolitan Algorithm from sec in “Adaptive Cooperative Systems” -summarized by Jinsan Yang.
Markov Chains.
Discrete-time Markov chain (DTMC) State space distribution
Ergodicity, Balance Equations, and Time Reversibility
Networks of queues Networks of queues reversibility, output theorem, tandem networks, partial balance, product-form distribution, blocking, insensitivity,
Markov Chains and Random Walks
Markov Chain Monte Carlo methods --the final project of stat 6213
Availability Availability - A(t)
Advanced Statistical Computing Fall 2016
Industrial Engineering Dep
Discrete-time markov chain (continuation)
Much More About Markov Chains
What is Probability? Quantification of uncertainty.
Markov Chains Mixing Times Lecture 5
Discrete Time Markov Chains
Discrete-time markov chain (continuation)
Randomized Algorithms Markov Chains and Random Walks
Courtesy of J. Akinpelu, Anis Koubâa, Y. Wexler, & D. Geiger
7. Metropolis Algorithm.
Lecture 4: Algorithmic Methods for G/M/1 and M/G/1 type models
Chapman-Kolmogorov Equations
STOCHASTIC HYDROLOGY Random Processes
Welcome to the wonderful world of Probability
Discrete time Markov Chain
Random Variables and Probability Distributions
Markov Chains & Population Movements
Lecture 5 This lecture is about: Introduction to Queuing Theory
Presentation transcript:

6. Markov Chain

State Space The state space is the set of values a random variable X can take. E.g.: integer 1 to 6 in a dice experiment, or the locations of a random walker, or the coordinates of set of molecules, or spin configurations of the Ising model.

Markov Process A stochastic process is a sequence of random variables X0, X1, …, Xn, … The process is characterized by the joint probability distribution P(X0, X1, …) If P(Xn+1|X0, X1,…, Xn) = P(Xn+1|Xn) then it is a Markov process. For simplicity, we consider only discrete state space, Xn takes integer vales. Capital letter X denotes a random variable, while lower case x for specific value. A Markov process remembers only its immediate previous past. See J R Morris, “Markov Chains” Cambridge, 1997, for a more mathematical treatment.

Markov Chain A Markov chain is completely characterized by an initial probability distribution P0(X0), and the transition matrix W(Xn->Xn+1) = P(Xn+1|Xn). Thus, the probability that a sequence of X0=a, X1=b, …, Xn= n appears, is P0(a)W(a->b)W(b->c) … W(..->n). The term “stochastic process” refers to general random process (in time), Markov process does not have “long-term” memory. Markov chain refers to discrete state space Markov process. See N G van Kampen, “Stochastic Processes in Physics and Chemistry”, North-Holland, (1981), for more information. See also, J. R. Norris, “Markov Chain”, Cambridge (1997).

Properties of Transition Matrix Since W(x->y) = P(y|x) is a conditional probability, we must have W(x->y) ≥ 0. Probability of going anywhere is 1, so ∑y W(x -> Y) = 1. Matrices with such properties are known as stochastic matrices.

Evolution Given the current distribution, Pn(X), the distribution at the next step, n +1, is obtained from Pn+1(Y) = ∑x Pn(X) W( X -> Y) In matrix form, this is Pn+1 = Pn W. Why this is so? All of these equations are based on the fact of conditional probability: P(A,B) = P(B) P(A|B), and marginal probability P(A) = ∑B P(A,B).

Chapman-Kolmogorov Equation We note that the conditional probability of state after k step is P(Xk=b|X0=a) = [Wk]ab. We have which, in matrix notation, is Wk+s=Wk Ws. The subscript in X is step or time. [W]ab means the (a,b) element of matrix W. Andrei Nikolaevich Kolmogorov (1903-1987) is a Russian mathematician best known for his work in probability theory and turbulence.

Probability Distribution of States at Step n Given the probability distribution P0 initially at n = 0, the distribution at step n is Pn = P0 Wn (n-th matrix power of W)

Example: Random Walker A drinking walker walks in discrete steps. In each step, he has ½ probability walk to the right, and ½ probability to the left. He doesn’t remember his previous steps. Picture from http://spaceplace.jpl.nasa.gov/walker.gif What is the variable X? What is the transition matrix W? At time t=0, the walker is at origin, what is the probability that he do a left-left-right move?

The Questions Under what conditions Pn(X) is independent of time (or step) n and initial condition P0? And approaches a limit P(X)? Given W(X->X’), compute P(X) Given P(X), how to construct W(X->X’) ?

Some Definitions: Recurrence and Transience A state i is recurrent if we visit it infinite number of times when n -> ∞. P(Xn = i for infinitely many n) = 1. For a transient state j, we visit it only a finite number of times as n -> ∞.

Irreducible From any state I and any other state J, there is a nonzero probability that one can go from I to J after some n steps. I.e., [Wn]IJ > 0, for some n.

Absorbing State A state, once it is there, can not move to anywhere else. Closed subset: once it is in the set, there is no escape from the set.

Example 1 2 4 5 3 {1,5} is closed, {3} is closed/absorbing. 1/2 1/2 1/4 1/2 1/4 4 5 1/4 3 {1,5} is closed, {3} is closed/absorbing. It is not irreducible.

Aperiodic State A state I is called aperiodic if [Wn]II > 0 for all sufficiently large n. This means that probability for state I to go back to I after n step for all n > nmax is nonzero. A periodic state would be that it cannot go to that state every p > 1 steps regularly.

Invariant or Equilibrium Distribution If we say that the probability distribution P(x) is invariant with respect to the transition matrix W(x->x ’).

Convergence to Equilibrium Let W be irreducible and aperiodic, and suppose that W has an invariant distribution p. Then for any initial distribution, P(Xn=j) -> pj, as n -> ∞ for all j. This theorem tell us when do we expect a unique limiting distribution.

Limit Distribution One also has independent of the initial state i, such that P = P W, [P]j = pj.

Condition for Approaching Equilibrium The irreducible and aperiodic condition can be combined to mean: For all state j and k, [Wn]jk > 0 for sufficiently large n. This is also referred to as ergodic. See the book of J R Norris for proofs of all the theorems quoted.

Urn Example There are two urns. Urn A has two balls, urn B has three balls. One draws a ball in each and switches them. There are two white balls, and three red balls. What are the states, the transition matrix W, and the equilibrium distribution P? Example taken from L E Reichl, “A Modern Course in Statistical Physics”, Edward Arnold (1980), page 164-165.

The Transition Matrix 1 3 2 Note that elements of W2 are all positive. 1/6 1 1/3 3 2 2/3 Note that elements of W2 are all positive. W thus is irreducible and ergodic.

Eigenvalue Problem Determine P is an eigenvalue problem: P = P W The solution is P1 = 1/10, P2 = 6/10, P3 = 3/10. What is the physical meaning of the above numbers?

Convergence to Equilibrium Distribution Let P0 = (1, 0, 0) P1 = P0 W = (0, 1, 0) P2 = P1 W = P0 W2 = (1/6,1/2,1/3) P3 = P2 W = P0 W3 = (1/12,23/36,5/18) P4 = P3 W = P0 W4 = (0.106,0.587,0.3) P5 = P4 W = P0 W5 = (0.1007, 0.5986, 0.3007) . . . P0 W∞ = (0.1, 0.6, 0.3)

Time Reversal Suppose X0, X1, …, XN is a Markov chain with (irreducible) transition matrix W(X->X’) and an equilibrium distribution P(X), what transition probability would result in a time-reversed process Y0 = XN, Y1=XN-1, …YN=X0?

Answer The new WR should be such that P(x) WR(x->x’) = P(x’)W(x’->x) (*) Original process P(x0,x1,..,xN) = P(x0) W(x0->x1) W(x1->x2) … W(xN-1->xN) must be equal to reversed process P(xN,xN-1,…,x0) = P(XN) WR(XN->XN-1) WR(xN-1->XN-2) … WR(x1->x0). The equation (*) satisfies this.

Reversible Markov Chain A Markov chain is said reversible if it satisfies detailed balance: P(X) W(X -> Y) = P(Y) W(Y ->X) Nearly all the Markov chains used in Monte Carlo method satisfy this condition by construction. That is, in reversible Markov chain, WR=W. This means that one can not distinguish statistically a chain running forward from running backward.

An example of a chain that does not satisfy detailed balance 1 2/3 2/3 1/3 1/3 1/3 3 2 Equilibrium distribution is P=(1/3,1/3,1/3). The reverse chain has transition matrix WR = WT (transpose of W). WR ≠ W. 2/3 Example taking from J R Morris, “Markov Chains”, page 48-49. Is the urns example reversible Markov chain?

Realization of Samples in Monte Carlo and Markov Chain Theory A Monte Carlo sampling do not deal with probability P(X) directly, rather the samples, when considered over many realizations, following that distribution. Monte Carlo generates next sample y from the current x, using the transition probability W(x -> y).