Download presentation
Presentation is loading. Please wait.
1
Markov Chains and Mixing Times
By Levin, Peres and Wilmer ChapterΒ 4:Β Introduction to markov chain mixing, Sections , pp.Β 47-61. Presented by Dani Dorfman
2
Planned topics Total Variation Distance Coupling
The Convergence Theorem Measuring Distance from Stationary
3
Total Variation Distance
4
Definition πβπ ππ = max π΄βΞ© π π΄ βπ(π΄)
Given two distributions π,π on Ξ© we define the Total Variation to be: πβπ ππ = max π΄βΞ© π π΄ βπ(π΄)
5
Example π€ π Coin tossing frog. π= 1βπ π π 1βπ ,Ξ = π π+π π π+π
π= 1βπ π π 1βπ ,Ξ = π π+π π π+π Define π 0 = 1,0 , Ξ π‘ = π π‘ π βΞ (π) (=Ξ π€ β π π‘ (π€) ) An easy computation shows: π π‘ βΞ ππ = Ξ π‘ = 1βπβπ π‘ Ξ 0 π π π€ π
6
βAn Easy Computationβ Induction: π‘=0: Ξ 0 = 1βπβπ 0 Ξ 0 π‘βπ‘+1:
Ξ 0 = 1βπβπ 0 Ξ 0 π‘βπ‘+1: Ξ π‘+1 = π π‘+1 π βπ π = 1βπ π π‘ π +π 1β π π‘ π βπ π = 1βπβπ π π‘ π +πβπ π = 1βπβπ π π‘ π +πβ π π+π = 1βπβπ π π‘ π β 1βπβπ π π = 1βπβπ Ξ π‘
7
πΌ πΌπΌ πΌπΌπΌ Proposition 4.2 π π π΅ π΅ πΆ π π΄ βπ(π΄)
Let π and π be two probability distributions on Ξ©. Then: πβπ ππ = 1 2 π₯βΞ© π π₯ βπ(π₯) π π πΌ πΌπΌ π π΄ βπ(π΄) πΌπΌπΌ π΅ π΅ πΆ
8
Proof Define π΅= π₯|π π₯ β₯π(π₯) , Let π΄βΞ© be an event. Clearly: π π΄ βπ π΄ β€π π΄β©π΅ βπ π΄β©π΅ β€π π΅ βπ π΅ Parallel argument gives: π π΄ βπ π΄ β€π π΄β© π΅ πΆ βπ π΄β© π΅ πΆ β€π π΅ πΆ βπ π΅ πΆ Note that both upper bounds are equal. Taking π΄=π΅ achieves the upper bounds, therefore: πβπ ππ =π π΅ βπ π΅ = π π΅ βπ π΅ +(π π΅ πΆ βπ( π΅ πΆ ) = 1 2 π₯βΞ© |π π₯ βπ(π₯)|
9
Remarks From the last proof we easily deduce:
πβπ ππ = π₯βΞ©,π π₯ β₯π(π₯) [π π₯ βπ π₯ ] Notice that ππ is equivalent to πΏ 1 norm and therefore: πβπ ππ β€ πβπ ππ + πβπ ππ
10
Proposition 4.5 Let π and π be two probability distributions on Ξ©. Then: πβπ ππ = sup max π β€1 π₯βΞ© π π₯ π π₯ βπ π₯ π(π₯) πΌ πΌπΌ πΌπΌπΌ
11
π β (π₯)= 1 π π₯ βπ π₯ β₯0 β1 π π₯ βπ π₯ <0
Proof Clearly the following function achieves the supremum: π β (π₯)= π π₯ βπ π₯ β₯0 β π π₯ βπ π₯ <0 Therefore: π₯βΞ© π β π₯ π π₯ β π β π₯ π(π₯) = 1 2 π₯βΞ©,π π₯ βπ π₯ β₯0 π π₯ βπ(π₯) π₯βΞ©,π π₯ βπ π₯ <0 π π₯ βπ(π₯) = 1 2 πβπ ππ πβπ ππ = πβπ ππ
12
Coupling & Total Variation
13
Definition A coupling of two probability distributions π,π is a pair of random variables π,π s.t π π=π₯ =π π₯ ,π π=π₯ =π π₯ . Given a coupling π,π of π,π one can define π π₯,π¦ = π(π=π₯,π=π¦) which represents the joint distribution π,π . Thus: π π₯ = π¦βΞ© π π₯,π¦ ,π π¦ = π₯βΞ© π(π₯,π¦)
14
Example (π,π) s.t βπ₯,π¦ π π=π₯,π=π¦ = 1 4 π= 1/4 1/4 1/4 1/4 π πβ π = 1 2
π,π represent a legal coin flip. We can build several couplings: (π,π) s.t βπ₯,π¦ π π=π₯,π=π¦ = 1 4 π= 1/4 1/4 1/4 1/4 π πβ π = 1 2 (π,π) s.t π=π βπ₯ π π=π=π₯ = 1 2 π= 1/ /2 π πβ π =0
15
Proposition 4.7 πβπ ππ = inf π,π π(πβ π)
Let π and π be two probability distributions on Ξ©. Then: πβπ ππ = inf π,π π(πβ π)
16
Proof π= π π΄ βπ π΄ =π πβπ΄ βπ πβπ΄ β€ π πβπ΄,πβπ΄ β€π(πβ π)
In order to show πβπ ππ β€ inf π,π π πβ π , βπ΄βΞ© note that: π π΄ βπ π΄ =π πβπ΄ βπ πβπ΄ β€ π πβπ΄,πβπ΄ β€π(πβ π) Thus it suffices to find a coupling π,π π .π‘ π πβ π = πβπ ππ . π=
17
Proof Cont. πΌ πΌπΌ πΌπΌπΌ
18
Proof Cont. Define the coupling (π,π) as follows:
With probability p=1β πβπ ππ take π=π according to the distribution πΎ πΌπΌπΌ . O/w take π,π from π΅= π₯ π π₯ βπ π₯ >0 , π΅ πΆ according to the distributions πΎ πΌ , πΎ πΌπΌ correspondingly. Clearly: π πβ π = πβπ ππ
19
Proof Cont. All that is left is to define πΎ πΌ , πΎ πΌπΌ , πΎ πΌπΌπΌ :
Ξ³ πΌ (π₯)= 1 πβπ ππ π π₯ βπ π₯ π π₯ βπ π₯ > πππ π Ξ³ πΌπΌ (π₯)= 1 πβπ ππ π π₯ βπ π₯ π π₯ βπ π₯ β€ πππ π Ξ³ πΌπΌπΌ (π₯)= minβ‘{π π₯ ,π(π₯)} 1β πβπ ππ Note that: π=π πΎ πΌπΌπΌ + 1βπ πΎ πΌ , π=π πΎ πΌπΌπΌ + 1βπ πΎ πΌπΌ
20
The Convergence Theorem
21
Theorem 4.9 Suppose that π is irreducible and aperiodic, with stationary distribution π. Then βπΌβ 0,1 ,πΆ>0 π .π‘: βπ‘ maπ₯ π₯βΞ© π π‘ π₯,β βΞ ππ <πΆ πΌ π‘
22
Lemma (Prop. 1.7) If π is irreducible and aperiodic, then βπ>0 π .π‘:
βπ₯,π¦ π π π₯,π¦ >0 Proof: Define βπ₯ Ξ€ π₯ ={π‘| π π‘ π₯,π₯ >0}, then βπ₯ gcd Ξ€ π₯ =1. βπ₯ Ξ€ π₯ is closed under addition. From number theory: βπ₯β π π₯ βπ> π π₯ πβΞ€ π₯ . From irreducibility βπ₯,π¦β π π₯,π¦ <π π .π‘ π π π₯,π¦ π₯,π¦ >0. Taking πβπ+ max π₯βΞ© π π₯ ends the proof.
23
Proof of Theorem 4.9 The last lemma gives us the existence of π π .π‘ βπ₯,π¦ π π π₯,π¦ >0. Let Ξ be the matrix with Ξ© rows, each row is π. βπΏ>0 π .π‘ βπ₯,π¦βΞ© :π π₯,π¦ β₯πΏπ π¦ =πΏΞ π₯,π¦ . Let π be the stochastic matrix that is derived from the equation: π π = 1βπ Ξ +ππ [π=1βπΏ] Clearly: πΞ =Ξ π=Ξ . By induction one can see: βπ π ππ = 1β π π Ξ + π π π π
24
Proof of Induction Case π=1 comes by definition.
πβπ+1: π π(π+1) = π ππ π π = 1β π π Ξ + π π π π π π = 1β π π Ξ + π π π π P π = 1β π π Ξ + π π π π 1βπ Ξ +ππ = 1β π π Ξ + π π 1βπ π π Ξ + π π+1 π π+1 = 1β π π Ξ + π π 1βπ Ξ + π π+1 π π+1 = 1β π π+1 Ξ + π π+1 π π+1
25
Proof of Theorem 4.9 Cont. βπ π ππ+π βΞ = π π ( π π π π βΞ )
The induction derives: π ππ+π = π ππ π π = 1β π π Ξ + π π π π π π Therefore, βπ π ππ+π βΞ = π π ( π π π π βΞ ) Finally, βπ₯ π ππ+π π₯,β βπ ππ β€ π π
26
Standardizing Distance From Stationary
27
Definitions π π‘ = max π₯,π¦βΞ© π π‘ π₯,β β π π‘ (π¦,β ππ
Given a stochastic matrix π with itβs π, we define: π π‘ = max π₯βΞ© π π‘ π₯,β βπ ππ π π‘ = max π₯,π¦βΞ© π π‘ π₯,β β π π‘ (π¦,β ππ
28
Lemma 4.11 For every stochastic matrix π and her stationary distribution π: π π‘ β€ π π‘ β€2π(π‘) Proof: The second inequality is trivial from the triangle inequality. Note that: π π΄ = π¦βΞ© π π¦ π(π¦,π΄) .
29
Proof Cont. π π‘ (π₯,β)βπ ππ = max π΄βΞ© π π‘ π₯,π΄ βπ(π΄) = max π΄βΞ© π¦βΞ© π(π¦) π π‘ π₯,π΄ β π π‘ (π¦,π΄) β€ max π΄βΞ© π¦βΞ© π π¦ π π‘ π₯,π΄ β π π‘ π¦,π΄ β€ π¦βΞ© π(π¦) max π΄βΞ© π π‘ π₯,π΄ β π π‘ π¦,π΄ = π¦βΞ© π π¦ π π‘ π₯,β β π π‘ π¦,β ππ β€ π¦βΞ© π π¦ π π‘ = π (π‘)
30
Observations π π‘ = max π ππβπ ππ π π‘ = max π,π πPβππ ππ
31
Lemma 4.12 The π function is submultiplicative, π.π. βπ ,π‘ π π +π‘ β€ π π π π‘ . Proof: Fix π₯,π¦βΞ©, Let ( π π , π π ) be the optimal coupling of π π π₯,β , π π π¦,β . Note that: π π‘+π π₯,π€ = (π π π π‘ ) π₯,π€ = π§βΞ© π π π₯,π§ π π‘ π§,π€ =πΈ π π‘ π π ,π€ The same argument gives us: π π‘+π π¦,π€ =πΈ π π‘ π π ,π€ .
32
Proof Cont. Note: π π‘+π π₯,π€ β π π‘+π π¦,π€ =πΈ π π‘ π π ,π€ βπΈ π π‘ π π ,π€
π π‘+π π₯,π€ β π π‘+π π¦,π€ =πΈ π π‘ π π ,π€ βπΈ π π‘ π π ,π€ Summing over all π€ yields: π π‘+π π₯,β β π π‘+π π¦,β ππ = 1 2 π€βΞ© πΈ π π‘ π π ,π€ β π π‘ ( π π ,π€) β€ πΈ π€βΞ© π π‘ π π ,π€ β π π‘ ( π π ,π€) β€ π π‘ π π π β π π β€ π π‘ π π
33
Remarks From submultiplicity we note that π (π‘) is non-increasing.
Also: βπ π ππ‘ β€ π ππ‘ β€ π π (π‘)
34
Thank you for your attention!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.