Seminar on Markov Chains and Mixing Times, Fall 16/17 Jay Tenenbaum

Slides:

Advertisements

Similar presentations

Request Dispatching for Cheap Energy Prices in Cloud Data Centers

Advertisements

SpringerLink Training Kit

Luminosity measurements at Hadron Colliders

From Word Embeddings To Document Distances

Choosing a Dental Plan Student Name

Virtual Environments and Computer Graphics

Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI

THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –

D. Phát triển thương hiệu

NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN

Điều trị chống huyết khối trong tai biến mạch máu não

BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.

Nasal Cannula X particulate mask

Evolving Architecture for Beyond the Standard Model

HF NOISE FILTERS PERFORMANCE

Electronics for Pedestrians – Passive Components –

Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel

L-Systems and Affine Transformations

CMSC423: Bioinformatic Algorithms, Databases and Tools

Some aspect concerning the LMDZ dynamical core and its use

Bayesian Confidence Limits and Intervals

实习总结（Internship Summary)

Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,

Front End Electronics for SOI Monolithic Pixel Sensor

Face Recognition Monday, February 1, 2016.

Solving Rubik's Cube By: Etai Nativ.

CS284 Paper Presentation Arpad Kovacs

انتقال حرارت 2 خانم خسرویار.

Summer Student Program First results

Theoretical Results on Neutrinos

HERMESでのHard Exclusive生成過程による核子内クォーク全角運動量についての研究

Wavelet Coherence & Cross-Wavelet Transform

yaSpMV: Yet Another SpMV Framework on GPUs

Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.

MOCLA02 Design of a Compact L-band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Fuel cell development program for electric vehicle

Overview of TST-2 Experiment

Optomechanics with atoms

داده کاوی سئوالات نمونه

Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium

ლექცია 4 - ფული და ინფლაცია

10. predavanje Novac i financijski sustav

Wissenschaftliche Aussprache zur Dissertation

FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,

Particle acceleration during the gamma-ray flares of the Crab Nebular

Interpretations of the Derivative Gottfried Wilhelm Leibniz

Advisor: Chiuyuan Chen Student: Shao-Chun Lin

Widow Rockfish Assessment

SiW-ECAL Beam Test 2015 Kick-Off meeting

On Robust Neighbor Discovery in Mobile Wireless Networks

Chapter 6 并发：死锁和饥饿 Operating Systems: Internals and Design Principles

You NEED your book!!! Frequency Distribution

Y V =0 a V =V0 x b b V =0 z

Fairness-oriented Scheduling Support for Multicore Systems

Climate-Energy-Policy Interaction

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Ch48 Statistics by Chtan FYHSKulai

The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.

Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs

Online Learning: An Introduction

Factor Based Index of Systemic Stress (FISS)

What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.

THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*

Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.

The Toroidal Sporadic Source: Understanding Temporal Variations

FW 3.4: More Circle Practice

ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف

Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM

Limits on Anomalous WWγ and WWZ Couplings from DØ

Presentation transcript:

Seminar on Markov Chains and Mixing Times, Fall 16/17 Jay Tenenbaum Coupling Seminar on Markov Chains and Mixing Times, Fall 16/17 Jay Tenenbaum

Reminders In case you forgot…

Definition - Total variation distance Let 𝜇 and 𝜈 be two probability distributions on Ω. The total variation distance between µ and 𝜈 is given by: ||𝜇−𝜈|| 𝑇𝑉 = max 𝐴⊂Ω |𝜇 𝐴 −𝜈(𝐴)| 1 − γ = α = β = ||µ − ν||TV

Definition - Total variation distance Proposition: ||𝜇−𝜈|| 𝑇𝑉 = max 𝐴⊂Ω |𝜇 𝐴 −𝜈(𝐴)| = 1 2 Σ 𝑥∈Ω |𝜇 𝑥 −𝜈(𝑥)| 1 − γ = α = β = ||µ − ν||TV

Definition - Coupling A coupling of two probability distributions µ and 𝜈 (both over Ω) is a pair of random variables (𝑋, 𝑌) defined over Ω×Ω) such that: the marginal distribution of 𝑋 is µ [𝑃 𝑋 = 𝑥 = Σ 𝑦∈Ω 𝑃 𝑥,𝑦 = µ(𝑥) ] the marginal distribution of 𝑌 is 𝜈 [𝑃 𝑌 =𝑦 = Σ 𝑥∈Ω 𝑃 𝑥,𝑦 =𝜈(𝑦) ]

Proposition: Let 𝜇 and 𝜈 be two probability distributions on Ω. Then: ||𝜇−𝜈|| 𝑇𝑉 =inf P X≠𝑌 X,Y coupling of 𝜇 𝑎𝑛𝑑 𝜈}

Definition – Distance to Stationary From now on we assume That P is an ergodic (aperiodic, irreducible) Markov process with stationary distribution 𝜋. Bounding the maximal distance (over 𝑥 0 ∈Ω) between 𝑃 𝑡 ( 𝑥 0 ,⋅) and 𝜋 is among our primary objective. It is therefore convenient to define: 𝑑 𝑡 ≔ max 𝑥∈Ω || 𝑃 𝑡 𝑥,⋅ −𝜋|| 𝑇𝑉

Definition – Mixing Time It is useful to introduce a parameter which measures the time required by a Markov chain for the distance to stationarity to be small. The mixing time is defined by: 𝑡 𝑚𝑖𝑥 𝜖 ≔ min 𝑡 𝑑 𝑡 <𝜖} We standardize: 𝑡 𝑚𝑖𝑥 ≔ 𝑡 𝑚𝑖𝑥 1 4

Random Walk Coupling example

Simple random walk on {0, 1 . . . , 𝑛} (Markov Chain Coupling Motivation) Move up or down with probability 1 2 if possible. Do nothing if attempt to move outside interval.

Simple random walk on {0, 1 . . . , 𝑛} Claim 4. If 0 ≤ 𝑥 ≤ 𝑦 ≤ 𝑛, 𝑡≥0 𝑃 𝑡 (𝑦,0)≤ 𝑃 𝑡 (𝑥,0)

Simple random walk on {0, 1 . . . , 𝑛} Claim 4. If 0 ≤ 𝑥 ≤ 𝑦 ≤ 𝑛, 𝑡≥0 𝑃 𝑡 (𝑦,0)≤ 𝑃 𝑡 (𝑥,0) Proof: Define a coupling (Xt, Yt ) of 𝑃 𝑡 𝑥,⋅ and 𝑃 𝑡 (𝑦,⋅): 𝑋0 = 𝑥 , 𝑌0 = 𝑦 . Let 𝑏1, 𝑏2 . . . be i.i.d. {±1}-valued Bernoulli (1/2). At the 𝑖′th step, attempt to add 𝑏𝑖 to both 𝑋𝑖−1 and 𝑌𝑖−1. Stop after t steps.

Simple random walk on {0, 1 . . . , 𝑛} For all t, Xt ≤ Yt . Therefore, 𝑃𝑡 𝑦 , 0 = 𝑃 𝑌𝑡 = 0 ≤ 𝑃 𝑋𝑡 = 0 =𝑃𝑡 𝑥 , 0 Note: In this case, when the coupling meets, it sticks together

Markov Chain Coupling

Markov Chain Coupling

Coupling Example

Coupling Example

Coupling Example

Coupling Example

Coupling Example

Markov Chain Coupling Definition Def: A coupling of a Markov Chain with transition matrix P is a Markovian Process 𝑋 𝑡 , 𝑌 𝑡 𝑡=0 ∞ such that: Each process {Xt}, {Yt} is a Markov Chain with transition matrix P. (∀𝑎,𝑏∈Ω Pr 𝑋 𝑡+1 =𝑏 𝑋 𝑡 =𝑎 =𝑃 𝑎,𝑏 =Pr⁡[ 𝑌 𝑡+1 =𝑏| 𝑌 𝑡 =𝑎]) Note that 𝑋 𝑡 𝑡=0 ∞ , 𝑌 𝑡 𝑡=0 ∞ may have different starting distributions 𝑋 0 , 𝑌 0 .

Markov Chain Coupling - Stickyness We can modify the Markovian Chain coupling so that chains stay together after meeting: 𝑖𝑓 𝑋 𝑠 = 𝑌 𝑠 , 𝑡ℎ𝑒𝑛 ∀𝑡≥𝑠 𝑋 𝑡 = 𝑌 𝑡 [Simply run according to the Markovian chain coupling, until they meet. Then run them together based on original Markov Chain] We denote such Couplings as Sticky Couplings

Markov Chain Coupling - Notation Given a Markovian Chain Coupling 𝑋 𝑡 , 𝑌 𝑡 𝑡=0 ∞ of a Markov Chain with transition P, we denote 𝑃 𝑥,𝑦 to be: 𝑃 𝑥,𝑦 𝐴 =𝑃(𝐴| 𝑋 0 =𝑥, 𝑌 0 =𝑦) Which is the probability assuming 𝑋 0 is at x, and 𝑌 0 is at y [recall that 𝑋 𝑡 𝑡=0 ∞ , 𝑌 𝑡 𝑡=0 ∞ may have different starting distributions]

Markov Chain Coupling - Observation Let 𝑋 𝑡 , 𝑌 𝑡 𝑡=0 ∞ be a coupling of P where X 0 = 𝑥, 𝑌 0 =𝑦. Observation: ∀𝑡≥0, ( 𝑋 𝑡 , 𝑌 𝑡 ) is a coupling of 𝑃 𝑡 (𝑥,⋅) with 𝑃 𝑡 𝑦,⋅ Proof. 𝑃 𝑡 (𝑥, 𝑧) = 𝑃 𝑥,𝑦 { 𝑋 𝑡 = 𝑧} and 𝑃 𝑡 (𝑦, 𝑧) = 𝑃 𝑥,𝑦 { 𝑌 𝑡 = 𝑧}

Bounding Distance to Stationary Using Couplings

Bounding Total Variation Distance As usual, P is an ergodic MC over Ω with stationary distribution 𝜋. Theorem: Let 𝑋 𝑡 , 𝑌 𝑡 𝑡=0 ∞ be a sticky coupling of P where X 0 = 𝑥, 𝑌 0 =𝑦. Let 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 be the first time the chains meet ( 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 := min 𝑡 𝑋 𝑡 = 𝑌 𝑡 }). Then: || 𝑃 𝑡 𝑥,⋅ − 𝑃 𝑡 𝑦,⋅ || 𝑇𝑉 ≤ 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡} Proof. || 𝑃 𝑡 𝑥,⋅ − 𝑃 𝑡 𝑦,⋅ || 𝑇𝑉 ≤ 𝑃 𝑥,𝑦 𝑋 𝑡 ≠ 𝑌 𝑡 = 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡} ( ||𝜇−𝜈|| 𝑇𝑉 =inf P X≠𝑌 X,Y coupling of 𝜇 𝑎𝑛𝑑 𝜈}) (definition of 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 + stickyness)

Bounding Total Variation Distance [Theorem: || 𝑃 𝑡 𝑥,⋅ − 𝑃 𝑡 𝑦,⋅ || 𝑇𝑉 ≤ 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡}] Corollary: 𝑑 𝑡 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡}

Reminder Given P an ergodic Markov process with stationary distribution 𝜋, We defined: 𝑑 𝑡 ≔ max 𝑥∈Ω || 𝑃 𝑡 𝑥,⋅ −𝜋|| 𝑇𝑉 However, we also defined: 𝑑 𝑡 ≔ max 𝑥,𝑦∈Ω || 𝑃 𝑡 𝑥,⋅ − 𝑃 𝑡 𝑦,⋅ || 𝑇𝑉 And showed that: 𝑑 𝑡 ≤ 𝑑 𝑡 ≤2𝑑 𝑡 We will use the fact that 𝑑 𝑡 ≤ 𝑑 𝑡

Bounding Distance to Stationary [Theorem: || 𝑃 𝑡 𝑥,⋅ − 𝑃 𝑡 𝑦,⋅ || 𝑇𝑉 ≤ 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡}] Corollary: 𝑑 𝑡 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡} Proof. 𝑑 𝑡 ≤ 𝑑 𝑡 = max 𝑥,𝑦∈Ω 𝑃 𝑡 𝑥,⋅ − 𝑃 𝑡 𝑦,⋅ 𝑇𝑉 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡} Using Theorem for each pair of 𝑥,𝑦∈Ω Design a coupling that brings X and Y together fast (for each 𝑥,𝑦∈Ω)

Examples – Bounding Mixing Time In all the following examples we consider Markov Chains. For each such example we: Define a suitable coupling 𝑋 𝑡 , 𝑌 𝑡 𝑡=0 ∞ . Bound the value 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡} for each pair of 𝑥,𝑦∈Ω Use the Corollary: 𝑑 𝑡 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 { 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡} to bound 𝑑(𝑡) Find the minimal t that ensures that 𝑑 𝑡 < 1 4 That t is an upper bound for the Mixing Time!

Lazy Random Walk On The Cycle Bounding mixing time

Random Lazy Walk On The Cycle 𝛀= ℤ 𝒏 = 𝟏,…,𝒏 𝑷 𝒋,𝒌 = 𝟏 𝟒 𝒊𝒇 𝒌≡𝒋+𝟏 𝟏 𝟒 𝒊𝒇 𝒌≡𝒋−𝟏 𝟏 𝟐 𝒊𝒇 𝒌≡𝒋 𝟎 𝑶.𝑾. 1/2 1 2 3 4 n 1/2 1/4 1/4 1/2 1/4 1/4 1/4 1/4 1/4 1/4 1/4 1/4 1/4 1/4 1/2 1/2

Random Lazy Walk On The Cycle We construct a coupling 𝑋 𝑡 , 𝑌 𝑡 𝑡=0 ∞ of two particles walking lazily on the circle, one starting from 𝑥, the other from 𝑦. At each move: Flip a coin If heads, ( 𝑋 𝑡 ) moves CW or CCW based on additional coin flip. If tails, ( 𝑌 𝑡 ) moves CW or CCW based on additional coin flip. Assume stickiness…

Random Lazy Walk On The Cycle Coupling summary: 50-50 choose 𝑋 𝑡 , 𝑌 𝑡 . 50-50 choose CW,CCW. Argument: This is indeed a Markov Chain Coupling Proof: From each unique particle’s point of view we have 50% chance to move it and if we move it, 50% chance cw or ccw.

Random Lazy Walk On The Cycle – Bounding 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 Coupling summary: 50-50 choose 𝑋 𝑡 , 𝑌 𝑡 . 50-50 choose CW,CCW. Bounding: let 𝐷 𝑡 the clockwise distance between 𝑋 𝑡 to 𝑌 𝑡 . Note that 𝐷 𝑡 is a simple random walk on intertior of {0,…,𝑛} and gets absorbed on either zero or n. We’ve seen in first lecture that if 𝜏 is the time required to get absorbed, and 𝐷 0 =𝑘 then: 𝐸 𝑘 𝜏 =𝑘 𝑛−𝑘 Notice that 𝜏= 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 , 𝐷 0 is the clockwise distance between 𝑥,𝑦. Therefore: ∀𝑥,𝑦 𝐸 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤ max 𝑘∈{0,…,𝑛−1} 𝑘 𝑛−𝑘 ≤ 𝑛 2 4

Random Lazy Walk On The Cycle Coupling summary: 50-50 choose 𝑋 𝑡 , 𝑌 𝑡 . 50-50 choose CW,CCW. Bounding: ∀𝑥,𝑦 𝐸 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤ 𝑛 2 4 Therefore: 𝑑 𝑡 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡 ≤ 1 𝑡 max 𝑥,𝑦∈Ω 𝐸 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤ 𝑛 2 4𝑡 Therefore: For 𝑡≥ 𝑛 2 we have 𝑑 𝑡 ≤ 1 𝑡 𝑛 2 2 ≤ 1 𝑛 2 ∗ 𝑛 2 4 ≤ 1 4 , therefore: 𝑡 𝑚𝑖𝑥 = min 𝑡 𝑑 𝑡 ≤ 1 4 ≤ 𝑛 2

Lazy Random Walk On The Hypercube Bounding mixing time

Sampling a Hypercube Toy problem: Randomly sample from the vertices of a k-dimensional hypercube. k = 3

Lazy Random Walk On Hypercube Markov Chain: [equivalent to] pick coordinate uniformly i {1,…,n} pick value uniformly b {0,1} set x(i)=b Markov Chain is ergodic & symmetric => Stationary distribution  is uniform ={0,1}n 1/2 1/6 1/6 1/6

Coupling For Lazy Random Walk On Hypercube Assuming X 0 =𝑥, 𝑌 0 =𝑦, define transition 𝑋 𝑡 , 𝑌 𝑡 → 𝑋 𝑡+1 , 𝑌 𝑡+1 : pick coordinate uniformly i {1,…,n} pick value uniformly b {0,1} set 𝑋 𝑡 𝑖 =𝑏 , 𝑌 𝑡 (𝑖)=𝑏 This is indeed a coupling. ( 0 , 0 , 1 , 0 , 1) t=0 ( 1 , 1 , 0 , 0 , 1) i=3, b=0 ( 0 , 0 , 0 , 0 , 1) t=1 ( 1 , 1 , 0 , 0 , 1) i=5, b=1 ( 0 , 0 , 0 , 0 , 1) t=2 ( 1 , 1 , 0 , 0 , 1) i=3, b=1 ( 0 , 0 , 1 , 0 , 1) t=3 ( 1 , 1 , 1 , 0 , 1) i=1, b=0 (0 , 0 , 1 , 0 , 1) t=4 (0 , 1 , 1 , 0 , 1) i=2, b=1 (0 , 1 , 1 , 0 , 1) t=5 (0 , 1 , 1 , 0 , 1)

Coupling For Lazy Random Walk On Hypercube Denote by 𝜏 the first time where all the coordinates have been selected at least once. The two walkers agree by time 𝜏 ( 𝑡 𝑐𝑜𝑢𝑝𝑙𝑒 ≤𝜏). (For each initial 𝑥,𝑦!!!) 𝜏 distributes like the coupon collector random variable (with n coupons). Therefore: (Assumption) 𝑃 𝜏>𝑛𝑙𝑛(𝑛)+𝑐𝑛 ≤ 𝑒 −𝑐 Therefore: 𝑑 𝑛𝑙𝑛(𝑛)+ l𝑛 4 ∗𝑛 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑛𝑙𝑛(𝑛)+ln⁡(4)∗𝑛 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 𝜏>𝑛𝑙𝑛(𝑛)+ln⁡(4)∗𝑛 ≤ e − ln 4 = 1 4 Therefore: 𝑡 𝑚𝑖𝑥 ≤𝑛𝑙𝑛(𝑛)+ l𝑛 4 ∗𝑛=𝑂(𝑛𝑙𝑜𝑔(𝑛))

Proof of Assumption Theorem: Let τ be a coupon collector random variable, as seen before. For any c > 0 we have: 𝑃 𝜏>𝑛𝑙𝑛(𝑛)+𝑐𝑛 ≤ 𝑒 −𝑐 Proof: Let 𝐴 𝑖 be the event that the I’th type does not appear among the first 𝑛𝑙𝑛(𝑛)+𝑐𝑛 coupons drawn. Then, from independence of trials 𝑃 𝐴 𝑖 = 1− 1 𝑛 𝑛𝑙𝑛(𝑛)+𝑐𝑛 Now, 𝑃 𝜏>𝑛𝑙𝑛(𝑛)+𝑐𝑛 =P ∪ 𝑖=1 𝑛 𝐴 𝑖 ≤ Σ 𝑖=1 𝑛 𝑃 𝐴 𝑖 = Σ 𝑖=1 𝑛 1− 1 𝑛 𝑛𝑙𝑛(𝑛)+𝑐𝑛 = =𝑛 1− 1 𝑛 −𝑛(−ln⁡(𝑛)−𝑐) ≤𝑛 𝑒 −ln⁡(𝑛) 𝑒 −𝑐 ≤ 𝑒 −𝑐

Card Shuffling – Random Transposition Bounding mixing time

Card Shuffling – Random Transposition Irreducible Aperiodic (Transpose card with itself) P is symmetric: x,y P(x,y)=P(y,x)   is uniform

Random Transposition – Coupling Definition We construct a coupling 𝜎 𝑡 , 𝜎 𝑡 ′ 𝑡=0 ∞ of the Random Transposition MC: Ω= 𝑆 𝑛 At each move: 𝜎 𝑡 , 𝜎 𝑡 ′ →( 𝜎 𝑡+1 , 𝜎 𝑡+1 ′ ) Choose card 𝑋 𝑡 and an independent position 𝑌 𝑡 uniformly. Switch the card 𝑋 𝑡 with the card at position 𝑌 𝑡 in both 𝜎 𝑡 , 𝜎 𝑡 ′ . Let 𝑀𝑡 = # of cards at the same position in 𝜎 𝑡 and 𝜎 𝑡 ′ Indeed a coupling!!!

Random Transposition – Case 1 At each move: 𝜎 𝑡 , 𝜎 𝑡 ′ →( 𝜎 𝑡+1 , 𝜎 𝑡+1 ′ ) Choose card 𝑋 𝑡 and an independent position 𝑌 𝑡 uniformly. Switch the card 𝑋 𝑡 with the card at position 𝑌 𝑡 in both 𝜎 𝑡 , 𝜎 𝑡 ′ . 𝑀 𝑡 = # of cards at the same position in 𝜎 𝑡 and 𝜎 𝑡 ′ Case 1: 𝑋 𝑡 in same position => 𝑀 𝑡+1 = 𝑀 𝑡

Random Transposition – Case 2 At each move: 𝜎 𝑡 , 𝜎 𝑡 ′ →( 𝜎 𝑡+1 , 𝜎 𝑡+1 ′ ) Choose card 𝑋 𝑡 and an independent position 𝑌 𝑡 uniformly. Switch the card 𝑋 𝑡 with the card at position 𝑌 𝑡 in both 𝜎 𝑡 , 𝜎 𝑡 ′ . 𝑀 𝑡 = # of cards at the same position in 𝜎 𝑡 and 𝜎 𝑡 ′ Case 2: 𝑋 𝑡 in different positions 𝜎 𝑡 𝑌 𝑡 = 𝜎 𝑡 ′ ( 𝑌 𝑡 ) => 𝑀 𝑡+1 = 𝑀 𝑡

Random Transposition – Cases 3 𝑋 𝑡 in different positions 𝜎 𝑡 𝑌 𝑡 ≠ 𝜎 𝑡 ′ ( 𝑌 𝑡 ) => 𝑀 𝑡+1 > 𝑀 𝑡 𝑀 𝑡+1 = 𝑀 𝑡 +2 𝑀 𝑡+1 = 𝑀 𝑡 +3 𝑀 𝑡+1 = 𝑀 𝑡 +1

Random Transposition – Cases Summary 𝑋 𝑡 in same position 𝑀 𝑡+1 = 𝑀 𝑡 Case 2: 𝑋 𝑡 in different positions 𝜎 𝑡 𝑌 𝑡 = 𝜎 𝑡 ′ ( 𝑌 𝑡 ) Cases 3: 𝜎 𝑡 𝑌 𝑡 ≠ 𝜎 𝑡 ′ ( 𝑌 𝑡 ) 𝑀 𝑡+1 > 𝑀 𝑡 We now calculate: 𝑃( 𝑀 𝑡+1 > 𝑀 𝑡 | 𝑀 𝑡 =𝑖) [0≤𝑖≤𝑛−1] The only case in which 𝑀 𝑡+1 > 𝑀 𝑡 is cases 3, which has a probability of 𝑛−𝑖 𝑛 ∗ 𝑛−𝑖 𝑛 = 𝑛−𝑖 2 𝑛 2 Therefore: 𝑃 𝑀 𝑡+1 > 𝑀 𝑡 𝑀 𝑡 =𝑖 = 𝑛−𝑖 2 𝑛 2

Random Transposition – Bounding MixingTime Theorem: Let 𝜏 be the first time 𝑀 𝑡 =𝑛.(The time it takes the decks to match each other) Then for every pair of initial permutations 𝜎 0 , 𝜎 0 ′ : 𝐸 𝜏 < 𝜋 2 6 𝑛 2 Proof: Let 𝜏 𝑖 be the amount of steps between the first time 𝑀 𝑡 ≥i−1 and the first time 𝑀 𝑡 ≥i. [since 𝑀 𝑡 could increase by 1,2,3, 𝜏 𝑖 could be equal zero] Then 𝜏= 𝜏 1 +…+ 𝜏 𝑛 We’ve seen 𝑃 𝑀 𝑡+1 > 𝑀 𝑡 𝑀 𝑡 =𝑖 = 𝑛−𝑖 2 𝑛 2 , therefore: 𝐸 𝜏 𝑖+1 𝑀 𝑡 =𝑖 = 𝑛 2 𝑛−𝑖 2 When no value of 𝑡 satisfies 𝑎 𝑡 =𝑖, 𝜏 𝑖+1 =0 therefore: 𝐸 𝜏 ≤ Σ 𝑖=0 𝑛−1 𝑛 2 𝑛−𝑖 2 = 𝑛 2 Σ 𝑖=0 𝑛−1 1 𝑛−𝑖 2 < 𝑛 2 Σ 𝑖=1 ∞ 1 𝑖 2 = 𝜋 2 6 𝑛 2

Random Transposition – Bounding Mixing Time We’ve just seen that 𝐸 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 < 𝜋 2 6 𝑛 2 , no matter what the initial permutations 𝜎 0 , 𝜎 0 ′ are. Therefore: 𝑑 𝑡 ≤ max 𝜎 0 , 𝜎 0 ′ ∈Ω 𝑃 𝜎 0 , 𝜎 0 ′ 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡 ≤ 1 𝑡 max 𝜎 0 , 𝜎 0 ′ ∈Ω 𝐸 𝜎 0 , 𝜎 0 ′ 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤ 1 𝑡 𝜋 2 6 𝑛 2 Therefore: For 𝑡≥ 𝜋 2 1.5 𝑛 2 we have 𝑑 𝑡 ≤ 1 𝑡 𝜋 2 6 𝑛 2 ≤ 1.5 𝑛 2 𝜋 2 𝜋 2 6 𝑛 2 ≤ 1 4 , therefore: 𝑡 𝑚𝑖𝑥 = min 𝑡 𝑑 𝑡 ≤ 1 4 ≤ 𝜋 2 1.5 𝑛 2 =𝑂( 𝑛 2 )

Lazy Random Walk On a Binary Tree Bounding mixing time

Finite Binary Tree

Reminders - Tree A tree is a connected (undirected!) graph with no cycles. The root is a distinguished vertex. The depth of a vertex v is the distance to the root. A level of a tree is a subset of vertices all at same depth Children of a vertex 𝑣∈𝑉 are neighbors of v with depth larger than v. A leaf is a vertex with degree one. 1 2

Reminders – Binary Tree of Depth k A tree where: Root has degree 2 Every vertex of distance≠0,𝑘 from root has degree 3 Vertices distance k from root are leaves

Lazy Random Walk On a Binary Tree We now consider the lazy random walk on the finite binary tree. [we denote by 𝑁(𝑣) the number of neighbors of given 𝑣∈𝑉] 𝑷 𝒖,𝒗 = 𝟏 𝟐𝑵(𝒖) 𝒊𝒇 (𝒖,𝒗)∈𝑬 𝟏 𝟐 𝒊𝒇 𝒖=𝒗 𝟎 𝑶.𝑾.

Coupling Definition We construct a coupling 𝑋 𝑡 , 𝑌 𝑡 𝑡=0 ∞ of two particles walking lazily on the binary tree by 2 phases. Phase 1 (as long as 𝑋 𝑡 , 𝑌 𝑡 not in same depth): Toss a coin to decide with chain of the 2 moves. Make random walk on chosen one. Phase 2 ( 𝑋 𝑡 , 𝑌 𝑡 in same depth): Make a random lazy walk on chain 𝑋 𝑡 and move accordingly on 𝑌 𝑡 [up/down-right/down-left]. Assume to be sticky… Indeed a coupling!!!

Phase 1 (as long as 𝑋 𝑡 , 𝑌 𝑡 not in same depth): Phase 2 ( 𝑋 𝑡 , 𝑌 𝑡 in same depth): Make a random lazy walk on chain 𝑋 𝑡 and move accordingly on 𝑌 𝑡 [up/down-right/down-left]. Phase 1 (as long as 𝑋 𝑡 , 𝑌 𝑡 not in same depth): Toss a coin to decide with chain of the 2 moves. Make random walk on chosen one.

Binary Tree Coupling - Observation Define the time 𝑡 0 of a run of the coupling as the first time 𝑋 𝑡 𝑡=0 ∞ has first visited a leaf and then visited the root. We argue that 𝑋 𝑡 0 = 𝑌 𝑡 0 . ( 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤ 𝑡 0 ) Definition: Define the commute time as the time it takes in a rlw starting from root, visiting a leaf and then returning to the root. Coupling: Phase 1: move 1 or 2 uniformly Phase 2: move both sync. ≤ 𝑛𝑜𝑛−𝑡𝑟𝑖𝑣𝑖𝑎𝑙 𝑙𝑒𝑚𝑚𝑎 4𝑛 (𝑛=|𝑉|) ∀𝑥,𝑦 𝐸 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤ 𝐸 𝑡 0 ≤𝐸[𝑐𝑜𝑚𝑚𝑢𝑡𝑒 𝑡𝑖𝑚𝑒]

Binary Tree Coupling - Observation ∀𝑥,𝑦 𝐸 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤4𝑛 Therefore: 𝑑 𝑡 ≤ max 𝑥,𝑦∈Ω 𝑃 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 >𝑡 ≤ 1 𝑡 max 𝑥,𝑦∈Ω 𝐸 𝑥,𝑦 𝜏 𝑐𝑜𝑢𝑝𝑙𝑒 ≤ 4𝑛 𝑡 Therefore: For 𝑡≥16𝑛 we have 𝑑 𝑡 ≤ 4𝑛 16𝑛 ≤ 1 4 , therefore: 𝑡 𝑚𝑖𝑥 = min 𝑡 𝑑 𝑡 ≤ 1 4 ≤16𝑛=𝑂(𝑛)

Thanks!!! Jay tenenbaum