Improved Bounds for Sampling Colorings

Improved Bounds for Sampling Colorings
Sitan Chen, Ankur Moitra Merged with: Michelle Delcourt, Guillem Perarnau, Luke Postle

Preliminaries

Background Main Question: Can we approximately count k-colorings in bounded-degree graphs? When do such colorings even exist? Brooks’ Theorem: Any graph G of maximum degree d has a (d+1)- coloring. Conjecture: We can approximately count arbitrarily well* as long as 𝑘≥𝑑+1! *for any 𝜖>0, multiplicative (1+𝜖)-approximation in time poly 1/𝜖,𝑛 Also remark that it is NP-hard to decide whether a graph is d-1 colorable, though it’s unclear whether we could get k\ge d sampling.

Motivation- counting complexity
Can we efficiently approximately count for #P-complete problems whose decision version is easy? matchings (monomer-dimer systems) independent sets (hardcore model) perfect matchings (dimer systems) partition function of Ising model satisfying assignments in bounded-degree CNFs estimating volume of a convex body

Motivation- phase transitions
In inference in graphical models: #k-colorings is partition function 𝑍 of antiferromagnetic Potts model at zero temperature 𝜇 𝐺 𝜎 = 1 𝑍 exp −𝛽⋅ 𝑖~𝑗 𝕀 𝜎 𝑖 = 𝜎 𝑗 , 𝛽→∞ Uniqueness: decay of long-range correlations [Jonasson ’01]: Uniqueness for the infinite d- regular tree at 𝑘≥𝑑+1 Phase transitions onset of computational hardness ⇿ with constant interaction parameter and zero external field

Motivation- phase transitions
Does uniqueness threshold coincide with computational hardness threshold for approximately counting k-colorings? [Sly, Sun ’12] + [Weitz ‘06] + [Sinclair, Srivastava, Thurley ’11]: Yes for general antiferromagnetic 2-spin systems on bounded- degree graphs [Galanis, Stefankovic, Vigoda ’14]: Hardness for k-colorings at 𝑘<𝑑 and antiferromagnetic Potts model in (conjectured) non-uniqueness region What about the algorithmic side? note that hardness at k<d result holds surprisingly even for triangle-free graphs

Counting vs. sampling Theorem (informal): Given an approximate sampler, you can approximately count. Proof: Successively remove edges of G to get 𝐺= 𝐺 𝑚 , 𝐺 𝑚−1 ,…, 𝐺 0 Z(G) = 𝑍( 𝐺 𝑚 ) 𝑍( 𝐺 𝑚−1 ) ⋅ 𝑍 𝐺 𝑚−1 𝑍 𝐺 𝑚−2 ⋯ 𝑍 𝐺 1 𝑍 𝐺 0 ⋅𝑍 𝐺 0 𝐺 0 is n isolated vertices, so 𝑍 𝐺 0 = 𝑘 𝑛 𝑍( 𝐺 𝑖 ) 𝑍( 𝐺 𝑖−1 ) =Pr [vertices of edge removed are different color in G 𝑖−1 ]

Simple candidate sampler
Single-site Glauber dynamics Initialize with arbitrary coloring Repeat for T steps: Pick a random vertex v and a random color c. Recolor v with c if allowed. Output final coloring Color: ? Fact: If 𝑘≥𝑑+2, the stationary distribution is the uniform distribution over k-colorings. Conjecture: If 𝑘≥𝑑+2, single-site Glauber dynamics mixes in time 𝑇=𝑂 𝑛 log 𝑛 .

Single-site Glauber dynamics Initialize with arbitrary coloring Repeat for T steps: Pick a random vertex v and a random color c. Recolor v with c if allowed. Output final coloring Color: ? Note that there is also the conjecture that *some* Markov chain mixes for k>=d+1. Also note that k=d+1 is the uniqueness threshold, i.e. point at which correlations decay on trees. An algorithm would establish another case where computational threshold coincides with uniqueness. Fact: If 𝑘≥𝑑+2, the stationary distribution is the uniform distribution over k-colorings. Conjecture: If 𝑘≥𝑑+2, single-site Glauber dynamics mixes in time 𝑇=𝑂 𝑛 log 𝑛 .

Single-site Glauber dynamics Initialize with arbitrary coloring Repeat for T steps: Pick a random vertex v and a random color c. Recolor v with c if allowed. Output final coloring Color: ✓ ? Note that there is also the conjecture that *some* Markov chain mixes for k>=d+1. Also note that k=d+1 is the uniqueness threshold, i.e. point at which correlations decay on trees. An algorithm would establish another case where computational threshold coincides with uniqueness. Fact: If 𝑘≥𝑑+2, the stationary distribution is the uniform distribution over k-colorings. Conjecture: If 𝑘≥𝑑+2, single-site Glauber dynamics mixes in time 𝑇=𝑂 𝑛 log 𝑛 .

Single-site Glauber dynamics Initialize with arbitrary coloring Repeat for T steps: Pick a random vertex v and a random color c. Recolor v with c if allowed. Output final coloring Color: ⨯ ? Note that there is also the conjecture that *some* Markov chain mixes for k>=d+1. Also note that k=d+1 is the uniqueness threshold, i.e. point at which correlations decay on trees. An algorithm would establish another case where computational threshold coincides with uniqueness. Fact: If 𝑘≥𝑑+2, the stationary distribution is the uniform distribution over k-colorings. Conjecture: If 𝑘≥𝑑+2, single-site Glauber dynamics mixes in time 𝑇=𝑂 𝑛 log 𝑛 .

Single-site Glauber dynamics Initialize with arbitrary coloring Repeat for T steps: Pick a random vertex v and a random color c. Recolor v with c if allowed. Output final coloring Color: ✓ ? Note that there is also the conjecture that *some* Markov chain mixes for k>=d+1. Also note that k=d+1 is the uniqueness threshold, i.e. point at which correlations decay on trees. An algorithm would establish another case where computational threshold coincides with uniqueness. Fact: If 𝑘≥𝑑+2, the stationary distribution is the uniform distribution over k-colorings. Conjecture: If 𝑘≥𝑑+2, single-site Glauber dynamics mixes in time 𝑇=𝑂 𝑛 log 𝑛 .

Prior work k/d Jerrum ‘95 2 We will show how to improve upon Vigoda’s result for general bounded-degree graphs

Prior work k/d 2 11/6 Jerrum ‘95 Vigoda ‘99
We will show how to improve upon Vigoda’s result for general bounded-degree graphs

Prior work k/d degree girth 2 11/6 1.763 Jerrum ‘95 Vigoda ‘99
Dyer, Frieze ‘01 1.763 Ω log 𝑛 Ω log 𝑑 We will show how to improve upon Vigoda’s result for general bounded-degree graphs

Prior work k/d degree girth 2 11/6 1.763 1.489 Jerrum ‘95 Vigoda ‘99
Dyer, Frieze ‘01 1.763 Ω log 𝑛 Ω log 𝑑 Molloy ‘02 1.489 We will show how to improve upon Vigoda’s result for general bounded-degree graphs

Prior work k/d degree girth 2 11/6 1.763 1.489 Jerrum ‘95 Vigoda ‘99
Dyer, Frieze ‘01 1.763 Ω log 𝑛 Ω log 𝑑 Molloy ‘02 1.489 Hayes ‘03 ≥6 We will show how to improve upon Vigoda’s result for general bounded-degree graphs

Prior work k/d degree girth 2 11/6 1.763 1.489 1+𝜖 Jerrum ‘95
Vigoda ‘99 11/6 Dyer, Frieze ‘01 1.763 Ω log 𝑛 Ω log 𝑑 Molloy ‘02 1.489 Hayes ‘03 ≥6 Hayes, Vigoda ‘03 1+𝜖 Ω exp 1/𝜖 log 𝑛 ≥11 Dyer, Frieze, Hayes, Vigoda ‘04 Ω(1) ≥5 ≥7 We will show how to improve upon Vigoda’s result for general bounded-degree graphs

triangle-free, deterministic
Prior work k/d assumptions Hayes, Vigoda, Vera ‘15 Ω(1/ log 𝑑 ) planar Mossel, Sly ‘08 𝑘≥ 𝑂 𝑑 max (1) 𝐺 𝑛,𝑑/𝑛 Efthymiou, Hayes, Stefankovic, Vigoda ‘17 5/2 Efthymiou ‘16 1+𝜖 (not Markov chain) Gamarnik, Katz ‘06 2.843 triangle-free, deterministic Tetali, Vera, Vigoda, Yang ‘12 1/ log 𝑑 tree Vardi ‘17 𝑂( log 𝑛 ) pathwidth We will show how to improve upon Vigoda’s result for general bounded-degree graphs

Our results Theorem 1: There is an absolute constant 𝜂>0 such that as long as 𝑘≥ −𝜂 𝑑, Glauber dynamics mixes in time 𝑂( 𝑛 2 ). Theorem 2: As long as 𝑘≥ −𝜂 𝑑, Glauber dynamics for sampling list 𝑘-colorings also mixes in time 𝑂( 𝑛 2 ). Corollary: The k-state zero temperature antiferromagnetic Potts model on ℤ 𝑑 lies in the disordered phase when 𝑘≥ −2𝜂 𝑑.

Technical tools

E 𝛿 𝑋 𝑡 , 𝑌 𝑡 𝑋 0 =𝑥, 𝑌 0 =𝑦 ≤ 1−𝜖 𝛿(𝑥,𝑦),
Review: coupling A t-step coupling of a Markov chain is a process 𝑋 𝑖 , 𝑌 𝑖 0≤𝑖≤𝑡 such that: 𝑋 𝑖 and 𝑌 𝑖 are faithful copies of the chain If 𝑋 𝑖 = 𝑌 𝑖 , then 𝑋 𝑖+1 = 𝑌 𝑖+1 . Take finite state space Ω with metric 𝛿 … Two standard ways to bound mixing time of Markov chain are 1) spectral, 2) combinatorial (coupling). 1) is typically more powerful but has proven too difficult to implement for sampling colorings. Lemma: If exists t-step coupling which contracts, i.e. E 𝛿 𝑋 𝑡 , 𝑌 𝑡 𝑋 0 =𝑥, 𝑌 0 =𝑦 ≤ 1−𝜖 𝛿(𝑥,𝑦), for all 𝑥,𝑦∈Ω, then chain mixes in time 𝑂 𝑡⋅ log diam Ω 𝜖

Review: path coupling Designing a good coupling is often difficult
Theorem [Bubley, Dyer ’97]: If exists coupling which contracts for all 𝑥,𝑦∈Ω with 𝛿 𝑥,𝑦 =1 then chain mixes rapidly. For k-colorings: 𝛿 is Hamming distance Goal: given any two k-colorings 𝜎,𝜏 differing on exactly one vertex, specify distribution over next step ( 𝜎 ′ , 𝜏 ′ ) s.t. E[𝛿 𝜎 ′ , 𝜏 ′ ]≤1−𝜖 Two standard ways to bound mixing time of Markov chain are 1) spectral, 2) combinatorial (coupling). 1) is typically more powerful but has proven too difficult to implement for sampling colorings. one-step coupling

Review: 𝑘>3𝑑 Given: colorings 𝜎,𝜏 which differ only on vertex v.
Henceforth: say 𝜎 𝑣 =blue, 𝜏 𝑣 =yellow Use the identity coupling, i.e. same move in both copies of the Markov chain Suppose we attempt to recolor vertex u with c If 𝑢=𝑣 and 𝑐 not among neighbors of 𝑣: distance -1 If 𝑢∈𝑁(𝑣) and 𝑐∈{blue,yellow}: distance +1 Otherwise, distance +0 Expected change: 1 𝑘𝑛 − 𝑘−𝑑 +2𝑑 v Mention briefly the technicality that you need to extend the state space to all colorings in order to ensure the premetric can be extended to the Hamming metric and that this is also a technicality in Vigoda’s and our analysis that needs to be handled, but which we will ignore in this talk. Color: N(v) min #available colors for v {blue/yellow} ⨯ max #neighbors of v

Review: 𝑘>2𝑑 [Jerrum ‘95]
Given: colorings 𝜎,𝜏 which differ only on vertex v. Suppose 𝜎 attempts to recolor vertex u with c If 𝑢=𝑣 and 𝑐 not in 𝑁(𝑣), do the same in 𝜏: distance -1 If 𝒖∈𝑵(𝒗) and 𝒄=𝐛𝐥𝐮𝐞, try to recolor u to 𝐲𝐞𝐥𝐥𝐨𝐰 in 𝝉: distance +0 If 𝒖∈𝑵(𝒗) and 𝒄=𝐲𝐞𝐥𝐥𝐨𝐰, try to recolor u to 𝐛𝐥𝐮𝐞 in 𝝉: distance +1 Otherwise, do the same in 𝜏, distance doesn’t change: distance +0 Expected change: 1 𝑘𝑛 − 𝑘−𝑑 +𝑑 v Color: N(v) min #available colors for v max #neighbors of v

Given: colorings 𝜎,𝜏 which differ only on vertex v. Suppose 𝜎 attempts to recolor vertex u with c If 𝑢=𝑣 and 𝑐 not in 𝑁(𝑣), do the same in 𝜏: distance -1 If 𝒖∈𝑵(𝒗) and 𝒄=𝐛𝐥𝐮𝐞, try to recolor u to 𝐲𝐞𝐥𝐥𝐨𝐰 in 𝝉: distance +0 If 𝒖∈𝑵(𝒗) and 𝒄=𝐲𝐞𝐥𝐥𝐨𝐰, try to recolor u to 𝐛𝐥𝐮𝐞 in 𝝉: distance +1 Otherwise, do the same in 𝜏, distance doesn’t change: distance +0 Expected change: 1 𝑘𝑛 − 𝑘−𝑑 +𝑑 v Color: N(v) ⨯ min #available colors for v max #neighbors of v

Given: colorings 𝜎,𝜏 which differ only on vertex v. Suppose 𝜎 attempts to recolor vertex u with c If 𝑢=𝑣 and 𝑐 not in 𝑁(𝑣), do the same in 𝜏: distance -1 If 𝒖∈𝑵(𝒗) and 𝒄=𝐛𝐥𝐮𝐞, try to recolor u to 𝐲𝐞𝐥𝐥𝐨𝐰 in 𝝉: distance +0 If 𝒖∈𝑵(𝒗) and 𝒄=𝐲𝐞𝐥𝐥𝐨𝐰, try to recolor u to 𝐛𝐥𝐮𝐞 in 𝝉: distance +1 Otherwise, do the same in 𝜏, distance doesn’t change: distance +0 Expected change: 1 𝑘𝑛 − 𝑘−𝑑 +𝑑 v Color: N(v) min #available colors for v max #neighbors of v

Given: colorings 𝜎,𝜏 which differ only on vertex v. Suppose 𝜎 attempts to recolor vertex u with c If 𝑢=𝑣 and 𝑐 not in 𝑁(𝑣), do the same in 𝜏: distance -1 If 𝒖∈𝑵(𝒗) and 𝒄=𝐛𝐥𝐮𝐞, try to recolor u to 𝐲𝐞𝐥𝐥𝐨𝐰 in 𝝉: distance +0 If 𝒖∈𝑵(𝒗) and 𝒄=𝐲𝐞𝐥𝐥𝐨𝐰, try to recolor u to 𝐛𝐥𝐮𝐞 in 𝝉: distance +1 Otherwise, do the same in 𝜏, distance doesn’t change: distance +0 Expected change: 1 𝑘𝑛 − 𝑘−𝑑 +𝑑 v Color: N(v) ✓ min #available colors for v max #neighbors of v

Tight example for Jerrum’s analysis
Each of the d neighbors of v: Is assigned a distinct color May have some yellow or blue neighbors other than v, but not both Expected change = 1 𝑘𝑛 − 𝑘−𝑑 +𝑑 v #available colors for v #neighbors of v

What makes high-girth graphs easier?
If G triangle-free, then for a “typical” 𝜎,𝜏, 𝑁(𝑣) does not have all distinct colors. For any fixing of vertices outside of 𝑁(𝑣), conditional distribution on 𝑁(𝑣) is a product measure. So: 𝔼 𝜎 #colors you could recolor 𝑣 with ≈𝑘 1− 1 𝑘 𝑑 ≫𝑘−𝑑 (If 𝑘>1.763𝑑, 𝑘 𝑒 −𝑑/𝑘 >𝑑) (Requires extra work/assumptions to get high probability statement about typical colorings encountered in Markov chain)

Vigoda’s analysis

Kempe components c=blue Given coloring 𝜎, vertex v, and color c, the associated Kempe component 𝑆 𝜎 𝑣,𝑐 is the maximal bichromatic connected component containing v and colored only with 𝜎(𝑣) and c. Denote the (multi)set of Kempe components by 𝒮 𝜎 . To flip a Kempe component, interchange colors 𝜎(𝑣) and c. v emphasize that the set of Kempe components is a multiset because for each c not in N(v), S_{\sigma}(v,c) = {v} will be regarded as a distinct element also remark that we will be denoting these components via their edges in subsequent figures, even though the important part is that they are subsets of vertices

Bigger flips are better
v v emphasize that the set of Kempe components is a multiset because for each c not in N(v), S_{\sigma}(v,c) = {v} will be regarded as a distinct element also remark that we will be denoting these components via their edges in subsequent figures, even though the important part is that they are subsets of vertices

Bigger flips are better
v v point is that these flips aren’t so aggressive that they mess up scenarios where we were fine before

Tradeoff v v what do we couple this guy to?

Fancier Markov chain SWK dynamics with flip parameters 1≥𝑝 1 ≥ 𝑝 2 ≥ 𝑝 3 ≥… Initialize with arbitrary coloring Repeat for T steps: Pick any Kempe component 𝑆 𝜎 𝑣,𝑐 from 𝒮 𝜎 with probability 1/𝑛𝑘 Flip it with probability 𝑝 𝑠 where 𝑠=| 𝑆 𝜎 𝑣,𝑐 | Output final coloring pronounced like “kotetski” note that glauber dynamics corresponds to choosing p_1 = 1 and p_i = 0 for all other i. Fact: If 𝑘≥𝑑+2, the stationary distribution is the uniform distribution over k-colorings. Theorem [Vigoda ‘99]: If 𝑘>11𝑑/6, there exist flip parameters for which the SWK dynamics mixes in time 𝑇=𝑂 𝑛 log 𝑛 .

𝑆 𝜎 𝑢 𝑖 ,𝜏 𝑣 1≤𝑖≤𝑑 𝑆 𝜏 𝑢 𝑖 ,𝜎 𝑣 1≤𝑖≤𝑑
Fancier coupling Given: colorings 𝜎,𝜏 which differ only on vertex v. Let 𝑐 1 ,…, 𝑐 𝑚 be the set of distinct colors in 𝑁 𝑣 . Let 𝑢 1 ,…, 𝑢 𝑑 be the neighbors of v. Observation: 𝒮 𝜎 △ 𝒮 𝜏 consists of 𝑆 𝜎 𝑣, 𝑐 𝑖 1≤𝑗≤𝑚 𝑆 𝜏 𝑣, 𝑐 𝑗 1≤𝑗≤𝑚 𝑆 𝜎 𝑢 𝑖 ,𝜏 𝑣 1≤𝑖≤𝑑 𝑆 𝜏 𝑢 𝑖 ,𝜎 𝑣 1≤𝑖≤𝑑 Here we’re just going to assume v has degree d, but we just need that its degree is at most d.

Fancier coupling Observation:
𝑆 𝜎 𝑣, 𝑐 𝑗 = 𝑢 𝑖 :𝜎 𝑢 𝑖 = 𝑐 𝑗 𝑆 𝜏 𝑢 𝑖 ,𝜎 𝑣 ∪{𝑣} Note that the same holds for sigma,tau swapped

Fancier coupling Observation:
𝑆 𝜏 𝑣, 𝑐 𝑗 = 𝑢 𝑖 :𝜎 𝑢 𝑖 = 𝑐 𝑗 𝑆 𝜎 𝑢 𝑖 ,𝜏 𝑣 ∪{𝑣} Note that the same holds for sigma,tau swapped

Fancier coupling ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
Fix a color c and let 𝑤 1 ,…, 𝑤 𝑚 𝑐 be the neighbors of v colored c Greedily couple flips of 𝑆 𝜎 𝑣,𝑐 and 𝑆 𝜎 𝑤 𝑖 ,𝜏 𝑣 ’s to flips of 𝑆 𝜏 𝑣,𝑐 and 𝑆 𝜏 𝑤 𝑖 ,𝜎 𝑣 ’s. Contribution to 𝔼 𝜹 𝝈 ′ , 𝝉 ′ −𝟏: 𝐻 𝐴 𝑐 , 𝐵 𝑐 , 𝑎 𝑐 , 𝑏 𝑐 ≔ 𝐴 𝑐 − 𝑎 max 𝑐 −1 𝑝 𝐴 𝑐 + 𝐵 𝑐 − 𝑏 max 𝑐 −1 𝑝 𝐵 𝑐 + 𝑖 𝑎 𝑖 𝑐 𝑞 𝑖 + 𝑏 𝑖 𝑐 𝑞 𝑖 ′ − min 𝑞 𝑖 , 𝑞 𝑖 ′ where 𝑞 𝑖 = 𝑝 𝑎 𝑖 𝑐 − 𝑝 𝐴 𝑐 ⋅𝕀 𝑖= 𝑖 max 𝑞 𝑖 ′ = 𝑝 𝑏 𝑖 𝑐 − 𝑝 𝐵 𝑐 ⋅𝕀 𝑖= 𝑗 max ∅ ∅ 𝐴 𝑐 𝑆 𝜎 𝑣,𝑐 𝑆 𝜏 𝑣,𝑐 𝐵 𝑐 𝑆 𝜎 𝑤 1 ,𝜏 𝑣 𝑆 𝜏 𝑤 1 ,𝜎 𝑣 𝑎 1 𝑐 some linear function in the flip parameters 𝑏 1 𝑐 don’t have to remember this formula at all, primary takeaway is that it’s some linear function in p_A, p_B, p_{a_i}’s, p_{b_i}’s 𝑏 2 𝑐 𝑆 𝜎 𝑤 2 ,𝜏 𝑣 𝑆 𝜏 𝑤 2 ,𝜎 𝑣 𝑎 2 𝑐 ⋮ ⋮ 𝑆 𝜎 𝑤 𝑖 max ,𝜏 𝑣 𝑏 max 𝑐 ⋮ ⋮ 𝑆 𝜏 𝑤 𝑗 max ,𝜎 𝑣 𝑎 max 𝑐 ⋮ ⋮ 𝑏 𝑚 𝑐 𝑐 𝑆 𝜎 𝑤 𝑚 𝑐 ,𝜏 𝑣 𝑆 𝜏 𝑤 𝑚 𝑐 ,𝜎 𝑣 𝑎 𝑚 𝑐 𝑐

𝔼 𝛿 𝜎 ′ , 𝜏 ′ −1 =− 𝑐: 𝑚 𝑐 =0 + 𝑐: 𝑚 𝑐 ≠0 𝐻 𝐴 𝑐 , 𝐵 𝑐 , 𝑎 𝑐 , 𝑏 𝑐
Fancier coupling If we could find 𝑝 𝑖 ’s and 𝜆>0 so that 𝐻 𝐴 𝑐 , 𝐵 𝑐 , 𝑎 𝑐 , 𝑏 𝑐 ≤−1+𝜆⋅ 𝑚 𝑐 ∀ 𝐴 𝑐 , 𝐵 𝑐 , 𝑎 𝑐 , 𝑏 𝑐 then 𝔼 𝛿 𝜎 ′ , 𝜏 ′ −1 =− 𝑐: 𝑚 𝑐 =0 + 𝑐: 𝑚 𝑐 ≠0 𝐻 𝐴 𝑐 , 𝐵 𝑐 , 𝑎 𝑐 , 𝑏 𝑐 ≤−𝑘+𝜆⋅𝑑<0 if 𝑘>𝜆⋅𝑑. (*) Note that to quantify over all A,B,a,b, need to quantify only over “realizable” tuples. Also mention that need to add some additional constraints to handle improper colorings. Mention that Vigoda essentially solves this LP by hand [Vigoda ‘99]: Can satisfy (*) using 𝜆=11/6 and 𝑝 1 =1, 𝑝 2 = , 𝑝 3 = 1 6 , 𝑝 4 = 2 21 , 𝑝 5 = 1 21 , 𝑝 6 = 1 84 , 𝑝 𝑖 =0 ∀𝑖>6

Our approach

Vigoda’s analysis as an LP
Linear Program 1 minimize 𝜆 subject to 𝐻 𝐴 𝑐 , 𝐵 𝑐 , 𝑎 𝑐 , 𝑏 𝑐 ≤−1+𝜆⋅ 𝑚 𝑐 ∀ 𝐴 𝑐 , 𝐵 𝑐 , 𝑎 𝑐 , 𝑏 𝑐 1=𝑝 1 ≥ 𝑝 2 ≥ 𝑝 3 ≥… Natural to wonder: if you allow flips of bigger components, use a less crude approximation than Vigoda to capture the infinite number of constraints and variables, can you do better? Q: Is Vigoda’s solution optimal for this LP? A: Yes!

Optimality of 11/6 In fact, optimality of 11/6 can be seen just by looking at the constraints arising from the two “extremal configurations”: 𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 3,2; 2 , 1 and 7,3; 3,3 , 1,1 Linear Program 2 minimize 𝜆 subject to 𝑝 1 + 𝑝 2 −2 𝑝 3 − min 𝑝 1 − 𝑝 2 , 𝑝 2 − 𝑝 3 ≤−1+𝜆 2𝑝 1 +5 𝑝 3 − min 𝑝 1 − 𝑝 3 , 𝑝 3 − 𝑝 7 ≤−1+2𝜆 1=𝑝 1 ≥ 𝑝 2 ≥ 𝑝 3 ≥… If you check the slackness of constraints under Vigoda’s assignment, you’ll find that only six of them are actually tight and give a small collection of extremal configurations that witness the optimality of 11/6. There’s a bit of room in which to move the minimizer, and in fact you can pinpoint exactly two extremal configurations that are already enough to force 11/6. “We say c is extremal if this holds” “just to emphasize that there’s no magic going on, the constraints are shown here”

Extremal configurations
𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 3,2; 2 , 1 v v If you check the slackness of constraints under Vigoda’s assignment, you’ll find that only six of them are actually tight and give a small collection of extremal configurations that witness the optimality of 11/6. There’s a bit of room in which to move the minimizer, and in fact you can pinpoint exactly two extremal configurations that are already enough to force 11/6. “We say c is extremal if this holds” “just to emphasize that there’s no magic going on, the constraints are shown here” c=green

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 3,2; 2 , 1 v v If you check the slackness of constraints under Vigoda’s assignment, you’ll find that only six of them are actually tight and give a small collection of extremal configurations that witness the optimality of 11/6. There’s a bit of room in which to move the minimizer, and in fact you can pinpoint exactly two extremal configurations that are already enough to force 11/6. “We say c is extremal if this holds” “just to emphasize that there’s no magic going on, the constraints are shown here” c=green 𝑆 𝜎 (𝑣,𝑐)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 3,2; 2 , 1 v v If you check the slackness of constraints under Vigoda’s assignment, you’ll find that only six of them are actually tight and give a small collection of extremal configurations that witness the optimality of 11/6. There’s a bit of room in which to move the minimizer, and in fact you can pinpoint exactly two extremal configurations that are already enough to force 11/6. “We say c is extremal if this holds” “just to emphasize that there’s no magic going on, the constraints are shown here” c=green 𝑆 𝜏 (𝑣,𝑐)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 3,2; 2 , 1 v v If you check the slackness of constraints under Vigoda’s assignment, you’ll find that only six of them are actually tight and give a small collection of extremal configurations that witness the optimality of 11/6. There’s a bit of room in which to move the minimizer, and in fact you can pinpoint exactly two extremal configurations that are already enough to force 11/6. “We say c is extremal if this holds” “just to emphasize that there’s no magic going on, the constraints are shown here” c=green 𝑢 1 𝑢 1 𝑆 𝜏 ( 𝑢 1 ,blue)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 3,2; 2 , 1 v v If you check the slackness of constraints under Vigoda’s assignment, you’ll find that only six of them are actually tight and give a small collection of extremal configurations that witness the optimality of 11/6. There’s a bit of room in which to move the minimizer, and in fact you can pinpoint exactly two extremal configurations that are already enough to force 11/6. “We say c is extremal if this holds” “just to emphasize that there’s no magic going on, the constraints are shown here” c=green 𝑢 1 𝑢 1 𝑆 𝜎 ( 𝑢 1 ,yellow)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1,1 v v c=green

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1,1 v v c=green 𝑆 𝜎 (𝑣,𝑐)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1,1 v v c=green 𝑆 𝜏 (𝑣,𝑐)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1,1 v v 𝑢 1 𝑢 2 𝑢 1 𝑢 2 c=green 𝑆 𝜏 ( 𝑢 1 ,blue) 𝑆 𝜏 ( 𝑢 2 ,blue)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1,1 v v 𝑢 1 𝑢 2 𝑢 1 𝑢 2 c=green 𝑆 𝜎 ( 𝑢 1 ,yellow) 𝑆 𝜎 ( 𝑢 2 ,yellow)

𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1,1 v v

Bottleneck for one-step couplings
Theorem: For 𝑘<11𝑑/6, there exist no choice of 𝒑 𝒊 and one-step coupling of the SWK dynamics which contracts simultaneously for both 𝐺 1 , 𝜎 1 , 𝜏 1 and 𝐺 2 , 𝜎 2 , 𝜏 2 under the Hamming metric. Formally, the optimality of 11/6 for the LP only shows that Vigoda’s greedy coupling doesn’t work below 11/6, but it is not hard to check that if any one-step coupling works, Vigoda’s greedy coupling will also work. 𝐺 1 , 𝜎 1 , 𝜏 1 𝐺 2 , 𝜎 2 , 𝜏 2

Variable-length coupling
If one-step coupling fails beyond 11/6, let’s try t-step coupling. What should t be? Theorem [Hayes, Vigoda ‘07]: If exists t-step coupling, where t is a random stopping time, which (1−𝜖)-contracts for all 𝑥,𝑦∈Ω with 𝛿 𝑥,𝑦 =1, chain mixes rapidly. Theorem [Bubley, Dyer ’97]: If exists t-step coupling which contracts for all 𝑥,𝑦∈Ω with 𝛿 𝑥,𝑦 =1, chain mixes rapidly. Mention that mixing time will depend on how long the variable length coupling typically lasts, but note that in our setting it’s \Theta(n) whp, so should expect same mixing time as in one-step coupling case. Our coupling: take t = first time 𝛿 𝜎 ′ , 𝜏 ′ ≠1, i.e. Repeatedly run Vigoda’s one-step greedy coupling Stop right after distance between two colorings has changed.

Brittleness of extremal configurations
Formally, the optimality of 11/6 for the LP only shows that Vigoda’s greedy coupling doesn’t work below 11/6, but it is not hard to check that if any one-step coupling works, Vigoda’s greedy coupling will also work.

Avoiding extremal configurations
If you remove the two extremal configuration constraints from Linear Program 1, objective value becomes 161/88=1.8295… Can’t hope to never encounter an extremal configuration, but the extremal configurations seem incredibly brittle. Nice Scenario: Right before the end of the coupling, 𝜎,𝜏 satisfy 𝑐: 𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 extremal 𝑐: 𝑚 𝑐 ≠0 ≤𝛾 for some absolute constant 𝛾<1 In the Nice Scenario, we’d already beat 11/6. We show the Nice Scenario holds in expectation. Can’t hope to completely avoid extremal configurations, but what if for a typical (sigma,tau) only a constant fraction of colors in N(v) are extremal? Why do we just care about the second kind of extremal configuration? Because the flip dynamics’ advantage over Jerrum is that they can handle the case of all unique colors around v much better. Indeed, if all N(v) colors were unique, would be able to couple perfectly, but at the cost of handling the case where some colors are repeated.

If you remove the two extremal configuration constraints from Linear Program 1, objective value becomes 161/88=1.8295… Can’t hope to never encounter an extremal configuration, but the extremal configurations seem incredibly brittle. We will call 𝜎,𝜏 c-bad if 𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1,1 Nice Scenario: Right before the end of the coupling, 𝜎,𝜏 satisfy 𝑐: 𝐴 𝑐 , 𝐵 𝑐 ; 𝑎 𝑐 , 𝑏 𝑐 = 7,3; 3,3 , 1, 𝑐: 𝑚 𝑐 ≥2 ≤𝛾 for some absolute constant 𝛾<1 In the Nice Scenario, we’d already beat 11/6. We show the Nice Scenario holds in expectation. Can’t hope to completely avoid extremal configurations, but what if for a typical (sigma,tau) only a constant fraction of colors in N(v) are extremal? Why do we just care about the second kind of extremal configuration? Because the flip dynamics’ advantage over Jerrum is that they can handle the case of all unique colors around v much better. Indeed, if all N(v) colors were unique, would be able to couple perfectly, but at the cost of handling the case where some colors are repeated.

If you remove the two extremal configuration constraints from Linear Program 1, objective value becomes 161/88=1.8295… Can’t hope to never encounter an extremal configuration, but the extremal configurations seem incredibly brittle… Nice Scenario: Right before the end of the coupling, 𝜎,𝜏 satisfy 𝑐: 𝜎,𝜏 c−bad 𝑐: 𝑚 𝑐 ≥2 ≤𝛾 for some absolute constant 𝛾<1 In the Nice Scenario, we’d already beat 11/6. We show the Nice Scenario holds in expectation. Can’t hope to completely avoid extremal configurations, but what if for a typical (sigma,tau) only a constant fraction of colors in N(v) are extremal? Why do we just care about the second kind of extremal configuration? Because the flip dynamics’ advantage over Jerrum is that they can handle the case of all unique colors around v much better. Indeed, if all N(v) colors were unique, would be able to couple perfectly, but at the cost of handling the case where some colors are repeated.

Nice Scenario in expectation
Key Lemma: Suppose 𝑘≥1.833𝑑. Starting from any pair of colorings differing only at v, 𝔼 𝑐: 𝜎,𝜏 𝑐−bad 𝔼 𝑐: 𝑚 𝑐 ≥2 ≤𝛾 for some absolute constant 𝛾<1, where expectation is over the 𝜎,𝜏 right before the end of the variable length coupling. Already know Nice Scenario would let us beat 11/6. By linearity of expectation, so does Key Lemma.

Proof sketch Let 𝑝 bad (𝑐) be the probability 𝜎,𝜏 c-bad right before the end of the coupling. Let 𝑝 good (𝑐) be probability that 𝜎,𝜏 c-good, i.e. not c-bad and 𝑚 𝑐 ≥2, right before the end of the coupling. By linearity of expectation, enough to show: 𝑝 bad 𝑐 ≤ 𝛾 ′ ⋅ 𝑝 good 𝑐 ∀𝑐 for some absolute constant 𝛾 ′ >0.

Proof sketch: bad → good
𝜎,𝜏 becomes c-good as soon as a 𝑤 𝑖 is selected and successfully recolored, with total probability ≥ 4 𝑛 ⋅ 𝑘−𝑑−1 𝑘 =Ω 1 𝑛 v 𝑤 1 𝑤 2 𝑤 3 𝑤 4 𝜎,𝜏 c-bad 𝜎,𝜏 c-good coupling terminates O 1 𝑛 Ω 1 𝑛 Even if the pair of colorings becomes c-good at some point, how do you ensure it’s more likely to stay c-good than revert to c-bad? Coupling only terminates when one of the 𝑂(𝑑) components in 𝒮 𝜎 △ 𝒮 𝜏 is chosen, with total probability ≤ 𝑂 𝑑 𝑛𝑘 =𝑂 1 𝑛 Note that here, c is green

Proof sketch: good → bad
Provided 𝑚 𝑐 𝑝 𝑚 𝑐 −2 =𝑂 1 , but we can add this for free into the LP Toy case: suppose 𝑚 𝑐 >2. At least 𝑚 𝑐 −2 c-colored neighbors must be flipped. At most 𝑘−2 ⋅ 𝑚 𝑐 such Kempe components, with total probability ≤ 𝑝 𝑚 𝑐 −2 𝑛𝑘 ⋅ 𝑘−2 ⋅ 𝑚 𝑐 =𝑂 1 𝑛 𝜎,𝜏 c-good 𝜎,𝜏 c-bad coupling terminates Ω 1 𝑛 O 1 𝑛 Coupling terminates as soon as v is chosen to be recolored with any c’ not in 𝑁(𝑣), with total probability ≥ 1 𝑛 ⋅ 𝑘−𝑑−1 𝑘 =Ω 1 𝑛

Recap Reformulated Vigoda’s analysis as solving an LP.
Identified the two extremal configurations that obstruct us from beating 11/6 with any one-step coupling. Modified coupling to only terminate once distance changes. Argued that right before coupling terminates, only a constant fraction of configurations around v are extremal in expectation.

An alternative approach
Delcourt, Perarnau, Postle ’18 independently showed essentially the same result by doing a one-step coupling analysis on a deformation of the Hamming metric. Idea: take Hamming metric minus a bonus term that counts the number of non-extremal configurations around v. Win-win analysis: Few extremal configurations, so Hamming term decreases Many extremal configurations, so bonus term increases Proof very similar to proof of Key Lemma Note that this analysis is morally the same as the proof of the key lemma

Open questions Can we use this approach + local uniformity results to push on or for high-girth graphs via SWK dynamics? What other approximate counting problems are amenable to a more aggressive Markov chain and this approach? Deterministic algorithm for 𝑘>2𝑑? If 𝑘> 1+𝜖 𝑑, can we show: Pr 𝜎 𝜎 has a Kempe component of size Ω log 𝑛 ≤ 1 poly 𝑛 ? Remark that 4 is equivalent to showing the most basic decay of correlations: given any two points u,v in the graph that are log(n) away, are their colors approximately independent? Best known for general graphs was 2, though we think we now have it for

Thanks!

Improved Bounds for Sampling Colorings

Similar presentations

Presentation on theme: "Improved Bounds for Sampling Colorings"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Improved Bounds for Sampling Colorings

Similar presentations

Presentation on theme: "Improved Bounds for Sampling Colorings"— Presentation transcript:

Similar presentations

About project

Feedback