Download presentation
Presentation is loading. Please wait.
1
Approximating The Permanent Amit Kagan Seminar in Complexity 04/06/2001
2
Topics Description of the Markov chain Analysis of its mixing time
3
Definitions Let G = (V 1, V 2, E) be a bipartite graph on n+n vertices. Let M denote the set of perfect matchings in G. Let M (y, z) denote the set of near-perfect matchings with holes only at y and z.
4
| M (u,v)|/| M | Exponentially Large It has only one perfect matching... u v Observe the following bipartite graph:
5
| M (u,v)|/| M | Exponentially Large But two near-perfect matchings with holes at u and v. u v
6
| M (u,v)|/| M | Exponentially Large Concatenating another hexagon, –adds a constant number of vertices, –but doubles the number of near-perfect matchings, –while the number of perfect matchings remains 1.... Thus we can force the ratio | M (u,v)|/| M | to be exponentially large.
7
The Breakthrough Jerrum, Sinclair, and Vigoda [2000] introduced an additional weight factor. Any hole pattern (including that with no holes) is equally likely in the stationary distribution π. π will assign Ω(1/n 2 ) weight to perfect matchings.
8
Edge Weights For each edge (y, z) E, we introduce a positive weight (y, z). For a matching M, (M) = (i, j) M (i, j). For a set of matchings S, (S) = M S (M). We will work with the complete graph on n+n vertices: o (e) = 1 for all e E o (e) = ξ ≈ 0 for all e E 1 1 1 ξ
9
The Stationary Distribution The desired distribution π over Ω is (M) (M), where w : V 1 × V 2 + is the weight function, to be specified shortly
10
The Markov Chain 1.Choose an edge e=(u,v) uniformly at random. 2. (i) If M M and e M, let M’ = M\{e}, (ii) if M M (u,v), let M’ = M {e}, (iii) if M M (u,z) where z v, and (y,v) M, let M’ = M {e}\{(y,v)}, (iv) if M M (y,v) where y u, and (u,z) M, let M’ = M {e}\{(u,z)}. Metropolis rule 3. With probability min{1, (M’)/ (M)} go to M’; otherwise, stay at M.
11
The Markov Chain (cont.) Finally, we add a self-loop probability of ½ to every state. This insures the MC is aperiodic. We also have irreducibility.
12
Detailed Balance Consider two adjacent matchings M and M’ with (M) ≤ (M’). (M)P(M, M’) = (M’)P(M’, M) P(M,M’) > 0 =: Q(M,M’) The transition probabilities between M and M’ may be written
13
The Ideal Weight Recall that (M) (M), where Ideally, we would take w = w*, where ( M (u,v)) (M)(M) λ(M)w(u,v)λ(M)w(u,v) = λ( M ) = ( M )
14
The Concession We will content ourselves with weights w satisfying This perturbation will reduce the relative weight of perfect and near-perfect matchings by at most a constant factor (4).
15
The Mixing Time Theorem Assuming the weight function w satisfies the above inequality for all (y,z) V 1 × V 2, then the mixing time of the MC is bounded above by ( ) = O(m 6 n 8 (n logn + log -1 )), provided the initial state is a perfect matching of maximum activity.
16
Edge Weights Revisited We will work with the complete graph on n+n vetices. Think of non-edges e E as having a very small activity of 1/n!. The combined weight of all invalid matchings is at most 1. We begin with activities whose ideal weights w* are easy to compute, and progress towards our target activities. ≡ 1 *(e) = 1 /n! for all e E *(e) = 1/n! for all e E
17
Step I We assume at the beginning of the phase w(u,v) approximates w*(u,v) within ratio 2 for all (u,v). Before updating an activity, we will find for each (u,v) a better approximation, one that is within ratio c for some 1 < c < 2. For this purpose we use the identity
18
Step I (cont.) The mixing time theorem allows us to sample, in polynomial time, from a distribution ’ that is within variation distance of π. We choose = c 1 /n 2, take O(n 2 log -1 ) samples from ’, and use sample averages. Using a few Chernoff bounds, we have, with probability 1- (n 2 +1) , approximation within ratio c to all of w*(u,v). c 1 > 0 is a sufficiently small constant
19
Step I (conclusion) Taking c = 6/5 and using O(n 2 log -1 ) samples, we obtain refined estimates w(u,v) satisfying 5w*(u,v)/6 ≤ w(u,v) ≤ 6w*(u,v)/5
20
Step II We update the activity of an edge e (e) ← (e) * exp(-1/2) The ideal weight function w* changes by at most a factor of exp(1/2). Since 6exp(1/2)/5 < 2, our estimates w after step I approximate w* within ratio 2 for the new activities. ≈ 1.978
21
Step II (cont.) We use the above procedure repeatedly to reduce the initial activities to the target activities. ≡ 1 *(e) = 1/n! for all e E *(e) = 1/n! for all e E This requires O(n 2 · n log n) phases. Each phase requires O(n 2 log -1 ) samples. Each sample requires O(n 21 log n) simulation steps (mixing time theorem). Overall time - O(n 26 log 2 n log -1 )
22
The Error We need to set so that the overall failure probability is strictly less than , say /2. The probability that any phase fails is at most O(n 3 log n · n 2 ). We will take = c 2 / n 5 log n.
23
Time Complexity Running time of generating a sample: Running time of the initialization:
24
Conductance The conductance of a reversible MC is defined as =min S (S), where Theorem: For an ergodic, reversible Markov chain with self- loops probabilities P(y,y) ½ for all states x ,
25
Canonical Paths We define canonical paths γ I,F from all I Ω to all F M. Denote Γ = { γ I,F : (I, F) Ω × M }. Certain transitions on a canonical path will be deemed chargeable. For each transition t denote cp(t) = {(I, F) : γ I,F contains t as a chargeable transition}
26
I FI F If I M, then I F consists of a collection of alternating cycles. If I M (y,z), then I F consists of a collection of alternating cycles together with a single alternating path from y to z. y z
27
Type A Path Assume I M. A cycle v 0 v 1 … v 2k = v 0 is unwound by: We assume w.l.g. that the edge (v 0, v 1 ) belongs to I (i) removing the edge (v 0, v 1 ), (ii) successively, for each 1 ≤ i ≤ k – 1, exchanging the edge (v 2i, v 2i+1 ) with (v 2i-1, v 2i ), (iii) adding the edge (v 2k-1, v 2k ). All these transitions are deemed chargeable.
28
Type A Path Illustrated v0v0 v1v1 v1v1 v2v2 v4v4 v3v3 v5v5 v6v6 v0v0 v7v7
29
Type B Path Assume I M (y,z). The alternating path y = v 0 … v 2k+1 = z is unwound by: (i) successively, for each 1 ≤ i ≤ k, exchanging the edge (v 2i-1, v 2i ) with (v 2i-2, v 2i-1 ), and (ii) adding the edge (v 2k, v 2k+1 ). Here, only the above transitions are deemed chargeable.
30
Type B Path Illustrated y z
31
Congestion We define a notion of congestion of Γ: Lemma I Assuming the weight w approximates w* within ratio 2, then τ(Γ) ≤ 16m.
32
Lemma II Let u,y V 1, v,z V 2. Then, (i) λ(u,v)λ( M (u,v)) ≤ λ( M ), for all vertices u,v with u v. (ii) λ(u,v)λ( M (u,z))λ( M (y,v)) ≤ λ( M )λ( M (y,z)), for all distinct vertices u,v,y,z with u v. Observe that M u,z M y,v {(u,v)} decomposes into a collection of cycles together with an odd- length path O joining y and z.
33
Corollary III Let u,y V 1, v,z V 2. Then, (i) w*(u,v) ≥ λ(u,v), for all vertices u,v with u v. (ii) w*(u,z)w*(y,v) ≥ λ(u,v)w*(y,z), for all distinct vertices u,v,y,z with u v. (iii) w*(u,z)w*(y,v) ≥ λ(u,v) λ(y,z), for all distinct vertices u,v,y,z with u v and y z.
34
Proof of Lemma I For any transition t = (M,M’) and any pair of states I, F cp(t), we will define an encoding η t (I,F) Ω such that η t : cp(t) → Ω is an injection, and π(I)π(F) ≤ 8 min{π(M), π(M’)}π(η t (I,F)) = 16m Q(t)π(η t (I,F)) Summing over I,F cp(t), we get
35
The Injection η t For a transition t = (M,M’) which is involved in stage (ii) of unwinding a cycle, the encoding is η t (I,F) = I F (M M’) \ {(v 0, v 1 )}. Otherwise, the encoding is η t (I,F) = I F (M M’).
36
From Congestion to Conductance Corollary IV Assuming the weight function w approximates w* within ratio 2 for all (y,z) V 1 × V 2, then ≥ 1/100 τ 3 n 4 ≥ 1/10 6 m 3 n 4. Proof Set α = 1/10 τ n 2. Let (S,Ŝ) be a partition of the state-space.
37
Case I π(S M ) / π(S) ≥ α and π(Ŝ M ) / π(Ŝ) ≥ α. Just looking at canonical paths of type A we have a total flow of π(S M )π(Ŝ M ) ≥ α 2 π(S)π(Ŝ) across the cut. Thus, τQ(S, Ŝ) ≥ α 2 π(S)π(Ŝ), and, (S) = Q(S, Ŝ)/π(S)π(Ŝ) ≥ α 2 / τ = 1/100 τ 3 n 4. 1/10τn 2
38
Case II Otherwise, π(S M ) / π(S) < α. Note the following estimates: π( M ) ≥ 1/4(n 2 +1) ≥ 1/5n 2 π(S M ) < απ(S) < α π(S \ M ) = π(S) – π(S M ) > (1 – α)π(S) Q(S \ M, S M ) ≤ π(S M ) < απ( M )
39
Case II (cont.) Consider the cut (S \ M, Ŝ M ). The weight of canonical paths (all chargeable as they cross the cut) is π(S \ M )π( M ) ≥ (1 – α)π(S)/5n 2 ≥ π(S)/6n 2. 1/10τn 2 Hence, τQ(S \ M,Ŝ M ) ≥ π(S)/6n 2. Q(S,Ŝ) ≥ … ≥ π(S)π(Ŝ)/15τn 2. (S) = Q(S,Ŝ)/π(S)π(Ŝ) ≥ 1/15τn 2.
40
Summing It Up Starting from an initial state X 0 of maximum activity guarantees π(X 0 ) ≥ 1/n!, and hence, log(π(X 0 ) -1 ) = O(n log n). We showed (S) ≥ 1/100 τ 3 n 4, and hence, (S) -1 = O( τ 3 n 4 ) = O( m 3 n 4 ). Thus, according to the conductance theorem, x 0 ( ) = O(m 6 n 8 (n logn + log -1 )).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.