Presentation is loading. Please wait.

Presentation is loading. Please wait.

Approximating The Permanent Amit Kagan Seminar in Complexity 04/06/2001.

Similar presentations


Presentation on theme: "Approximating The Permanent Amit Kagan Seminar in Complexity 04/06/2001."— Presentation transcript:

1 Approximating The Permanent Amit Kagan Seminar in Complexity 04/06/2001

2 Topics Description of the Markov chain Analysis of its mixing time

3 Definitions Let G = (V 1, V 2, E) be a bipartite graph on n+n vertices. Let M denote the set of perfect matchings in G. Let M (y, z) denote the set of near-perfect matchings with holes only at y and z.

4 | M (u,v)|/| M | Exponentially Large It has only one perfect matching... u v Observe the following bipartite graph:

5 | M (u,v)|/| M | Exponentially Large But two near-perfect matchings with holes at u and v. u v

6 | M (u,v)|/| M | Exponentially Large Concatenating another hexagon, –adds a constant number of vertices, –but doubles the number of near-perfect matchings, –while the number of perfect matchings remains 1.... Thus we can force the ratio | M (u,v)|/| M | to be exponentially large.

7 The Breakthrough Jerrum, Sinclair, and Vigoda [2000] introduced an additional weight factor. Any hole pattern (including that with no holes) is equally likely in the stationary distribution π. π will assign Ω(1/n 2 ) weight to perfect matchings.

8 Edge Weights For each edge (y, z)  E, we introduce a positive weight (y, z). For a matching M, (M) =  (i, j)  M (i, j). For a set of matchings S, (S) =  M  S (M). We will work with the complete graph on n+n vertices: o (e) = 1 for all e  E o (e) = ξ ≈ 0 for all e  E 1 1 1 ξ

9 The Stationary Distribution The desired distribution π over Ω is  (M)   (M), where w : V 1 × V 2   + is the weight function, to be specified shortly

10 The Markov Chain 1.Choose an edge e=(u,v) uniformly at random. 2. (i) If M  M and e  M, let M’ = M\{e}, (ii) if M  M (u,v), let M’ = M  {e}, (iii) if M  M (u,z) where z  v, and (y,v)  M, let M’ = M  {e}\{(y,v)}, (iv) if M  M (y,v) where y  u, and (u,z)  M, let M’ = M  {e}\{(u,z)}. Metropolis rule 3. With probability min{1,  (M’)/  (M)} go to M’; otherwise, stay at M.

11 The Markov Chain (cont.) Finally, we add a self-loop probability of ½ to every state. This insures the MC is aperiodic. We also have irreducibility.

12 Detailed Balance Consider two adjacent matchings M and M’ with  (M) ≤  (M’).   (M)P(M, M’) =  (M’)P(M’, M) P(M,M’) > 0 =: Q(M,M’) The transition probabilities between M and M’ may be written

13 The Ideal Weight Recall that  (M)   (M), where Ideally, we would take w = w*, where  ( M (u,v)) (M)(M) λ(M)w(u,v)λ(M)w(u,v) = λ( M ) =  ( M )

14 The Concession We will content ourselves with weights w satisfying This perturbation will reduce the relative weight of perfect and near-perfect matchings by at most a constant factor (4).

15 The Mixing Time Theorem Assuming the weight function w satisfies the above inequality for all (y,z)  V 1 × V 2, then the mixing time of the MC is bounded above by  (  ) = O(m 6 n 8 (n logn + log  -1 )), provided the initial state is a perfect matching of maximum activity.

16 Edge Weights Revisited We will work with the complete graph on n+n vetices. Think of non-edges e  E as having a very small activity of 1/n!. The combined weight of all invalid matchings is at most 1. We begin with activities whose ideal weights w* are easy to compute, and progress towards our target activities. ≡ 1 *(e) = 1 /n! for all e  E *(e) = 1/n! for all e  E

17 Step I We assume at the beginning of the phase w(u,v) approximates w*(u,v) within ratio 2 for all (u,v). Before updating an activity, we will find for each (u,v) a better approximation, one that is within ratio c for some 1 < c < 2. For this purpose we use the identity

18 Step I (cont.) The mixing time theorem allows us to sample, in polynomial time, from a distribution  ’ that is within variation distance  of π. We choose  = c 1 /n 2, take O(n 2 log  -1 ) samples from  ’, and use sample averages. Using a few Chernoff bounds, we have, with probability 1- (n 2 +1) , approximation within ratio c to all of w*(u,v). c 1 > 0 is a sufficiently small constant

19 Step I (conclusion) Taking c = 6/5 and using O(n 2 log  -1 ) samples, we obtain refined estimates w(u,v) satisfying 5w*(u,v)/6 ≤ w(u,v) ≤ 6w*(u,v)/5

20 Step II We update the activity of an edge e  (e) ← (e) * exp(-1/2) The ideal weight function w* changes by at most a factor of exp(1/2). Since 6exp(1/2)/5 < 2, our estimates w after step I approximate w* within ratio 2 for the new activities. ≈ 1.978

21 Step II (cont.) We use the above procedure repeatedly to reduce the initial activities to the target activities. ≡ 1 *(e) = 1/n! for all e  E *(e) = 1/n! for all e  E This requires O(n 2 · n log n) phases. Each phase requires O(n 2 log  -1 ) samples. Each sample requires O(n 21 log n) simulation steps (mixing time theorem).  Overall time - O(n 26 log 2 n log  -1 )

22 The  Error We need to set  so that the overall failure probability is strictly less than , say  /2. The probability that any phase fails is at most O(n 3 log n · n 2  ). We will take  = c 2  / n 5 log n.

23 Time Complexity Running time of generating a sample: Running time of the initialization:

24 Conductance The conductance of a reversible MC is defined as  =min  S   (S), where Theorem: For an ergodic, reversible Markov chain with self- loops probabilities P(y,y)  ½ for all states x ,

25 Canonical Paths We define canonical paths γ I,F from all I  Ω to all F  M. Denote Γ = { γ I,F : (I, F)  Ω × M }. Certain transitions on a canonical path will be deemed chargeable. For each transition t denote cp(t) = {(I, F) : γ I,F contains t as a chargeable transition}

26 I  FI  F If I  M, then I  F consists of a collection of alternating cycles. If I  M (y,z), then I  F consists of a collection of alternating cycles together with a single alternating path from y to z. y z

27 Type A Path Assume I  M. A cycle v 0  v 1  …  v 2k = v 0 is unwound by: We assume w.l.g. that the edge (v 0, v 1 ) belongs to I (i) removing the edge (v 0, v 1 ), (ii) successively, for each 1 ≤ i ≤ k – 1, exchanging the edge (v 2i, v 2i+1 ) with (v 2i-1, v 2i ), (iii) adding the edge (v 2k-1, v 2k ). All these transitions are deemed chargeable.

28 Type A Path Illustrated v0v0 v1v1 v1v1 v2v2 v4v4 v3v3 v5v5 v6v6 v0v0 v7v7

29 Type B Path Assume I  M (y,z). The alternating path y = v 0  …  v 2k+1 = z is unwound by: (i) successively, for each 1 ≤ i ≤ k, exchanging the edge (v 2i-1, v 2i ) with (v 2i-2, v 2i-1 ), and (ii) adding the edge (v 2k, v 2k+1 ). Here, only the above transitions are deemed chargeable.

30 Type B Path Illustrated y z

31 Congestion We define a notion of congestion of Γ: Lemma I Assuming the weight w approximates w* within ratio 2, then τ(Γ) ≤ 16m.

32 Lemma II Let u,y  V 1, v,z  V 2. Then, (i) λ(u,v)λ( M (u,v)) ≤ λ( M ), for all vertices u,v with u  v. (ii) λ(u,v)λ( M (u,z))λ( M (y,v)) ≤ λ( M )λ( M (y,z)), for all distinct vertices u,v,y,z with u  v. Observe that M u,z  M y,v  {(u,v)} decomposes into a collection of cycles together with an odd- length path O joining y and z.

33 Corollary III Let u,y  V 1, v,z  V 2. Then, (i) w*(u,v) ≥ λ(u,v), for all vertices u,v with u  v. (ii) w*(u,z)w*(y,v) ≥ λ(u,v)w*(y,z), for all distinct vertices u,v,y,z with u  v. (iii) w*(u,z)w*(y,v) ≥ λ(u,v) λ(y,z), for all distinct vertices u,v,y,z with u  v and y  z.

34 Proof of Lemma I For any transition t = (M,M’) and any pair of states I, F  cp(t), we will define an encoding η t (I,F)  Ω such that η t : cp(t) → Ω is an injection, and π(I)π(F) ≤ 8 min{π(M), π(M’)}π(η t (I,F)) = 16m Q(t)π(η t (I,F)) Summing over I,F  cp(t), we get

35 The Injection η t For a transition t = (M,M’) which is involved in stage (ii) of unwinding a cycle, the encoding is η t (I,F) = I  F  (M  M’) \ {(v 0, v 1 )}. Otherwise, the encoding is η t (I,F) = I  F  (M  M’).

36 From Congestion to Conductance Corollary IV Assuming the weight function w approximates w* within ratio 2 for all (y,z)  V 1 × V 2, then  ≥ 1/100 τ 3 n 4 ≥ 1/10 6 m 3 n 4. Proof Set α = 1/10 τ n 2. Let (S,Ŝ) be a partition of the state-space.

37 Case I π(S  M ) / π(S) ≥ α and π(Ŝ  M ) / π(Ŝ) ≥ α. Just looking at canonical paths of type A we have a total flow of π(S  M )π(Ŝ  M ) ≥ α 2 π(S)π(Ŝ) across the cut. Thus, τQ(S, Ŝ) ≥ α 2 π(S)π(Ŝ), and,  (S) = Q(S, Ŝ)/π(S)π(Ŝ) ≥ α 2 / τ = 1/100 τ 3 n 4. 1/10τn 2

38 Case II Otherwise, π(S  M ) / π(S) < α. Note the following estimates: π( M ) ≥ 1/4(n 2 +1) ≥ 1/5n 2 π(S  M ) < απ(S) < α π(S \ M ) = π(S) – π(S  M ) > (1 – α)π(S) Q(S \ M, S  M ) ≤ π(S  M ) < απ( M )

39 Case II (cont.) Consider the cut (S \ M, Ŝ  M ). The weight of canonical paths (all chargeable as they cross the cut) is π(S \ M )π( M ) ≥ (1 – α)π(S)/5n 2 ≥ π(S)/6n 2. 1/10τn 2 Hence, τQ(S \ M,Ŝ  M ) ≥ π(S)/6n 2. Q(S,Ŝ) ≥ … ≥ π(S)π(Ŝ)/15τn 2.   (S) = Q(S,Ŝ)/π(S)π(Ŝ) ≥ 1/15τn 2.

40 Summing It Up Starting from an initial state X 0 of maximum activity guarantees π(X 0 ) ≥ 1/n!, and hence, log(π(X 0 ) -1 ) = O(n log n). We showed  (S) ≥ 1/100 τ 3 n 4, and hence,  (S) -1 = O( τ 3 n 4 ) = O( m 3 n 4 ). Thus, according to the conductance theorem,  x 0 (  ) = O(m 6 n 8 (n logn + log  -1 )).


Download ppt "Approximating The Permanent Amit Kagan Seminar in Complexity 04/06/2001."

Similar presentations


Ads by Google