Download presentation
Presentation is loading. Please wait.
1
Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech)
2
Adaptive annealing: a near-optimal connection between sampling and counting Daniel Štefankovi č (University of Rochester) Santosh Vempala Eric Vigoda (Georgia Tech) If you want to count using MCMC then statistical physics is useful.
3
1. Counting problems 2. Basic tools: Chernoff, Chebyshev 3. Dealing with large quantities (the product method) 4. Statistical physics 5. Cooling schedules (our work) 6. More… Outline
4
independent sets spanning trees matchings perfect matchings k-colorings Counting
5
independent sets spanning trees matchings perfect matchings k-colorings Counting
6
spanning trees Compute the number of
7
spanning trees Compute the number of det(D – A) vv 0101 1010 0101 1010 2000 0200 0020 0002 2 0 2 0 2 Kirchhoff’s Matrix Tree Theorem: - DA det
8
spanning trees Compute the number of polynomial-time algorithm G number of spanning trees of G
9
independent sets spanning trees matchings perfect matchings k-colorings Counting ?
10
independent sets Compute the number of (hard-core gas model) independent set subset S of vertices, of a graph no two in S are neighbors =
11
# independent sets = 7 independent set = subset S of vertices no two in S are neighbors
12
... # independent sets = G1G1 G2G2 G3G3 GnGn... G n-2... G n-1
13
... # independent sets = G1G1 G2G2 G3G3 GnGn... G n-2... G n-1 2 3 5 F n-1 FnFn F n+1
14
# independent sets = 5598861 independent set = subset S of vertices no two in S are neighbors
15
independent sets Compute the number of polynomial-time algorithm G number of independent sets of G ?
16
independent sets Compute the number of polynomial-time algorithm G number of independent sets of G ! (unlikely)
17
#P-complete #P-complete even for 3-regular graphs graph G # independent sets in G (Dyer, Greenhill, 1997) FP #P P NP
18
graph G # independent sets in G approximation randomization ?
19
graph G # independent sets in G approximation randomization ? which is more important?
20
graph G # independent sets in G approximation randomization ? which is more important? My world-view: (true) randomness is important conceptually but NOT computationally (i.e., I believe P=BPP). approximation makes problems easier (i.e., I believe #P=BPP)
21
We would like to know Q Goal: random variable Y such that P( (1- )Q Y (1+ )Q ) 1- “Y gives (1 -estimate”
22
We would like to know Q Goal: random variable Y such that P( (1- )Q Y (1+ )Q ) 1- polynomial-time algorithm G, , FPRAS: Y (fully polynomial randomized approximation scheme):
23
1. Counting problems 2. Basic tools: Chernoff, Chebyshev 3. Dealing with large quantities (the product method) 4. Statistical physics 5. Cooling schedules (our work) 6. More... Outline
24
We would like to know Q 1. Get an unbiased estimator X, i. e., E[X] = Q Y= X 1 + X 2 +... + X n n 2. “Boost the quality” of X:
25
P( Y gives (1 )-estimate ) 1 - The Bienaymé-Chebyshev inequality V[Y] E[Y] 2 1
26
P( Y gives (1 )-estimate ) 1 - Y= X 1 + X 2 +... + X n n The Bienaymé-Chebyshev inequality V[Y] E[Y] 2 = 1 V[X] E[X] 2 n squared coefficient of variation SCV V[Y] E[Y] 2 1
27
P( Y gives (1 )-estimate of Q ) Let X 1,...,X n,X be independent, identically distributed random variables, Q=E[X]. Let The Bienaymé-Chebyshev inequality 1 - V[X] n E[X] 2 1 Then Y= X 1 + X 2 +... + X n n
28
P( Y gives (1 )-estimate of Q ) - 2. n. E[X] / 3 1 – Let X 1,...,X n,X be independent, identically distributed random variables, 0 X 1, Q=E[X]. Let Chernoff’s bound Y= X 1 + X 2 +... + X n n Then e
30
n V[X] E[X] 2 1 1 n 1 E[X] 3 ln (1/ ) 0X10X1 Number of samples to achieve precision with confidence
31
n V[X] E[X] 2 1 1 n 1 E[X] 3 ln (1/ ) 0X10X1 Number of samples to achieve precision with confidence BAD GOOD BAD
32
Median “boosting trick” P( ) 3/4 n 1 E[X] 4 (1- )Q(1+ )Q Y= X 1 + X 2 +... + X n n Y = BY BIENAYME-CHEBYSHEV:
33
Median trick – repeat 2T times (1- )Q(1+ )Q P( ) 3/4 P( ) 1 - e -T/4 > T out of 2T median is in P( ) 1 - e -T/4 BY BIENAYME-CHEBYSHEV: BY CHERNOFF:
34
n V[X] E[X] 2 32 n 1 E[X] 3 ln (1/ ) 0X10X1 + median trick ln (1/ ) BAD
35
n V[X] E[X] 2 1 ln (1/ ) Creating “approximator” from X = precision = confidence
36
1. Counting problems 2. Basic tools: Chernoff, Chebyshev 3. Dealing with large quantities (the product method) 4. Statistical physics 5. Cooling schedules (our work) 6. More... Outline
37
(approx) counting sampling Valleau,Card’72 (physical chemistry), Babai’79 (for matchings and colorings), Jerrum,Valiant,V.Vazirani’86 random variables: X 1 X 2... X t E[X 1 X 2... X t ] = O(1) V[X i ] E[X i ] 2 the X i are easy to estimate = “WANTED” the outcome of the JVV reduction: such that 1) 2) squared coefficient of variation (SCV)
38
E[X 1 X 2... X t ] = O(1) V[X i ] E[X i ] 2 the X i are easy to estimate = “WANTED” 1) 2) O(t 2 / 2 ) samples (O(t/ 2 ) from each X i ) give 1 estimator of “WANTED” with prob 3/4 Theorem (Dyer-Frieze’91) (approx) counting sampling
39
JVV for independent sets P( ) 1 # independent sets = GOAL: given a graph G, estimate the number of independent sets of G
40
JVV for independent sets P( ) P( ) = ? ? ? ? ? P( ) ? X1X1 X2X2 X3X3 X4X4 X i [0,1] and E[X i ] ½ = O(1) V[X i ] E[X i ] 2 P(A B)=P(A)P(B|A)
41
JVV for independent sets P( ) P( ) = ? ? ? ? ? P( ) ? X1X1 X2X2 X3X3 X4X4 X i [0,1] and E[X i ] ½ = O(1) V[X i ] E[X i ] 2 P(A B)=P(A)P(B|A)
42
Self-reducibility for independent sets ? ? ? P( ) 5 7 =
43
? ? ? 5 7 = 5 7 = Self-reducibility for independent sets
44
? ? ? P( ) 5 7 = 5 7 = 5 7 = Self-reducibility for independent sets
45
? ? P( ) 3 5 = 3 5 = Self-reducibility for independent sets
46
? ? P( ) 3 5 = 3 5 = 3 5 = Self-reducibility for independent sets
47
3 5 5 7 = 5 7 = 3 5 5 7 = 2 3 = 7 Self-reducibility for independent sets
48
SAMPLER ORACLE graph G random independent set of G JVV: If we have a sampler oracle: then FPRAS using O(n 2 ) samples.
49
SAMPLER ORACLE graph G random independent set of G JVV: If we have a sampler oracle: then FPRAS using O(n 2 ) samples. SAMPLER ORACLE graph G set from gas-model Gibbs at ŠVV: If we have a sampler oracle: then FPRAS using O * (n) samples.
50
O * ( |V| ) samples suffice for counting Application – independent sets Cost per sample (Vigoda’01,Dyer-Greenhill’01) time = O * ( |V| ) for graphs of degree 4. Total running time: O * ( |V| 2 ).
51
Other applications matchings O * (n 2 m) (using Jerrum, Sinclair’89) spin systems: Ising model O * (n 2 ) for < C (using Marinelli, Olivieri’95) k-colorings O * (n 2 ) for k>2 (using Jerrum’95) total running time
52
1. Counting problems 2. Basic tools: Chernoff, Chebyshev 3. Dealing with large quantities (the product method) 4. Statistical physics 5. Cooling schedules (our work) 6. More… Outline
53
easy = hot hard = cold
54
Hamiltonian 1 2 4 0
55
H : {0,...,n} Big set = Goal: estimate |H -1 (0)| |H -1 (0)| = E[X 1 ]... E[X t ]
56
Distributions between hot and cold (x) exp(-H(x) ) = inverse temperature = 0 hot uniform on = cold uniform on H -1 (0) (Gibbs distributions)
57
(x) Normalizing factor = partition function exp(-H(x) ) Z( )= exp(-H(x) ) x Z( ) Distributions between hot and cold (x) exp(-H(x) )
58
Partition function Z( )= exp(-H(x) ) x have: Z(0) = | | want: Z( ) = |H -1 (0)|
59
Partition function - example Z( )= exp(-H(x) ) x have: Z(0) = | | want: Z( ) = |H -1 (0)| 1 2 4 0 Z( ) = 1 e -4. + 4 e -2. + 4 e -1. + 7 e -0. Z(0) = 16 Z( )=7
60
(x) exp(-H(x) ) Z( ) Assumption: we have a sampler oracle for SAMPLER ORACLE graph G subset of V from
61
(x) exp(-H(x) ) Z( ) Assumption: we have a sampler oracle for W
62
(x) exp(-H(x) ) Z( ) Assumption: we have a sampler oracle for W X = exp(H(W)( - ))
63
(x) exp(-H(x) ) Z( ) Assumption: we have a sampler oracle for W X = exp(H(W)( - )) E[X] = (s) X(s) s = Z( ) Z( ) can obtain the following ratio:
64
Partition function Z( ) = exp(-H(x) ) x Our goal restated Goal: estimate Z( )=|H -1 (0)| Z( ) = Z( 1 ) Z( 2 ) Z( t ) Z( 0 ) Z( 1 ) Z( t-1 ) Z(0) 0 = 0 < 1 < 2 <... < t = ...
65
Our goal restated Z( ) = Z( 1 ) Z( 2 ) Z( t ) Z( 0 ) Z( 1 ) Z( t-1 ) Z(0)... How to choose the cooling schedule? Cooling schedule: E[X i ] = Z( i ) Z( i-1 ) V[X i ] E[X i ] 2 O(1) minimize length, while satisfying 0 = 0 < 1 < 2 <... < t =
66
Our goal restated Z( ) = Z( 1 ) Z( 2 ) Z( t ) Z( 0 ) Z( 1 ) Z( t-1 ) Z(0)... How to choose the cooling schedule? Cooling schedule: E[X i ] = Z( i ) Z( i-1 ) V[X i ] E[X i ] 2 O(1) minimize length, while satisfying 0 = 0 < 1 < 2 <... < t =
67
1. Counting problems 2. Basic tools: Chernoff, Chebyshev 3. Dealing with large quantities (the product method) 4. Statistical physics 5. Cooling schedules (our work) 6. More... Outline
68
Parameters: A and n Z( ) = A H: {0,...,n} Z( ) = exp(-H(x) ) x Z( ) = a k e - k k=0 n a k = |H -1 (k)|
69
Parameters Z( ) = A H: {0,...,n} independent sets matchings perfect matchings k-colorings 2V2V V! kVkV A E V V E n V!
70
Parameters Z( ) = A H: {0,...,n} independent sets matchings perfect matchings k-colorings 2V2V V! kVkV A E V V E n V! matchings = # ways of marrying them so that no unhappy couple
71
Parameters Z( ) = A H: {0,...,n} independent sets matchings perfect matchings k-colorings 2V2V V! kVkV A E V V E n V! matchings = # ways of marrying them so that no unhappy couple
72
Parameters Z( ) = A H: {0,...,n} independent sets matchings perfect matchings k-colorings 2V2V V! kVkV A E V V E n V! matchings = # ways of marrying them so that no unhappy couple
73
Parameters Z( ) = A H: {0,...,n} independent sets matchings perfect matchings k-colorings 2V2V V! kVkV A E V V E n V! marry ignoring “compatibility” hamiltonian = number of unhappy couples
74
Parameters Z( ) = A H: {0,...,n} independent sets matchings perfect matchings k-colorings 2V2V V! kVkV A E V V E n V!
75
Previous cooling schedules Z( ) = A H: {0,...,n} + 1/n (1 + 1/ln A) ln A “Safe steps” O( n ln A) Cooling schedules of length O( (ln n) (ln A) ) 0 = 0 < 1 < 2 <... < t = (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06)
76
Previous cooling schedules Z( ) = A H: {0,...,n} + 1/n (1 + 1/ln A) ln A “Safe steps” O( n ln A) Cooling schedules of length O( (ln n) (ln A) ) 0 = 0 < 1 < 2 <... < t = (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06)
77
+ 1/n (1 + 1/ln A) ln A “Safe steps” (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) Z( ) = a k e - k k=0 n W X = exp(H(W)( - )) 1/e X 1 V[X] E[X] 2 e 1 E[X]
78
+ 1/n (1 + 1/ln A) ln A “Safe steps” (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) Z( ) = a k e - k k=0 n W X = exp(H(W)( - )) Z( ) = a 0 1 Z(ln A) a 0 + 1 E[X] 1/2
79
+ 1/n (1 + 1/ln A) ln A “Safe steps” (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) Z( ) = a k e - k k=0 n W X = exp(H(W)( - )) E[X] 1/2e
80
Previous cooling schedules + 1/n (1 + 1/ln A) ln A “Safe steps” O( n ln A) Cooling schedules of length O( (ln n) (ln A) ) (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) (Bezáková,Štefankovi č, Vigoda,V.Vazirani’06) 1/n, 2/n, 3/n,...., (ln A)/n,...., ln A
81
No better fixed schedule possible Z( ) = A H: {0,...,n} Z a ( ) = (1 + a e ) A 1+a - n A schedule that works for all (with a [0,A-1]) has LENGTH ( (ln n)(ln A) ) THEOREM:
82
Parameters Z( ) = A H: {0,...,n} Our main result: non-adaptive schedules of length * ( ln A ) Previously: can get adaptive schedule of length O * ( (ln A) 1/2 )
83
Related work can get adaptive schedule of length O * ( (ln A) 1/2 ) Lovász-Vempala Volume of convex bodies in O * (n 4 ) schedule of length O(n 1/2 ) (non-adaptive cooling schedule, using specific properties of the “volume” partition functions)
84
Existential part for every partition function there exists a cooling schedule of length O * ((ln A) 1/2 ) Lemma: can get adaptive schedule of length O * ( (ln A) 1/2 ) there exists
85
Cooling schedule (definition refresh) Z( ) = Z( 1 ) Z( 2 ) Z( t ) Z( 0 ) Z( 1 ) Z( t-1 ) Z(0)... How to choose the cooling schedule? Cooling schedule: E[X i ] = Z( i ) Z( i-1 ) V[X i ] E[X i ] 2 O(1) minimize length, while satisfying 0 = 0 < 1 < 2 <... < t =
86
W X = exp(H(W)( - )) E[X 2 ] E[X] 2 Z(2 - ) Z( ) Z( ) 2 = C E[X] Z( ) Z( ) = Express SCV using partition function (going from to ) V[X] E[X] 2 +1 =
87
f( )=ln Z( ) Proof: E[X 2 ] E[X] 2 Z(2 - ) Z( ) Z( ) 2 = C C’=(ln C)/2 2-2- (f(2 - ) + f( ))/2 (ln C)/2 + f( ) graph of f
88
f( )=ln Z( ) f is decreasing f is convex f’(0) –n f(0) ln A Properties of partition functions
89
f( )=ln Z( ) f is decreasing f is convex f’(0) –n f(0) ln A f( ) = ln a k e - k k=0 n f’( ) = a k k e - k k=0 - n a k e - k k=0 n Properties of partition functions (ln f)’ = f’ f
90
f( )=ln Z( ) f is decreasing f is convex f’(0) –n f(0) ln A Proof: either f or f’ changes a lot Let K:= f (ln |f’|) 1 K 1 Then for every partition function there exists a cooling schedule of length O * ((ln A) 1/2 ) GOAL: proving Lemma:
91
Proof: Let K:= f (ln |f’|) 1 K 1 Then c := (a+b)/2, := b-a have f(c) = (f(a)+f(b))/2 – 1 (f(a) – f(c)) / f’(a) (f(c) – f(b)) / f’(b) a b c f is convex
92
Let K:= f (ln |f’|) 1 K Then c := (a+b)/2, := b-a have f(c) = (f(a)+f(b))/2 – 1 (f(a) – f(c)) / f’(a) (f(c) – f(b)) / f’(b) f is convex f’(b) f’(a) 1-1/ f e - f
93
f:[a,b] R, convex, decreasing can be “approximated” using f’(a) f’(b) (f(a)-f(b)) segments
94
Proof: 2-2- Technicality: getting to 2 -
95
Proof: 2-2- ii i+1 Technicality: getting to 2 -
96
Proof: 2-2- ii i+1 i+2 Technicality: getting to 2 -
97
Proof: 2-2- ii i+1 i+2 Technicality: getting to 2 - i+3 ln ln A extra steps
98
Existential Algorithmic can get adaptive schedule of length O * ( (ln A) 1/2 ) there exists can get adaptive schedule of length O * ( (ln A) 1/2 )
99
Algorithmic construction (x) exp(-H(x) ) Z( ) using a sampler oracle for we can construct a cooling schedule of length 38 (ln A) 1/2 (ln ln A)(ln n) Our main result: Total number of oracle calls 10 7 (ln A) (ln ln A+ln n) 7 ln (1/ )
100
current inverse temperature ideally move to such that E[X] = Z( ) Z( ) E[X 2 ] E[X] 2 B2B2 B 1 Algorithmic construction
101
current inverse temperature ideally move to such that E[X] = Z( ) Z( ) E[X 2 ] E[X] 2 B2B2 B 1 Algorithmic construction X is “easy to estimate”
102
current inverse temperature ideally move to such that E[X] = Z( ) Z( ) E[X 2 ] E[X] 2 B2B2 B 1 Algorithmic construction we make progress (where B 1 1)
103
current inverse temperature ideally move to such that E[X] = Z( ) Z( ) E[X 2 ] E[X] 2 B2B2 B 1 Algorithmic construction need to construct a “feeler” for this
104
Algorithmic construction current inverse temperature ideally move to such that E[X] = Z( ) Z( ) E[X 2 ] E[X] 2 B2B2 B 1 need to construct a “feeler” for this = Z( ) Z( ) Z(2 ) Z( )
105
Algorithmic construction current inverse temperature ideally move to such that E[X] = Z( ) Z( ) E[X 2 ] E[X] 2 B2B2 B 1 need to construct a “feeler” for this = Z( ) Z( ) Z(2 ) Z( ) bad “feeler”
106
estimator for Z( ) Z( ) Z( ) = a k e - k k=0 n For W we have P(H(W)=k) = a k e - k Z( )
107
Z( ) = a k e - k k=0n For W we have P(H(W)=k) = a k e - k Z( ) For U we have P(H(U)=k) = a k e - k Z( ) If H(X)=k likely at both , estimator Z( ) Z( ) estimator for
108
Z( ) = a k e - k k=0n For W we have P(H(W)=k) = a k e - k Z( ) For U we have P(H(U)=k) = a k e - k Z( ) If H(X)=k likely at both , estimator Z( ) Z( ) estimator for
109
For W we have P(H(W)=k) = a k e - k Z( ) For U we have P(H(U)=k) = a k e - k Z( ) P(H(U)=k) P(H(W)=k) e k( - ) = Z( ) Z( ) Z( ) Z( ) estimator for
110
For W we have P(H(W)=k) = a k e - k Z( ) For U we have P(H(U)=k) = a k e - k Z( ) P(H(U)=k) P(H(W)=k) e k( - ) = Z( ) Z( ) Z( ) Z( ) PROBLEM: P(H(W)=k) can be too small estimator for
111
Rough estimator for Z( ) = a k e - k k=0 n For W we have P(H(W) [c,d]) = a k e - k Z( ) k=c d Z( ) Z( ) For U we have P(H(W) [c,d]) = a k e - k Z( ) k=c d interval instead of single value
112
P(H(U) [c,d]) P(H(W) [c,d]) ee e c( - ) e 1 If | - | |d-c| 1 then Rough estimator for We also need P(H(U) [c,d]) P(H(W) [c,d]) to be large. Z( ) Z( ) Z( ) Z( ) Z( ) Z( ) a k e - k k=c d a k e - k k=c d e c( - ) = a k e - (k-c) k=c d a k e - (k-c) d k=c
113
Split {0,1,...,n} into h 4(ln n) ln A intervals [0],[1],[2],...,[c,c(1+1/ ln A)],... for any inverse temperature there exists a interval with P(H(W) I) 1/8h We say that I is HEAVY for We will:
114
Split {0,1,...,n} into h 4(ln n) ln A intervals [0],[1],[2],...,[c,c(1+1/ ln A)],... for any inverse temperature there exists a interval with P(H(W) I) 1/8h We say that I is HEAVY for We will:
115
Algorithm find an interval I which is heavy for the current inverse temperature see how far I is heavy (until some * ) use the interval I for the feeler repeat Z( ) Z( ) Z(2 ) Z( ) either * make progress, or * eliminate the interval I * or make a “long move” ANALYSIS:
116
distribution of h(X) where X ... I = a heavy interval at I is heavy
117
distribution of h(X) where X ... I = a heavy interval at no longer heavy at ! I is NOT heavy I is heavy
118
distribution of h(X) where X ’... I = a heavy interval at ’’ heavy at ’ I is heavy I is heavy I is NOT heavy
119
I is heavy I is heavy I is NOT heavy I is heavy I is NOT heavy use binary search to find * ** * +1/(2n) = min{1/(b-a), ln A} I=[a,b] ’’
120
I is heavy I is heavy I is NOT heavy I is heavy I is NOT heavy use binary search to find * ** * +1/(2n) = min{1/(b-a), ln A} I=[a,b] How do you know that you can use binary search? ’’
121
I is heavy I is heavy How do you know that you can use binary search? I is NOT heavy I is NOT heavy Lemma: the set of temperatures for which I is h-heavy is an interval. a k e - k k=0 n a k e - k kIkI 1 8h P(h(X) I) 1/8h for X I is h-heavy at
122
How do you know that you can use binary search? a k e - k k=0 n a k e - k kIkI 1 8h c 0 x 0 + c 1 x 1 + c 2 x 2 +.... + c n x n Descarte’s rule of signs: x=e - + - ++ sign change number of positive roots number of sign changes
123
How do you know that you can use binary search? a k e - k k=0 n a k e - k kIkI 1 h c 0 x 0 + c 1 x 1 + c 2 x 2 +.... + c n x n Descarte’s rule of signs: x=e - + ++ sign change number of positive roots number of sign changes -1+x+x 2 +x 3 +...+x n 1+x+x 20 -
124
How do you know that you can use binary search? a k e - k k=0 n a k e - k kIkI 1 8h c 0 x 0 + c 1 x 1 + c 2 x 2 +.... + c n x n Descarte’s rule of signs: x=e - + ++ sign change number of positive roots number of sign changes -
125
I is heavy I is heavy I is NOT heavy ** * +1/(2n) can roughly compute ratio of Z( )/Z( ’) for ’ [ , * ] if | - |.|b-a| 1 I=[a,b]
126
I is heavy I is heavy I is NOT heavy ** * +1/(2n) can roughly compute ratio of Z( )/Z( ’) for ’ [ , * ] if | - |.|b-a| 1 I=[a,b] find largest such that Z( ) Z( ) Z(2 ) Z( ) CC 1. success 2. eliminate interval 3. long move
128
if we have sampler oracles for then we can get adaptive schedule of length t=O * ( (ln A) 1/2 ) independent sets O * (n 2 ) (using Vigoda’01, Dyer-Greenhill’01) matchings O * (n 2 m) (using Jerrum, Sinclair’89) spin systems: Ising model O * (n 2 ) for < C (using Marinelli, Olivieri’95) k-colorings O * (n 2 ) for k>2 (using Jerrum’95)
129
1. Counting problems 2. Basic tools: Chernoff, Chebyshev 3. Dealing with large quantities (the product method) 4. Statistical physics 5. Cooling schedules (our work) 6. More... Outline
130
6. More… a) proof of Dyer-Frieze b) independent sets revisited c) warm starts Outline
131
O(t 2 / 2 ) samples (O(t/ 2 ) from each X i ) give 1 estimator of “WANTED” with prob 3/4 Theorem (Dyer-Frieze’91) Appendix – proof of: E[X 1 X 2... X t ] = O(1) V[X i ] E[X i ] 2 the X i are easy to estimate = “WANTED” 1) 2)
132
How precise do the X i have to be? First attempt – term by term (1 )(1 )(1 )... (1 ) 1 t t t t Main idea: each term (t 2 ) samples (t 3 ) total n V[X] E[X] 2 1 ln (1/ )
133
How precise do the X i have to be? Analyzing SCV is better (Dyer-Frieze’1991) P( X gives (1 )-estimate ) 1 - V[X] E[X] 2 1 squared coefficient of variation (SCV) GOAL: SCV(X) 2 /4 X=X 1 X 2... X t
134
How precise do the X i have to be? (Dyer-Frieze’1991) SCV(X) = (1+SCV(X 1 ))... (1+SCV(X t )) - 1 Main idea: SCV(X i ) t SCV(X) < SCV(X)= V[X] E[X] 2 E[X 2 ] E[X] 2 = Analyzing SCV is better proof:
135
How precise do the X i have to be? (Dyer-Frieze’1991) SCV(X) = (1+SCV(X 1 ))... (1+SCV(X t )) - 1 Main idea: SCV(X i ) t SCV(X) < SCV(X)= V[X] E[X] 2 E[X 2 ] E[X] 2 = Analyzing SCV is better proof: X 1, X 2 independent E[X 1 X 2 ] = E[X 1 ]E[X 2 ] X 1, X 2 independent X 1 2,X 2 2 independent X 1,X 2 independent SCV(X 1 X 2 )=(1+SCV(X 1 ))(1+SCV(X 2 ))-1
136
How precise do the X i have to be? (Dyer-Frieze’1991) X 1 X 2... X t X = Main idea: SCV(X i ) t SCV(X) < each term (t / 2 ) samples (t 2 / 2 ) total Analyzing SCV is better
137
6. More… a) proof of Dyer-Frieze b) independent sets revisited c) warm starts Outline
138
1 2 4 Hamiltonian 0
139
Hamiltonian – many possibilities 0 1 2 (hardcore lattice gas model)
140
What would be a natural hamiltonian for planar graphs?
141
What would be a natural hamiltonian for planar graphs? H(G) = number of edges natural MC (1+ ) 1 (1+ ) try G - {u,v} try G + {u,v} pick u,v uniformly at random
142
natural MC (1+ ) 1 (1+ ) try G - {u,v} try G + {u,v} pick u,v uniformly at random u v u v (1+ ) n(n-1)/2 1 (1+ ) n(n-1)/2 G G’
143
u v u v (1+ ) n(n-1)/2 1 (1+ ) n(n-1)/2 G) number of edges satisfies the detailed balance condition (G) P(G,G’) = (G’) P(G’,G) G G’ ( = exp(- ))
144
6. More… a) proof of Dyer-Frieze b) independent sets revisited c) warm starts Outline
145
Mixing time: mix = smallest t such that | t - | TV 1/e Relaxation time: rel = 1/(1- 2 ) rel mix rel ln (1/ min ) n ln n) n) (n=3) (discrepancy may be substantially bigger for, e.g., matchings)
146
Mixing time: mix = smallest t such that | t - | TV 1/e Relaxation time: rel = 1/(1- 2 ) Estimating (S) 1 if X S 0 otherwise Y= { X E[Y]= (S)... X1X1 X2X2 X3X3 XsXs METHOD 1
147
Mixing time: mix = smallest t such that | t - | TV 1/e Relaxation time: rel = 1/(1- 2 ) Estimating (S) 1 if X S 0 otherwise Y= { X E[Y]= (S)... X1X1 X2X2 X3X3 XsXs METHOD 1 X1X1 X2X2 X3X3... XsXs METHOD 2 (Gillman’98, Kahale’96,...)
148
Mixing time: mix = smallest t such that | t - | TV 1/e Relaxation time: rel = 1/(1- 2 ) Further speed-up X1X1 X2X2 X3X3... XsXs | t - | TV exp(-t/ rel ) Var ( 0 / ) ( (x)( 0 (x)/ (x)-1) 2 ) 1/2 small called warm start METHOD 2 (Gillman’98, Kahale’96,...)
149
Mixing time: mix = smallest t such that | t - | TV 1/e Relaxation time: rel = 1/(1- 2 ) Further speed-up X1X1 X2X2 X3X3... XsXs METHOD 2 (Gillman’98, Kahale’96,...) | t - | TV exp(-t/ rel ) Var ( 0 / ) ( (x)( 0 (x)/ (x)-1) 2 ) 1/2 small called warm start sample at can be used as a warm start for ’ cooling schedule can step from ’ to
150
sample at can be used as a warm start for ’ cooling schedule can step from ’ to 00 11 22 33 mm.... = “well mixed” states m=O( (ln n)(ln A) )
151
00 11 22 33 mm.... = “well mixed” states XsXs X1X1 X2X2 X3X3... XsXs METHOD 2 run the our cooling-schedule algorithm with METHOD 2 using “well mixed” states as starting points
152
00 11 kk Output of our algorithm: k=O * ( (ln A) 1/2 ) small augmentation (so that we can use sample from current as a warm start at next) still O * ( (ln A) 1/2 ) 00 11 22 33 mm.... Use analogue of Frieze-Dyer for independent samples from vector variables with slightly dependent coordinates.
153
if we have sampler oracles for then we can get adaptive schedule of length t=O * ( (ln A) 1/2 ) independent sets O * (n 2 ) (using Vigoda’01, Dyer-Greenhill’01) matchings O * (n 2 m) (using Jerrum, Sinclair’89) spin systems: Ising model O * (n 2 ) for < C (using Marinelli, Olivieri’95) k-colorings O * (n 2 ) for k>2 (using Jerrum’95)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.