Download presentation
Presentation is loading. Please wait.
Published byEthel Harrington Modified over 9 years ago
1
1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks, T.S. Jayram, Xiaodong Sun
2
2 Branching programs x1x1 x4x4 x2x2 x3x3 x5x5 x5x5 x3x3 x7x7 x1x1 x2x2 x8x8 x7x7 1 0 10 To compute f:{0,1} n {0,1} on input (x 1,…,x n ) follow path from source to sink x=(1,1,0,1,...) Time T = length of longest path Space S = log 2 (# of nodes)
3
3 Branching program properties Simulate random-access machines same time T and space S Multi-way version for x i in domain D good for modeling RAM input registers BPs will be leveled wlog. same time T at most 2 S nodes per level
4
4 Overall approach to lower bounds If f:D n {0,1} is computed using small time and space then f -1 (1) has a special combinatorial structure. Lower bounds for f follow if f -1 (1) does not have the structure How do we find such structures?
5
5 Levelled BPs and Layers v0v0 1 0 kn Break BP into r layers L 1,…,L r of height kn/r L1L1 L2L2 LrLr Assume time T kn and wlog that the BP is levelled ( 2 S nodes per level) Partition (a subset of) the layers L j into sets 1, 2,…, p p 2
6
6 The Trace of an Input v0v0 1 0 kn L1L1 L2L2 L5L5 The trace of input x the sequence of nodes reached on input x as the computation moves from one set i to another E.g. trace(x) =(v 1,v 2,v 3 ) a = length of trace = # of alternations in the partition 2 Sa possible traces v1v1 v2v2 v3v3 Partition of (a subset of) the layers L j into sets 1, 2,…, p p 2
7
7 Branching program time-space lower bounds using these ideas Oblivious - same variable queried per level [Chandra-Furst-Lipton 83], [Alon-Maass 86], [Babai-Nisan-Szegedy 89] (Syntactic) read k - no variable queried k times on any path [Borodin-Razborov-Smolensky 89], [Okol’nishnikova 89] General BP’s [B-Jayram-Saks 98], [Ajtai 99a], [Ajtai 99b], [B-Saks-Sun-Vee 00], [B-Vee 02]
8
8 The Case of Oblivious BP’s v0v0 1 0 kn L1L1 L2L2 L5L5 v1v1 v2v2 v3v3 Partition of the layers L j into sets 1, 2,…, p p 2 When the BP is oblivious Each i is associated with the subset A i of variables read in levels in i trace(x) can be used as the messages on input x in a communication protocol between p players computing f, where the i th player has values of the variables in A i
9
9 The Oblivious Case Let C= i p A i be the common variables for the players and A’ i = A i - C For any assignment to C, the trace can be used to compute f Space bound S CC(f ;A’ 1,…,A’ p )/a for any Want: n-|A’ i | large for all i small # of alternations a
10
10 The Read-k Case Wlog first make the read-k BP uniform For any pair of nodes u,v the multi-set of variables queried between u and v is the same on any path Call the set A uv Then apply levelling etc. uv Add extra ‘dummy’ queries on each path if necessary
11
11 Read-k Case Argument Overview Variation of the usual argument First fix the node sequence s=(v 0,v 1,…,v r ) for the r layers Defines sets of inputs A v 0 v 1,…,A v r-1 v r read during these layers f s is an AND of functions defined on these sets of variables (k,r)-rectangle Then choose a layer partition 1, 2 that is good for A v 0 v 1,…,A v r-1 v r Subsequence of (v 0,v 1,…,v r ) at alternations forms the trace - also good 1 0 v1v1 v2v2 v4v4 v3v3 v0v0 vrvr
12
12 Partitioning the layers r layers (of height kn/r) Let Layers(x,i) be the set of layers in which variable x i is read on input x |Layers(x,i)| k For a set of layers, unread(x, ) = { i : Layers(x,i) = } core(x, ) = { i : Layers(x,i) } Partition is good if these are large for = 1, 2
13
13 How to partition the layers Assign every layer to 1 or 2 A = core(x, 1 ) = unread(x, 2 ) B = core(x, 2 ) = unread(x, 1 ) C = set of variables read in common Two techniques, both using probabilistic method [Borodin-Razborov-Smolensky 89] |A|, |B| n/2 k+1, a r k 2 2 k [Okol’nishnikova 89] |A| n/k O(k), |B| n/2, a = 2k, r = 2k 2
14
14 The Read-k Case: Fixing the Trace v0v0 1 0 kn L1L1 L2L2 L5L5 v1v1 v2v2 v3v3 Fix a node sequence and then partition the layers L j into sets 1, 2 yielding a trace t Define f t (x)=1 f(x)=1 and x follows t Again, by uniformity, the trace determines which variables are read in each component of the partition vfvf f t (x)=g(x A C ) h(x B C ) f t -1 (1) is a pseudo-rectangle
15
15 Rectangles and Pseudo-rectangles Ordinary combinatorial rectangle in {0,1} n Partition [n] into A and B R A R B for sets R A {0,1} A and R B {0,1} B Alternatively {x : x A R A and x B R B } Pseudo-rectangle [n] =D E, sets R D {0,1} D and R E {0,1} E {x : x D R D and x E R E } Or, partition [n] into A, B and C {x: x A C R A C and x B C R B C }
16
16 Read-k lower bounds If f is computed by a (nondeterministic) read k branching program of size 2 S then The ones of f, f -1 (1), can be covered by 2 Sa pseudo- rectangles R with |A| and |B| large and f(R)=1 |A|, |B| n/2 k+1, a k 2 2 k [BRS 89] |A| n/k O(k), |B| n/2, a=2k [Okol 89] Prove upper bound on # of inputs in any such pseudo-rectangle on which f is constant 1 2 S (|f -1 (1)|/ ) 1/a or S log (|f -1 (1)|/ )
17
17 Lower bounds for general BPs [BST 98] Major problem to handle Fixing the node sequence and the layer partition does not fix sets A = core(x, 1 ) or B = core(x, 2 ) Solutions Apply one layer partition for all inputs Use extension of [BRS 89] partition method Ignore inputs for which partition is bad Prob method argument bounds # of bad inputs Partition remaining inputs based on the values of core(x, 1 ) and core(x, 2 ) as well as on their traces
18
18 Lower bounds for general BPs [BST 98] Number of rectangles increases Multiply 2 Sa by the number of choices of core(x, 1 ) and core(x, 2 ) A priori bound is 3 n since sets are disjoint Observation a pseudo-rectangle w.r.t A,B,C remains a pseudo- rectangle w.r.t A’,B’,C’ if A’ A, B’ B, and C’=C (A-A’) (B-B’) Partition based on only the first m=n/2 k+1 elements of core(x, 1 ) and core(x, 2 ) # of choices is at most
19
19 Lower bounds for general BPs [BST 98] If f is computed by a (nondeterministic) time kn branching program of size 2 S Then most of f -1 (1) can be covered by 2 Sa pseudo-rectangles with |A|=|B|=m=n/2 k+1 where a k 2 2 k (the cover is a partition if the program is deterministic) # of pseudo-rectangles is at most 2 4log 2 (n/m) m+Sa = 2 4(k+1)m+Sa Is that good?
20
20 Using the Bound: Embedded Rectangles Pseudo-rectangles are hard to reason about Easier objects: Embedded rectangles Start with an pseudo-rectangle on A,B,C Fix an assignment to the common set C we get a simpler object with a combinatorial rectangle R A xR B on AxB an assignment to C=A B spine Result is an embedded rectangle
21
21 Partition of most of f -1 (1) into embedded rectangles Input space is D n Each pseudo-rectangle can be partitioned into at most |D| n-2m embedded rectangles R with |A|=|B|=m=n/2 k+1 A,B feet of R Total number of such embedded rectangles partitioning most of f -1 (1) 2 4(k+1)m+Sa |D| n-2m Total number of inputs is |D| n Non-trivial only if, e.g. |D| 2 3(k+1) large domain
22
22 Lower bound on embedded rectangle size for which f is constant Suppose |f -1 (1)| |D| n Since at most 2 4(k+1)m+Sa |D| n-2m embedded rectangles, average size is at least 2 -4(k+1)m-Sa-1 |D| 2m and at least 1/4 of f -1 (1) is covered by those 2 -4(k+1)m-Sa-2 |D| 2m Such a rectangle defined by ( ,A,B,R A,R B ) must have |R A |/|D m |,|R B |/|D m | 2 -4(k+1)m-Sa-2 Typical 2-party communication complexity results* say |R A |/|D m |,|R B |/|D m | |D| - m * With extra work to handle and easiest A,B
23
23 The time space tradeoff lower bounds [BST 98] Therefore for such a hard f 2 -4(k+1)m-Sa-2 |D| - m So if is constant and |D| 2 9(k+1)/ Sa [ log |D| 4(k+1)] m c ( /2) m log |D| Since m=n/2 k+1 and a k 2 2 k for some C 1 S C -k n log |D| Therefore T/n=k c’log ((n log|D|)/S), i.e.
24
24 What functions are this hard? Computing x T Mx 0 (mod q) q n [BST 98] Non-optimal bound when M is Sylvester matrix Let 1/2 and c 2/(1 H 2 ( )) HAM :[n c ] n {0,1}: Is any pair (x i,x j ) close in Hamming distance (x i,x j ) clog n? Any two sets in [n c ] m each of density n - m contain a pair of coordinates that are within clog n of each other Defined in [Ajtai 99a] where weaker lower bounds proved using generalization of [Okol 89] instead of [BRS 89] Best bounds follow immediately from [BST 98]
25
25 What functions are this hard? Computing x T M y x 0 (mod q) for x GF(q) n, y GF(q) 2n-1, q n Function defined in [Ajtai 99b] and case q=2 used for Boolean lower bounds Key to improvement: For some y, M y has better rigidity properties than Sylvester matrices have Defining these matrices and analyzing their rigidity properties is the key contribution of [Ajtai 99b] Most of the hard work in Boolean lower bounds is in the second half of [Ajtai 99a], much of which does not fit in the STOC version
26
26 Ajtai’s matrices 0 y1y1 y 2n-1 y 2n-2 y n+2 y n+1 ynyn y4y4 y3y3 y2y2 MyMy M y is constant on anti-diagonals below the main diagonal
27
27 x T M y x on an embedded (m, ) -rectangle MyMy A B x A B x For every on AUB, f (x AUB, , y) = x A T M AB x B + g(x A,y) + h(x B,y)
28
28 Rectangles, rank, & rigidity Largest rectangle on which x A T Mx B is constant has density q -rank(M) [BRS 89] Lemma [Ajtai 99b] Can fix y s.t. every n n minor M AB of M y has rank(M AB ) c n/log 2 (1/ ) 1+ n better than comparable rigidity bound of 2 n for Sylvester matrices [BRS 89], [BST 98]
29
29 How to partition the layers Assign every layer to 1 or 2 A = core(x, 1 ) = unread(x, 2 ) B = core(x, 2 ) = unread(x, 1 ) C = set of variables read in common Two techniques for read-k case, both using probabilistic method [Borodin-Razborov-Smolensky 89] |A|, |B| n/2 k+1, a r k 2 2 k [Okol’nishnikova 89] |A| n/k O(k), |B| n/2, a = 2k, r = 2k 2
30
30 Read-k case: Branching program with node sequence kn v0v0 v r-1 v2v2 v1v1 vrvr 1 0 L1L1 L2L2 LrLr
31
31 Partitioning the layers r layers (of height kn/r) Let Layers(x,i) be the set of layers in which variable x i is read on input x |Layers(x,i)| k For a set of layers, unread(x, ) = { i : Layers(x,i) = } core(x, ) = { i : Layers(x,i) } Partition is good if these are large for = 1, 2
32
32 Partitioning the layers [Okol’nishnikova 89] Fix node sequence s and x that follows s Choose a random subset 1 of k of the r layers For each index i Thus Fix a partition achieving the average
33
33 Partitioning the layers [Okol’nishnikova 89] I.e., for each such x Only k layers of height kn/r At most a=2k alternations Total k 2 n/r n/2 vars read in 1 if r=2k 2 core (x, 2 ) n/2
34
34 Partitioning the layers [BRS 89] Assign each layer independently Pr[L i 1 ]=Pr[L i 2 ]=1/2 for = 1 or 2 Let i =1 if Layers(x,i) and 0 otherwise Pr[ i ]=Pr[Layers(x,i) ] 1/2 k each variable is read in at most k layers E[ i i ]=E[ #{ i: Layers(x,i) } ] n/2 k i.e., E[|core(x, )|] n/2 k E[|unread(x, )|] n/2 k
35
35 Modification for general BP [BST 98] Let (i) = |Layers(x,i)| i (i) kn Pr[ i ] = Pr[Layers(x,i) ] = 2 (i) E[|core(x, )|] = E[ i i ] = i 2 (i) By arithmetic-geometric mean inequality this is
36
36 Second Moment Method [BRS 89][BST 98] If r is big enough |core(x, )| is concentrated around its mean Bound Var[|core(x, )|] = Var[ i i ] Events for i, j correlated only if x i and x j read in the same layer At most (i)kn/r vars read in the same layer as x i Each contributes at most Pr[ i ]= 1/2 (i) to variance Var[ i i ] = (kn/r) i (i) 2 (i) (k/r) ( j (j)) i 2 (i) (k 2 n/r) i 2 (i) = (k 2 n/r) E[|core(x, )|] FKG-like inequality of Chebyshev - terms are anti-correlated
37
37 Second Moment Method [BRS 89][BST 98] Var[|core(x, )|] (k 2 n/r) E[|core(x, )|] = (k 2 n/r) By Chebyshev’s inequality Pr[ /2 |core(x, )| 3 /2] 1 Var[|core(x, )|]/( /2) 2 1 4k 2 2 k /r since n/2 k Choose r=8k 2 2 k
38
38 The Boolean case is much harder [BST 98] Showed only T 1.017n for S=o(n) for quadratic form problem Uses pseudo-rectangles but specialized to splitting BP only at the T/2 level, deterministic [Ajtai 99a] Shows lower bounds for Element Distinctness over [n 2 ] that work for density 2 - m Embedded rectangles not pseudo-rectangles, deterministic [Ajtai 99b] T=O(n) S= (n) for Boolean BP’s!!! [B-Saks-Sun-Vee 00] Improved bounds and extension to O(n/T)-error randomized case Talk later
39
39 Power of the Large Domain Technique For oblivious BPs, best bound using two- party CC is T= (n log (n/S)) [Alon-Maass 86] Bounds match for general BPs over large domains Best oblivious BP bounds use multiparty CC T= (n log 2 (n/S)) [Babai-Nisan-Szegedy 89] [B-Vee 02] Matching bounds for general BPs over large domains Erik Vee talk later
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.