Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

Similar presentations


Presentation on theme: "1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,"— Presentation transcript:

1 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks, T.S. Jayram, Xiaodong Sun

2 2 Branching programs x1x1 x4x4 x2x2 x3x3 x5x5 x5x5 x3x3 x7x7 x1x1 x2x2 x8x8 x7x7 1 0 10 To compute f:{0,1} n  {0,1} on input (x 1,…,x n ) follow path from source to sink x=(1,1,0,1,...) Time T = length of longest path Space S = log 2 (# of nodes)

3 3 Branching program properties Simulate random-access machines same time T and space S Multi-way version for x i in domain D good for modeling RAM input registers BPs will be leveled wlog. same time T at most 2 S nodes per level

4 4 Overall approach to lower bounds If f:D n  {0,1} is computed using small time and space then f -1 (1) has a special combinatorial structure. Lower bounds for f follow if f -1 (1) does not have the structure How do we find such structures?

5 5 Levelled BPs and Layers v0v0 1 0 kn Break BP into r layers L 1,…,L r of height kn/r L1L1 L2L2 LrLr Assume time T  kn and wlog that the BP is levelled (  2 S nodes per level) Partition (a subset of) the layers L j into sets  1,  2,…,  p p  2

6 6 The Trace of an Input v0v0 1 0 kn L1L1 L2L2 L5L5 The trace of input x the sequence of nodes reached on input x as the computation moves from one set  i to another E.g. trace(x) =(v 1,v 2,v 3 ) a = length of trace = # of alternations in the partition  2 Sa possible traces v1v1 v2v2 v3v3 Partition of (a subset of) the layers L j into sets  1,  2,…,  p p  2

7 7 Branching program time-space lower bounds using these ideas Oblivious - same variable queried per level [Chandra-Furst-Lipton 83], [Alon-Maass 86], [Babai-Nisan-Szegedy 89] (Syntactic) read k - no variable queried  k times on any path [Borodin-Razborov-Smolensky 89], [Okol’nishnikova 89] General BP’s [B-Jayram-Saks 98], [Ajtai 99a], [Ajtai 99b], [B-Saks-Sun-Vee 00], [B-Vee 02]

8 8 The Case of Oblivious BP’s v0v0 1 0 kn L1L1 L2L2 L5L5 v1v1 v2v2 v3v3 Partition of the layers L j into sets  1,  2,…,  p p  2 When the BP is oblivious Each  i is associated with the subset A i of variables read in levels in  i trace(x) can be used as the messages on input x in a communication protocol between p players computing f, where the i th player has values of the variables in A i

9 9 The Oblivious Case Let C=  i  p A i be the common variables for the players and A’ i = A i - C For any assignment  to C, the trace can be used to compute f  Space bound S  CC(f  ;A’ 1,…,A’ p )/a for any  Want: n-|A’ i | large for all i small # of alternations a

10 10 The Read-k Case Wlog first make the read-k BP uniform For any pair of nodes u,v the multi-set of variables queried between u and v is the same on any path Call the set A uv Then apply levelling etc. uv Add extra ‘dummy’ queries on each path if necessary

11 11 Read-k Case Argument Overview Variation of the usual argument First fix the node sequence s=(v 0,v 1,…,v r ) for the r layers Defines sets of inputs A v 0 v 1,…,A v r-1 v r read during these layers f s is an AND of functions defined on these sets of variables (k,r)-rectangle Then choose a layer partition  1,  2 that is good for A v 0 v 1,…,A v r-1 v r Subsequence of (v 0,v 1,…,v r ) at alternations forms the trace - also good 1 0 v1v1 v2v2 v4v4 v3v3 v0v0 vrvr

12 12 Partitioning the layers r layers (of height  kn/r) Let Layers(x,i) be the set of layers in which variable x i is read on input x |Layers(x,i)|  k For a set  of layers, unread(x,  ) = { i : Layers(x,i)   =  } core(x,  ) = { i : Layers(x,i)   } Partition is good if these are large for  =  1,  2

13 13 How to partition the layers Assign every layer to  1 or  2 A = core(x,  1 ) = unread(x,  2 ) B = core(x,  2 ) = unread(x,  1 ) C = set of variables read in common Two techniques, both using probabilistic method [Borodin-Razborov-Smolensky 89] |A|, |B|  n/2 k+1, a  r  k 2 2 k [Okol’nishnikova 89] |A|  n/k O(k), |B|  n/2, a = 2k, r = 2k 2

14 14 The Read-k Case: Fixing the Trace v0v0 1 0 kn L1L1 L2L2 L5L5 v1v1 v2v2 v3v3 Fix a node sequence and then partition the layers L j into sets  1,  2 yielding a trace t Define f t (x)=1  f(x)=1 and x follows t Again, by uniformity, the trace determines which variables are read in each component of the partition vfvf f t (x)=g(x A  C )  h(x B  C ) f t -1 (1) is a pseudo-rectangle

15 15 Rectangles and Pseudo-rectangles Ordinary combinatorial rectangle in {0,1} n Partition [n] into A and B R A  R B for sets R A  {0,1} A and R B  {0,1} B Alternatively {x : x A  R A and x B  R B } Pseudo-rectangle [n] =D  E, sets R D  {0,1} D and R E  {0,1} E {x : x D  R D and x E  R E } Or, partition [n] into A, B and C {x: x A  C  R A  C and x B  C  R B  C }

16 16 Read-k lower bounds If f is computed by a (nondeterministic) read k branching program of size 2 S then The ones of f, f -1 (1), can be covered by 2 Sa pseudo- rectangles R with |A| and |B| large and f(R)=1 |A|, |B|  n/2 k+1, a  k 2 2 k [BRS 89] |A|  n/k O(k), |B|  n/2, a=2k [Okol 89] Prove upper bound  on # of inputs in any such pseudo-rectangle on which f is constant 1 2 S  (|f -1 (1)|/  ) 1/a or S  log (|f -1 (1)|/  )

17 17 Lower bounds for general BPs [BST 98] Major problem to handle Fixing the node sequence and the layer partition does not fix sets A = core(x,  1 ) or B = core(x,  2 ) Solutions Apply one layer partition for all inputs Use extension of [BRS 89] partition method Ignore inputs for which partition is bad Prob method argument bounds # of bad inputs Partition remaining inputs based on the values of core(x,  1 ) and core(x,  2 ) as well as on their traces

18 18 Lower bounds for general BPs [BST 98] Number of rectangles increases Multiply 2 Sa by the number of choices of core(x,  1 ) and core(x,  2 ) A priori bound is  3 n since sets are disjoint Observation a pseudo-rectangle w.r.t A,B,C remains a pseudo- rectangle w.r.t A’,B’,C’ if A’  A, B’  B, and C’=C  (A-A’)  (B-B’) Partition based on only the first m=n/2 k+1 elements of core(x,  1 ) and core(x,  2 ) # of choices is at most

19 19 Lower bounds for general BPs [BST 98] If f is computed by a (nondeterministic) time kn branching program of size 2 S Then most of f -1 (1) can be covered by 2 Sa pseudo-rectangles with |A|=|B|=m=n/2 k+1 where a  k 2 2 k (the cover is a partition if the program is deterministic) # of pseudo-rectangles is at most 2 4log 2 (n/m) m+Sa = 2 4(k+1)m+Sa Is that good?

20 20 Using the Bound: Embedded Rectangles Pseudo-rectangles are hard to reason about Easier objects: Embedded rectangles Start with an pseudo-rectangle on A,B,C Fix an assignment to the common set C we get a simpler object with a combinatorial rectangle R A xR B on AxB an assignment  to C=A  B spine Result is an embedded rectangle

21 21 Partition of most of f -1 (1) into embedded rectangles Input space is D n Each pseudo-rectangle can be partitioned into at most |D| n-2m embedded rectangles R with |A|=|B|=m=n/2 k+1 A,B feet of R Total number of such embedded rectangles partitioning most of f -1 (1) 2 4(k+1)m+Sa |D| n-2m Total number of inputs is |D| n Non-trivial only if, e.g. |D|  2 3(k+1) large domain

22 22 Lower bound on embedded rectangle size for which f is constant Suppose |f -1 (1)|   |D| n Since at most 2 4(k+1)m+Sa |D| n-2m embedded rectangles, average size is at least  2 -4(k+1)m-Sa-1 |D| 2m and at least 1/4 of f -1 (1) is covered by those   2 -4(k+1)m-Sa-2 |D| 2m Such a rectangle defined by ( ,A,B,R A,R B ) must have |R A |/|D m |,|R B |/|D m |   2 -4(k+1)m-Sa-2 Typical 2-party communication complexity results* say |R A |/|D m |,|R B |/|D m |  |D| -  m * With extra work to handle  and easiest A,B

23 23 The time space tradeoff lower bounds [BST 98] Therefore for such a hard f  2 -4(k+1)m-Sa-2   |D| -  m So if  is constant and |D|  2 9(k+1)/  Sa  [  log |D|  4(k+1)] m  c  (  /2) m log |D| Since m=n/2 k+1 and a  k 2 2 k for some C  1 S  C -k n log |D| Therefore T/n=k  c’log ((n log|D|)/S), i.e.

24 24 What functions are this hard? Computing x T Mx  0 (mod q) q  n [BST 98] Non-optimal bound when M is Sylvester matrix Let  1/2 and c  2/(1  H 2 (  )) HAM  :[n c ] n  {0,1}: Is any pair (x i,x j ) close in Hamming distance  (x i,x j )   clog n? Any two sets in [n c ] m each of density  n -  m contain a pair of coordinates that are within  clog n of each other Defined in [Ajtai 99a] where weaker lower bounds proved using generalization of [Okol 89] instead of [BRS 89] Best bounds follow immediately from [BST 98]

25 25 What functions are this hard? Computing x T M y x  0 (mod q) for x  GF(q) n, y  GF(q) 2n-1, q  n Function defined in [Ajtai 99b] and case q=2 used for Boolean lower bounds Key to improvement: For some y, M y has better rigidity properties than Sylvester matrices have Defining these matrices and analyzing their rigidity properties is the key contribution of [Ajtai 99b] Most of the hard work in Boolean lower bounds is in the second half of [Ajtai 99a], much of which does not fit in the STOC version

26 26 Ajtai’s matrices 0 y1y1 y 2n-1 y 2n-2 y n+2 y n+1 ynyn y4y4 y3y3 y2y2 MyMy M y is constant on anti-diagonals below the main diagonal

27 27 x T M y x on an embedded (m,  ) -rectangle MyMy A B x A B x For every  on AUB, f (x AUB, , y) = x A T M AB x B + g(x A,y) + h(x B,y)

28 28 Rectangles, rank, & rigidity Largest rectangle on which x A T Mx B is constant has density  q -rank(M) [BRS 89] Lemma [Ajtai 99b] Can fix y s.t. every  n  n minor M AB of M y has rank(M AB )  c  n/log 2 (1/  )   1+  n better than comparable rigidity bound of  2 n for Sylvester matrices [BRS 89], [BST 98]

29 29 How to partition the layers Assign every layer to  1 or  2 A = core(x,  1 ) = unread(x,  2 ) B = core(x,  2 ) = unread(x,  1 ) C = set of variables read in common Two techniques for read-k case, both using probabilistic method [Borodin-Razborov-Smolensky 89] |A|, |B|  n/2 k+1, a  r  k 2 2 k [Okol’nishnikova 89] |A|  n/k O(k), |B|  n/2, a = 2k, r = 2k 2

30 30 Read-k case: Branching program with node sequence kn v0v0 v r-1 v2v2 v1v1 vrvr 1 0 L1L1 L2L2 LrLr

31 31 Partitioning the layers r layers (of height  kn/r) Let Layers(x,i) be the set of layers in which variable x i is read on input x |Layers(x,i)|  k For a set  of layers, unread(x,  ) = { i : Layers(x,i)   =  } core(x,  ) = { i : Layers(x,i)   } Partition is good if these are large for  =  1,  2

32 32 Partitioning the layers [Okol’nishnikova 89] Fix node sequence s and x that follows s Choose a random subset  1 of k of the r layers For each index i Thus Fix a partition achieving the average

33 33 Partitioning the layers [Okol’nishnikova 89] I.e., for each such x Only k layers of height kn/r At most a=2k alternations Total  k 2 n/r  n/2 vars read in  1 if r=2k 2  core (x,  2 )  n/2

34 34 Partitioning the layers [BRS 89] Assign each layer independently Pr[L i   1 ]=Pr[L i   2 ]=1/2  for  =  1 or  2 Let  i =1 if Layers(x,i)   and 0 otherwise Pr[  i ]=Pr[Layers(x,i)   ]  1/2 k each variable is read in at most k layers E[  i  i ]=E[ #{ i: Layers(x,i)   } ]  n/2 k i.e., E[|core(x,  )|]  n/2 k E[|unread(x,  )|]  n/2 k

35 35 Modification for general BP [BST 98] Let (i) = |Layers(x,i)|  i (i)  kn Pr[  i ] = Pr[Layers(x,i)   ] = 2  (i) E[|core(x,  )|] = E[  i  i ] =  i 2  (i) By arithmetic-geometric mean inequality this is 

36 36 Second Moment Method [BRS 89][BST 98] If r is big enough |core(x,  )| is concentrated around its mean Bound Var[|core(x,  )|] = Var[  i  i ] Events for  i,  j correlated only if x i and x j read in the same layer At most (i)kn/r vars read in the same layer as x i Each contributes at most Pr[  i ]= 1/2 (i) to variance Var[  i  i ] = (kn/r)  i (i) 2  (i)  (k/r) (  j (j))  i 2  (i)  (k 2 n/r)  i 2  (i) = (k 2 n/r) E[|core(x,  )|] FKG-like inequality of Chebyshev - terms are anti-correlated

37 37 Second Moment Method [BRS 89][BST 98] Var[|core(x,  )|]  (k 2 n/r) E[|core(x,  )|] = (k 2 n/r)  By Chebyshev’s inequality Pr[  /2  |core(x,  )|  3  /2]  1  Var[|core(x,  )|]/(  /2) 2  1  4k 2 2 k /r since   n/2 k Choose r=8k 2 2 k

38 38 The Boolean case is much harder [BST 98] Showed only T  1.017n for S=o(n) for quadratic form problem Uses pseudo-rectangles but specialized to splitting BP only at the T/2 level, deterministic [Ajtai 99a] Shows lower bounds for Element Distinctness over [n 2 ] that work for density 2 -  m Embedded rectangles not pseudo-rectangles, deterministic [Ajtai 99b] T=O(n)  S=  (n) for Boolean BP’s!!! [B-Saks-Sun-Vee 00] Improved bounds and extension to O(n/T)-error randomized case Talk later

39 39 Power of the Large Domain Technique For oblivious BPs, best bound using two- party CC is T=  (n log (n/S)) [Alon-Maass 86] Bounds match for general BPs over large domains Best oblivious BP bounds use multiparty CC T=  (n log 2 (n/S)) [Babai-Nisan-Szegedy 89] [B-Vee 02] Matching bounds for general BPs over large domains Erik Vee talk later


Download ppt "1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,"

Similar presentations


Ads by Google