Branching Programs Part 3 Paul Beame University of Washington
Outline Branching program basics Space (size) lower bounds Multi-output functions Time-Space tradeoff lower bounds for general BPs Single-output functions Restricted classes of BPs OBDDs, Read-once (FBDDs), Oblivious, Read-k Lower bound methods for restricted classes Lower bound methods for general BPs Applying tradeoffs: BPs and static data structures Multi-output functions using single-output techniques Lower bound for encoding good codes
Limited Branching Program Forms Structure-based Oblivious For each BP level, all the nodes on that level have the same variable name e.g. Parity BP Time-Based Read-Once On every path through the BP each variable is queried at most once e.g. Parity BP Read-𝑘 On every path through the BP each variable is queried at most 𝑘 times Time-Bounded Every path in the BP has length at most 𝑻=𝒌𝒏
Oblivious vs Read-k and Best Bounds Recall argument for oblivious BPs: If length 𝑻=𝒌𝒏 then ≥𝒏/𝟐 variables are read at most 𝟐𝒌 times So oblivious length 𝒌𝒏 ≈ Read-2𝒌 Exponential Read-𝒌 size lower bounds for simple explicit Boolean functions for 𝒌=𝐎(log 𝒏) Inspired by 2-party communication complexity [Borodin-Razborov-Smolensky 1989][Okol’nishnikova 1989] Exponential size lower bound for an explicit function over large domain for 𝒌=𝐎(log2 𝒏) Inspired by multiparty communication complexity [B-Vee 2002] Drawback: Function is not known to be in NP. No larger 𝒌 possible until the oblivious case is improved
Read-k BPs On every path through the BP each variable is queried at most 𝒌 times Unlike read-once, there may be paths that are not consistent with any input. We assume that those paths are restricted, too. Lower bound methods for read-𝒌 BPs often also apply to nondeterministic read-𝒌 BPs the “every path” constraint is essential there Defn: Nondeterministic BPs (NBPs) generalize BPs by allowing many out-edges from a vertex with the same label: An NBP outputs 1 on input 𝒙 iff there is some path that 𝒙 can take that leads to the 1-sink node.
Read-k BPs We can also separate the levels of the Read-𝒌 BP hierarchy: e.g., a small Read-2 BP can tell whether an 𝒏 x 𝒏 binary matrix is a permutation matrix, unlike small Read-Once BPs There are explicit Boolean functions with small Read-𝒌+1 BPs that require exponential size Read-𝒌 BPs for 𝒌 ≤ log 𝒏 /2, even allowing nondeterminism or randomization [Jayram S. Thathachar 1998] Techniques are easier but are similar enough to those for general time-bounded BPs in the case of non-Boolean inputs that we do them together
Limited Branching Program Forms Structure-based Oblivious For each BP level, all the nodes on that level have the same variable name e.g. Parity BP Time-Based Read-Once On every path through the BP each variable is queried at most once e.g. Parity BP Read-𝑘 On every path through the BP each variable is queried at most 𝑘 times Time-Bounded Every path in the BP has length at most 𝑻=𝒌𝒏
Breaking up a BP via its Traces Split BP 𝑷 with input set 𝑫𝒏 into 𝑳 layers Let traces(𝑷)={trace(𝒙)| 𝒙∊𝑫𝒏} For 𝝉 ∊ traces(𝑷) let 𝒇𝝉 be the function that has value 1 on input 𝒙 iff trace(𝒙)=𝝉 and 𝒇(𝒙)=1 Since 𝑷 computes 𝒇 𝒇= 𝝉∈𝐭𝐫𝐚𝐜𝐞𝐬(𝑷) 𝒇 𝝉 The 𝒇𝝉 are disjoint and |traces(𝑷)| ≤ 2𝑺𝑳 Can extend this to nondeterministic BPs: Each 𝒙 may have multiple traces The 𝒇𝝉 are no longer disjoint 1
Read-k BPs and Traces Split BP 𝑷 with input set 𝑫𝒏 into 𝑳 layers If 𝑷 is a read-𝒌 BP then w.l.o.g. for every pair of nodes 𝒖, 𝒗 in 𝑷 the same set of variables is read on every path from 𝒖 to 𝒗 Only must avoid variables read 𝒌 times on some pair of paths above 𝒖 and below 𝒗 So, each trace 𝝉 yields a fixed sequence of 𝑳 sets of variables read, each of size ≤ 𝒌𝒏/𝑳 Can assign the layers for each 𝒇𝝉 as we did for oblivious BPs Get 2𝑺𝑳 assignments, one for each 𝝉 1
Recall: Strategy for Assigning Layers Assign each of 𝑳 layers to either Alice or Bob for 𝑻≤𝒌𝒏 Goal: maximize # of bits per player 𝒎, while minimizing 𝑳. Flip an independent coin for each layer: 𝒎=𝒏/2𝒌+1, 𝑳=8𝒌2 2𝒌 equal length layers [Borodin-Razborov-Smolensky 1989, B-Jayram-Saks 2001]
Recall: Strategy for Assigning Layers Assign each of 𝑳 layers to either Alice or Bob for 𝑻≤𝒌𝒏 Goal: maximize # of bits per player 𝒎, while minimizing 𝑳. Flip an independent coin for each layer: 𝒎=𝒏/2𝒌+1, 𝑳=8𝒌2 2𝒌 equal length layers [Borodin-Razborov-Smolensky 1989, B-Jayram-Saks 2001] Use 4𝒌2 equal layers. Give a random subset of 2𝒌 of them to Bob. 𝒎=𝒏/(2𝒆𝒌)2𝒌, 𝑳=4𝒌2 [Okol’nishnikova 1989, Ajtai 1999]
Communication Complexity Input 𝒙∈𝑿 Input 𝒚∈𝒀 010 11 Bob … … Alice 𝒇(𝒙,𝒚) 𝑪(𝒇) = # of bits Alice and Bob need to exchange to compute 𝒇 on 𝑿𝒀
Communication Complexity Defn: A (combinatorial) rectangle in 𝑿𝗑𝒀 is a subset 𝑼𝗑𝑽 where 𝑼⊆𝑿 and 𝑽⊆𝒀. Lemma: Any deterministic 𝒄-bit protocol for 𝒇:𝑿𝗑𝒀→{0,1} yields a partition of 𝑿𝗑𝒀 into 2𝒄 rectangles on which 𝒇 is constant Lemma: Nondeterministic 𝒄-bit protocols correspond to coverings of 𝒇-1(1) by 2𝒄 rectangles. To prove that 𝒇 requires large (non)deterministic communication complexity it suffices to prove 𝒇-1(1) is large Any rectangle in 𝒇-1(1) is small 𝑿 𝒀 𝑼 𝑽
Best Partition and Fixed Variables For BP lower bounds, we don’t know a priori how the input {0,1}𝒏 or [𝒎]𝒏 to 𝒇 is partitioned into 𝑿𝗑𝒀 Need to analyze rectangles for all possible partitions of [𝒏] “best partition” communication complexity Also, in the case of oblivious BPs, most input variables (ones seen by both parties) were fixed ahead of time Implicitly, rectangle size was only important relative to the space of unfixed variables. For Read-𝒌 and general time-bounded BPs this aspect is even more important so we need to make it explicit.
Embedded Rectangles Defn: For disjoint sets 𝑨, 𝑩 ⊆ [𝒏] and 𝜶 ∊ 𝑫[𝒏]-𝑨-𝑩, the embedded rectangle 𝑹 ⊆ 𝑫𝒏 with footprint (𝑨,𝑩), tail 𝜶, and body 𝑹𝑨𝗑𝑹𝑩 for 𝑹𝑨⊆𝑫𝑨, 𝑹𝑩⊆𝑫𝑩 is the set 𝑹={𝒙∊𝑫𝒏 | 𝒙𝑨∊𝑹𝑨, 𝒙𝑩∊𝑹𝑩, 𝒙[𝒏]-𝑨-𝑩 = 𝜶}. Defn: The density 𝜹𝑹 of 𝑹 is |𝑹𝑨𝗑𝑹𝑩|/ |𝑫𝑨𝗑𝑫𝑩| The footsize 𝒎𝑹 of 𝑹 is min{|𝑨|,|𝑩|}. For oblivious BP lower bounds, the footprint (𝑨,𝑩) is the same for all of the embedded rectangles associated with the partition of the input sequence 𝝈 Lets us use communication lower bounds over the reduced variable set 𝑨 ∪ 𝑩.
Lower Bounds from Embedded Rectangles Strategy: Write 𝒇= 𝒊=𝟏 𝑬 𝒇 𝒊 where each 𝒇𝒊-1(1) is a union of embedded rectangles with footsize 𝒎, the same footprint (𝑨𝒊,𝑩𝒊), but different tails For Read-𝒌 BPs 𝑬 ≤ 2𝑳𝑺 (one per trace). For general length 𝒌𝒏 BPs? Show that any embedded rectangle in 𝒇-1(1) with footsize ≥ 𝒎 has density ≤ 𝜹. Implies that 𝑬 ∙ 𝜹 ≥ |𝒇-1(1)|/|𝑫𝒏|
Decomposition for Length 𝒌𝒏 Recall that 𝒇= 𝝉 𝒇 𝝉 . Fix one the 2𝑺𝑳 traces 𝝉. Apply layer assignment separately for each 𝒙 with trace 𝝉 to the sequence of variables queried on input 𝒙. One of ≤ 2𝑳 possible layer assignments 𝝀=layers(𝒙) Let private(𝒙) be the pair consisting of the first 𝒎 variables in the private inputs for Alice and Bob, respectively under 𝝀 At most 𝒏 𝒎 𝟐 choices (𝑨, 𝑩) for private(𝒙) Claim: For disjoint 𝑨,𝑩⊆[𝒏] with |𝑨|=|𝑩|=𝒎, values 𝜶∊𝑫[𝒏]-𝑨-𝑩, trace 𝝉 and layer assignment 𝝀, 𝑹={𝒙∊𝑫𝒏| private(𝒙)=(𝑨,𝑩), 𝒙[𝒏]-𝑨-𝑩=𝜶, trace(𝒙)=𝝉, layers(𝒙)=𝝀} is an embedded rectangle with footprint (𝑨,𝑩) in 𝒇-1(1). Claim ⇨ we can choose 𝑬≤ 2𝑺𝑳 2𝑳 𝒏 𝒎 𝟐
Proving the Claim 1 Claim: For disjoint 𝑨,𝑩⊆[𝒏] with |𝑨|=|𝑩|=𝒎, values 𝜶∊𝑫[𝒏]-𝑨-𝑩, trace 𝝉 and layer assignment 𝝀 𝑹={𝒙∊𝑫𝒏| private(𝒙)=(𝑨,𝑩), 𝒙[𝒏]-𝑨-𝑩=𝜶, trace(𝒙)=𝝉, layers(𝒙)=𝝀} is an embedded rectangle with footprint (𝑨,𝑩) in 𝒇-1(1).
Lower Bounds from Embedded Rectangles Strategy: Write 𝒇= 𝒊=𝟏 𝑬 𝒇 𝒊 where each 𝒇𝒊-1(1) is a union of embedded rectangles with footsize 𝒎, the same footprint (𝑨𝒊,𝑩𝒊), but different tails For Read-𝒌 BPs: 𝑬 ≤ 2𝑳𝑺 (one per trace). For general length 𝒌𝒏 BPs: 𝑬≤ 2(𝑺+1)𝑳 𝒏 𝒎 𝟐 Show that any embedded rectangle in 𝒇-1(1) with footsize ≥ 𝒎 has density ≤ 𝜹. Implies that 𝑬∙𝜹 ≥ |𝒇-1(1)|/|𝑫𝒏|
Functions with Embedded Rectangle Tradeoffs Show that any embedded rectangle in 𝒇-1(1) with footsize = 𝒎 has density ≤ 𝜹: Cannot be smaller than 𝜹=|𝑫|-2𝒎 (just one point) Functions 𝒇 with 𝜹=|𝑫|-𝜺 𝒎: Hamming separation HAM𝜸 : [Ajtai 2002] e.g. 𝑫=[𝒏6]={0,1}6log 𝒏 Output is 1 iff 𝚫(𝒙𝒊,𝒙𝒋)≥5 log2 𝒏 for all 𝒊≠𝒋 Membership in linear codes over finite field 𝔽 𝟐 𝟒𝒌 [Jukna 2009] Middle bit of integer multiplication of numbers with 𝑫={0,1}𝒃, i.e. 𝒏 𝒃-bit blocks. [Sauerhoff-Woelfel 2003] All have |𝒇-1(1)|/|𝑫𝒏| ≥ 1/|𝑫|.
Lower Bounds for BPs/NBPs/RAMs for large 𝑫 Suppose that 𝑻≤𝒌𝒏 and BRS layer assignment based on independent coin-flips is used. Then for 99% of 𝒙 𝒎=𝒏/2𝒌+1 𝑳=8𝒌2 2𝒌 𝑬 ∙ 𝜹 ≥ 0.99|𝒇-1(1)|/|𝑫𝒏| ≥ 0.99/|𝑫| 𝑬 ≤ 2(𝑺+1)𝑳 𝒏 𝒎 𝟐 and 𝜹=|𝑫|-𝜺 𝒎 For these values 𝒏 𝒎 𝟐 ≤ (𝒆 𝟐 𝒌+𝟏 ) 𝟐𝒎 < 2(2𝒌+6)𝒎. If log2|𝑫| > 4𝒌/𝜺 then 2𝑺𝑳 ≥ |𝑫|𝜺 𝒎/4 Taking logs we get 𝑺 𝑳 ≥ 𝜺’ 𝒎 log2 |𝑫| Plugging in as before yields 𝑻=𝛀(𝒏 log((𝒏log 𝑫)/𝑺)) [B-Jayram-Saks 2001, B-Saks-Sun-Vee 2003]
Boolean Domains and 𝐄𝐃 𝒏, 𝒏 𝟐 For 𝐄𝐃 𝒏, 𝒏 𝟐 we only have 𝜹=2-𝜺𝒎 Over 𝔽23𝒏 Ajtai defined an explicit cubic form 𝒇(𝒙,𝒚)=𝒙𝗧𝐌𝒚 𝒙 that requires 𝜹=2-𝜺𝒎 Alternatively: 𝒇(𝒙)=1 iff # of (𝒊,𝒋) pairs s.t. 𝒊<𝒋, 𝒙𝒊=𝒙𝒋=𝒙𝒊+𝒋=1 is odd 𝒚1 𝒚2n-1 𝒚2n-2 𝒚n+2 𝒚n+1 𝒚n 𝒚4 𝒚3 𝒚2 M𝒚
BP Lower Bound Technology for 𝜹=2-𝜺𝒎 Much more complicated argument that holds only up to small amounts of nondeterminism. [Ajtai 2005] Uses correlations between private(𝒙) values for related inputs. an independent layer assignment that leaves most layers unassigned to either player a probability of assigning a layer for input 𝒙 to one of Alice or Bob that depends on the typical # of different layers in which input variables are read on input 𝒙. Theorem:[Ajtai 2005,B-Saks-Sun-Vee 2003] 𝐄𝐃 𝒏, 𝒏 𝟐 and 𝒙𝗧𝐌𝒚 𝒙 both require 𝑻=𝛀 𝒏 𝐥𝐨𝐠 𝒏 𝑺 𝐥𝐨𝐠 𝐥𝐨𝐠 𝒏 𝑺 .
BPs and Static Data Structures Theorem: [Miltersen-Nisan-Safra-Wigderson 1998] With query set {0,1}𝒎, time lower bounds of 𝝎(𝒎) for size 𝒏𝐎(1) static cell-probe data structures require non-trivial time-space tradeoffs (i.e. 𝐎(log 𝒏) space requires superlinear time.) Proof: View the query 𝒙∊{0,1}𝒎 as the input vector For each fixed dataset 𝓓, have a different branching program 𝑷𝓓. Can use each cell of the cell-probe data structure to hold a node of 𝑷𝓓 Values are the index of the variable and the names of the two pointers Size 𝒏𝐎(1) BP implies 𝒏𝐎(1) size cell-probe data structure with 𝒘=3log 𝒏-bit words (simply follow the branching program) Time is preserved. Can extend each step to full tree of height 𝒌 at cost of 2𝒌 factor larger word-size 𝒘. Saves a factor 𝒌 in time.
A Converse Theorem: [B-Vee 2002] Static data structure: 2𝑺 cells + extra work space at most 𝑺 time 𝑻 query algorithm that reads ≤ 𝒌 consecutive bits of the query in a one step yields a 2𝒌-way BP 𝑷𝓓 of time 𝐎(𝑻) and space 𝐎(𝑺+log 𝑻) for every dataset 𝓓. Proof: Each BP node corresponds to a cell name + configuration of the extra storage. Memory contents are fixed by 𝓓. The input bits accessed are determined by the algorithm and the fixed memory cell contents just read.
Application to 𝛌-Near Neighbor [B-Vee 2002] Hamming separation HAM𝜸 : e.g. 𝑫=[𝒏6]={0,1}6log 𝒏. Output is 1 iff 𝚫(𝒙𝒊,𝒙𝒋)≥5 log2 𝒏 for all 𝒊≠𝒋 Can solve HAM𝜸 using a 𝛌-Near Neighbor data structure: Encode each coordinate 𝒙𝒊 in 𝑫 as 𝒙𝒊 using twice the bits so distance from 0 is fixed Choose 𝓓 to be set of all possible strings of the form 0𝒊-1𝒂 0𝒋-𝒊-1 𝒂 0𝒏-𝒋 HAM𝜸(𝒙)=0 iff 𝓓 contains a close string to 𝒙 So...BP lower bound for HAM𝜸 implies: Theorem: Any 𝛌-Near Neighbor data structure for Hamming distance on {0,1}𝒎 that reads 𝐎(log 𝒏) consecutive bits per time step and 𝟐 (𝒎𝒏) 𝒐(𝟏) memory cells requires time 𝛀(𝒎).
Larger bounds for Huge Domains Inspired by multiparty NOF communication complexity Uses embedded cylinder intersections instead of embedded rectangles Theorem: There is an explicit function over a huge domain for which 𝑻=𝛀(𝒏 log2 (𝒏log|𝑫|/𝑺)) is needed. [B-Vee 2002] Drawbacks: Domain size |𝑫| requires 𝚯(log3 𝒏) bits to encode Function, which is based on tensored, interleaved Reed-Solomon codes, is not known to be in NP.
Single-Output Methods for Multi-output Problems
Open Problems Prove general BP lower bounds for out-degree 2 (arbitrary) directed graph reachability Savitch’s Theorem implies 𝑺=𝐎(log2 𝒏) and we don’t expect 𝑺=𝐎(log 𝒏) is possible at all. (Would imply NL/poly=L/poly.) Prove that 𝑺=𝐎(log 𝒏) implies 𝑻=𝛀(𝒏2) or 𝑻= 𝛀(𝒏1+𝜺) At least match oblivious BP bound of 𝑻=𝛀(𝒏 log2(𝒏/𝑺)) for out-degree 1. Improve best lower bound for Boolean functions from 𝑻=𝛀 𝒏 𝐥𝐨𝐠 𝒏 𝑺 𝐥𝐨𝐠 𝐥𝐨𝐠 𝒏 𝑺 to 𝑻=𝛀(𝒏 log(𝒏/𝑺)) to match the large domain and oblivious BP bounds. Generalize embedded rectangle techniques for Boolean inputs to embedded cylinder intersections.
Open Problems Prove any oblivious BP lower bound for an explicit single-output function that holds for time 𝑻= 𝒏 log𝛚(1) 𝒏 or even 𝑻=𝛚(𝒏 log2 𝒏). Prove 𝑻=𝛀(𝒏 log2 (𝒏/𝑺)) oblivious BP lower bound for a wider range of natural functions.
Open Problems Prove an 𝛀(𝒏2) size lower bound for an explicit Boolean function Find better time-space tradeoff lower bounds for other multi-output functions, e.g. Encoding asymptotically good error-correcting codes. [Bazzi-Mitter 2005] conjectured 𝑻=𝛀( 𝒏 𝟐 /𝑺) Element distinctness in sliding windows [B-Clifford-Machmouchi 2013] 𝑻=𝛀( 𝒏 𝟑/𝟐 / 𝑺 𝟏/𝟐 ) ?