Nikhil Bansal, Shashwat Garg, Jesper Nederlof, Nikhil Vyas Faster Space-Efficient Algorithms for Subset Sum, k-Sum and related problems (available at arXiv:1612.02788) Nikhil Bansal, Shashwat Garg, Jesper Nederlof, Nikhil Vyas disclaimer: no turtles or hares were harmed during this research
Subset Sum Algo i’th bit? 0/1 010011110010000010100001100001101010010 Subset Sum Given integers 𝑤 1 ,…, 𝑤 𝑛 ,𝑡 find 𝑋⊆ 𝑛 with 𝑖∈𝑋 𝑤 𝑖 =𝑡 𝑂(𝑛𝑡) time by DP 𝑂 𝑛+𝑡 randomized time B(SODA’17) 𝑂 (𝑛𝑡) randomized time, poly space B(SODA’17) 𝑂 ∗ (2 𝑛 ) time, poly space ( 𝑂 ∗ omits poly factors in input) 𝑂 ∗ (2 𝑛/2 ) time, 𝑂 ∗ ( 2 𝑛/2 ) space HS(JACM72) 𝑂 ∗ (2 𝑛/2 ) time, 𝑂 ∗ ( 2 𝑛/4 ) space SS(SICOMP81) Can solve any instance in either 𝑂 ∗ ( 2 0.49991𝑛 ) time or 𝑂 ∗ ( 2 0.999𝑛 ) time and poly space AKKN(STACS’16) Algo i’th bit? 𝑝𝑜𝑙𝑦(𝑛) time 0/1 Main Result: There is a Monte Carlo algorithm for Subset Sum using 𝑂 ∗ ( 2 0.86𝑛 ) time and poly space, assuming random read-only access to random bits
BCM:Element Distinctness Crypto: List merging Floyd: Cycle Finding BCM (FOCS13): Shuffle function HS (JACM72): MitM BCM:Element Distinctness Crypto: List merging AKKN (STACS16): Subset Sum distribution is smooth LN(STOC10): Save space with DFT Hash mod p List Disjointness (with small freqs2) Subset Sum (many distinct sums) Subset Sum (few distinct sums) Subset Sum
BCM:Element Distinctness Floyd: Cycle Finding BCM (FOCS13): Shuffle function BCM:Element Distinctness
Element Distinctness (ED): BCM(FOCS13) Given z∈ 𝑚 𝑛 with 𝑚≤𝑝𝑜𝑙𝑦(𝑛), are all values distinct? Unlimited space: 𝑂 (𝑛) time (sort) 𝑂( log 𝑛 ) space: 𝑂 ( 𝑛 2 ) time (brute-force) Suppose 𝑧 ∈ 𝑅 𝑛 𝑛 ; find a repeated value-pair whp Unlimited space: 𝑂 ( 𝑛 ) time ( 𝑂 ( 𝑛 ) samples) 𝑂( 𝑙𝑜𝑔 𝑛 ) space: 𝑂 (𝑛) time ( 𝑂 (1) samples and compare) Theorem(BCM): 𝑂 (𝑛 1.5 ) time, 𝑂( log 𝑛 ) space, assuming random read-only access to random bits We’ll first see 𝑂 ( 𝑛 ) time, 𝑂( log 𝑛 ) space if 𝑧 ∈ 𝑅 𝑛 𝑛 Approach: set 𝑓 𝑖 = 𝑧 𝑖 , and use cycle finding to find 𝑖≠𝑗 such that 𝑓 𝑖 =𝑓 𝑗
Floyd’s Cycle Finding s Finds such 𝑖,𝑗 using little space Basic algo in crypto, much more obscure in TCS View 𝑓 as digraph (with arcs (𝑖,𝑓(𝑖))) s
Floyd’s Cycle Finding i s j Finds such 𝑖,𝑗 using little space Basic algo in crypto, much more obscure in TCS View 𝑓 as digraph (with arcs (𝑖,𝑓(𝑖))) i s j T = #steps turtle in first round (6 in ex) p= stem-length (3 in ex), q=cycle length (6 in ex) 2T=T+xq -> T=xq -> T+p=xq+p
Floyd’s Cycle Finding i s j Finds such 𝑖,𝑗 using little space Basic algo in crypto, much more obscure in TCS View 𝑓 as digraph (with arcs (𝑖,𝑓(𝑖))) i s j Only works if start outside cycle! Works well assuming 𝑓 is random: 𝑝 and 𝑞 are 𝜃( 𝑛 ) whp (birthday paradox)
`Shuffling’ f Theorem(BCM): 𝑂 (𝑛 1.5 ) time, 𝑂( log 𝑛 ) space, assuming random read-only access to random bits What if 𝑓 is not random? Answer: `shuffle’ 𝑓 Let ℎ: 𝑚 →[𝑛] be a random function Cannot remember ℎ, but use assumed oracle Define 𝑓 𝑖 =ℎ 𝑧 𝑖 Use Floyd to sample 𝑖≠𝑗 such that 𝑓 𝑖 =𝑓(𝑗) Call 𝑖,𝑗 a bad pair if ℎ 𝑧 𝑖 =ℎ( 𝑧 𝑗 ) but 𝑧 𝑖 ≠ 𝑧 𝑗 Expect at most 𝑛 2 /𝑛 bad pairs Using 𝑂 𝑛 samples, expect to see real solution
BCM:Element Distinctness Floyd: Cycle Finding BCM (FOCS13): Shuffle function List Disjointness (with small freqs2) Crypto: List merging BCM:Element Distinctness
List Disjointness Given two lists 𝑥,𝑦∈ 𝑚 𝑛 , 𝑚≤𝑝𝑜𝑙𝑦(𝑛) Asked do they share a common value? 𝑥=(1, 9, 2, 4, 9, 4, 6, 5, 2) 𝑦=(7, 8, 5, 0, 3, 7, 3, 0, 8) Very similar to ED; but want values from different lists Define 𝑝 𝑥 = 𝑣∈[𝑚] |𝑥 −1 𝑣 2 ; i.e p 𝑦 =1+4⋅ 2 2 𝑝 𝑥,𝑦 =𝑝 𝑥 +𝑝 𝑦 Counts number of pseudo-solutions Theorem: There is an 𝑂 n 𝑝 time, 𝑂( log 𝑛 ) space algorithm for List Disjointness, if given 𝑝 𝑥,𝑦 ≤𝑝 assuming random read-only access to random bits
List Disjointness Define 𝑧∈ 𝑚 𝑛 by merging 𝑥,𝑦 Theorem: There is an 𝑂 n 𝑝 time, 𝑂( log 𝑛 ) space algorithm for List Disjointness, if given 𝑝 𝑥,𝑦 ≤𝑝 assuming random read-only access to random bits Define 𝑧∈ 𝑚 𝑛 by merging 𝑥,𝑦 E.g. just concatenate 𝑥 and 𝑦 In paper we set 𝑧 𝑖 := 𝑥 𝑖 or 𝑧 𝑖 := 𝑦 𝑖 with prob 1/2 Sample 𝑖,𝑗 such that 𝑓 𝑖 =𝑓(𝑗) as before If 𝑧 𝑖 =𝑧(𝑗) and 𝑖≠𝑗, also check 𝑥 𝑖 = 𝑦 𝑗 or 𝑥 𝑗 = 𝑦 𝑖 Need 𝑝(𝑥,𝑦) samples Expect 𝑂(𝑛/ 𝑝(𝑥,𝑦) ) vertices needed for a sample
BCM:Element Distinctness Floyd: Cycle Finding BCM (FOCS13): Shuffle function HS (JACM72): MitM BCM:Element Distinctness Crypto: List merging List Disjointness (with small freqs2) Subset Sum (many distinct sums)
𝐿 𝑅 Meet in the Middle Ints 𝑤 1 ,…, 𝑤 𝑛 ,𝑡. Denote w 𝑋 ≔ 𝑖∈𝑋 𝑤 𝑖 Reduce SSS on 𝑛 integers to List Disjointness on lists of length 2 𝑛/2 ; run the sorting algo 𝐿 𝑅 𝑤 1 ,…, 𝑤 𝑛/2 , 𝑤 𝑛/2+1 ,…, 𝑤 𝑛 𝑥= 𝑤 𝑋 𝑋⊆𝐿 𝑦= 𝑡−𝑤 𝑋 𝑋⊆𝑅 LD instance solved in 𝑂( 2 𝑛/2 polylog ( 2 𝑛/2 ) ) time, which is O ∗ ( 2 𝑛/2 ). Also uses 2 𝑛/2 space
𝐿 𝑅 Meet in the Middle new Ints 𝑤 1 ,…, 𝑤 𝑛 ,𝑡. Denote w 𝑋 ≔ 𝑖∈𝑋 𝑤 𝑖 Reduce SSS on 𝑛 integers to List Disjointness on lists of length 2 𝑛/2 ; run the sorting algo. new 𝐿 𝑅 𝑤 1 ,…, 𝑤 𝑛/2 , 𝑤 𝑛/2+1 ,…, 𝑤 𝑛 𝑥= 𝑤 𝑋 𝑋⊆𝐿 𝑦= 𝑡−𝑤 𝑋 𝑋⊆𝑅 𝑂 ∗ 1 space, 𝑂 ∗ ( 2 𝑛/2 𝑝 ) time 𝑝 𝑥 = {𝑋,𝑌⊆𝐿:𝑤 𝑋 =𝑤 𝑌 } ≤ 2 𝑛 If 𝑤 𝑖 =0, 𝑝 𝑥 ,𝑝 𝑦 = 2 𝑛 -> 2 𝑛 time How do instances with 𝑝 𝑥 ≥ 2 0.99𝑛 look?
BCM:Element Distinctness Crypto: List merging Floyd: Cycle Finding BCM (FOCS13): Shuffle function HS (JACM72): MitM BCM:Element Distinctness Crypto: List merging AKKN (STACS16): Subset Sum distribution is smooth List Disjointness (with small freqs2) Subset Sum (many distinct sums)
Subset Sum Distribution is smooth (AKKN) Lemma: 𝑝 𝑥 ⋅|{𝑤 𝑋 :𝑋⊆𝐿}|≤ 6 𝑛/2 Note: 6 𝑛/2 < 2 1.3𝑛 𝒘 𝑳 𝒑(𝒙) |{𝒘 𝑿 :𝑿⊆𝑳}| Histogram 0 0 0 0 0 32 2 1 1 2 4 8 16 32⋅ 1 2 32 1 2 3 4 5 6⋅ 1 2 + 4⋅ 2 2 + 6⋅ 3 2 =76 16
Subset Sum Distribution is smooth (AKKN) Lemma: 𝑝 𝑥 ⋅|{𝑤 𝑋 :𝑋⊆𝐿}|≤ 6 𝑛/2 Note: 6 𝑛/2 < 2 1.3𝑛 We use this as follows: Suppose 𝑤 𝑋 :𝑋⊆ 𝑛 ≥ 2 0.99𝑛 then 𝑤 𝑋 :𝑋⊆𝐿 ≥ 2 0.49𝑛 and thus 𝑝 𝑥 ≤ 6 𝑛/2 / 2 0.49𝑛 ≤ 2 0.99𝑛 Cor: If 𝑤 𝑋 :𝑋⊆ 𝑛 ≥ 2 0.99𝑛 , can solve SSS in 𝑂 ∗ (2 𝑛/2 2 0.99𝑛 )= 𝑂 ∗ ( 2 0.995𝑛 ) time and poly space, assuming random read-only access to random bits Proof of Lemma uses a connection to UDCP’s
Uniquely Decodable Code Pairs (UDCP) Pair 𝒞 1 , 𝒞 2 ⊆{0,1 } 𝑑 s.t. 𝒞 1 + 𝒞 2 = 𝒞 1 | 𝒞 2 |. 𝒞 1 + 𝒞 2 ={x+y: x,y ∈ 𝒞 1 × 𝒞 2 }, x+y is addition over ℤ 𝑑 . 1 0 1 0 0 1 1 1 𝒞 1 1011100 1101101 0000000 1010011 0101010 1111101 1100111 1011100 1101101 0000000 1010011 0101010 1111101 1100111 ?? 2 1 𝒞 2 0011001 1010101 0011011 0110110 0 1 1 0 1 1 0 1 0011001 1010101 0011011 0110110
Uniquely Decodable Code Pairs (UDCP) Pair 𝒞 1 , 𝒞 2 ⊆{0,1 } 𝑑 s.t. 𝒞 1 + 𝒞 2 = 𝒞 1 | 𝒞 2 |. 𝒞 1 ={10,01}, 𝒞 2 ={00,01,11} is UDCP: 𝒞 1 + 𝒞 2 ={10,11,21,01,02,12} 𝒞 1 + 𝒞 2 ≤ { 0,1,2} 𝑑 ≤ 3 𝑑 Side remark: In general 𝒞 1 || 𝒞 2 ≤ 2 1.5𝑑 is best known (elegant upper bound, might tell afterwards)
Subset Sum Distribution is smooth (AKKN) Lemma: 𝑝 𝑥 ⋅|{𝑤 𝑋 :𝑋⊆𝐿}|≤ 6 𝑛/2 There exists a frequent sum 𝑖 s.t. B v ={𝑥∈ 0,1 𝑛/2 :𝑤⋅𝑥 =𝑣}; 𝑝 𝑥 ≤ 𝐵 𝑣 2 𝑛/2 Let 𝐴⊆{ 0,1} 𝑛/2 be such that for all 𝑎 1 , 𝑎 2 ∈𝐴 𝑎 1 ⋅𝑤= 𝑎 2 ⋅𝑤 implies 𝑎 1 = 𝑎 2 Then |𝐴+ 𝐵 𝑣 |= 𝐴 ⋅| 𝐵 𝑣 | (i.e. 𝐴, 𝐵 𝑣 is a UDCP): Suppose 𝑎 1 + 𝑏 1 = 𝑎 2 + 𝑏 2 (add in ℤ 𝑛/2 ) Thus 𝐴 ⋅ 𝐵 𝑣 = 𝐴+ 𝐵 𝑣 ≤ 3 𝑛/2 lemma follows: 𝑝 𝑥 𝐴 ≤ 𝐵 𝑣 2 𝑛/2 𝐴 ≤ 6 𝑛/2 𝑤⋅(𝑎 1 + 𝑤⋅𝑏 1 )= 𝑤⋅(𝑎 2 +𝑤⋅ 𝑏 2 ) 𝑤⋅ 𝑎 1 + 𝑤⋅𝑏 1 )= 𝑤⋅ 𝑎 2 +𝑤⋅ 𝑏 2 ) 𝑤⋅ 𝑎 1 + 𝑤⋅𝑦 1 )= 𝑤⋅ 𝑎 2 +𝑤⋅ 𝑦 2 ) 𝑤⋅(𝑎 1 + 𝑤⋅𝑏 1 )= 𝑤⋅(𝑎 2 +𝑤⋅ 𝑏 2 ) → 𝑎 1 = 𝑎 2 → 𝑏 1 = 𝑏 2
BCM:Element Distinctness Crypto: List merging Floyd: Cycle Finding BCM (FOCS13): Shuffle function HS (JACM72): MitM BCM:Element Distinctness Crypto: List merging AKKN (STACS16): Subset Sum distribution is smooth List Disjointness (with small freqs2) Hash mod p + DFT Subset Sum (many distinct sums) Subset Sum (few distinct sums) Subset Sum
Subset Sum with few Distinct Sums Lemma: Can solve instance 𝑤 1 ,…, 𝑤 𝑛 ,𝑡 in time O ∗ ( 𝑤 𝑋 :𝑋⊆ 𝑛 ) and poly space. Done by hashing numbers mod a prime of order 𝑤 𝑋 :𝑋⊆ 𝑛 and run 𝑂 ∗ 𝑡 time 𝑂 ∗ (1) space algorithm, that uses DFT. Combining with previous lemma we obtain Main Result’: There is a Monte Carlo for Subset Sum using 𝑂 ∗ ( 2 0.995𝑛 ) time and poly space, assuming random read-only access to random bits Omitted optimization to get the 𝑂 ∗ ( 2 0.86𝑛 )
BCM:Element Distinctness Crypto: List merging Floyd: Cycle Finding BCM (FOCS13): Shuffle function HS (JACM72): MitM BCM:Element Distinctness Crypto: List merging AKKN (STACS16): Subset Sum distribution is smooth List Disjointness (with small freqs2) Hash mod p + DFT Random k-Sum Subset Sum (many distinct sums) Subset Sum (few distinct sums) Subset Sum Knapsack & Binary Linear Programming NvdZvL (MFCS12):Reduce without adding variables
Further Results Using reduction to Subset Sum NvLvdZ (MFCS’12) Theorem: Binary LP on 𝑛 vars, 𝑑 constraints, max int 𝑚 in time 𝑂 ∗ 2 0.86𝑛 log 𝑚𝑛 𝑛 𝑂 𝑑 and poly space, assuming random read-only access to random bits Using reduction to Subset Sum NvLvdZ (MFCS’12) Established by simple recursive rounding scheme Time/space tradeoffs for List Disjointness By extending techniques from BCM List Disjointness in 𝑂 (𝑛 𝑝/𝑠 ) time given 𝑠≤ 𝑛 2 /𝑝 space Theorem: Random 3-Sum in 𝑂 ( 𝑛 2.5 ) time and log 𝑛 space if ints independently and u.a.r from [𝑚] with 𝑚≤𝑝𝑜𝑙𝑦(𝑛) being a multiple of 𝑛
Further Research How strong is random oracle assumption exactly? Weaker than the existence of sufficiently strong PRG’s Still don’t know exact (low-space) complexity of ED!! Can we do something specific for SSS? Solve Subset Sum in time 𝑂 ∗ 2 0.5−𝜖 𝑛 Can restrict to 𝑤 1 ,…, 𝑤 𝑛, ,𝑡≤ 2 0.997𝑛 AKKN (STACS’16) Show 𝒞 1 𝒞 2 ≤ 2 1.49999𝑑 for UDCP 𝒞 1 , 𝒞 2 ∈{0,1 } 𝑑 True if 𝒞 1 ≥ 2 0.995𝑑 AKKN (ISIT’16) Big open question in information theory Study Subset Sum combinatorics If spike of size 2 𝜖𝑛 for some constant 𝜖>0, bound |{𝑤 𝑋 :𝑋⊆[𝑛]}|≤ 2 1− 𝜖 ′ 𝑛 for some 𝜖 ′ ≔𝑓(𝜀)>0
Take-home Messages Cycle finding is a great tool low space algo’s Several worst-case algo’s for SSS are inspired on techniques from the literature of average-case complexity of SSS: Cycle finding in poly space setting (this work) Howgrave-Graham approach in exp. space setting (AKKN) Win/win approach for many/few distinct sums In exponential space setting it remains to `win’ in the case of few sums Thanks for listening!!