Faster Space-Efficient Algorithms for Subset Sum Jesper Nederlof (Eindhoven University, Netherlands) Joint work with Nikhil Bansal, Shashwat Garg, and Nikhil Vyas of 11
Subset Sum Given ints 𝑤 1 ,…, 𝑤 𝑛 ,𝑡, is there 𝑋⊆ 𝑛 with 𝑤 𝑋 := 𝑖∈𝑋 𝑤 𝑖 =𝑡? Today: how efficiently can we solve this in the wost case? NP-complete -> exponential time Basic DP [Bellman(50’s)] runs in 𝑂 ∗ (𝑡) time and space What if 𝑡 is huge (i.e 2 𝑛 )? of 11
Meet in the Middle [HS(JACM’74)] Let L=( 𝑤 1 ,…, 𝑤 𝑛/2 ) R=( 𝑤 𝑛/2+1 ,…, 𝑤 𝑛 ) Compute all possible 2 𝑛/2 sums For each v∈𝑥, check v∈𝑦 Sort 𝑦 + binary search in 3: 𝑂 ∗ (2 𝑛/2 ) time and space [SS(SICOMP’81)] improved to 𝑂 ∗ (2 𝑛/2 ) time, 𝑂 ∗ ( 2 𝑛/4 ) space Natural question: can we beat O ∗ ( 2 𝑛 ) using polynomial space? 𝑤 1 ,…, 𝑤 𝑛/2 𝑤 𝑛/2+1 ,…, 𝑤 𝑛 𝑅 𝐿 x = 𝑤 𝑋 𝑋⊆𝐿 y = 𝑡−𝑤 𝑋 𝑋⊆𝑅 of 11
Our contribution Main thm: SSS in 𝑂 ∗ ( 2 0.86𝑛 ) time and poly space, if given a random oracle is given. 010011110010000010100001100001101010010 The oracle stores the random bits for us! Weaker assumption than sufficiently strong PRG’s Algo i’th bit? 𝑝𝑜𝑙𝑦(𝑛) time 0/1 of 11
Our contribution Main thm: SSS in 𝑂 ∗ ( 2 0.86𝑛 ) time and poly space, if given a random oracle is given. Generalization to Knapsack Further generalization to Binary IP with few constraints Random* 3SUM in 𝑂 ( 𝑛 2.5 ) time and 𝑂( log 𝑛 ) space without oracle of 11
Union bound over d events Main Idea Consider w( 2 [𝑛] ) = {𝑤⋅𝑥 : x ∈ 0,1 𝑛 } i.e. all possible 2 𝑛 sums generated by 𝑤= 𝑤 1 ,…, 𝑤 𝑛 Let 𝑑=|𝑤( 2 [𝑛] )| i.e. # distinct possible sums Case 1: If d< 2 0.86𝑛 (few distinct sums) a) Hash mod 𝑂(𝑑), which makes 𝑡= 𝑂 ∗ (𝑑) b) 𝑂 ∗ (𝑡) time DP, but in poly space [LN(STOC’10)] Interpolates 𝑝 𝑥 = 𝑖=1 𝑛 (1+ 𝑥 𝑤 𝑖 ) to find coeff. of 𝑥 𝑡 Consider w( 2 [𝑛] ) = {𝑤⋅𝑥 : x ∈ 0,1 𝑛 } i.e. all possible 2 𝑛 sums generated by 𝑤= 𝑤 1 ,…, 𝑤 𝑛 Let 𝑑=|𝑤( 2 [𝑛] )| i.e. # distinct possible sums Case 1: If d< 2 0.86𝑛 (few distinct sums) a) Hash mod 𝑂(𝑑), which makes t= 𝑂 ∗ (𝑑) b) 𝑂 ∗ (𝑡) time DP, but in poly space [LN(STOC’10)] Case 2: If d > 2 0.86𝑛 (many distinct sums) a) Upper bound max bin size via combinatorics of subset sums b) Use Floyd’s cycle finding (and the oracle ) Union bound over d events of 11
Main Idea Consider w( 2 [𝑛] ) = {𝑤⋅𝑥 : x ∈ 0,1 𝑛 } i.e. all possible 2 𝑛 sums generated by 𝑤= 𝑤 1 ,…, 𝑤 𝑛 Let 𝑑=|𝑤( 2 [𝑛] )| i.e. # distinct possible sums Case 1: If d< 2 0.86𝑛 (few distinct sums) a) Hash mod 𝑂(𝑑), which makes t= 𝑂 ∗ (𝑑) b) 𝑂 ∗ (𝑡) time DP, but in poly space [LN(STOC’10)] Case 2: If d > 2 0.86𝑛 (many distinct sums) a) Upper bound max bin size via combinatorics of subset sums b) Use Floyd’s cycle finding (and the oracle ) of 11
Main Idea Lemma [AKKN(STACS’16)]: 𝑑⋅ 𝑏 𝑚𝑎𝑥 ≤ 2 1.5𝑛 Case 2: If d > 2 0.86𝑛 (many distinct sums) a) Upper bound max bin size via combinatorics of subset sums Consider w( 2 [𝑛] ) = {𝑤⋅𝑥 : x ∈ 0,1 𝑛 } i.e. all possible 2 𝑛 sums generated by 𝑤= 𝑤 1 ,…, 𝑤 𝑛 Let 𝑑=|𝑤( 2 [𝑛] )| i.e. # distinct possible sums Case 1: If d< 2 0.86𝑛 (few distinct sums) a) Hash mod 𝑂(𝑑), which makes t= 𝑂 ∗ (𝑑) b) 𝑂 ∗ (𝑡) time DP, but in poly space [LN(STOC’10)] Case 2: If d > 2 0.86𝑛 (many distinct sums) a) Proof max bucket size low via combinatorics of subset sums b) Use Floyd’s cycle finding (and the oracle ) max bin size 𝑏 𝑚𝑎𝑥 = 𝑚𝑎𝑥 𝑣 |{𝑥∈{0,1 } 𝑙 :𝑤⋅𝑥=𝑣}| Lemma [AKKN(STACS’16)]: 𝑑⋅ 𝑏 𝑚𝑎𝑥 ≤ 2 1.5𝑛 𝒘 𝒅 𝒃 𝒎𝒂𝒙 Histogram 0 0 0 0 0 1 32 1 2 4 8 16 Cool add. comb. result Smoothness Subset Sum distrib. Proved via simple connection with `Uniquely Decodable Code Pairs’ As d > 2 0.86𝑛 , 𝑏 𝑚𝑎𝑥 ≤ 2 0.64𝑛 of 11
Meet in the Middle [HS(JACM’74)] Let L=( 𝑤 1 ,…, 𝑤 𝑛/2 ) R=( 𝑤 𝑛/2+1 ,…, 𝑤 𝑛 ) Compute all possible 2 𝑛/2 sums For each v∈𝑥, check v∈𝑦 Sort 𝑦 + binary search in 3: 𝑂 ∗ (2 𝑛/2 ) time and space Looks like the Element Distinctness problem: Given list 𝑧 of 𝑚 ints find two positions with equal ints, if exist Repeat 1. Define z as the concatenation of x and y 2. (Almost) uniformly sample a solution of ED instance 𝑧 3. Check if `fake’ solution or a real SSS solution U = 𝑤 1 ,…, 𝑤 𝑛/2 𝑤 𝑛/2+1 ,…, 𝑤 𝑛 𝑅 𝐿 x = 𝑤 𝑋 𝑋⊆𝐿 y = 𝑡−𝑤 𝑋 𝑋⊆𝑅 5 3 4 3 2 10 7 8 1 6 5 3 4 9 2 10 7 8 1 6 none How many times? 𝑂 #𝑓𝑎𝑘𝑒 𝑠𝑜𝑙𝑠 ≤𝑂( 2 0.89𝑛 ) Using max bin bound!! of 11
Sample solution of ED instance A priori for even finding a solution, 𝑂( 𝑚 2 ) time seems best with polylog 𝑚 space Crucially relies on Floyd’s rho algorithm Thm [BCM(FOCS’13)]: ED in 𝑂 ( 𝑚 1.5 ) time, 𝑂( log 𝑚 ) space, if given random oracle LD: Given two lists 𝑥,𝑦∈ 𝑚 𝑁 , is there a common value? Define 𝑓 𝑥,𝑦 = 𝑣=1 𝑚 |𝑥 −1 𝑣 2 + |𝑦 −1 𝑣 2 Refined analysis of [BCM(FOCS’13)]. Sample time 𝑂(𝑚/ 𝑓 ) Thm: LD In 𝑂 𝑚 𝑓 time and 𝑂( log 𝑚 ) space, if given f≥𝑓(𝑥,𝑦) and random oracle of 11
Take-home Messages Cycle finding is a great tool low space algo’s Win/win approach for many/few distinct sums Similar idea useful for other problems? To improve [HS(JACM’74)] it remains to `win’ in the case of few sums Thanks for listening, hope to see you at our poster! of 11