Cell-Probe Lower Bounds for Succinct Partial Sums Mihai P ă trașcuEmanuele Viola.

Slides:



Advertisements
Similar presentations
Optimal Lower Bounds for 2-Query Locally Decodable Linear Codes Kenji Obata.
Advertisements

Estimating Distinct Elements, Optimally
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO Based on a paper in STOC, 2012.
Tight Bounds for Distributed Functional Monitoring David Woodruff IBM Almaden Qin Zhang Aarhus University MADALGO.
Quantum Lower Bounds The Polynomial and Adversary Methods Scott Aaronson September 14, 2001 Prelim Exam Talk.
Quantum Software Copy-Protection Scott Aaronson (MIT) |
Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.
Optimal Bounds for Johnson- Lindenstrauss Transforms and Streaming Problems with Sub- Constant Error T.S. Jayram David Woodruff IBM Almaden.
Xiaoming Sun Tsinghua University David Woodruff MIT
CSCI 3130: Formal Languages and Automata Theory Tutorial 5
On the k-Independence Required by Linear Probing and Minwise Independence Mihai P ă trașcuMikkel Thorup ICALP10.
Lower Bounds for 2-Dimensional Range Counting Mihai Pătraşcu
Models of Computation Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms Week 1, Lecture 2.
Recurrences : 1 Chapter 3. Growth of function Chapter 4. Recurrences.
Extracting Randomness From Few Independent Sources Boaz Barak, IAS Russell Impagliazzo, UCSD Avi Wigderson, IAS.
Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
Changing Base without Losing Space Yevgeniy DodisMihai P ă trașcuMikkel Thorup STOC10.
110/6/2014CSE Suprakash Datta datta[at]cse.yorku.ca CSE 3101: Introduction to the Design and Analysis of Algorithms.
Algorithmic mechanism design Vincent Conitzer
Lecture 3 Universal TM. Code of a DTM Consider a one-tape DTM M = (Q, Σ, Γ, δ, s). It can be encoded as follows: First, encode each state, each direction,
Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.
Succinct Representations of Dynamic Strings Meng He and J. Ian Munro University of Waterloo.
Succinct Data Structures for Permutations, Functions and Suffix Arrays
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
QuickSort Average Case Analysis An Incompressibility Approach Brendan Lucier August 2, 2005.
Greedy Algorithms Amihood Amir Bar-Ilan University.
Succincter Mihai P ă trașcu unemployed ??. Storing trits Store A[1..n] ∈ {1,2,3} n to retrieve any A[i] efficiently Plank, 2005.
CS420 lecture one Problems, algorithms, decidability, tractability.
Divide-and-Conquer Recursive in structure –Divide the problem into several smaller sub-problems that are similar to the original but smaller in size –Conquer.
SODA Jan 11, 2004Partial Sums1 Tight Bounds for the Partial-Sums Problem Mihai PǎtraşcuErik Demaine (presenting) MIT CSAIL.
Succinct Representations of Trees S. Srinivasa Rao Seoul National University.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 12 June 18, 2006
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.
On Everlasting Security in the Hybrid Bounded Storage Model Danny Harnik Moni Naor.
Introduction to Algorithm design and analysis
Mike 66 Sept Succinct Data Structures: Techniques and Lower Bounds Ian Munro University of Waterloo Joint work with/ work of Arash Farzan, Alex Golynski,
Analysis of Algorithms
Introduction n – length of text, m – length of search pattern string Generally suffix tree construction takes O(n) time, O(n) space and searching takes.
Summer School '131 Succinct Data Structures Ian Munro.
Succinct Data Structures Ian Munro University of Waterloo Joint work with David Benoit, Andrej Brodnik, D, Clark, F. Fich, M. He, J. Horton, A. López-Ortiz,
MA/CSSE 473 Day 28 Dynamic Programming Binomial Coefficients Warshall's algorithm Student questions?
Asymmetric Communication Complexity And its implications on Cell Probe Complexity Slides by Elad Verbin Based on a paper of Peter Bro Miltersen, Noam Nisan,
Lecture 5 Today, how to solve recurrences We learned “guess and proved by induction” We also learned “substitution” method Today, we learn the “master.
CSC 413/513: Intro to Algorithms Introduction Proof By Induction Asymptotic notation.
Stephen Alstrup (U. Copenhagen) Esben Bistrup Halvorsen (U. Copenhagen) Holger Petersen (-) Aussois – ANR DISPLEXITY - March 2015 Cyril Gavoille (LaBRI,
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
Recurrences It continues… Jeff Chastine. Recurrences When an algorithm calls itself recursively, its running time is described by a recurrence. A recurrence.
CMPT 438 Algorithms.
Succinct Data Structures
Information Complexity Lower Bounds
Succinct Data Structures
Succinct Data Structures
CS 3343: Analysis of Algorithms
Succinct Data Structures: Upper, Lower & Middle Bounds
Growth of functions CSC317.
CS 3343: Analysis of Algorithms
Data Structures & Algorithms
DFT and FFT By using the complex roots of unity, we can evaluate and interpolate a polynomial in O(n lg n) An example, here are the solutions to 8 =
Discrete Methods in Mathematical Informatics
Greedy: Huffman Codes Yin Tat Lee
Indistinguishability by adaptive procedures with advice, and lower bounds on hardness amplification proofs Aryeh Grinberg, U. Haifa Ronen.
Compact routing schemes with improved stretch
David Kauchak cs161 Summer 2009
Succinct Data Structures
Switching Lemmas and Proof Complexity
Cryptography Lecture 18.
Presentation transcript:

Cell-Probe Lower Bounds for Succinct Partial Sums Mihai P ă trașcuEmanuele Viola

Succinct Data Structures Given some input of N bits => some data structure of close to N bits to answer useful queries Why? Practice: functional data compression You often want to query compressed data, … so the data structure on top better be small, too Theory: algorithmic ideas with nice information theory flavor N + o(N), N + O(N/lg N), N + O(√N), …

Interesting Upper Bounds [Dodis, P, Thorup ’10] Store a vector A[1..n] from alphabet Σ – Space ⌈ n log 2 Σ ⌉ – Constant time to read or write A[i] [P, FOCS’ 08] Store a vector A[1..n] of bits – Query time O(t) for RANK (k) := A[1] + … + A[k] – Space n + n / (w/t) t

Rank / Select RANK (k) = A[1] + … + A[k] SELECT (k) = index of k th one in A[1..n] A staple of succinct data structures. Example: representing trees succinctly A BCD EF A B C D E F AB.EC.F.D L EFT C HILD (k) = 2 · RANK (k) R IGHT C HILD (k) = 2 · RANK (k) + 1 P ARENT (k) = SELECT ([k/2])

Interesting Lower Bounds [Gál, Miltersen ’03]polynomial evaluation =>redundancy * query time ≥ Ω(n)  nobody really expects a succinct solution [Golynski SODA ’09]store a permutation and query π(·), π -1 (·) With space 2n lg n, query time is 1 If space is (1 + ε) n lg n => query time is Ω(1 / √ε) [HERE] RANK / SELECT For query time t => space is ≥ n + n / w O(t) NB: Upper bound was n + n / (w/t) Θ(t)

Models “Standard Model” (a.k.a. Word RAM) memory = words of w bits w ≥ lg n (store pointers, indices) constant time ops: read/write memory, +, -, *, /, %,, ==, >>, <<, &, |, ^ Lower bounds: “Cell-Probe Model” memory = words (cells) of w bits time = # cell reads/writes Nice: information theoretic, holds even with exotic instructions

The Lower Bound Proof

Step 1: Understand the Upper Bound A[2] … lg ( B Σ ) A[1] A[B] Σ1Σ1 Σ1Σ1 A[.] … lg ( B Σ ) A[.] ΣBΣB ΣBΣB … Σ* H(Σ 1,…, Σ B | Σ*) H(Σ 1,…, Σ B | Σ*)

Recursion ∑ ∑∑ ∑ ∑ ∑∑ ∑ ∑ ∑∑ ∑ Store non-succinctly Spend constant time per node => decode Σ i from Σ* and the H(Σ 1,…, Σ B | Σ*) memory bits Set parameters such that: H(Σ 1,…, Σ B | Σ*) = O(w) => B ≈ w Can go t levels up => redundancy ≈ n / w t

Published Bits For induction, use a stronger model: memory = cells of w bits; cost 1 to read a cell P published bits, free to read (=cache) Initially, set P = redundancy -> publish some arbitrary (!) P bits Intuitively, think of published bits ≈ sums at roots of the subtrees

Published Bits Die by Direct Sum Do published bits trivialize the problem? No. Consider T=100·P subproblems For the average problem, published bits have 0.01 information => can’t really help A[1]A[2].. A[n] Problem 1Problem T

Recursion Intuition Intuition: { first cells probed by Q 0 } = { first cells probes by Q k } = the roots of the trees So just publish { first cells probed by Q 0 } => get rid of 1 probe for all queries Upper BoundLower Bound Remove top level of the trees =>t -= 1 redundancy *= w Main Lemma: Remove one cell-probe Increase P *= O(w) A[1]A[2].. A[n] Q0Q0 Q1Q1

Careful for Bad Algorithms! Intuition: { first cells probed by Q 0 } = { first cells probes by Q k } = the roots of the trees So just publish { first cells probed by Q 0 } => get rid of 1 probe for all queries Upper BoundLower Bound Remove top level of the trees =>t -= 1 redundancy *= w Main Lemma: Remove one cell-probe Increase P *= O(w) Can’t make an argument based on first probe!

Intuition Fix Intuition: | Foot(Q 0 ) ∩ Foot(Q k ) | = Ω(P) So just publish Foot(Q 0 ) => get rid of 1 probe for most queries Upper BoundLower Bound Remove top level of the trees =>t -= 1 redundancy *= w Main Lemma: Remove one cell-probe Increase P *= O(poly(w)) footprint: Foot(Q) = { cells probes by Q }

Why are Cells Reused? Suppose, for contradiction, that | Foot(Q 0 ) ∩ Foot(Q k ) | = o(P) So the answers to most of Q k can be decoded from Foot(Q 0 ) But Answers(Q 0 ) and Answers(Q k ) are highly correlated: H(Ans(Q 0 )) + H(Ans(Q k )) >> H(Ans(Q 0 ), Ans(Q k )) So the data structure is an inefficient encoding (non-succinct). Ignored in this talk

Entropy Computation H(Ans(Q 0 )) ≈ H(Ans(Q k )) ≈ T · H(binomial on n/T trials) ≈ T · c lg(n/T) for some constant c H(Ans(Q 0 ), Ans(Q k )) = H(Ans(Q 0 )) + H(Ans(Q k ) | Ans(Q 0 )) ≈ T · c lg(n/T) + T · H(binomial on k trials) ≈ T · c lg(n/T) + T · c lg k < T · c lg(n/T) + T · c lg n/(2T) = H(Ans(Q 0 )) + H(Ans(Q k )) – Ω(T) A[1]..A[k+1].. A[n] Q0Q0 QkQk Problem 1Problem T For k in “first half”… Otherwise symmetric

Proof by Encoding Claim: We will encode A with less than n bits (impossible) ΔH = H(Ans(Q 0 ), Ans(Q k )) – [ H(Ans(Q 0 )) + H(Ans(Q k )) ] = Ω(T) Contradiction for T >> P Write down…Bits required Published bitsP Answers to Q 0, Q k H(Ans(Q 0 ), Ans(Q k )) Cells in Foot(Q 0 )w·|Foot(Q 0 )| - H(Ans(Q 0 )) Other cellsn - w·|Foot(Q 0 )| - H(Ans(Q k )) TOTALn + P - ΔH H(X|f(X)) = H(X) – H(f(X))

The End