Sublinear-Time Error-Correction and Error-Detection Luca Trevisan U.C. Berkeley
Contents Survey of results on error-correcting codes with sub-linear time checking and decoding procedures Most of the results not proved by the speaker Some of the results not yet proved by anybody
Minimum Distance
Ideally Constant information rate Linear minimum distance Very efficient decoding Sipser-Spielman: linear time deterministic procedure
Sub-linear time decoding? Must be probabilistic Must have some probability of incorrect decoding Even so, is it possible?
Reasons to be interested Sub-linear time decoding useful for worst-case to average-case reductions, and in information-theoretic Private Information Retrieval Sub-linear time checking arises in PCP Useful in practice?
Hadamard Code
“Constant time” decoding
A Lower Bound If: the code is linear, the alphabet is small, and the decoding procedure uses two queries Then exponential encoding length is necessary Goldreich-Trevisan, Samorodnitsky
More trade-offs For k queries and binary alphabet: More complicated formulas for bigger alphabet
Construction without polynomials
Negative result 1 Suppose C:{0,1}^n -> {0,1}^m is code with decoding procedure that reads only k bits of corrupted encoding Pick random x, compute C(x), project C(x) on m^{(k-1)/k} coordinates, prove that it still contains W(n) bits of info. about x. Then it must be m=W(n^{k/(k-1)}) Katz-Trevisan
Negative Result 2 Suppose C:{0,1}^n -> {0,1}^m is linear code with decoding procedure that reads only 2 bits of corrupted encoding Then there are vectors a1…am in {0,1}^n such that for each i=1,…,n there are W(m) disjoint pairs j1,j2 such that aj1 xor aj2 = ei Then it must be m=exp(W(n)) Goldreich-Trevisan, Samorodnitksy
Checking polynomial codes Consider encoding with multivariate low-degree polynomials Given p, pick random z, do the decoding for p(z), compare with actual value of p(z) “Simple” case of low-degree test. Rejection prob. proportional to distance from code. Rubinfeld-Sudan
Bivariate Low Degree Test A degree-d bivariate polynomial p:F x F -> F is represented as 2|F| elements of F^d (the univariate polynomial qa (y) = p(a,y) for each a and the polynomial rb(x) = p(x,b) for each b Test: pick random a and b, read qa and rb, check that qa(b)=rb(a)
Analysis If |F| is a constant factor bigger than d, then rejection probability is proportional to distance from code Arora-Safra, ALMSS, Polishuck-Spielman
Efficiency of Decoding vs Checking
Tensor Product Codes Suppose we have a linear code C with codewords in {0,1}^m. Define new code C’ with codewords in {0,1}^(mxm); a “matrix” is a codeword of C’ if each row and each column is codeword for C If C has lots of codeword and large minimum distance, same true for C’
Generalization of the Bivariate Low Degree Test Suppose C has K codewords Define code C’’ over alphabet [K], with codewords of length 2m C’’ has as many codewords as C’ For each codeword y of C’, corresponding codeword in C’’ contains value of each row and each column of y Test: pick a random “row” and a random “column”, check intersection agrees Analysis?
Negative Results? No known lower bound for locally checkable codes Possible to get encoding length n^(1+o(1)) and checking with O(1) queries and {0,1} alphabet? Possible to get encoding length O(n) with O(1) queries and small alphabet?
Applications? Better locally decodable codes have applications to PIR General/simple analysis of checkable proofs could have application to PCP (linear-length PCP, simple proof of the PCP theorem) Applications to the practice of fault-tolerant data storage/transmission?