Sublinear-Time Error-Correction and Error-Detection Luca Trevisan U.C. Berkeley luca@eecs.berkeley.edu
Contents Survey of results on error-correcting codes with sub-linear time checking and decoding procedures Results originated in complexity theory
Error-correction
Error-detection
Minimum Distance
Ideally Constant information rate Linear minimum distance Very efficient decoding Sipser-Spielman: linear time deterministic procedure
Sub-linear time decoding? Must be probabilistic Must have some probability of incorrect decoding Even so, is it possible?
Motivations & Context Sub-linear time decoding useful for worst-case to average-case reductions, and in information-theoretic Private Information Retrieval Sub-linear time checking arises in PCP Useful in practice?
Error-correction
Hadamard Code
Example is… Encoding of… 1
“Constant time” decoding
Analysis
A Lower Bound If: the code is linear, the alphabet is small, and the decoding procedure uses two queries Then exponential encoding length is necessary Goldreich-Trevisan, Samorodnitsky
More trade-offs For k queries and binary alphabet: More complicated formulas for bigger alphabet
Construction without polynomials
Construction with polynomials View message as polynomial p:Fk->F of degree d (F is a field, |F| >> d) Encode message by evaluating p at all |F|k points To encode n-bits message, can have |F| polynomial in n, and d,k around (log n)O(1)
To reconstruct p(x) Pick a random line in Fk passing through x; evaluate p on d+1 points of the line; by interpolation, find degree-d univariate polynomial that agrees with p on the line Use interp’ing polynomial to estimate p(x) Algorithm reads p in d+1 points, each uniformly distributed Beaver-Feigenbaum; Lipton; Gemmel-Lipton-Rubinfeld-Sudan-Wigderson
x+(d+1)y x+2y x+y x
Error-detection
Checking polynomial codes Consider encoding with multivariate low-degree polynomials Given p, pick random z, do the decoding for p(z), compare with actual value of p(z) “Simple” case of low-degree test. Rejection prob. proportional to distance from code. Rubinfeld-Sudan
Bivariate Code 2x2 + xy + y2 + 1 mod 5 1 2 3 4 A degree-d bivariate polynomial p:F x F -> F can be represented as 2|F| univariate degree-d polynomials (the “rows” and the columns”) 2x2 + xy + y2 + 1 mod 5 1 2 3 4 Y2+1 Y2+y+3 Y2+2y+4 Y2+3y+4 Y2+4y+3 2x2+1 2x2+x+2 2x2+2x 2x2+3x 2x2+4x+2
Bivariate Low-Degree Test Pick a random row and a random column. Chek that they agree on intersection If |F| is a constant factor bigger than d, then rejection probability is proportional to distance from code Arora-Safra, ALMSS, Polishuck-Spielman
Efficiency of Decoding vs Checking
Tensor Product Codes Suppose we have a linear code C with codewords in {0,1}^m. Define new code C’ with codewords in {0,1}^(mxm); a “matrix” is a codeword of C’ if each row and each column is codeword for C If C has lots of codeword and large minimum distance, same true for C’
Generalization of the Bivariate Low Degree Test Suppose C has K codewords Define code C’’ over alphabet [K], with codewords of length 2m C’’ has as many codewords as C’ For each codeword y of C’, corresponding codeword in C’’ contains value of each row and each column of y Test: pick a random “row” and a random “column”, check intersection agrees Analysis?
Negative Results? No known lower bound for locally checkable codes Possible to get encoding length n^(1+o(1)) and checking with O(1) queries and {0,1} alphabet? Possible to get encoding length O(n) with O(1) queries and small alphabet?
Applications? Better locally decodable codes have applications to PIR General/simple analysis of checkable proofs could have application to PCP (linear-length PCP, simple proof of the PCP theorem) Applications to the practice of fault-tolerant data storage/transmission?