Error-Correcting Codes and Pseudorandom Projections Luca Trevisan U.C. Berkeley
About this talk Take the input, encode with an error-correcting code, and restrict the codeword to a (pseudo)randomly chosen subset of the bits The above approach works to construct hash functions, randomness extractors, pseudorandom generators, and more A few different applications of the approach exist, with very different analyses If the moral of [Vadhan, 2001] is right, all constructed objects are the same, and that same approach works for all is not surprising. Then why differences in analysis?
Disclaims This talk will –be technically imprecise –lack proper credits and “historic” perspective –have an open finale rather than a happy ending
Cast of Characters Hash Functions map an input to a “random” output Randomness Extractors map a “weakly random” input to a random output Pseudorandom Generators map a short random input to a long pseudorandom output Error-correcting codes
Hash Functions H x s H s (x) For x=/=y, and for random s, H s (x) very likely to be different from H s (y).
Error-correcting Codes C x C(x) Injective map of n bits in N bits (typically, N=O(n)) If x=/=y, then C(x) and C(y) differ in several places (typically, differ in (N) positions)
Hash Functions from Error Correcting-Codes Used several times: ISW00, MV99, M98,..., GW94,... C C y C(x) C(y) s s H s (x) H s (y) x
Analysis Say that C() maps n bits into N bits, and if x=/=y then C(x) and C(y) differ in N/3 places. s describes a subset of N/3 of size m H s (x) is C(x) “projected” to the bits of s If x=/=y, then, Pr s [H s (x) = H s (y)] < (2/3) m s can be specified using mlog n random bits
Other Observations Projection points can be chosen along random walk on expander –Collision prob. 2 -k achievable with log n + O(k) random bits Used in one of the hash functions of [Goldreich-Wigderson, 1994]. In a RAM with division and multiplication, error correcting code and projection (and so, hash function) is computable in O(1) time [Miltersen, 1998]
Extractors E x s E(s,x) If x sampled from distribution with min-entropy k, and s uniform, then E(s,x) almost uniform Similar to hash functions, but want d=O(log n) n bits, entropy k m bits, uniform d bits, uniform
Pseudorandom Projections C C(x) s x Proj E E(s,x)
The Nisan-Wigderson projection generator... s a1a1 a2a2 a3a3 a4a4
The Nisan-Wigderson projection generator s a1a1 a2a2 a3a3 a4a4
Notions of almost-independence Standard notion of “almost” independence for random vars A 1,…,A m implies: – there are conditional distributions where A 1,…,A m-1 are fixed, yet A m has still high entropy In NW, A m is determined once A 1,…,A m-1 However, in NW, there are conditional distributions where A m is completely random, yet each of A 1,…,A m-1 has very low entropy
Properties Let NW(s,x) be x projected to the coordinates generated from s using the NW generator. Suppose that D is a procedure that, for a random s, distinguishes NW(s,x) from uniform Then there is a string x’ “close” to x such that: –x’ has a small description given D –x’ is “efficiently computable” given D and some small amount of additional information
Extractor Based on NW C C(x) s x NW E E(s,x)
Analysis If it were not a good extractor, there would be a distribution X of high min-entropy and a function D, such that, for a random s, D distinguishes NW(C(X),s) from uniform For most (fixed) x taken from X, D would distinguish NW(C(x),s) from uniform For each such x, there is x’ close to C(x) with small description If X has high min-entropy, with high probability C(X) is not close to a string of small description complexity. Contradiction: it is a good extractor
B.B. Pseudorandom Generators G f s G f (s) If f has high circuit complexity, and s uniform, then G f (s) indistinguishable from uniform Similar to extractors, but with computational requirements description of function of high circuit complexity m bits, pseudorandom d bits, uniform
A Construction Encoding truth-table of function f using error- correcting code based on multivariate polynomials Project encoded truth-table to a subset of entries chosen using seed s and NW projection generator Essentially same as extractor seen before Analysis: –Need the error-correcting code to have a “sub- linear time list-decoding” procedure [STV99] –Need the computational version of the analysis of the NW projection generator
Notes Things were discovered in reverse order (pseudorandom generator first, extractor later) In original proof [Impagliazzo-Wigderson, 1997], encoding of f not presented as a good error-correcting code (and analysis does not use list-decoding)
Fully Algebraic Construction? In NW-based extractor, and in possible implementation of NW-based pseudorandom generator: –Input x (resp., function f) is encoded as a multivariate polynomial p –Seed s is used to generate points a 1,…,a m –Output is p(a 1 ),…,p(a m ) [up to minor cheating] No algebraic meaning to a 1,…,a m How about a 1,…,a m be on a random line?
Miltersen-Vinodchandra Encode function f as multivariate polynomial p Use seed s to pick an axis parallel line Output values of p restricted to the line Does not give extractor or pseudorandom generator, but (with some more machinery) gives a good hitting set generator Analysis uses the observation about random projection of a code being good hash function
Ta-Shma-Zuckerman-Safra Encode input x as a multivariate polynomial p Use seed s to select an axis-parallel line and a starting point on the line Output values of p on a few consecutive points on the line, beginning with the starting point Gives a good extractor Analysis has similar high-level structure of analysis of NW-based extractor: a distinguisher implies a short description for x Note: short description not computationally efficient; construction does not imply p.r.g.
Shaltiel-Umans Encode input x as multivariate polynomial p in F d Use seed s to pick generator g of F d Evaluate p on g, g 2,... Distinguisher implies that x has (computationally efficient) short description Gives extractors and p.r.g.; performances as good as of best optimized previous constructions
Conclusions? What choices of pseudorandom projections are good to turn error-correcting codes into extractors / pseudorandom generators, and why? Do good extractors / prg follow from encoding with multivar polynomials and projecting on parts of a random (non-axis parallel) line? The NW projections give extractors using any error-correcting codes. Alternative methods with same generality?