New extractors and condensers from Parvaresh- Vardy codes Amnon Ta-Shma Tel-Aviv University Joint work with Chris Umans (CalTech)
Plan -The problem - The GUV condenser - Our Variant of the GUV condenser - Concluding remarks
Extractor is a hash function E: {0,1} n x {0,1} t → {0,1} m
{0,1} n f Extractor is a hash function E: {0,1} n x {0,1} t → {0,1} m Input f {0,1} n E(f,y) {0,1} m Seed y {0,1} t Output in {0,1} m E y
{0,1} n f With the property that: E y E(f,y) {0,1} m
{0,1} n f With the property that: X {0,1} n of size 2 k E y E(f,y) {0,1} m
With the property that: {0,1} n f X {0,1} n of size 2 k E(X,U t ) U m E y E(f,y) {0,1} m
Parameters We hash n bits to fewer m bits, using t auxiliary truly random bits, s.t. any source with k “entropy” is mapped to a source ε close to uniform The entropy loss of the extractor is k-m Our goal to simultaneously minimize the seed length and the entropy loss.
Extractor’s best parameters Seed length Entropy lossRemarks Non-explicit & Lower bound O(log n/ε)2log(1/ε)+O(1) LRVW02O(log n) (k) Constant ε GUV07O(log n/ε) (k) Sub-constant ε DKSS09O(log n/ε)k/polylog(n)Sub-constant ε
Extractor’s best parameters Seed length Entropy lossRemarks Non-explicit & Lower bound O(log n/ε)2log(1/ε)+O(1) LRVW02O(log n) (k) Constant ε GUV07O(log n/ε) (k) Sub-constant ε DKSS09 We match the result With a direct construction O(log n/ε)k/polylog(n)Sub-constant ε
{0,1} n f Condenser is a hash function G: {0,1} n x {0,1} t → {0,1} m Input f {0,1} n G(f,y) {0,1} m Seed y {0,1} t Output in {0,1} m G y
{0,1} n f With the property that: G y G(f,y) {0,1} m
{0,1} n f With the property that: X {0,1} n of size 2 k G y G(f,y) {0,1} m
With the property that: {0,1} n f X {0,1} n of size 2 k G(X,U t ) is close to having k’ entropy. G y G(f,y) {0,1} m
Parameters We hash n bits to fewer m bits, using t auxiliary truly random bits, s.t. any source with k “entropy” is mapped ε close to having k’ “entropy” The entropy loss of the condenser is k-k’ The entropy rate of the condenser is k’/m
Our goal Our goal is to simultaneously: minimize the seed length, minimize the entropy loss, and, maximize the entropy rate. o(k) entropy loss+ 1-o(1) entropy rate Extractors with sub-linear entropy loss.
Condenser’s best parameters Seed lengthEntropy lossEntropy rate Non-explicit & Lower bound O(log n/ε)01-o(1) GUV07O(log n/ε)0Constant Our main resultO(log n/ε)k/log(n)1-1/log(n)
Lossless Condensers as unbalanced expanders {0,1} n {0,1} m x (y, w) edge (x,(y,w)) present if G(x,y) = w Any set of size 2 k expands to (1- )·2 t ·2 k y
The GUV condenser
The basic condenser: G: q n x q q qnqn f f(y) y The input: f q n is interpreted as a degree n polynomial f(Y) over q qq
The basic condenser: G: q n x q q qnqn f f(y) y The input: f q n is interpreted as a degree n polynomial f(Y) over q qq The seed: y q from the base field q
The basic condenser: G: q n x q q qnqn f f(y) y The input: f q n is interpreted as a degree n polynomial f(Y) over q qq The seed: y q from the base field q The output: An element in the base field q
The basic condenser: G: q n x q q qnqn f f(y) y The input: f q n is interpreted as a degree n polynomial f(Y) over q qq The seed: y q from the base field q The output: An element in the base field q The standard way to view a RS code as a condenser. Encode, use the seed to choose a symbol from the encoded string.
The GUV condenser: G: q n x q ( q ) m qnqn f (f 0 (y),..,f m-1 (y)) y The input: f q n is interpreted as a degree n polynomial f(Y) over q (q)m(q)m The seed: y q from the base field q The output: m elements in the base field q
The GUV condenser: G: q n x q ( q ) m qnqn f (f 0 (y),..,f m-1 (y)) y The input: f q n is interpreted as a degree n polynomial f(Y) over q (q)m(q)m The seed: y q from the base field q The output: m elements in the base field q where: f k = f h k with operations in q n
The GUV condenser: G: q n x q ( q ) m qnqn f (f 0 (y),..,f m-1 (y)) y The input: f q n is interpreted as a degree n polynomial f(Y) over q The seed: y q from the base field q The output: m elements in the base field q where: f k = f h k with operations in q n The standard way to view a PV code as a condenser. Encode, use the seed to choose a symbol from the encoded string. (q)m(q)m
The PV curve C: q n ( q n ) m defined by C(f)=(f 0,..,f m-1 ) with f k = f h k operations are in q n
The GUV condenser is an excellent lossless condenser … but has a bottleneck with the entropy rate
Analyzing GUV (simplified case) qnqn f (f 0 (y),..,f m-1 (y)) y Any S q n of size h m (q)m(q)m has an image of size h m
Proof idea qnqn f (f 0 (y),..,f m-1 (y) y (q)m(q)m 1. Assume G(S) has size < h m
Proof idea qnqn f (f 0 (y),..,f m-1 (y) y (q)m(q)m 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0
Proof idea qnqn f (f 0 (y),..,f m-1 (y)) y 3. Prove that for all f S Q(f,f h,..,f h m-1 )=0 (q)m(q)m 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0
Proof idea qnqn f (f 0 (y),..,f m-1 (y)) y 3. Prove that for all f S Q(f,f h,..,f h m-1 )=0 (q)m(q)m 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0 4. Prove that R(f)= Q(f,f h,..,f h m-1 ) is a non-zero polynomial and conclude that |S| ≤ h m
Proof idea qnqn f (f 0 (y),..,f m-1 (y)) y 3. Prove that for all f S Q(f,f h,..,f h m-1 )=0 (q)m(q)m 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0
Proof idea qnqn f (f 0 (y),..,f m-1 (y) y 3. Prove that for all f S Q(f,f h,..,f h m-1 )=0 qmqm 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0 For every f S, Q(f 0,..,f m-1 )(y) =Q(f 0 (y),..,f m-1 (y)) has: q roots (for each y in q ) deg (Q(f 0,..,f m-1 )) < deg(Q)· n < hmn. Thus, if q>hmn, then Q(f 0,..,f m-1 )=0 in q [Y] and therefore also in q [Y] mod E.
Proof idea qnqn f (f 0 (y),..,f m-1 (y)) y 3. Prove that for all f S Q(f,f h,..,f h m-1 )=0 (q)m(q)m 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0 4. Prove that R(f)= Q(f,f h,..,f h m-1 ) is a non-zero polynomial and conclude that |S| ≤ h m
Proof idea qnqn f (f 0 (y),..,f m-1 (y) y 3. Prove that for all f S Q(f,f h,..,f h m-1 )=0 qmqm 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0 4. Prove that R(f)= Q(f,f h,..,f h m-1 ) is a non-zero polynomial and conclude that |S| ≤ h m As local degrees in Q are at most h, The coefficient of x 0 i 0..x m-1 i m-1 in Q(x 0,..,x m-1 ) is the same as the coefficient of f i in Q(f,f h,..,f h m-1 ) where (i 0,..,i m-1 ) is the base-h representation of i And so R is non-zero iff Q is.
The GUV condenser has constant entropy rate For the analysis to work we need q > hmn For logarithmic seed length we need q=poly(n). Thus, we must have q=h c for some c>1, and the entropy rate is constant. log(q m )= c log(h m ).
A remark The basic condenser also has constant entropy rate. For example the set of all squares in q has as pre-image all square polynomials. So the entropy rate is ½.
To overcome the bottleneck [DW08],[DKSS09] Dvir showed a simple algebraic proof that every Kakeya set must be large. Dvir-Wigderson extended the technique to build better mergers, and from that better extractors. DKSS improved the result by using multiplicities.
Our variant of the GUV condenser
First modification A two stage PV construction
Two levels of extension We take the extension fields p q q n Where: q=p 2 and q = p [Y] mod F, deg(F)=2, and, As before ( q ) n = q [Z] mod E, deg(E)=n
Applying PV twice qnqn f (f 0 (a),..,f m-1 (a)) a The input: f q n (q)m(q)m The seed: a q b p The output: m elements in p (p)m(p)m (f 0 (a)(b),..,f m-1 (a)(b))
Applying PV twice qnqn f (f 0 (a),..,f m-1 (a)) a The input: f q n (q)m(q)m The seed: a q b p The output: m elements in p (p)m(p)m (f 0 (a)(b),..,f m-1 (a)(b)) Where: f i q n is a deg n poly over q f i (a) q is a deg 2 poly over p f i (a)(b) p
Applying PV twice Similar to concatenated codes. Hash and then hash again. But, for the analysis to work we need to analyze the process as a whole.
Applying PV twice – Analysis (simplified case) qnqn f (f 0 (a),..,f m-1 (a)) a (q)m(q)m (p)m(p)m (f 0 (a)(b),..,f m-1 (a)(b)) 1. Assume G(S) has size < h m
Applying PV twice - Analysis qnqn f (f 0 (a),..,f m-1 (a)) a (q)m(q)m (p)m(p)m (f 0 (a)(b),..,f m-1 (a)(b)) 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0
Applying PV twice - Analysis qnqn f (f 0 (a),..,f m-1 (a)) a (q)m(q)m (p)m(p)m (f 0 (a)(b),..,f m-1 (a)(b)) 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0 3. f S, a q, Q(f 0 (a),..,f m-1 (a))=0, Provided that p> deg(Q)=hm.
Applying PV twice - Analysis qnqn f (f 0 (a),..,f m-1 (a)) a (q)m(q)m (p)m(p)m (f 0 (a)(b),..,f m-1 (a)(b)) 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0 3. f S, a q, Q(f 0 (a),..,f m-1 (a))=0, Provided that p> deg(Q)=hm. 4. f S, Q(f,f h,..,f h m-1 )=0, Provided that q=p 2 > n deg(Q)=nhm.
Applying PV twice - Analysis qnqn f (f 0 (a),..,f m-1 (a) a (q)m(q)m (p)m(p)m (f 0 (a)(b),..,f m-1 (a)(b)) 1. Assume G(S) has size < h m 2.Find non-zero Q(x 1,..,x m ) s.t. Each var has local deg < h Q(S)=0 3. f S, a q, Q(f 0 (a),..,f m-1 (a))=0, Provided that p> deg(Q)=hm. 4. f S, Q(f,f h,..,f h m-1 )=0, Provided that q=p 2 > n deg(Q)=nhm. 5. Prove that R(f)= Q(f,f h,..,f h m-1 ) is a non-zero polynomial and conclude that |S| ≤ h m
What did we gain? For the analysis to work we need: p > deg(Q)=hm **the key equation**, and, q = p 2 > n deg(Q) which translates to, p>n and is fine. Compare with p> deg(Q) n = hmn we had before. We still need to gain the m factor.
Massaging Deg(Q) To gain the m factor we need to Work with total degree, and, Work with multiplicities. We should choose Q that vanishes with multiplicity t on the set B=G(S), for some parameter t (t=m 2 ). and this would make the parameters optimal.
We now face a problem How do we know that Q(f,f h,..,f h m-1 ) 0 is not the zero polynomial? The argument before used that Q has local degree at most h in each variable. The argument does not carry over for high ( ht = hm 2 ) total degree.
Second modification 1.A two stage PV construction 2.Change the curve C: q n ( q n ) m from the PV curve C k (f)= f h k to the “covering curve”. The covering curve has the property that deg(C i )= h m-1, and C: p m → ( p ) m covers ( p ) m
Modifying the analysis. Q QChoose Q that vanishes with multiplicity t over B=G(S). |B|=(p/2) m. deg(Q)<pt/2. Q QQ has low degree, and so it cannot vanish with multiplicity t/2 over ( p ) m [DKSS]. The curve C covers ( p ) m and so Q cannot vanish with multiplicity t/2 over the curve. QThus, some t/2-derivative of Q : –does not vanish on the curve. –does vanish with multiplicity t/2 over B. Call this derivative Q and work with it.
Three modifications that work in concert 1.A two stage PV construction 2.Change the curve C: q n ( q n ) m from the PV curve C k (f)= f h k to the “covering curve”. 3. Use total degree and multiplicities plus a new argument to show that Q does not vanish over the curve.
Concluding remarks
A limit on the covering curve approach We want to argue that for every large set B there exists a Q of degree at most ht-1 that vanishes with multiplicity t on B and does not vanish on ( p ) m However, there exists a Kakeya set B of size about (p/2) m, s.t. any homogenous polynomial Q of degree at most pt-1 that vanishes with multiplicity t over B, vanishes over ( p ) m. Indeed we deal with sets B of size at most (p/2) m.
Open problems 1.Can another variant and/or analysis of GUV construct condensers with O(log n) entropy loss and O(log n) seed length? 2.Our results for condensers and extractors (and also previous constructions) work for error ≥2 -log n (for any constant >0). Improve it to =1/n. 3.Our construction for a condenser with >0 error is not strong. Make it strong.
A step in a chain Early work: Extractors As hash functions Trevisan: Extractors as ECC with good distance TZS,SU,U: Extractors From RM code GUV: Condensers from RS,PV code This work: Condensers from PV 2 code, and a special curve What’s next?
≤ > ×≥ ρ α ≤ > ×≥