Extractors: Optimal Up to Constant Factors

Extractors: Optimal Up to Constant Factors
Avi Wigderson IAS, Princeton Hebrew U., Jerusalem Joint work with Chi-Jen Lu, Omer Reingold, Salil Vadhan. To appear: STOC 03

Original Motivation [B84,SV84,V85,VV85,CG85,V87,CW89,Z90-91]
Randomization is pervasive in CS Algorithm design, cryptography, distributed computing, … Typically assume perfect random source. Unbiased, independent random bits Can we use a “weak” random source? (Randomness) Extractors: convert weak random sources into almost perfect randomness.

Extractors [Nisan & Zuckerman `93]
k-source of length n d random bits (short) “seed” EXT m almost-uniform bits X has min-entropy k ( is a k-source) if x Pr[X = x]  2-k (i.e. no heavy elements).

Extractors [Nisan & Zuckerman `93]
k-source of length n d random bits (short) “seed” EXT m bits -close to uniform X has min-entropy k ( is a k-source) if x Pr[X = x]  2-k (i.e. no heavy elements). Measure of closeness: statistical difference (a.k.a. variation distance, a.k.a. half L1-norm).

Applications of Extractors
Derandomization of error reduction in BPP [Sip88, GZ97, MV99,STV99] Derandomization of space-bounded algorithms [NZ93, INW94, RR99, GW02] Distributed & Network Algorithms [WZ95, Zuc97, RZ98, Ind02]. Hardness of Approximation [Zuc93, Uma99, MU01] Cryptography [CDHKS00, MW00, Lu02 Vad03] Data Structures [Ta02]

Unifying Role of Extractors
Extractors are intimately related to: Hash Functions [ILL89,SZ94,GW94] Expander Graphs [NZ93, WZ93, GW94, RVW00, TUZ01, CRVW02] Samplers [G97, Z97] Pseudorandom Generators [Trevisan 99, …] Error-Correcting Codes [T99, TZ01, TZS01, SU01, U02]  Unify the theory of pseudorandomness.

Extractors as graphs (k,)-extractor Ext: {0,1}n  {0,1}d {0,1}m
Ext(x,y) y Sampling Hashing Amplification Coding Expanders …  (X) k-source X |X|=2k B Discrepancy: For all but 2k of the x {0,1}n, | |(X)  B|/2d - |B|/2m |< 

Extractors - Parameters
k-source of length n (short) “seed” d random bits EXT m bits -close to uniform Goals: minimize d, maximize m. Non-constructive & optimal [Sip88,NZ93,RT97]: Seed length d = log(n-k) + 2 log 1/ + O(1). Output length m = k + d - 2 log 1/ - O(1).

Extractors - Parameters
k-source of length n (short) “seed” d random bits EXT m bits -close to uniform Goals: minimize d, maximize m. Non-constructive & optimal [Sip88,NZ93,RT97]: Seed length d = log n + O(1). Output length m = k + d - O(1).  = 0.01 k  n/2

Explicit Constructions
A large body of work [..., NZ93, WZ93, GW94, SZ94, SSZ95, Zuc96, Ta96, Ta98, Tre99, RRV99a, RRV99b, ISW00, RSW00, RVW00, TUZ01, TZS01, SU01] (+ those I forgot …) Some results for particular value of k (small k and large k are easier). Very useful example [Zuc96]: k=(n), d=O(log n), m=.99 k For general k, either optimize seed length or output length. Previous records [RSW00]: d=O(log n (poly loglog n)), m=.99 k d=O(log n), m = k/log k Off by polylog factors }

This Work Main Result: Any k, d=O(log n), m=.99 k
Other results (mainly for general ). Technical contributions: New condensers w/ constant seed length. Augmenting the “win-win repeated condensing” paradigm of [RSW00] w/ error reduction à la [RRV99]. General construction of mergers [TaShma96] from locally decodable error-correcting codes.

Condensers [RR99,RSW00,TUZ01]
A (k,k’,)-condenser: k-source of length n d random bits seed Con (,k’)-source of length n’ Lossless Condenser if k’=k (in this case, denote as (k,)-condenser).

Repeated Condensing [RSW00]
k-source; length n t=log (n/k) seed0 Con (0,k)-source; length n/2 seed1 Con (20,k)-source; length n/4 (t.0,k)-source; length O(k)

1st Challenge: Error Accumulation
Number of steps t=log (n/k). Final error > t.0  Need 0 < 1/t. Condenser seed length > log 1/0 > log t.  Extractor seed length > t.log t which may be as large as log n.loglog n (partially explains seed length of [RSW00]). Solution idea: start with constant 0. Combine repeated condensing w/ error reduction (à la [RRV99]) to prevent error accumulation.

Error Reduction Con0 Con0
Con0 w/ error ; condensation ; seed length d seed0 k-source; length n (,k)-source; length n Con0 Con0 seed1 (,k)-source; length n Con has condensation 2; seed length 2d. Hope: error  2. Only if error comes from seeds!

Parallel composition Con0 Con0
Con0 : seed error ; condensation ; seed length d; source error ; entropy loss  = k+d-k’ seed0 length n length n Con0 Con0 seed1 length n Con : seed error O(2); condensation 2; seed d+O(log 1/); source error ; entropy loss +1

Serial composition Con0 Con0
Con0 : seed error ; condensation ; seed d; source error ; entropy loss  = k+d-k’ Con0 seed0 seed1 Con0 Con : seed error O(); condensation 2; seed 2d; source error (1+1/); entropy loss 2

Repeated condensing revisited
Start with Con0 w/ constant seed error   1/18; constant condensation ; constant seed length d; (source error 0; entropy loss 0). Alternate parallel and serial composition loglog n/k + O(1) times.  Con w/ seed error ; condenses to O(k) bits; [optimal] seed length d=O(log n/k); (source error (polylog n/k). 0; entropy loss O(log n/k).0). Home? Not so fast …

2nd Challenge: No Such Explicit Lossless Condensers
Previous condensers with constant seed length: A (k,)-condenser w/ n’=n-O(1) [CRVW02] A (k,(k),)-condenser w/ n’=n/100 [Vad03] for k=(n) Here: A (k,(k),)-condenser w/ n’=n/100 for any k (see them later). Still not lossless! Challenge persists …

Win-Win Condensers [RSW00]
Assume Con is a (k,(k),)-condenser then k-source X, we are in one of two good cases: Con(X,Y) contains almost k bits of randomness  Con is almost lossless. X still has some randomness even conditioned on Con(X,Y).  (Con(X,Y), X) is a “block source” [CG85]. Good extractors for block sources already known (based on [NZ93,…] )  Ext(Con(X,Y), X) is uniform on (k) bits

Win-Win under composition
More generally: (Con,Som) is a win-win condenser if k-source X, either: Con(X,Y) is lossless. Or: Som(X,Y) is somewhere random: a list of b sources one of which is uniform on (k) bits. Parallel composition generalized: Con’(X,Y1Y2) = Con(X,Y1)Con(X,Y2) Som’(X,Y1Y2) = Som(X,Y1)  Som(X,Y2) Serial composition generalized: Con’(X,Y1Y2) = Con(Con(X,Y1),Y2) Som’(X,Y1Y2) = Som(X,Y1)  Som(Con(X,Y1),Y2)

Partial Summary We give a constant seed length, (k,(k),)-condenser w/ n’=n/100 (still to be seen). Implies a lossless win-win condenser with constant seed length. Iterate repeated condensing and (seed-) error reduction loglog n/k + O(1) times. Get a win-win condenser (Con,Som) where Con condenses to O(k) bits and Som produces a “short” list of t sources where one of which is a block source (t can be made as small as log(c)n).

3rd Challenge: Mergers [TaShma96]
Now we have a somewhere random source X1,X2,…,Xt (one of the Xi is random) t can be made as small as log(c)n An extractor for such a source is called merger [TaS96]. Previous constructions: mergers w/ seed length d=O(log t . log n) [TaS96] . Here: mergers w/ seed length d=O(log t) and seed length d=O(t) (independent of n)

New Mergers From LDCs Example: mergers from Hadamard codes.
Input, a somewhere k-source X=X1,X2,…,Xt (one of the Xi is a k-source). Seed, Y is t bits. Define: Con(X,y) = iY Xi Claim: With prob ½, Con(X,Y) has entropy k/2 Proof idea: Assume wlog. that X1 is a k-source. For every y, Con(X,y)  Con(X,ye1) = X1.  At least one of Con(X,y) and Con(X,ye1) contains entropy  k/2.

Old Debt – The Condenser
Promised: a (k,(k),)-condenser, w/ constant seed length and n’=n/100. Two new simple ways of obtaining them. Based on any error correcting codes (gives best parameters; influenced by[RSW00,Vad03]). Based on the new mergers [Ran Raz]. Mergers  Condensers Let X be a k-source. For any constant t: X=X1,X2,…,Xt is  a somewhere k/t source.  The Hadamard merger is also a condenser w/ the desired parameters.

Some Open Problems Improved dependence on . Possible direction: mergers for t blocks with seed length f(t) + O(log n/). Getting the right constants: d = log n + O(1). m = k + d - O(1). Possible directions: .lossless condenser w/ constant seed lossless mergers Better locally decodable codes.

New Mergers From LDCs Generally: View the somewhere k-source X=X1,X2,…,Xt (n)t as a tn matrix. Encode each column with a code C: t  u. X1 Xt Output a random row of the encoded matrix d=log u (independent of n)

New Mergers From LDCs C: t  u is (q,) locally decodable erasure code if: For any fraction  of non-erased codeword symbols S. For any message position i. The ith message symbol can be recovered using q codeword symbols from S. Using such C, the above mergers essentially turn a somewhere k-sources to a k/q-source with probability at least 1- .

New Mergers From LDCs Hadamard mergers w/ smaller error:
Seed length d=O(t.log 1/). Transform a somewhere k-sources X1,X2,…,Xt to a (k/2, )-source. Reed-Muller mergers w/ smaller error: Seed length d=O(t .log 1/). Transform a somewhere k-sources X1,X2,…,Xt to a ((k), )-source. Note: seed length doesn’t depend on n ! Efficient enough to obtain the desired extractors.

Error Reduction Main Extractor: Any k, m=.99 k, d=O(log n).
Caveat: constant . Error reduction for extractors [RRV99] is not efficient enough for this case. Our new mergers + [RRV99,RSW00] give improved error reduction.  Get various new extractors for general . Any k, m=.99 k, d=O(log n), =exp(-log n/log(c)n) Any k, m=.99 k, d=O(log n (log*n)2 + log 1/ ),

Source vs. Seed Error Cont.
Defining bad inputs: {0,1}n {0,1}n’ A bad input x: Pr[Con(x,Y) is heavy]>2 (k+a)-source X A “heavy” output z: Pr[Con(X,Y)=z]>1/2k’ # heavy outputs < 2k’  # bad inputs < 2k (a fraction 2-a)

Source vs. Seed Error Conclusion
More formally: for a (k,k’,)-condenser Con, (k+log 1/)-source X, set G of “good” pairs (x,y) s.t.: For 1- density xX, Pr[(x,Y)G]>1-2. Con(X,Y)|(X,Y)G is a (k’-log 1/) source.  Can differentiate source error  and seed error . Source error free in seed length!  (Con(x,y1),Con(x,y2)) has source error  and seed error O(2). Don’t need random y1,y2 (expander edge).

Extractors: Optimal Up to Constant Factors

Similar presentations

Presentation on theme: "Extractors: Optimal Up to Constant Factors"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Extractors: Optimal Up to Constant Factors

Similar presentations

Presentation on theme: "Extractors: Optimal Up to Constant Factors"— Presentation transcript:

Similar presentations

About project

Feedback