Presentation is loading. Please wait.

Presentation is loading. Please wait.

Extractors: Optimal Up to Constant Factors

Similar presentations


Presentation on theme: "Extractors: Optimal Up to Constant Factors"— Presentation transcript:

1 Extractors: Optimal Up to Constant Factors
Avi Wigderson IAS, Princeton Hebrew U., Jerusalem Joint work with Chi-Jen Lu, Omer Reingold, Salil Vadhan. To appear: STOC 03

2 Original Motivation [B84,SV84,V85,VV85,CG85,V87,CW89,Z90-91]
Randomization is pervasive in CS Algorithm design, cryptography, distributed computing, … Typically assume perfect random source. Unbiased, independent random bits Can we use a “weak” random source? (Randomness) Extractors: convert weak random sources into almost perfect randomness.

3 Extractors [Nisan & Zuckerman `93]
k-source of length n d random bits (short) “seed” EXT m almost-uniform bits X has min-entropy k ( is a k-source) if x Pr[X = x]  2-k (i.e. no heavy elements).

4 Extractors [Nisan & Zuckerman `93]
k-source of length n d random bits (short) “seed” EXT m bits -close to uniform X has min-entropy k ( is a k-source) if x Pr[X = x]  2-k (i.e. no heavy elements). Measure of closeness: statistical difference (a.k.a. variation distance, a.k.a. half L1-norm).

5 Applications of Extractors
Derandomization of error reduction in BPP [Sip88, GZ97, MV99,STV99] Derandomization of space-bounded algorithms [NZ93, INW94, RR99, GW02] Distributed & Network Algorithms [WZ95, Zuc97, RZ98, Ind02]. Hardness of Approximation [Zuc93, Uma99, MU01] Cryptography [CDHKS00, MW00, Lu02 Vad03] Data Structures [Ta02]

6 Unifying Role of Extractors
Extractors are intimately related to: Hash Functions [ILL89,SZ94,GW94] Expander Graphs [NZ93, WZ93, GW94, RVW00, TUZ01, CRVW02] Samplers [G97, Z97] Pseudorandom Generators [Trevisan 99, …] Error-Correcting Codes [T99, TZ01, TZS01, SU01, U02]  Unify the theory of pseudorandomness.

7 Extractors as graphs (k,)-extractor Ext: {0,1}n  {0,1}d {0,1}m
Ext(x,y) y Sampling Hashing Amplification Coding Expanders (X) k-source X |X|=2k B Discrepancy: For all but 2k of the x {0,1}n, | |(X)  B|/2d - |B|/2m |< 

8 Extractors - Parameters
k-source of length n (short) “seed” d random bits EXT m bits -close to uniform Goals: minimize d, maximize m. Non-constructive & optimal [Sip88,NZ93,RT97]: Seed length d = log(n-k) + 2 log 1/ + O(1). Output length m = k + d - 2 log 1/ - O(1).

9 Extractors - Parameters
k-source of length n (short) “seed” d random bits EXT m bits -close to uniform Goals: minimize d, maximize m. Non-constructive & optimal [Sip88,NZ93,RT97]: Seed length d = log n + O(1). Output length m = k + d - O(1).  = 0.01 k  n/2

10 Explicit Constructions
A large body of work [..., NZ93, WZ93, GW94, SZ94, SSZ95, Zuc96, Ta96, Ta98, Tre99, RRV99a, RRV99b, ISW00, RSW00, RVW00, TUZ01, TZS01, SU01] (+ those I forgot …) Some results for particular value of k (small k and large k are easier). Very useful example [Zuc96]: k=(n), d=O(log n), m=.99 k For general k, either optimize seed length or output length. Previous records [RSW00]: d=O(log n (poly loglog n)), m=.99 k d=O(log n), m = k/log k Off by polylog factors }

11 This Work Main Result: Any k, d=O(log n), m=.99 k
Other results (mainly for general ). Technical contributions: New condensers w/ constant seed length. Augmenting the “win-win repeated condensing” paradigm of [RSW00] w/ error reduction à la [RRV99]. General construction of mergers [TaShma96] from locally decodable error-correcting codes.

12 Condensers [RR99,RSW00,TUZ01]
A (k,k’,)-condenser: k-source of length n d random bits seed Con (,k’)-source of length n’ Lossless Condenser if k’=k (in this case, denote as (k,)-condenser).

13 Repeated Condensing [RSW00]
k-source; length n t=log (n/k) seed0 Con (0,k)-source; length n/2 seed1 Con (20,k)-source; length n/4 (t.0,k)-source; length O(k)

14 1st Challenge: Error Accumulation
Number of steps t=log (n/k). Final error > t.0  Need 0 < 1/t. Condenser seed length > log 1/0 > log t.  Extractor seed length > t.log t which may be as large as log n.loglog n (partially explains seed length of [RSW00]). Solution idea: start with constant 0. Combine repeated condensing w/ error reduction (à la [RRV99]) to prevent error accumulation.

15 Error Reduction Con0 Con0
Con0 w/ error ; condensation ; seed length d seed0 k-source; length n (,k)-source; length n Con0 Con0 seed1 (,k)-source; length n Con has condensation 2; seed length 2d. Hope: error  2. Only if error comes from seeds!

16 Parallel composition Con0 Con0
Con0 : seed error ; condensation ; seed length d; source error ; entropy loss  = k+d-k’ seed0 length n length n Con0 Con0 seed1 length n Con : seed error O(2); condensation 2; seed d+O(log 1/); source error ; entropy loss +1

17 Serial composition Con0 Con0
Con0 : seed error ; condensation ; seed d; source error ; entropy loss  = k+d-k’ Con0 seed0 seed1 Con0 Con : seed error O(); condensation 2; seed 2d; source error (1+1/); entropy loss 2

18 Repeated condensing revisited
Start with Con0 w/ constant seed error   1/18; constant condensation ; constant seed length d; (source error 0; entropy loss 0). Alternate parallel and serial composition loglog n/k + O(1) times.  Con w/ seed error ; condenses to O(k) bits; [optimal] seed length d=O(log n/k); (source error (polylog n/k). 0; entropy loss O(log n/k).0). Home? Not so fast …

19 2nd Challenge: No Such Explicit Lossless Condensers
Previous condensers with constant seed length: A (k,)-condenser w/ n’=n-O(1) [CRVW02] A (k,(k),)-condenser w/ n’=n/100 [Vad03] for k=(n) Here: A (k,(k),)-condenser w/ n’=n/100 for any k (see them later). Still not lossless! Challenge persists …

20 Win-Win Condensers [RSW00]
Assume Con is a (k,(k),)-condenser then k-source X, we are in one of two good cases: Con(X,Y) contains almost k bits of randomness  Con is almost lossless. X still has some randomness even conditioned on Con(X,Y).  (Con(X,Y), X) is a “block source” [CG85]. Good extractors for block sources already known (based on [NZ93,…] )  Ext(Con(X,Y), X) is uniform on (k) bits

21 Win-Win under composition
More generally: (Con,Som) is a win-win condenser if k-source X, either: Con(X,Y) is lossless. Or: Som(X,Y) is somewhere random: a list of b sources one of which is uniform on (k) bits. Parallel composition generalized: Con’(X,Y1Y2) = Con(X,Y1)Con(X,Y2) Som’(X,Y1Y2) = Som(X,Y1)  Som(X,Y2) Serial composition generalized: Con’(X,Y1Y2) = Con(Con(X,Y1),Y2) Som’(X,Y1Y2) = Som(X,Y1)  Som(Con(X,Y1),Y2)

22 Partial Summary We give a constant seed length, (k,(k),)-condenser w/ n’=n/100 (still to be seen). Implies a lossless win-win condenser with constant seed length. Iterate repeated condensing and (seed-) error reduction loglog n/k + O(1) times. Get a win-win condenser (Con,Som) where Con condenses to O(k) bits and Som produces a “short” list of t sources where one of which is a block source (t can be made as small as log(c)n).

23 3rd Challenge: Mergers [TaShma96]
Now we have a somewhere random source X1,X2,…,Xt (one of the Xi is random) t can be made as small as log(c)n An extractor for such a source is called merger [TaS96]. Previous constructions: mergers w/ seed length d=O(log t . log n) [TaS96] . Here: mergers w/ seed length d=O(log t) and seed length d=O(t) (independent of n)

24 New Mergers From LDCs Example: mergers from Hadamard codes.
Input, a somewhere k-source X=X1,X2,…,Xt (one of the Xi is a k-source). Seed, Y is t bits. Define: Con(X,y) = iY Xi Claim: With prob ½, Con(X,Y) has entropy k/2 Proof idea: Assume wlog. that X1 is a k-source. For every y, Con(X,y)  Con(X,ye1) = X1.  At least one of Con(X,y) and Con(X,ye1) contains entropy  k/2.

25 Old Debt – The Condenser
Promised: a (k,(k),)-condenser, w/ constant seed length and n’=n/100. Two new simple ways of obtaining them. Based on any error correcting codes (gives best parameters; influenced by[RSW00,Vad03]). Based on the new mergers [Ran Raz]. Mergers  Condensers Let X be a k-source. For any constant t: X=X1,X2,…,Xt is  a somewhere k/t source.  The Hadamard merger is also a condenser w/ the desired parameters.

26 Some Open Problems Improved dependence on . Possible direction: mergers for t blocks with seed length f(t) + O(log n/). Getting the right constants: d = log n + O(1). m = k + d - O(1). Possible directions: .lossless condenser w/ constant seed lossless mergers Better locally decodable codes.

27 New Mergers From LDCs Generally: View the somewhere k-source X=X1,X2,…,Xt (n)t as a tn matrix. Encode each column with a code C: t  u. X1 Xt Output a random row of the encoded matrix d=log u (independent of n)

28 New Mergers From LDCs C: t  u is (q,) locally decodable erasure code if: For any fraction  of non-erased codeword symbols S. For any message position i. The ith message symbol can be recovered using q codeword symbols from S. Using such C, the above mergers essentially turn a somewhere k-sources to a k/q-source with probability at least 1- .

29 New Mergers From LDCs Hadamard mergers w/ smaller error:
Seed length d=O(t.log 1/). Transform a somewhere k-sources X1,X2,…,Xt to a (k/2, )-source. Reed-Muller mergers w/ smaller error: Seed length d=O(t .log 1/). Transform a somewhere k-sources X1,X2,…,Xt to a ((k), )-source. Note: seed length doesn’t depend on n ! Efficient enough to obtain the desired extractors.

30 Error Reduction Main Extractor: Any k, m=.99 k, d=O(log n).
Caveat: constant . Error reduction for extractors [RRV99] is not efficient enough for this case. Our new mergers + [RRV99,RSW00] give improved error reduction.  Get various new extractors for general . Any k, m=.99 k, d=O(log n), =exp(-log n/log(c)n) Any k, m=.99 k, d=O(log n (log*n)2 + log 1/ ),

31 Source vs. Seed Error Cont.
Defining bad inputs: {0,1}n {0,1}n’ A bad input x: Pr[Con(x,Y) is heavy]>2 (k+a)-source X A “heavy” output z: Pr[Con(X,Y)=z]>1/2k’ # heavy outputs < 2k’  # bad inputs < 2k (a fraction 2-a)

32 Source vs. Seed Error Conclusion
More formally: for a (k,k’,)-condenser Con, (k+log 1/)-source X, set G of “good” pairs (x,y) s.t.: For 1- density xX, Pr[(x,Y)G]>1-2. Con(X,Y)|(X,Y)G is a (k’-log 1/) source.  Can differentiate source error  and seed error . Source error free in seed length!  (Con(x,y1),Con(x,y2)) has source error  and seed error O(2). Don’t need random y1,y2 (expander edge).


Download ppt "Extractors: Optimal Up to Constant Factors"

Similar presentations


Ads by Google