Randomness Extraction: A Survey David Zuckerman University of Texas at Austin
Randomness in Computer Science Many uses of randomness in CS. Randomized algorithms Cryptography Distributed computing But: high-quality randomness expensive. Can low-quality (weak) randomness suffice?
Models for Weak Randomness Independent bits with same, unknown bias [von Neumann ’51] Semirandom sources [Santha-Vazirani ‘84] δ < Pr[Xi|X1=x1,…,Xi-1=xi-1] < 1-δ Block sources [Chor-Goldreich ‘85] Bit-fixing sources [CFGHRS ‘85,…] k uniform bits; others set by adversary.
General Weak Random Source [Z ‘90] Random variable X on {0,1}n. General model: min-entropy Flat source: Uniform on A, |A| ≥ 2k. {0,1}n |A| ³ 2k
General Weak Random Source [Z ‘90] Can arise in different ways: Physical source of randomness. Cryptography: condition on adversary’s information, e.g. bounded storage model. Pseudorandom generators (for space s machines): condition on TM configuration.
Goal: Extract Randomness n bits m bits Ext statistical error Problem: Impossible, even for k=n-1, m=1, ε<1/2.
Impossibility Proof f-1(0) f-1(1) Suppose f:{0,1}n {0,1} satisfies ∀sources X with H∞(X) ≥ n-1, f(X) ≈ U. f-1(0) f-1(1) Take X=f-1(0)
Randomness Extractor: short seed [Nisan-Z ‘93,…, Guruswami-Umans-Vadhan ‘07] d=O(log (n/ε)) random bit seed Y n bits m =.99k bits Ext statistical error Strong extractor: (Ext(X,Y),Y) ≈ Uniform
Outline Seeded Extractors Seedless Extractors for Structured Sources Basic applications Alternate view with applications Sketch of two constructions Seedless Extractors for Structured Sources Algebraic sources: independent, affine, … Applications in cryptography Complexity-theoretic sources Crypto-tailored Extractors
Simulating Randomized Algorithms Randomized algorithm R using m random bits. Assume only random bits X have H∞(X)≥k>m. No high-quality randomness available. Given Ext for H∞(X)≥k seed length d, output length m. Simulation with factor 2d blowup: Run R with random string Ext(x,y1),…,Ext(x,y2d). Take majority vote or median.
Use in Privacy Amplification [Bennett, Brassard, Robert 1985] public Goal: convert weak shared secret X to uniform secret. Unbounded passive adversary. Y Pick Shared secret = Ext(X,Y). Correct by strong extractor definition.
PRGs for Space-Bounded Machines Basic PRG: G(x,y) = (x,Ext(x,y)) [Nisan-Z] Condition on configuration v after read x. Whp Hence whp Ext(X,Y) close to uniform. G:{0,1}O(s) {0,1}poly(s) fools space s TMs [NisanZ] Sometimes can avoid union bound! O(log n log log n) bit seed fools read-once polylog-width “regular” BPs [BRRY ‘10,BV ‘10] O(log n) bit seed fools read-once O(1)-width permutation BPs [KNP]. BRRY: Braverman-Rao-Raz-Yehudayoff BV: Brody-Verbin KNP: Koucky-Nimbhorkar-Pudlak
PRGs from Shrinkage Hardness vs. Randomness paradigm: Lower bounds give PRGs [Nisan-Wigderson,…]. But: need superpolynomial lower bounds. Known: polynomial lower bounds for restricted models. E.g., formulas Ω(n3/polylog n) [Andreev, Hastad]. [Impagliazzo, Meka, Z 2012]: polynomial lower bounds proved via shrinkage give PRGs. E.g., seed length s1/3+o(1) fools size s formulas.
Graph-Theoretic View: “Expansion” N=2n output uniform K=2k M=2m x y Ext(x,y) (1-)M D=2d Can use this to construct expanders beating eigenvalue bound [WZ]
K-Expanding Graphs N |A|≥K |Γ(A)|>N-K K Useful for sorting, networks Goal: minimize degree D D>N/K Random graphs: D=O((N/K) log (N/K)) 2nd Eigenvalue: D≥(N/K)2/2 Extractors: D=N1+o(1)/K [Wigderson-Z ‘93] K
Extractors K-Expanding Graphs M (1-)M (1-)M K K-Expanding Graph: V=[N] E=Paths of length 2 in Ext
Alternate View M=2m N=2n D=2d BADS S x Other direction: ErrorS ≤ |BADS|2-k + ε
Averaging Sampler via Alternate View [Z ‘96] Goal: Estimate mean μ of Black box access to f. Algorithm: Pick x randomly in {0,1}n. Sample f at Γ(x) = {x1,…,xD}. Output μf. Pr[error] = |BADf|/2n. Can use 1.01m random bits with Pr[error]=2-Ω(m).
Extractor Perspective Helps Proposition: Sampler using O(m) random bits implies sampler using 1.01m random bits. Equivalent Statement: Extractor outputting Ω(k) bits implies extractor outputting .99k bits. Ext(x,(y1,y2)) = Ext(x,y1)Ext(x,y2) [Wigderson-Z] Conditioned on Ext(X,y1) of length m, still ≈k-m bits of entropy in X.
Extractor Codes via Alt-View [Ta-Shma-Z 2001] List recovery – generalizes list decoding. S=(S1,…,SD), agreement = |{i|xi in Si}| |{Codewords with agreement ≥(μ(S) + ε)D}| ≤ |BADS|. Extractor codes with efficient decoding give hardcore bits Ext(x,y) wrt 1-way (f(x),y). Codes Extractors [Tre,TZS, SU, GUV].
Max Clique and Chromatic Number [FGLSS,…,Hastad]: Max Clique inapproximable to n1-, any >0, assuming NP ZPP. [LY,…,FK]: Same for Chromatic Number. Derandomize with linear degree extractors: Thm [Z]: Both inapproximable to n1-, any >0, assuming NP P.
Constructions of Strong Extractors Restrictions Degree D=2d Output Length m Existence None (n-k)/ε2 k – 2lg(1/ε) Leftover Hash Lemma [ILL] 2n GUV 2007 (n/ε)O(1) (1-α)k nO(log(k/ε)) k – 2lg(1/ε)-O(1) DKSS 2009 ε≥1/logcn nO(1) (1-1/logcn)k Z 2006 k=Ω(n) ε=Ω(1) O(n) DKSS: Dvir-Kopparty-Saraf-Sudan
Pseudorandom Generators random seed pseudorandom PRG Cryptographically secure PRGs: Run in time less than adversary. Exist iff one-way functions exist [HILL]. PRGs for derandomization: Can take slightly more time than adversary. Exist iff “hard” functions exist [Nisan-Wigderson ...]
PRGs from Hard Functions [Nisan-Wigderson 1988 …] random seed comp. error ε PRG
NW-Style PRGs Give Extractors [Trevisan 1999] seed n bits Ext statistical error View x as hard function f:{0,1}lg n {0,1} Most functions hard Set Ext(x,y) = NW-PRG(f,y) Better: Ext(x,y) = NW-PRG(Code(f),y)
Linear Degree Extractor [Z] (Sketch) + O(1) random bits Condense: .9 Extract: + lg n+O(1) random bits uniform
Condensing via Incidence Graph lines points = Fq2 P (L,P) an edge iff P on L |P|3/2 edges L 1-Bit Somewhere Condenser: Input: edge Output: random endpoint Condenses rate to rate (1+), some > 0. Proof uses bound on incidences [BKT]+ probabilistic lemma. Combine with technique of [Raz] to get actual condenser.
High Entropy Extractor Chernoff bound for random walks on expanders [Gillman,Kahale] Implies Sampler Implies Extractor.
Seeded Extractor Techniques/History Hashing based: Z ’90-91, Nisan-Z ‘93, Wigderson-Z ‘93, Srinivasan-Z ’94, Z ‘96, Ta-Shma ‘96, Raz-Reingold-Vadhan ‘99, Reingold-Shaltiel-Wigderson ‘00, NW-PRG based: Trevisan ’99, Raz-Reingold-Vadhan ‘99, Impagliazzo-Shaltiel-Wigderson ‘99-00, Ta-Shma-Umans-Z ‘01 Algebraic/coding theory based: Ta-Shma-Z-Safra ’01, Shaltiel-Umans ‘01, Lu-Reingold-Vadhan-Wigderson ‘03, Gurusmami-Umans-Vadhan ‘07, Ta-Shma-Umans ’12 Additive combinatorics based: Barak-Kindler-Shaltiel-Sudakov-Wigderson ’05, Raz ‘05, Z ’07, Dvir-Wigderson ‘08, Dvir-Kopparty-Sharaf-Sudan ‘09
Seedless (Deterministic) Extractors for Structured Sources Probabilistic Method: If ≤ sources of min-entropy k: Can deterministically extract m=(1-α)k bits with error 2-αk/3. Algebraic sources: Bit-fixing, affine. Independent sources. Complexity-theoretic sources: AC0 sources, small-space sources.
Oblivious Bit-Fixing Sources Example: ?0010?111??11. ? = uniform on {0,1}. (n-k) bits fixed by adversary; k uniform bits. Parity extracts 1 bit. For k≥logc n, can extract k-o(k) bits [GRS, Rao]. Application: Exposure Resilient Cryptography. Adversary learns many bits of secret key. Can still do cryptography. GRS: Gabizon-Raz-Shaltiel
Affine Extractors X = random element from affine subspace. Generalizes bit-fixing sources. Extractor for min-entropy αn, any α>0 [Bourgain]. 1-bit disperser for min-entropy exp(log.9 n) [Shaltiel]. Large fields: any k>0 [Gabizon-Raz].
Independent Sources n bits n bits Ext m =Ω(k) bits statistical error
Classical: entropy rate > 1/2 Lindsey Lemma: H∞ (X) + H∞ (Y) > n+t implies X.Y ≈ U, error 2-t/2.
Independent Sources # sources k=H∞(X) Restrictions Existence 2 k ≥ 2log n None Bourgain k ≥ .499n BRSW k ≥ nα Disperser Li 3 k ≥ n1/2+α Rao-Z Uneven lengths O(1) k ≥ log3 n BRSW: Barak-Rao-Shaltiel-Wigderson
Cryptography with Weak Sources Players have independent weak sources. Allow Byzantine faults. For 2 players, impossible [DOPS]. For more players, possible! DOPS: Dodis-Ong-Prabhakaran-Sahai DO: Dodis-Oliveira GSV: Goldwasser-Sudan-Vaikuntanathan
Network Extractor Protocol [Goldwasser-Sudan-Vaikunthanatan05, Dodis-Oliveira03] 010101010 01001 Input: x1,…,xp 2 {0,1}n from independent weak random sources 01010101 01001 011011011 11010 Byzantine faults: can send arbitrary messages 001010101 01001 100100101 10100 p processors communicate via point-to-point channels. Unkown t are faulty. We allow Byzantine faults. We assume communication channels are not private, so adversary can see all communication. This is called “full information model.” In this talk we focus only on our results for the synchronous setting. Output: z1,…,zp 2 {0,1}m private nearly-uniform random strings (for honest parties) 010111101 10101 011110101 11001 010100101 10110
Network Extractor Protocols After running network extractor protocol, run standard protocol, e.g., Byzantine Agreement. Naïve idea to design protocol: A few players broadcast sources. Remaining players apply independent-source extractor to those sources and own source. Problem: what if only malicious players broadcast?
Network Extractor Constructions Information-theoretic setting [Kalai-Li-Rao-Z]: For k ≥ exp(logα n), can still tolerate linear number of faults in BA and leader election, any α>0. Computational setting [Kalai-Li-Rao]: Under certain crypto assumptions, for k = αn, secure multiparty computation if ≥ 2 honest players. Under certain crypto assumptions, 2-source extractors for k = αn, any α>0.
Complexity-Theoretic Sources X=f(U), complexity(f) small. Deterministic extraction possible under assumptions [Trevisan-Vadhan ‘00]. No assumptions: NC0 [De-Watson ‘11, Viola ‘11] AC0 [Viola ‘11] Proofs reduce to low-weight affine extractors [Rao ‘09].
Small Space Sources Space s source: min-entropy k source generated by width 2s branching program. n+1 layers 1/, 0 1-1/, 0 1,1 0.1,0 0.8,1 0.3,0 0.5,1 0.1,1 width 2s 1 1 1 1
Bit Fixing Sources can be modelled by Space 0 sources 0.5,1 0.5,1 0.5,1 1,1 1,0 1,1 0.5,0 0.5,0 0.5,0 ? 1 ? ? 0 1
Extractors for Small Space Sources For k ≥ αn, any α>0, space αβn, β>0 sufficiently small, can extract k-o(k) bits [Kamp-Rao-Vadhan-Z ‘06]. Proof reduces to variants of independent sources by conditioning on intermediate states.
Crypto-Tailored Extractors Fuzzy extractors Noise tolerant [Dodis-Ostrovsky-Reyzin-Smith ‘04] Correlation extractors [Ishai-Kushilevitz-Ostrovsky-Sahai ‘09]. Non-malleable extractors [Dodis-Wichs ’09]
Privacy Amplification With Active Adversary public Y Pick Shared secret = Ext(X,Y). Problem: Active adversary could change Y to Y’.
Active Adversary Can arbitrarily insert, delete, modify, and reorder messages. E.g., can run several rounds with one party before resuming execution with other party.
Non-Malleable Extractor [Dodis-Wichs 2009] Strong extractor: (Ext(X,Y),Y) ≈ (U,Y). nmExt is a non-malleable extractor if for arbitrary A:{0,1}d {0,1}d with y’ = A(y) ≠ y. (nmExt(X,Y),nmExt(X,Y’),Y) ≈ (U,nmExt(X,Y’),Y) Can’t ignore a bit of the seed. Existence: k > log log n + c, d = log n + O(1), m = (k-log d)/2.01. Gives privacy amplification with active adversary in 2 rounds with optimal entropy loss.
Explicit Non-Malleable Extractor Even k=n-1, m=1 nontrivial. E.g., Ext(x,y) = x.y. X=0??...?, y’=A(y) flips first bit, x.y’= x.y. Dodis-Li-Wooley-Z 2011: H∞ (X) > n/2. Cohen-Raz-Segev 2012: Seed length O(log n). Li 2012: H∞ (X) > .499n. Connection with 2-source extractors.
A Simple 1-Bit Construction [Li] Sidon set: set S with all s+t, s,t in S, distinct. Example: S={(x,x3)|x in F2n/2}. Thm [Li]: f(x,y) = x.y, y uniform from S, nonmalleable extractor for H∞ (X) > n/2. Proof: H∞ (Y) = n/2, so X.Y ≈ U (Lindsey’s lemma). Suffices to show X.Y+X.A(Y) ≈ U (XOR lemma). X.Y+X.A(Y) = X.(Y+A(Y)). H∞ (Y+A(Y)) = H∞ (Y) = n/2.
Proof Via Character Sum Estimate For m=1, we show For larger m, consider (χ,χ’) with χ’ nontrivial. Give “non-uniform” XOR lemma. nmExt(x,A(y)) need not be uniform.
Conclusions Extractors Crypto Expanders Coding Theory Inapproximability PRGs Interesting mathematics used in constructions: additive combinatorics, coding theory, random walks on expander graphs, hashing, …
Open Questions Seeded Extractors Seedless Extractors O(n) degree for all min-entropy. O(log n) seed to extract k - 2log(1/ε) – O(1). Seedless Extractors 2-source extractors for min-entropy αn, any α>0. Affine extractors for min-entropy nα. Other general models. Crypto-Tailored Extractors Non-malleable extractors for min-entropy αn. Other Applications & Connections.