Pseudo-randomness
Randomized complexity classes model: probabilistic Turing Machine –deterministic TM with additional read-only tape containing “coin flips” BPP (Bounded-error Probabilistic Poly-time) –L BPP if there is a p.p.t. TM M: x L Pr y [M(x,y) accepts] ≥ 2/3 x L Pr y [M(x,y) rejects] ≥ 2/3 –“p.p.t” = probabilistic polynomial time
Randomized complexity classes RP (Random Polynomial-time) –L RP if there is a p.p.t. TM M: x L Pr y [M(x,y) accepts] ≥ ½ x L Pr y [M(x,y) rejects] = 1 coRP (complement of Random Polynomial-time) –L coRP if there is a p.p.t. TM M: x L Pr y [M(x,y) accepts] = 1 x L Pr y [M(x,y) rejects] ≥ ½
Randomized complexity classes One more important class: ZPP (Zero-error Probabilistic Poly-time) –ZPP = RP coRP –Pr y [M(x,y) outputs “fail”] ≤ ½ –otherwise outputs correct answer
OPEN: BPP = NEXP P RPcoRP NPcoNP PSPACE BPP
Theorem (Adleman): BPP P/poly
BPP and Boolean circuits Proof: –language L BPP –error reduction gives TM M such that if x L of length n Pr y [M(x, y) accepts] ≥ 1 – (½) n 2 if x L of length n Pr y [M(x, y) rejects] ≥ 1 – (½) n 2
BPP and Boolean circuits –say “y is bad for x” if M(x,y) gives incorrect answer –for fixed x: Pr y [y is bad for x] ≤ (½) n 2 –Pr y [y is bad for some x] ≤ 2 n (½) n 2 < 1 –Conclude: there exists some y on which M(x, y) is always correct –build circuit for M, hardwire this y
BPP Next: further explore the relationship between randomness and nonuniformity Main tool: pseudo-random generators
BM; Y – Definition of Pseudorandomness. –G is efficiently computable –“stretches” t bits into m bits –“fools” small circuits: for all circuits C of size at most s : |Pr y [C(y) = 1] – Pr z [C(G(z)) = 1]| ≤ ε
Historical Pseudo-random Linear-Congruential generators: X 0 = random seed mod n X i+1 = a(X i ) + b mod n b i = m.s.b.(X i ) {rightmost half of bits} Blum-Micali [84] X 0 = random seed mod p X i+1 = (g raised to X i ) mod p b i = m.s.b.(X i )
Historical Pseudo-random Blum-Blum-Sub X 0 = random seed mod n = pq p=q=3 mod 4 X i+1 = (X i ) 2 mod n b i = l.s.b.(X i )
Derandomization Goal: try to simulate BPP in subexponential time (or better) use Pseudo-Random Generator (PRG): PRG “good” if it passes all efficient statistical tests seedoutput string G t bitsm bits
Our requirements: –G is efficiently computable –“stretches” t bits into m bits –“fools” small circuits: for all circuits C of size at most s : |Pr y [C(y) = 1] – Pr z [C(G(z)) = 1]| ≤ ε
Simulating BPP using PRGs Recall: L BPP implies exists p.p.t.TM M x L Pr y [M(x,y) accepts] ≥ 2/3 x L Pr y [M(x,y) rejects] ≥ 2/3 given an input x: –convert M into circuit C(x, y) –simplification: pad y so that |C| = |y| = m hardwire input x to get circuit C x Pr y [C x (y) = 1] ≥ 2/3 (“yes”) Pr y [C x (y) = 1] ≤ 1/3 (“no”)
Simulating BPP using PRGs Use a PRG G with –output length m –seed length t « m –error ε < 1/6 –fooling size s = m Compute Pr z [C x (G(z)) = 1] exactly –evaluate C x (G(z)) on every seed z {0,1} t running time (O(m)+(time for G)) 2 t
Simulating BPP using PRGs knowing Pr z [C x (G(z)) = 1], can distinguish between two cases: 01/31/22/31 “yes”: ε 01/31/22/31 “no”: ε
Blum-Micali-Yao PRG Initial goal: for all 1 > δ > 0, we will build a family of PRGs {G m } with: output length m fooling size s = m seed length t = m δ running time m c error ε < 1/6 implies: BPP δ>0 TIME(2 n δ ) ( EXP Why? simulation runs in time O(m+m c )(2 m δ ) = O(2 m 2δ ) = O(2 n 2kδ )
Blum-Micali-Yao PRG PRGs of this type imply existence of one-way- functions –we’ll use widely believed cryptographic assumptions Definition: One Way Function (OWF): function family f = {f n }, f n :{0,1} n {0,1} n –f n computable in poly(n) time –for every family of poly-size circuits {C n } Pr x [C n (f n (x)) f n -1 (f n (x))] ≤ ε(n) –ε(n) = o(n -c ) for all c
Blum-Micali-Yao PRG believe one-way functions exist –e.g. integer multiplication, discrete log, RSA (w/ minor modifications) Definition: One Way Permutation: OWF in which f n is 1-1 –can simplify “Pr x [C n (f n (x)) f n -1 (f n (x))] ≤ ε(n)” to Pr y [C n (y) = f n -1 (y)] ≤ ε(n)
First attempt attempt at PRG from OWF f: –t = m δ –y 0 {0,1} t –y i = f t (y i-1 ) –G(y 0 ) = y k-1 y k-2 y k-3 …y 0 –k = m/t computable in time at most kt c < mt c-1 = m c
First attempt output is “unpredictable”: –no poly-size circuit C can output y i-1 given y k-1 y k-2 y k-3 …y i with non-negl. success prob. –if C could, then given y i can compute y k-1, y k-2, …, y i+2, y i+1 and feed to C –result is poly-size circuit to compute y i-1 = f t -1 (y i ) from y i –note: we’re using that f t is 1-1
First attempt attempt: y 0 {0,1} t y i = f t (y i-1 ) G(y 0 ) = y k-1 y k-2 y k-3 …y 0 y0y0 y1y1 y2y2 y3y3 y4y4 y5y5 ftft ftft ftft ftft ftft G(y 0 ): y0y0 y1y1 y2y2 y3y3 y4y4 y5y5 f t -1 ftft ftft G’(y 3 ): f t -1 same distribution!
First attempt one problem: –hard to compute y i-1 from y i –but might be easy to compute single bit (or several bits) of y i-1 from y i –could use to build small circuit C that distinguishes G’s output from uniform distribution on {0,1} m
First attempt second problem –we don’t know if “unpredictability” given a prefix is sufficient to meet fooling requirement: |Pr y [C(y) = 1] – Pr z [C(G(z)) = 1]| ≤ ε
Hard bits If {f n } is one-way permutation we know: –no poly-size circuit can compute f n -1 (y) from y with non-negligible success probability Pr y [C n (y) = f n -1 (y)] ≤ ε’(n) We want to identify a single bit position j for which: –no poly-size circuit can compute (f n -1 (x)) j from x with non-negligible advantage over a coin flip Pr y [C n (y) = (f n -1 (y)) j ] ≤ ½ + ε(n)
Hard bits For some specific functions f we know of such a bit position j More general: function h n :{0,1} n {0,1} rather than just a bit position j.
Hard bits Definition: hard bit for g = {g n } is family h = {h n }, h n :{0,1} n {0,1} such that if circuit family {C n } of size s(n) achieves: Pr x [C n (x) = h n (g n (x))] ≥ ½ + ε(n) then there is a circuit family {C’ n } of size s’(n) that achieves: Pr x [C’ n (x) = g n (x)] ≥ ε’(n) with: –ε’(n) = (ε(n)/n) O(1) –s’(n) = (s(n)n/ε(n)) O(1)
Goldreich-Levin To get a generic hard bit, first need to modify our one-way permutation Define f’ n :{0,1} n x {0,1} n {0,1} 2n as: f’ n (x,y) = (f n (x), y)
Goldreich-Levin Two observations: –f’ is a permutation if f is –if circuit C n achieves Pr x,y [C n (x,y) = f’ n -1 (x,y)] ≥ ε(n) then for some y * Pr x [C n (x,y * )=f’ n -1 (x,y * )=(f n -1 (x), y * )] ≥ ε(n) and so f’ is a one-way permutation if f is. f’ n (x,y) = (f n (x), y)
Goldreich-Levin The Goldreich-Levin function: GL 2n : {0,1} n x {0,1} n {0,1} is defined by: GL 2n (x,y) = i:y i = 1 x i –parity of subset of bits of x selected by 1’s of y –inner-product of n-vectors x and y in GF(2) Theorem (G-L): for every function f, GL is a hard bit for f’. (proof: problem set)
Distinguishers and predictors Distribution D on {0,1} n D ε -passes statistical tests of size s if for all circuits of size s: |Pr y U n [C(y) = 1] – Pr y D [C(y) = 1]| ≤ ε –circuit violating this is sometimes called an efficient “distinguisher”
Distinguishers and predictors D ε -passes prediction tests of size s if for all circuits of size s: Pr y D [C(y 1,2,…,i-1 ) = y i ] ≤ ½ + ε –circuit violating this is sometimes called an efficient “predictor” predictor seems stronger Yao showed essentially the same! –important result and proof (“hybrid argument”)
Distinguishers and predictors Theorem (Yao): if a distribution D on {0,1} n (ε/n)-passes all prediction tests of size s, then it ε-passes all statistical tests of size s’ = s – O(n).
Distinguishers and predictors Proof: –idea: proof by contradiction –given a size s’ distinguisher C: |Pr y U n [C(y) = 1] – Pr y D [C(y) = 1]| > ε –produce size s predictor P: Pr y D [P(y 1,2,…,i-1 ) = y i ] > ½ + ε/n –work with distributions that are “hybrids” of the uniform distribution U n and D
Distinguishers and predictors –given a size s’ distinguisher C: |Pr y U n [C(y) = 1] – Pr y D [C(y) = 1]| > ε –define n+1 hybrid distributions –hybrid distribution D i : sample b = b 1 b 2 …b n from D sample r = r 1 r 2 …r n from U n output: b 1 b 2 …b i r i+1 r i+2 …r n
Distinguishers and predictors Hybrid distributions: D 0 = U n : D n = D: D i-1 : Di:Di:
Distinguishers and predictors –Define: p i = Pr y D i [C(y) = 1] –Note: p 0 =Pr y U n [C(y)=1]; p n =Pr y D [C(y)=1] –by assumption:ε < |p n – p 0 | –triangle inequality: |p n – p 0 | ≤ Σ 1 ≤ i ≤ n |p i – p i-1 | –there must be some i for which |p i – p i-1 | > ε/n –WLOG assume p i – p i-1 > ε/n can invert output of C if necessary
Distinguishers and predictors –define distribution D i ’ to be D i with i-th bit flipped –p i ’ = Pr y D i ’ [C(y) = 1] –notice: D i-1 = (D i + D i ’ )/2 p i-1 = (p i + p i ’ )/2 D i-1 : Di:Di: D i ’:
Distinguishers and predictors randomized predictor P’ for i th bit: –input: u = y 1 y 2 …y i-1 (which comes from D) –flip a coin: d {0,1} –w = w i+1 w i+2 …w n U n-i –evaluate C(udw) –if 1, output d; if 0, output d Claim: Pr y D,d,w U n-i [P’(y 1 … i-1 ) = y i ] > ½ + ε/n.
Distinguishers and predictors P’ is randomized procedure there must be some fixing of its random bits d, w that preserves the success prob. final predictor P has d * and w * hardwired: C may need to add gate d*d* w*w* circuit for P: Size is s’ + O(n) = s as promised
Distinguishers and predictors Proof of claim: Pr y D,d,w U n-i [P’(y 1 … i-1 ) = y i ] = Pr[y i = d | C(u,d,w) = 1]Pr[C(u,d,w) = 1] + Pr[y i = d | C(u,d,w) = 0]Pr[C(u,d,w) = 0] = Pr[y i = d | C(u,d,w) = 1](p i-1 ) + Pr[y i = d | C(u,d,w) = 0](1 - p i-1 ) u = y 1 y 2 …y i-1
Distinguishers and predictors –Observe: Pr[y i = d | C(u,d,w) = 1] = Pr[C(u,d,w) = 1 | y i = d]Pr[y i =d] / Pr[C(u,d,w) = 1] = p i /(2p i-1 ) Pr[y i = d | C(u,d,w) = 0] = Pr[C(u,d,w) = 0 | y i = d]Pr[y i = d] / Pr[C(u,d,w) = 0] = (1 – p i ’) / 2(1 - p i-1 ) D i-1 DiDi Di’Di’ u = y 1 y 2 …y i-1
Distinguishers and predictors Success probability: Pr[y i =d|C(u,d,w)=1](p i-1 ) + Pr[y i = d|C(u,d,w)=0](1-p i-1 ) We know: –Pr[y i = d | C(u,d,w) = 1] = p i /(2p i-1 ) –Pr[y i = d | C(u,d,w) = 0] = (1 - p i ’)/2(1 - p i-1 ) –p i-1 = (p i + p i ’)/2 –p i – p i-1 > ε/n Conclude: Pr[P’(y 1 … i-1 ) = y i ] = ½ + (p i - p i ’)/2 = ½ + p i /2 – (p i-1 – p i /2) = ½ + p i – p i-1 > ½ + ε/n. p i ’/2 = p i-1 – p i /2
The BMY Generator Recall goal: for all 1 > δ > 0, family of PRGs {G m } with output length m fooling size s = m seed length t = m δ running time m c error ε < 1/6 If one way permutations exist then WLOG there is OWP f = {f n } with hard bit h = {h n }
The BMY Generator Generator G δ = { G δ m }: –t = m δ –y 0 {0,1} t –y i = f t (y i-1 ) –b i = h t (y i ) –G δ (y 0 ) = b m-1 b m-2 b m-3 …b 0
The BMY Generator Theorem (BMY): for every δ > 0, there is a constant c s.t. for all d, e, G δ is a PRG with error ε < 1/m d fooling size s = m e running time m c Note: stronger than we needed –sufficient to have ε < 1/6; s = m
The BMY Generator Proof: –computable in time at most mt c < m c+1 –assume G δ does not (1/m d )-pass statistical test C = {C m } of size m e : |Pr y U m [C(y) = 1] – Pr z D [C(z) = 1]| >1/m d Generator G δ = { G δ m }: –t = m δ ; y 0 {0,1} t ; y i = f t (y i-1 ); b i = h t (y i ) –G δ m (y 0 ) = b m-1 b m-2 b m-3 …b 0
The BMY Generator –transform this distinguisher into a predictor P of size m e + O(m): Pr y [P(b m-1 …b m-i ) = b m-i-1 ] > ½ + 1/m d+1 Generator G δ = { G δ m }: –t = m δ ; y 0 {0,1} t ; y i = f t (y i-1 ); b i = h t (y i ) –G δ m (y 0 ) = b m-1 b m-2 b m-3 …b 0
The BMY Generator –a procedure to compute h t (f t -1 (y)) set y m-i = y; b m-i = h t (y m-i ) compute y j, b j for j = m-i+1, m-i+2…, m-1 as above evaluate P(b m-1 b m-2 …b m-i ) f a permutation implies b m-1 b m-2 …b m-i distributed as (prefix of) output of generator: Pr y [P(b m-1 b m-2 …b m-i ) = b m-i-1 ] > ½ + 1/m d+1 Generator G δ = { G δ m }: –t = m δ ; y 0 {0,1} t ; y i = f t (y i-1 ); b i = h t (y i ) –G δ m (y 0 ) = b m-1 b m-2 b m-3 …b 0
The BMY Generator Pr y [P(b m-1 b m-2 …b m-i ) = b m-i-1 ] > ½ + 1/m d+1 –What is b m-i-1 ? b m-i-1 = h t (y m-i-1 ) = h t (f t -1 (y m-i )) = h t (f t -1 (y)) –We have described a family of polynomial-size circuits that computes h t (f t -1 (y)) from y with success greater than ½ + 1/poly(m) –Contradiction. Generator G δ = { G δ m }: –t = m δ ; y 0 {0,1} t ; y i = f t (y i-1 ); b i = h t (y i ) –G δ m (y 0 ) = b m-1 b m-2 b m-3 …b 0
The BMY Generator y0y0 y1y1 y2y2 y3y3 y4y4 y5y5 ftft ftft ftft ftft ftft G(y 0 ): y0y0 y1y1 y2y2 y3y3 y4y4 y5y5 f t -1 ftft ftft G’(y 3 ): f t -1 b0b0 b1b1 b2b2 b3b3 b4b4 b5b5 b0b0 b1b1 b2b2 b3b3 b4b4 b5b5 same distribution