Derandomization: New Results and Applications Emanuele Viola Harvard University March 2006
Useful throughout Computer Science –Algorithms –Learning Theory –Complexity Theory Question: Is Randomness necessary? Randomness in Computation InputAnswer (error probability 1%)
Goal: remove randomness Why study derandomization? Breakthrough [R ‘04]: Connectivity in logarithmic space (SL = L) Breakthrough [AKS ‘02]: Primality in polynomial time (PRIMES 2 P) Derandomization
Goal: simulate randomized computation deterministically Trivial Derandomization: If A uses n random bits, enumerate all 2 n possibilities Probabilistic polynomial-time µ exponential time BPP µ Time(2 poly(n) ) Strong Belief: BPP = P ( Time(poly(n)) ) Complexity Assumptions ) BPP = P [BFNW,NW,IW,…] Randomness vs. Time
Outline Overview of derandomization Derandomization of restricted models –Application: Hardness Amplification in NP –New derandomization Derandomization of general models –BPP vs. PH –Proof of Lower Bound
Constant-Depth Circuits Probabilistic constant-depth circuit (BP AC 0 ) Theorem [N ‘91]: BP AC µ Time(n polylog n ) –Compare to BP P µ Time(2 poly(n) ) Input Depth Random Bits VVVVVVVV ÆÆÆÆÆÆ V
Application: Avg-Case Hardness of NP Study hardness of NP on random instances –Natural question, essential for cryptography Currently cannot relate to P NP [FF,BT,V] Hardness amplification Definition: f : {0,1} n ! {0,1} is -hard if for every efficient algorithm M : Pr x [M(x) f(x)] ¸ Hardness Amplification f.01-hard f ’ (1/2 - )-hard
Previous Results Yao’s XOR Lemma: f 0 (x 1,…, x n ) := f(x 1 ) © © f(x n ) f 0 ¼ (1/2 – 2 -n )-hard, almost optimal Cannot use XOR in NP: f 2 NP ; f 0 2 NP Idea: f´(x 1,…, x n ) = C( f(x 1 ),…, f(x n ) ), C monotone –e.g. f(x 1 ) Æ ( f(x 2 ) Ç f(x 3 ) ). f 2 NP ) f 0 2 NP Theorem [O’D]: There is C s.t. f 0 ¼ (1/2 – 1/n)-hard Barrier: No monotone C can do better!
Theorem [HVV]: Amplification in NP up to ¼ 1/ n –Matches the XOR Lemma Technique: Derandomize! Intuitively, f´ := C( f(x 1 ),…, f(x n ), … … f(x 2 n ) ) f´ (1/2 – 1/2 n )-hard by previous result Problem: Input length = 2 n Note C is constant-depth Derandomize: input length ! n, keep hardness Our Result on Hardness Amplification C f(x 1 ),…, f(x n ), … … f(x 2 n ) f ’
Outline Overview of derandomization Derandomization of restricted models –Application: Hardness Amplification in NP –New derandomization Derandomization of general models –BPP vs. PH –Proof of Lower Bound
Previous Results Recall Theorem [N]: BP AC µ Time(n polylog n ) But AC 0 is weak: Majority AC 0 –Majority(x 1,…,x n ) := i x i > n/2 ? Theorem [LVW]: BP Maj AND µ Time(2 n ) Derandomize incomparable classes ÆÆÆÆÆÆ Maj VVVVVVVV ÆÆÆÆÆÆ V
Theorem [V] : BP Maj AC µ Time(2 n ) Derandomize constant-depth circuits with few Majority gates = Improves on [LVW]. Slower than [N] but richer richest probabilistic circuit class in Time(2 n ) Techniques: Communication complexity + switching lemma [BNS,HG,H,HM,CH] Our New Derandomization ÆÆÆÆÆÆ ÇÇÇÇ Maj InputRandom Bits
Outline Overview of derandomization Derandomization of restricted models –Application: Hardness Amplification in NP –New derandomization Derandomization of general models –BPP vs. PH –Proof of Lower Bound
Probabilistic Polynomial Time (BPP): for every x, Pr [ M(x) errs ] · 1% Strong belief: BPP = P [NW,BFNW,IW,…] Still open: BPP µ NP ? Theorem [SG,L; ‘83]: BPP µ 2 P Recall NP = P ! 9 y M(x,y) 2 P ! 9 y 8 z M(x,y,z) BPP vs. POLY-TIME HIERARCHY
More precisely [SG,L] give BPTime(t) µ 2 Time( t 2 ) Question[Rest of this Talk]: Is quadratic slow-down necessary? Motivation: Lower bounds Know NTime ≠ Time on some models [P+,F+,…] Technique: speed-up computation with quantifiers To prove NTime ≠ BPTime cannot afford 2 Time( t 2 ) The Problem we Study
Input: R = Task: Tell Pr i [ R i = 1] ¸ 99% from Pr i [ R i = 1] · 1% Do not care if Pr i [ R i = 1] ~ 50% (approximate) Model: Depth-3 circuit Approximate Majority R = VVVVVVVV ÆÆÆÆÆÆ V Depth
M(x;u) 2 BPTime(t) R = Compute M(x): Tell Pr u [M(x) = 1] ¸ 99% Compute Appr-Maj from Pr u [M(x) = 1] · 1% BPTime(t) µ 2 Time(t’) = 9 8 Time(t’) Running time t’ Bottom fan-in f = t’ / t –run M at most t’/t times The connection [FSS] VVVVVVVV ÆÆÆÆÆÆ V f |R| = 2 t R i = M(x;i)
Theorem[V] : Small depth-3 circuits for Approximate Majority on N bits have bottom fan-in (log N) Corollary: Quadratic slow-down necessary for relativizing techniques: BPTime A (t) µ 2 Time A (t 1.99 ) Theorem[DvM,V]: BPTime (t) µ 3 Time (t ¢ log 5 t) –Previous result [A]: BPTime (t) µ O(1) Time( t ) For time, the level is the third! Our Results
Outline Overview of derandomization Derandomization of restricted models –Application: Hardness Amplification in NP –New derandomization Derandomization of general models –BPP vs. PH –Proof of Lower Bound
Theorem[V]: 2 N -size depth-3 circuits for Approximate Majority on N bits have bottom fan-in f = (log N) Recall: Tells R 2 YES := { R : Pr i [ R i = 1] ¸ 99% } from R 2 NO := { R : Pr i [ R i = 1] · 1% } Our Negative Result VVVVVVVV ÆÆÆÆÆÆ V f R = |R| = N
Circuit is OR of s depth-2 circuits By definition of OR : R 2 YES ) some C i (R) = 1 R 2 NO ) all C i (R) = 0 By averaging, fix C = C i s.t. Pr R 2 YES [C (x) = 1 ] ¸ 1/s 8 R 2 NO ) C (R) = 0 Claim: Impossible if C has bottom fan-in · log N Proof V C 1 C 2 C 3 C s
Depth-2 circuit ) CNF (x 1 Vx 2 V : x 3 ) Æ ( : x 4 ) Æ (x 5 Vx 3 ) bottom fan-in ) clause size Claim: All CNF C with clauses of size ¢ log N Either Pr R 2 YES [C (x) = 1 ] · 1 / 2 N or there is R 2 NO : C(x) = 1 Note: Claim ) Theorem CNF Claim VVVVV Æ x 1 x 2 x 3 … x N
Definition: S µ {x 1,x 2,…,x N } is a covering if every clause has a variable in S E.g.: S = {x 3,x 4 } C = (x 1 Vx 2 V : x 3 ) Æ ( : x 4 ) Æ (x 5 Vx 3 ) Proof idea: Consider smallest covering S Case |S| BIG : Pr R 2 YES [C (x) = 1 ] · 1 / 2 N Case |S| tiny : Fix few variables and repeat Proof Outline Either Pr R 2 YES [C(x)=1] · 1/2 N or 9 R 2 NO : C(x) = 1
|S| ¸ N ) have N /( ¢ log N) disjoint clauses i –Can find i greedily Pr R 2 YES [ C(R) = 1 ] · Pr [ 8 i, i (R) = 1 ] = i Pr[ i (R) = 1] (independence) · i (1 – 1/100 log N ) = i ( 1 – 1/N O( ) ) = ( 1 – 1/N O( ) ) |S| · e -N (1) Case |S| BIG Either Pr R 2 YES [C(x)=1] · 1/2 N or 9 R 2 NO : C(x) = 1
|S| < N ) Fix variables in S –Maximize Pr R 2 YES [C(x)=1] Note: S covering ) clauses shrink Example (x 1 Vx 2 Vx 3 ) Æ ( : x 3 ) Æ (x 5 V : x 4 ) (x 1 Vx 2 ) Æ (x 5 ) Repeat Consider smallest covering S’, etc. Case |S| tiny Either Pr R 2 YES [C(x)=1] · 1/2 N or 9 R 2 NO : C(x) = 1 x 3 Ã 0 x 4 Ã 1
Recall: Repeat ) shrink clauses So repeat at most ¢ log N times When you stop: Either smallest covering size ¸ N Or C = 1 Fixed · ( ¢ log N) N ¿ N vars. Set rest to 0 ) R 2 NO : C(R) = 1 Q.E.D. Finish up Either Pr R 2 YES [C(x)=1] · 1/2 N or 9 R 2 NO : C(x) = 1
Derandomization: powerful technique Restricted models: Constant-depth circuits (AC 0 ) –Derandomization of AC 0 [N] –Application: Hardness Amplification in NP [HVV] –Derandomization of AC 0 with few Maj gates [V] General models: BPP vs. PH –BPTime(t) µ 2 Time( t 2 ) [SG,L] –BPTime (t) µ 2 Time (t 1.99 ) (w.r.t. oracle) [V] Lower Bound for Approximate Majority Conclusion
Thank you!