If NP languages are hard on the worst-case then it is easy to find their hard instances Danny Gutfreund, Hebrew U. Ronen Shaltiel, Haifa U. Amnon Ta-Shma, Tel-Aviv U.
Impagliazzo ’ s worlds Algorithmica: SAT NP. NP=P Heuristica: NP is easy on avg on EASY distributions. Pessiland: NP hard on the avg, but no OWF. Minicrypt: OWF, but no public-key crypto. Cryptomania: Public-key cryptography.
Pseudo-P [IW98,Kab-01] L Pseudo p -P if there exists a polynomial-time algorithm A=A L s.t.: For every samplable distribution {D n }, for every input length n, Pr x Dn [ A(x)=L(x) ] > p D={D n } is samplable if there exists S P s.t. S(1 n,U p(n) )=D n
Distributional complexity (Levin) Def: L Avg p(n) P, if for every samplable distribution D there exists A=A L,D P s.t. Pr x Dn [A(x)=L(x)] ≥ p(n)
Heuristica vs. Super-Heuristica Heuristica: every (avg-case) solution to some hard problem is bound to a specific distribution. If the distribution changes we need to come up with a new solution. Super-Heuristica: once a good heuristic for some NP-complete problem is developed then every new problem just needs to be reduced to it. Natural for cryptography (and lower-bounds) Natural for algorithms Natural for complexity, Naturally appears in derandomization (IW98,K01,..)
Connection to cryptography The right hardness for cryptography. E.g., a standard assumption in cryptography is that (FACTORING,D) AvgBPP Where D is the samplable distribution obtained by sampling primes p,q and outputting N =pq.
A remark about reductions For the distributional setting one needs to define “ approximation preserving reductions ” [Levin] [L86,BDCGL90,Gur90,Gur91] Showed complete problems in DistNP. [IL90] showed a reductions to the uniform distribution. For Pseudo-classes any (Cook/Karp) reduction is good. if L reduces to L ’ via R then for every samplable D, R(D) is samplable. So SAT PseudoP NP PseudoP
A refined picture Algorithmica: NP=P Super-Heuristica: NP Pseudo-P. Pessiland: NP hard on the avg, but no OWF. Heuristica: NP Avg-P.
Our main result NP P NP Pseudo 2/3+ε P Worst-case hardness weak average-case hardness Also, NP BPP NP Pseudo 2/3+ε BPP Compare with the open problem: NP BPP ? NP Avg 1-1/p(n) BPP
Back to Impagliazzo ’ s worlds Algorithmica: NP=P Super-Heuristica: NP Pseudo-P. Pessiland: NP hard on the avg, but no OWF. Heuristica: NP Avg-P.
In words Super-Heuristica does not exist: if we do well on every samplable distribution we do well on every input. Heuristics for NP-complete problems will always have to be bound to specific samplable distributions (unless NP is easy on the worst-case).
Main Lemma Given a description of a poly-time algorithm that fails to solve SAT, we can efficiently produce on input 1 n up to 3 formulas (of length n) s.t. at least one is hard for the algorithm. We also have a probabilistic version.
Proof - main lemma We are given a description of DSAT, and we know that DSAT fails to solve SAT. The idea is to use DSAT to find instances on which it fails. Think of DSAT as an adversary.
First question to DSAT: can you solve SAT on the worst-case? Write as a formula: (n)= {0,1}^n [SAT( ) DSAT( )] Problem - not an NP formula: (n)= {0,1}^n [ ( )=true DSAT( )=0] [ ( )=false DSAT( )=1]
Search to decision E.g., starting with a SAT sentence (x 1, …,x n ) DSAT claims is satisfiable. For each variable try setting x i =0,x i =1, If DSAT says none is satisfiyng, we found a contradiction. Otherwise, chose x i so that DSAT says it is satisfyable. At the end check the assignment is satisfying.
Can SSAT solve SAT on the worst-case? SSAT has one-sided error: it can ’ t say “ yes ” on an unsatisfied formula. (n)= {0,1}^n [ ( )=true SSAT( )=no] Notice that (n) SAT Notice that we use the of DSAT code
First question to DSAT: can SSAT solve SAT on the worst-case? If DSAT( (n))=false output (n). [Note that (n) SAT ] Otherwise, run the search algorithm on (n) with DSAT. Case 1: the search algorithm fails. Output the three contradicting statements. DSAT( (n)[ 1 … i x i+1 … x m ])=true, DSAT( (n)[ 1 … i 0 … x m ])=false, and, DSAT( (n)[ 1 … i 1 … x m ])=false. Case 2: The search algorithm succeeds. We hold SAT such that SSAT( )=false. Or DSAT( (n)[ 1 … m ])=false
Are we done? We hold on which SSAT is wrong ( SAT but SSAT( )=false ) What we need is a sentence on which DSAT is wrong.
Now work with If DSAT( (n))=false output (n). [Note that (n) SAT ] Otherwise, run the search algorithm on (n) with DSAT. Case 1: the search algorithm fails. Output the three contradicting statements. Case 2: The search algorithm succeeds. SSAT finds a satisfying assignent for . Case 2 never happens, SSAT( )=false.
Comments about the reduction Our reduction is non-black-box (because we use the description of the TM, and the search to decision reduction), and,. it is adaptive (even if we use parallel search to decision reductions [BDCGL90]). So it does not fall in the categories ruled out by [Vio03,BT03] (for average classes)
Dealing with probabilistic algorithms If we proceed as before we get: (n)= {0,1}^n [ ( )=t Pr r [SSAT( ,r)=1]<2/3 ] Problem: (n) is an MA statement. We do not know how to derandomize without unproven assumptions. Solution: Derandomize using Adelman (BPP P/Poly)
Back to the proof We replace the formula (n)= {0,1}^n [ ( )=t Pr r [SSAT( ,r)=1]<2/3 ] with a distribution over formulas: (n,r ’ )= {0,1}^n [ ( )=t Pr r [SSAT( ,r ’ )=0]<2/3 ] With very high probability SSAT ’ (input,r ’ ) behaves like SSAT and the argument continues as before.
A weak Avg version Thm: Assume NP RP. Let f(n)=n (1). Then there exists a distribution D samplable in time f(n), such that for every NP-complete languge L, (L,D) Avg 1-1/n^3 BPP Remark: the corresponding assertion for deterministic classes can be proved directly by diagonalization. The distribution is more complicated than the algorithms it ’ s hard for.
Why worst-case to avg-case reductions are hard? Thm 2 says that the first is not the problem. 1. An exponential search space. 2. A weak machine has to beat stronger machines. Here are two possible exlanations:
Proof of Theorem 2 – cont. We define the distribution D={D n } as follows: on input 1 n choose uniformly a machine M from K log(n) run it for (say) n log(n) steps. If it didn ’ t halt, output 0 n otherwise, output the output of M (trimmed or padded to length n). K m - the set of all probabilistic TM of description length at most m.
Proof of Theorem 2 – cont. By Thm 1 for every algorithm A, exists a samplable distribution D that outputs hard instances for it. With probability at least n -2 we choose the machine that generates D, and then with probability > 1/3 we get a hard instance for A.
Hardness amplification for Pseudo-classes. Reminisicent of hardness amplification for AvgBPP, but: Many old techniques don ’ t work. E.g.: Many reconstruction proofs don ’ t work, because the reconstructing algorithm can not sample the distribution. Some techniques work better. E.g., boosting. If for every samplable distribution the algorithm can find a witness for non-negligble fraction of inputs, then it finds a witness for almost all inputs in any samplable distribution.
We proved NP BPP P ||,NP Pseudo ½ +ε BPP. Open problem: NP BPP NP Pseudo ½ +ε BPP Using: [VV,BDCGL90] parallel search to decision reduction, error correcting codes, boosting The first two are common ingredients in hardness amplification. Boosting is a kind of replacement to the hard core lemma.
Some more slides
Summarizing the proof We have reduced the problem of searching for hard instances to an NP search problem (via downwards self-reducibility, and Adelman). We then run the search with DSAT which either contradicts itself, or keeps on cheating on formulas decreasing in size until we can check for ourselves. (Similar to the idea that underlies the proof that IP=PSPACE).
Part 2: Our results
Proof of Theorem 1 Recall Thm 1: Follows (almost) directly from the main lemma. The only problem that we jump between input lengths. We define two samplable distributions one from the first search (on ) and the second from the second search (on ). The lemma says that if an algorithm succeeds on both, then it succeeds on every input.
Avg-P (Levin+Impagliazzo) L Avg p(n) -P if for every samplable distribution D={D n }, (L,D) Avg p(n) -P In words: for every samplable distribution, there exists a good algorithm.
Foundations of cryptography Can we get OWF from worst-case hardness? Example: Lattice based problems [Ajt96,AD96,Reg03,RM04]. Can we base OWF on worst-case hardness of NP-complete languages? The lattice problems are probably not NP- complete [GG00,AR04].
Disucssion: Worst-case to avg- case reductions Can we use the min-max theorem? Examples: hardcore lemma [Imp95], pseudoentropy [BSW03].
Other open questions Instance checker for NP. Hardness amplification in the pseudo setting. Is it necessary to use the description of the machines in the reduction?
Part 1: Definitions Distributional complexity Samplable distributions Avg p(n) -P Pseudo p(n) -P
P L P if there exists a polynomial-time algorithm such that: For every input x {0,1} *, A(x)=L(x)
Samplable distributions Def: D={D n } is samplable if: Each D n is a distribution over {0,1} n. There exists a probabilistic polynomial time algorithm S, that on input 1 n outputs an n bit output according to the distribution D n. S(1 n,U p(n) ) = D n
Pseudo-BPP: Natural from an algorithmic point of view Often we do not know the distribution on the inputs except that the distribution was generated by an efficient process. We would like to have one algorithm that is good for every samplable distribution.
Pseudo-BPP: Arises naturally in derandomization Arises naturally in the context of derandomization under uniform assumptions [IW98,Kab01,Lu01,TV02,GST03].
Worst-case to avg-case reductions – known results High complexity classes (EXP,PSPACE,#P) – possible [BFNW93,IW97,STV99,TV02]. Main idea: encode the truth-table of the function (via ECC). S.t.: 1. The encoded tt appears in the encoding. 2. Local decoding. Requires access to the whole truth-table.
Worst-case to avg-case reductions – known results Within NP – lattice problems, probably not NP-complete. Negative results for NP-complete problems: 1. Impossible via black-box reductions (unconditionally) [Vio03]. 2. Impossible via non-adaptive self-correction (conditionally) [BT03].