NP-complete Problem Prof. S M Lee Department of Computer Science
Decision Problems To keep things simple, we will mainly concern ourselves with decision problems. These problems only require a single bit output: ``yes'' and ``no''. How would you solve the following decision problems? Is this directed graph acyclic? Is there a spanning tree of this undirected graph with total weight less than w? Does this bipartite graph have a perfect (all nodes matched) matching? Does the pattern p appear as a substring in text t?
P P is the set of decision problems that can be solved in worst-case polynomial time: If the input is of size n, the running time must be O(nk). Note that k can depend on the problem class, but not the particular instance. All the decision problems mentioned above are in P.
Nice Puzzle The class NP (meaning non-deterministic polynomial time) is the set of problems that might appear in a puzzle magazine: ``Nice puzzle.'' What makes these problems special is that they might be hard to solve, but a short answer can always be printed in the back, and it is easy to see that the answer is correct once you see it. Example... Does matrix A have an LU decomposition? No guarantee if answer is ``no''.
NP Technically speaking: A problem is in NP if it has a short accepting certificate. An accepting certificate is something that we can use to quickly show that the answer is ``yes'' (if it is yes). Quickly means in polynomial time. Short means polynomial size. This means that all problems in P are in NP (since we don't even need a certificate to quickly show the answer is ``yes''). But other problems in NP may not be in P. Given an integer x, is it composite? How do we know this is in NP?
Good Guessing Another way of thinking of NP is it is the set of problems that can solved efficiently by a really good guesser. The guesser essentially picks the accepting certificate out of the air (Non-deterministic Polynomial time). It can then convince itself that it is correct using a polynomial time algorithm. (Like a right-brain, left-brain sort of thing.) Clearly this isn't a practically useful characterization: how could we build such a machine?
Exponential Upperbound Another useful property of the class NP is that all NP problems can be solved in exponential time (EXP). This is because we can always list out all short certificates in exponential time and check all O(2nk) of them. Thus, P is in NP, and NP is in EXP. Although we know that P is not equal to EXP, it is possible that NP = P, or EXP, or neither. Frustrating!
NP-hardness As we will see, some problems are at least as hard to solve as any problem in NP. We call such problems NP-hard. How might we argue that problem X is at least as hard (to within a polynomial factor) as problem Y? If X is at least as hard as Y, how would we expect an algorithm that is able to solve X to behave?
HARD AND EASY PROBLEMS (a very informal introduction) Good starting points for precise definitions and formal introduction are Papadimitriou: Computational Complexity, Adison-Wesley, 1994 Garey and Johnson: Computers and Intractability, Freeman 1979 Schrijver: Theory of Linear and Integer Programming, Wiley, 1986, (Chapter 2) P: Collection Z of problems is in P (polynomial-time solvable) if there exists a polynomial-time algorithm that solves any problem in Z, i.e, the algorithm that requires at most f(s) basic steps where s=size of the input and f is a polynomial. NP: Collection Z of decision problems is in NP (nondeterministic polynomial- time solvable) if there exists a polynomial time algorithm to check the correctness of the YES answer. co-NP: replace YES by NO in the above definition.
One of the central (and widely and intensively studied 30 years) problems of (theoretical) computer science is to prove that (a) P NP (b) NP co-NP. All evidence indicates that these conjectures are true. Disproving any of these two conjectures would not only be considered truly spectacular, but would also come as a tremendous surprise (with a variety of far- reaching counterintuitive consequences). NP-complete: Collection Z of problems is NP-complete if (a) it is NP and (b) if polynomial-time algorithm existed for solving problems in Z, then P=NP. P NP co-NP
Theorem. (Karp-Papadimitriou, ‘80) If collection Z of membership problems is NP-complete, then either (1) NP=co-NP or (2) CLDs for Z cannot be described (in terms of giving a finite number of classes/rules for defining halfspaces from CLD). Precise definitions of what form of description is ruled out (unless NP=co-NP) by the theorem can be found in Karp and Papadimitriou: “On linear characterization of combinatorial optimization problems”, SIAM Journal on Computing 11(1982), Also, Chapter 18 of Schrijver’s Theory of Linear and Integer Programming (Wiley, 1986) has a discussion of the result. (The result is often stated in terms of the class of all possible “rational” polyhedra. However, focusing on any NP-complete class is equivalent to studying the general class.)
Membership Problem for S Linear Optimization Problem for S INPUT: y R d INPUT: w R d Is y S ? Find max { w ·s | s S }. Theorem. (Groetschel, Lovasz, Schrijver ‘88) The complexity class of any collection Z M of membership problems is equal to the complexity class of the collection Z LO of the corresponding linear optimization problems. The details can be found in Groetschel, Lovasz,Schrijver: Geometric Algorithms and Combinatorial Optimization, Springer- Verlag, 1998, Also, see Chapter 14 of Schrijver’s Theory of Linear and Integer Programming (Wiley, 1986) for the overview of the result and its widespread implications.
EXAMPLE: BINARY CHOICE D= { {a b} | {a,b} C } (note that d=|D|=n(n- 1 )/ 2 ) Binary Choice Polytope P BC (n) = the convex hull of { x i | i=1,…,n! } Define Z={P BC (n), n =1,2,3,… } Theorem. The collection Z LO of linear optimization problems on polytopes from Z is NP- complete. Thus, by GLS Theorem, Z M is NP-complete. Thus, by KP Theorem, There is no hope for finding CLD for Binary Choice Polytope.
EXAMPLE: MULTIPLE CHOICE D= { {a b: b B\{a}} | (a,B) such that a B C } (note that d = |D| = n2 n-1 ) Multiple Choice Polytope P MC (n) = the convex hull of { x i | i=1,…,n! } Define Z={P MC (n), n =1,2,3,… } Theorem. The collection Z LO of linear optimization problems on polytopes from Z is in P. Thus, by GLS Theorem, Z M is in P. Thus, KP Theorem is not restrictive in this case. (In fact, CLD given by Falmagne ‘79)
Linear Optimization on P MC (n) is easy. INPUT: w R d Find max { f(s)= w ·s | s P MC (n) }. max attained at some extreme point of P MC (n), i.e., at some x i (that represents R i ).
Dynamic Programming: For every a C do Value({a}): = w (a,{a}) For i=2…n, do For every B such that |B|=i, do Find Value(C ) is the solution to the linear optimization problem. Inside loop requires at most Ki2 i steps for some const. K. There are <2 n repetitions of the inside loop. Thus the algorithm needs at most K’(n2 n ) 2 steps. In terms of the input size d=n2 n-1 : The algorithm requires at most Ld 2 steps for some const. L. (These are very crude estimates, but we don’t have to be precise for our purpose here) Linear Optimization on P MC (n) can be done in polynomial time
The prob.distr. over the set of all n! rankings explains the data Given the set of queries D: If the number of queries d < f(n) for some polynomial f, finding CLD for the corresponding polytope is likely to be hopeless. If the number of queries d > Ka n, there is hope. Exercise: Show that the linear optimization problem corresponding to the membership problem for the Approval Voting Polytope is easy (i.e., can be solved in polynomial time). Hint: What is the input size? Hint: Devise a dynamic programming algorithm for finding a max. weight maximal chain.
Note that the rankings played no conceptual role. General Model Claim: “Given any data set, there is a probability distribution over the finite set of alternatives (states of nature) that explains the data.” The finite set of queries/questions: Then, for any Q, the probability that Q holds is Get polytopal representation of the model. How to choose D?
Possible States of Nature S 1,…,S s Queries (d of them) Polytope (in [0,1] d ) Model Characterization LinearOptimization
NP-Complete Problems