Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.

Slides:



Advertisements
Similar presentations
Analysis of Algorithms
Advertisements

Great Theoretical Ideas in Computer Science
NP-Completeness.
Traveling Salesperson Problem
Design and Analysis of Algorithms Approximation algorithms for NP-complete problems Haidong Xue Summer 2012, at GSU.
Great Theoretical Ideas in Computer Science for Some.
Approximation Algorithms for TSP
Combinatorial Algorithms
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
S. J. Shyu Chap. 1 Introduction 1 The Design and Analysis of Algorithms Chapter 1 Introduction S. J. Shyu.
Combinatorial Algorithms
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Great Theoretical Ideas in Computer Science.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Approximation Algorithms: Combinatorial Approaches Lecture 13: March 2.
Branch and Bound Similar to backtracking in generating a search tree and looking for one or more solutions Different in that the “objective” is constrained.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
The Theory of NP-Completeness
CSE 326: Data Structures NP Completeness Ben Lerner Summer 2007.
1 Branch and Bound Searching Strategies 2 Branch-and-bound strategy 2 mechanisms: A mechanism to generate branches A mechanism to generate a bound so.
CSE 421 Algorithms Richard Anderson Lecture 27 NP Completeness.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
Programming & Data Structures
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Graphs and DNA sequencing CS 466 Saurabh Sinha. Three problems in graph theory.
1 The TSP : NP-Completeness Approximation and Hardness of Approximation All exact science is dominated by the idea of approximation. -- Bertrand Russell.
Complexity Classes (Ch. 34) The class P: class of problems that can be solved in time that is polynomial in the size of the input, n. if input size is.
Graph Theory Topics to be covered:
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Advanced Algorithm Design and Analysis (Lecture 13) SW5 fall 2004 Simonas Šaltenis E1-215b
1 A -Approximation Algorithm for Shortest Superstring Speaker: Chuang-Chieh Lin Advisor: R. C. T. Lee National Chi-Nan University Sweedyk, Z. SIAM Journal.
Great Theoretical Ideas in Computer Science.
Approximation Algorithms
1 Combinatorial Algorithms Parametric Pruning. 2 Metric k-center Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the.
Honors Track: Competitive Programming & Problem Solving Optimization Problems Kevin Verbeek.
Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Unit 9: Coping with NP-Completeness
CSE 421 Algorithms Richard Anderson Lecture 27 NP-Completeness and course wrap up.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.
CSE 421 Algorithms Richard Anderson Lecture 27 NP-Completeness Proofs.
Approximation Algorithms by bounding the OPT Instructor Neelima Gupta
Approximation Algorithms Greedy Strategies. I hear, I forget. I learn, I remember. I do, I understand! 2 Max and Min  min f is equivalent to max –f.
COSC 3101A - Design and Analysis of Algorithms 14 NP-Completeness.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
1 Euler and Hamilton paths Jorge A. Cobb The University of Texas at Dallas.
Approximation algorithms
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
TU/e Algorithms (2IL15) – Lecture 11 1 Approximation Algorithms.
The NP class. NP-completeness
Introduction to Approximation Algorithms
Optimization problems such as
Richard Anderson Lecture 26 NP-Completeness
Hamiltonian Cycle and TSP
Hamiltonian Cycle and TSP
Hamiltonian cycle part
Great Theoretical Ideas in Computer Science
Computability and Complexity
Approximation Algorithms for TSP
Approximation Algorithms
3. Brute Force Selection sort Brute-Force string matching
3. Brute Force Selection sort Brute-Force string matching
Backtracking and Branch-and-Bound
Approximation Algorithms
Lecture 24 Vertex Cover and Hamiltonian Cycle
3. Brute Force Selection sort Brute-Force string matching
Presentation transcript:

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm that is guaranteed to solve the problem in all cases. However, it's rare that we actually want the capability of solving a problem in all possible cases. We can: Specialize for particular applications. Try heuristics. For example, we know compressing a string is not solvable (we proved it), not even approximate compression. But everybody is doing it anyways and some people are making money out of it. Theory and practice are usually very far apart!

When a problem is NP-hard … Similarly, when we prove a problem is NP-complete, that means that no one currently has a polynomial- time algorithm for the problem. But that's absolutely not a reason to give up. Theoretical proofs are often deceiving. For optimization problems, we are often willing to settle for solutions that are not best possible, but come pretty close to being best possible. For example, for the travelling salesman problem, finding the tour of least cost is nice, but in real life we would often be content with finding a tour that is close to optimal. This leads to the idea of approximation algorithms.

Exhaustive search Although exhaustive search is too slow for large instances of NP-complete problems, as the solution space can grow exponentially, there are tricks that can speed up the computation in many cases. For example, although the travelling salesman problem is NP-complete, we can find optimal travelling salesman tours for real- world instances with hundreds or even thousands of cities, by using some search techniques.

Backtracking Backtracking and exhaustive search is something we have “avoided” at all cost in this course. But is it really that bad?.

Example. This often works. For input (x 1 OR ~x 2 ) AND (~x 2 OR x 4 ) AND (x 1 OR x 2 OR x 3 ). By setting x 1 = 0 and x 1 = 1, we reduce to simpler formulas: (x 1 OR ~x 2 ) AND (~x 2 OR x 4 ) AND (x 1 OR x 2 OR x 3 ) / \ x 1 = 0 / \ x 1 = 1 / \ (~x 2 ) AND (~x 2 OR x 4 ) AND (x 2 OR x 3 ) (~x 2 OR x 4 )

Branch and Bound Branch-and-bound is a natural idea applied to optimization problems. The idea is that we keep track of the cost of the best solution or partial solution found so far, and reject partial solutions if they exceed some quantity, as sometimes we can estimate the cost from the partial solutions.

Example. travelling salesman Suppose we have a partial solution given by a simple path from a to b, passing through vertices S and denote it by [a, S, b]. Extend this to a full tour by finding a path [b, V-(S ∪ {a,b}), a]. We do this extension edge by edge, so if there is an edge (b, c) in the graph then [a, S, b] gets replaced by [a, S ∪ {b}, c]. How to estimate the cost of a partial solution? Given a partial tour [a, S, b], the remainder of the tour is a path through V-S-{a,b}, plus edges from a and b to this part. Therefore the cost is at least the sum of the least-weight edge from a to V-S-{a,b}, the least- weight edge from b to V-S-{a,b}, and the minimum spanning tree of V-S-{a,b}, which can be estimated.

Approximation Algorithms. If we cannot solve it exactly, we can find approximate solutions. For example, for the travelling salesman problem, we might settle for a tour that is within some constant factor of the best. For minimization algorithms, the approximation ratio of an optimization algorithm A is defined to be A’s Solution / Optimal Solution

Vertex Cover: approx. ratio 2 A matching in a graph is a subset of the edges such that no vertex appears in two or more edges. A matching is maximal if one cannot add any new edges to it and still preserve the matching property. Maximal matching Alg.: Examine edges consecutively and add them to our matching if they are disjoint from edges already chosen, all in polynomial time.

Ratio-2 Vertex Cover continues … Clearly (1) The number of vertices in any vertex cover of G is at least as large as the number of edges in any maximal matching. (2) The set of all endpoints of a maximal matching is a vertex cover. So, letting M be the set of edges in a maximal matching, and C be the number of vertices in the smallest vertex cover, we have |C| ≥ |M| by (1) and 2|M| ≥ |C|. It follows that our algorithm for constructing a vertex cover has an approximation ratio bounded by 2. Dinur and Safra proved you can’t do better than unless P=NP.

Shortest Common Superstring Approximation algorithms are usually simple, but the proof of approximation guarantees are usually hard. Here is one example. Given n strings s 1, s 2, …, s n, find the shortest common superstring s. (I,e, each s i is a substring of s, and s is the shortest such string.) The problem is NP-complete. Greedy Algorithm: keep on merging max overlapped strings, until one left. Theorem: This Greedy algorithm is 4 x optimal.

Widely used in DNA sequencing It is widely used in DNA shotgun Sequencing (especially with the new generation of sequencers which promises to sequence a fragment of 40k BP long): Make many copies (single strand) Cut them into fragments of lengths ~500. Sequence each of the fragments. Then assemble all fragments into the shortest common superstring by GREEDY: repeatedly merge the pair with max overlap until finish. Dec release of mouse genome: 33 million reads, covering 2.5G bases (x10 coverage)

Many have worked on this: Many people (well known scientists, including one author of our textbook) have worked on this and improved the constant from 4 to

Theorem. GREEDY achieves 4n, n=opt. Proof by Picture: Given S={s 1, …,s m }, construct G: Nodes are s 1, …,s m Edges: if then add edge: where pref is the pref length. I.e. |s i |=pref+overlap length with s j |SCS(S)| = length shortest Hamiltonian cycle in G Greedy Modified: find all cycles with minimum weights in G, then open cycle, concatenate to obtain the final superstring. (Note: regular greedy has no cycles.) sisi sjsj pref sisi sjsj

Assuming initial Hamiltonian cycle has w(C) = n Then merging s i with s j is equivalent to breaking into two cycles. We have: w(C 1 )+ w(C 2 ) ≤ n Proof: We merged (s i, s j ) because they have max overlap. Picture shows: Reasoning: s’ and s” at least overlap that much so that the sum of red is no more than sum of green: d(s i,s j )+d(s’’,s’)≤d(s i,s’)+d(s’’,s j ) Continue this process,end with self-cycles: C 1, C 2, C 3, C 4, …  w(C i ) ≤ n. sisi sjsj … sisi sjsj s’’ s’ s’’ C C1C1 C2C2 This minimum cycle exists sjsj S” sisi S’

Then we open cycles & concatenate Let w i =w(C i ) L i =| longest string in C i | |open C i | ≤ w i + L i We know n ≥  w i Lemma. S 1 and S 2 overlap ≤ w 1 +w 2  (L i -2w i ) ≤ n, by Lemma, since L i ’s must be in the final SCS. |Greedy’(S)|<  (L i +w i ) =  (L i -2w i )+  3w i ≤ n + 3n =4n. QED s1s1 s2s2 w1w1 w2w2 w2w2 w2w2 w1w1 w1w1 w1w1

Open Question Show Greedy achieves approximation ratio 2.