Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We conjecture none of them has polynomial time algorithm.

Similar presentations


Presentation on theme: "1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We conjecture none of them has polynomial time algorithm."— Presentation transcript:

1 1 Approximation Algorithms

2 2 Motivation By now we’ve seen many NP-Complete problems. We conjecture none of them has polynomial time algorithm.

3 3 Motivation Is this a dead-end? Should we give up altogether?

4 4 Motivation Or maybe we can settle for good approximation algorithms?

5 5 Introduction Objectives: –To formalize the notion of approximation. –To demonstrate several such algorithms. Overview: –Optimization and Approximation –VERTEX-COVER, SET-COVER

6 6 Optimization Many of the problems we’ve encountered so far are really optimization problems. I.e - the task can be naturally rephrased as finding a maximal/minimal solution. For example: finding a maximal clique in a graph.

7 7 Approximation An algorithm which returns an answer C which is “close” to the optimal solution C* is called an approximation algorithm. “Closeness” is usually measured by the ratio bound  (n) the algorithm produces. Which is a function that satisfies, for any input size n, max{C/C*,C*/C}  (n).

8 8 VERTEX-COVER Instance: an undirected graph G=(V,E). Problem: find a set C  V of minimal size s.t. for any (u,v)  E, either u  C or v  C. Example:

9 9 Minimum VC NP-hard Proof: It is enough to show the decision problem below is NP-Complete: Instance: an undirected graph G=(V,E) and a number k. Problem: to decide if there exists a set V’  V of size k s.t for any (u,v)  E, u  V’ or v  V’. This follows immediately from the following observation.

10 10 Minimum VC NP-hard Observation: Let G=(V,E) be an undirected graph. The complement V\C of a vertex- cover C is an independent-set of G. Proof: Two vertices outside a vertex-cover cannot be connected by an edge.

11 11 VC - Approximation Algorithm C   E’  E while E’   –do let (u,v) be an arbitrary edge of E’ – C  C  {u,v} – remove from E’ every edge incident to either u or v. return C. COR(B) 523-524

12 12 Demo Compare this cover to the one from the example one from the example

13 13 Polynomial Time C   E’  E while E’   do –let (u,v) be an arbitrary edge of E’ –C  C  {u,v} –remove from E’ every edge incident to either u or v return C O(n 2 ) O(1)O(n) O(n 2 )

14 14 Correctness The set of vertices our algorithm returns is clearly a vertex-cover, since we iterate until every edge is covered.

15 15 How Good an Approximation is it? Observe the set of edges our algorithm chooses  any VC contains 1 in each our VC contains both, hence at most twice as large no common vertices!

16 16 The Traveling Salesman Problem

17 17 The Mission: A Tour Around the World

18 18 The Problem: Traveling Costs Money 1795$

19 19 Introduction Objectives: –To explore the Traveling Salesman Problem. Overview: –TSP: Formal definition & Examples –TSP is NP-hard –Approximation algorithm for special cases –Inapproximability result

20 20 TSP Instance: a complete weighted undirected graph G=(V,E) (all weights are non-negative). Problem: to find a Hamiltonian cycle of minimal cost. 3 43 2 5 1 10

21 21 Polynomial Algorithm for TSP? What about the greedy strategy: At any point, choose the closest vertex not explored yet?

22 22 The Greedy $trategy Fails 5 0 3 1 12 10 2   

23 23 The Greedy $trategy Fails 5 0 3 1 12 10 2   

24 24 TSP is NP-hard The corresponding decision problem: Instance: a complete weighted undirected graph G=(V,E) and a number k. Problem: to find a Hamiltonian path whose cost is at most k.

25 25 TSP is NP-hard Theorem: HAM-CYCLE  p TSP. Proof: By the straightforward efficient reduction illustrated below: HAM-CYCLETSP 0 1 0 0 0 1k=0 verify!

26 26 What Next? We’ll show an approximation algorithm for TSP, which yields a ratio-bound of 2 for cost functions which satisfy a certain property.

27 27 The Triangle Inequality Definition: We’ll say the cost function c satisfies the triangle inequality, if  u,v,w  V : c(u,v)+c(v,w)  c(u,w)

28 28 Approximation Algorithm 1. Grow a Minimum Spanning Tree (MST) for G. 2. Return the cycle resulting from a preorder walk on that tree. COR(B) 525-527

29 29 Demonstration and Analysis The cost of a minimal hamiltonian cycle  the cost of a MST 

30 30 Demonstration and Analysis The cost of a preorder walk is twice the cost of the tree

31 31 Demonstration and Analysis Due to the triangle inequality, the hamiltonian cycle is not worse.

32 32 What About the General Case? We’ll show TSP cannot be approximated within any constant factor  1 By showing the corresponding gap version is NP-hard. COR(B) 528

33 33 gap-TSP[  ] Instance: a complete weighted undirected graph G=(V,E). Problem: to distinguish between the following two cases: There exists a hamiltonian cycle, whose cost is at most |V|. The cost of every hamiltonian cycle is more than  |V|.

34 34 Instances min cost |V|  |V| 1 1 1    0  +1  0 0 1

35 35 What Should an Algorithm for gap-TSP Return? |V|  |V| NO!YES! min cost gap DON’T-CARE...

36 36 gap-TSP & Approximation Observation: Efficient approximation of factor  for TSP implies an efficient algorithm for gap-TSP[  ].

37 37 gap-TSP is NP-hard Theorem: For any constant  1, HAM-CYCLE  p gap-TSP[  ]. Proof Idea: Edges from G cost 1. Other edges cost much more.

38 38 The Reduction Illustrated HAM-CYCLEgap-TSP 1  |V|+1 1 1 1 Verify (a) correctness (b) efficiency

39 39 Approximating TSP is NP- hard gap-TSP[  ] is NP-hard Approximating TSP within factor  is NP-hard

40 40 Summary We’ve studied the Traveling Salesman Problem (TSP). We’ve seen it is NP-hard. Nevertheless, when the cost function satisfies the triangle inequality, there exists an approximation algorithm with ratio-bound 2. 

41 41 Summary For the general case we’ve proven there is probably no efficient approximation algorithm for TSP. Moreover, we’ve demonstrated a generic method for showing approximation problems are NP-hard. 

42 42 SET-COVER Instance: a finite set X and a family F of subsets of X, such that Problem: to find a set C  F of minimal size which covers X, i.e -

43 43 SET-COVER: Example

44 44 SET-COVER is NP-Hard Proof: Observe the corresponding decision problem. Clearly, it’s in NP (Check!). We’ll sketch a reduction from (decision) VERTEX-COVER to it:

45 45 VERTEX-COVER  p SET-COVER one set for every vertex, containing the edges it covers VC SC one element for every edge

46 46 C   U  X while U   do –select S  F that maximizes |S  U| –C  C  {S} –U  U - S return C Greedy Algorithm O(|F|·|X|) min{|X|, |F|} COR(B) 530-533

47 47 Demonstration 012345 compare to the optimal coveroptimal cover

48 48 Is Being Greedy Worthwhile? How Do We Proceed From Here? We can easily bound the approximation ratio by logn.We can easily bound the approximation ratio by logn. A more careful analysis yields a tight bound of lnn.A more careful analysis yields a tight bound of lnn.

49 49 Loose Ratio-Bound Claim: If  cover of size k, then after k iterations the algorithm covered at least ½ of the elements. Suppose it doesn’t and observe the situation after k iterations: the n elements what we covered >½

50 50 the n elements Loose Ratio-Bound Claim: If  cover of size k, then after k iterations the algorithm covered at least ½ of the elements. Since this part  can also be covered by k sets... what we covered >½

51 51 the n elements Loose Ratio-Bound Claim: If  cover of size k, then after k iterations the algorithm covered at least ½ of the elements. there must be a set not chosen yet, whose size is at least ½n·1/k what we covered >½

52 52 the n elements what we covered >½ Loose Ratio-Bound Claim: If  cover of size k, then after k iterations the algorithm covered at least ½ of the elements. Thus in each of the k iterations we’ve covered at least ½n·1/k new elements and the claim is proven!

53 53 Loose Ratio-Bound Claim: If  cover of size k, then after k iterations the algorithm covered at least ½ of the elements. Therefore after klogn iterations (i.e - after choosing klogn sets) all the n elements must be covered, and the bound is proved.

54 54 Tight Ratio-Bound Claim: The greedy algorithm approximates the optimal set-cover within factor H(max{ |S|: S  F } ) Where H(d) is the d-th harmonic number:

55 55 Tight Ratio-Bound

56 56 Claim’s Proof Whenever the algorithm chooses a set, charge 1. 0.2 Split the cost between all covered vertices.

57 57 Analysis That is, we charge every element x  X with Where S i is the first set which covers x. cxcx

58 58 Lemma Lemma: For every S  F, Proof: Fix an S  F. For any i, Define Let k be the smallest index, for which u k =0. Number of members of S left uncovered after i iterations  1  i  k : S i covers u i-1 - u i elements from S

59 59 Lemma This last observation yields: Our greedy strategy promises S i (1  i  k) covers at least as many new elements as S. Since for any 1  i  |C| we defined u i as |S-(S 1 ...  S i )|... For any b>a  N, H(b)-H(a)=1/(a+1)+...+1/(b)  (b-a)·1/bThis is a telescopic sumu k =0H(0)=0u 0 =|S|

60 60 Analysis Now we can finally complete our analysis:

61 61 Summary As it turns out, we can sometimes find efficient approximation algorithms for NP- hard problems. We’ve seen two such algorithms: –for VERTEX-COVER (factor 2) –for SET-COVER (logarithmic factor). 

62 62 The Subset Sum Problem Problem definition –Given a finite set S and a target t, find a subset S’ ⊆ S whose elements sum to t All possible sums –S = {x 1, x 2,.., x n } –L i = set of all possible sums of {x 1, x 2,.., x i } Example –S = {1, 4, 5} –L 1 = {0, 1} –L 2 = {0, 1, 4, 5} = L 1  (L 1 + x 2 ) –L 3 = {0, 1, 4, 5, 6, 9, 10} = L 2  (L 2 + x 3 ) L i = L i-1  (L i-1 +x i )

63 63 Subset Sum, revisited: Given a set S of numbers, find a subset S ’ that adds up to some target number t. To find the largest possible sum that doesn ’ t exceed t: T = {0}; for each x in S { T = union(T, x+T); remove elements from T that exceed t; } return largest element in T; (Aside: How should we implement T?) x + T adds x to each element in the set T Potential doubling at each step Complexity O(2 n )

64 64 Trimming: To reduce the size of the set T at each stage, we apply a trimming process. For example, if z and y are consecutive elements and (1-  )y  z < y, then remove z. If  =0.1, {10,11,12,15,20,21,22,23,24,29}  {10,12,15,20,23,29}

65 65 Subset Sum with Trimming: Incorporate trimming in the previous algorithm: T = {0}; for each x in S { T = union(T, x+T); T = trim( , T); remove elements from T that exceed t; } return largest element in T; Trimming only eliminates values, it doesn ’ t create new ones. So the final result is still the sum of a subset of S that is less than t. 0    1/n

66 66 At each stage, values in the trimmed T are within a factor somewhere between (1-  ) and 1 of the corresponding values in the untrimmed T. The final result (after n iterations) is within a factor somewhere between (1-  ) n and 1 of the result produced by the original algorithm.

67 67 After trimming, the ratio between successive elements in T is at least 1/(1-  ), and all of the values are between 0 and t. Hence the maximum number of elements in T is: log (1/(1-  )) t  (log t /  ). This is enough to give us a polynomial bound on the running time of the algorithm.

68 68 Subset Sum – Trim Want to reduce the size of a list by “trimming” –L: An original list –L’: The list after trimming L –d: trimming parameter, [0..1] –y: an element that is removed from L –z: corresponding (representing) element in L’ (also in L) –(y-z)/y  d –(1-d)y  z  y Example –L = {10, 11, 12, 15, 20, 21, 22, 23, 24, 29} –d = 0.1 –L’ = {10, 12, 15, 20, 23, 29} –11 is represented by 10. (11-10)/11  0.1 –21, 22 are represented by 20. (21-20)/21  0.1 –24 is represented by 23. (24-23)/24  0.1

69 69 Subset Sum – Trim (2) Trim(L, d)// L: y 1, y 2,.., y m 1.L’ = {y 1 } 2.last = y 1 // most recent element z in L’ which represent elements in L 3.for i = 2 to m do 4. if last < (1-d) y i then // (1-d)y  z  y 5. // y i is appended into L’ when it cannot be represented by last 6. append y i onto the end of L’ 7. last = y i 8.return L’ Example –L = {10, 11, 12, 15, 20, 21, 22, 23, 24, 29} –d = 0.1 –L’ = {10, 12, 15, 20, 23, 29} O(m)

70 70 Subset Sum – Approximate Algorithm Approx_subset_sum(S, t, e)// S=x 1,x 2,..,x n 1.L 0 = {0} 2.for i = 1 to n do 3. L i = L i-1  (L i-1 +x i ) 4. L i = Trim(L i, e/n) 5. Remove elements that are greater than t from L i 6.return the largest element in L n Example –L = {104, 102, 201, 101}, t=308, e=0.20, d = e/n=0.05 –L 0 = {0} –L 1 = {0, 104} –L 2 = {0, 102, 104, 206} After trimming 104: L 2 = {0, 102, 206} –L 3 = {0, 102, 201, 206, 303, 407} After trimming 206: L 3 = {0, 102, 201, 303, 407} After removing 407: L 3 = {0, 102, 201, 303} –L 4 = {0, 101, 102, 201, 203, 302, 303, 404} After trimming 102, 203, 303: L 4 = {0, 101, 201, 302, 404} After removing 404: L 4 = {0, 101, 201, 302} –Return 302 (=201+101) Optimal answer is 104+102+101=307

71 71 Subset Sum - Correctness The approximation solution C is not smaller than (1- e) times of an optimal solution C * –i.e., C * (1-e)  C Proof –for every element y in L there is a z in L’ such that (1-e/n)y  z  y –for every element y in L i there is a z in L’ i such that (1-e/n) i y  z  y –If y * is an optimal solution in L n, then there is a corresponding z in L n ’ (1-e/n) n y *  z  y * –Since (1-e) < (1-e/n) n [ (1-e/n) n is increasing ] (1-e) y *  (1-e/n) n y *  z (1-e) y *  z –So the value z returned is not smaller than 1-e times the optimal solution y *

72 72 Subset Sum – Correctness (2) The approximation algorithm is fully polynomial Proof –Successive elements z and z’ in L i ’ must have the relationship z/z’ = 1/(1-e/n) i,e, they differ by a factor of at least 1/(1-e/n) –The number of elements in each L i is at most log 1/(1-e/n) t [ t is the largest ] = ln t / (-ln(1-e/n))  (ln t) / (-(-e/n)) [Eq. 2.10: x/(1+x)  ln(1+x)  x, for x > -1]  (n ln t) / e –So the length of L i is polynomial –So the running time of the algorithm is polynomial

73 73 Summary: Not all problems are computable. Some problems can be solved in polynomial time (P). Some problems can be verified in polynomial time (NP). Nobody knows whether P=NP. But the existence of NP-complete problems is often taken as an indication that P  NP. In the meantime, we use approximation to find “good-enough” solutions to hard problems.

74 74 What’s Next? But where can we draw the line? Does every NP-hard problem have an approximation? And to within which factor? Can approximation be NP-hard as well? 


Download ppt "1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We conjecture none of them has polynomial time algorithm."

Similar presentations


Ads by Google