1 Chapter 11 Approximation Algorithms Slides by Kevin Wayne. 2005 Pearson-Addison Wesley. All rights reserved.

2 Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one of three desired features. n Solve problem to optimality. n Solve problem in poly-time. n Solve arbitrary instances of the problem.  -approximation algorithm. n Guaranteed to run in poly-time. n Guaranteed to solve arbitrary instance of the problem n Guaranteed to find solution within ratio  of true optimum. Challenge. Need to prove a solution's value is close to optimum, without even knowing what optimum value is!

11.1 Load Balancing

4 Load Balancing Input. m identical machines; n jobs, job j has processing time t j. n Job j must run contiguously on one machine. n A machine can process at most one job at a time. Def. Let J(i) be the subset of jobs assigned to machine i. The load of machine i is L i =  j  J(i) t j. Def. The makespan is the maximum load on any machine L = max i L i. Load balancing. Assign each job to a machine to minimize makespan.

5 List-scheduling algorithm. n Consider n jobs in some fixed order. n Assign job j to machine whose load is smallest so far. Implementation. O(n log n) using a priority queue. Load Balancing: List Scheduling List-Scheduling(m, n, t 1,t 2,…,t n ) { for i = 1 to m { L i  0 J(i)   } for j = 1 to n { i = argmin k L k J(i)  J(i)  {j} L i  L i + t j } jobs assigned to machine i load on machine i machine i has smallest load assign job j to machine i update load of machine i

6 Load Balancing: List Scheduling Analysis Theorem. [Graham, 1966] Greedy algorithm is a 2-approximation. n First worst-case analysis of an approximation algorithm. n Need to compare resulting solution with optimal makespan L*. Lemma 1. The optimal makespan L*  max j t j. Pf. Some machine must process the most time-consuming job. ▪ Lemma 2. The optimal makespan Pf. n The total processing time is  j t j. n One of m machines must do at least a 1/m fraction of total work. ▪

7 Load Balancing: List Scheduling Analysis Theorem. Greedy algorithm is a 2-approximation. Pf. Consider load L i of bottleneck machine i. n Let j be last job scheduled on machine i. n When job j assigned to machine i, i had smallest load. Its load before assignment is L i - t j  L i - t j  L k for all 1  k  m. j 0 L = L i L i - t j machine i blue jobs scheduled before j

8 Load Balancing: List Scheduling Analysis Theorem. Greedy algorithm is a 2-approximation. Pf. Consider load L i of bottleneck machine i. n Let j be last job scheduled on machine i. n When job j assigned to machine i, i had smallest load. Its load before assignment is L i - t j  L i - t j  L k for all 1  k  m. n Sum inequalities over all k and divide by m: n Now ▪ Lemma 2 Lemma 1

9 Load Balancing: List Scheduling Analysis Q. Is our analysis tight? A. Essentially yes. Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m machine 2 idle machine 3 idle machine 4 idle machine 5 idle machine 6 idle machine 7 idle machine 8 idle machine 9 idle machine 10 idle list scheduling makespan = 19 m = 10

10 Load Balancing: List Scheduling Analysis Q. Is our analysis tight? A. Essentially yes. Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m m = 10 optimal makespan = 10

11 Load Balancing: LPT Rule Longest processing time (LPT). Sort n jobs in descending order of processing time, and then run list scheduling algorithm. LPT-List-Scheduling(m, n, t 1,t 2,…,t n ) { Sort jobs so that t 1 ≥ t 2 ≥ … ≥ t n for i = 1 to m { L i  0 J(i)   } for j = 1 to n { i = argmin k L k J(i)  J(i)  {j} L i  L i + t j } jobs assigned to machine i load on machine i machine i has smallest load assign job j to machine i update load of machine i

12 Load Balancing: LPT Rule Observation. If at most m jobs, then list-scheduling is optimal. Pf. Each job put on its own machine. ▪ Lemma 3. If there are more than m jobs, L*  2 t m+1. Pf. n Consider first m+1 jobs t 1, …, t m+1. n Since the t i 's are in descending order, each takes at least t m+1 time. n There are m+1 jobs and m machines, so by pigeonhole principle, at least one machine gets two jobs. ▪ Theorem. LPT rule is a 3/2 approximation algorithm. Pf. Same basic approach as for list scheduling. ▪ Lemma 3 ( by observation, can assume number of jobs > m )

13 Load Balancing: LPT Rule Q. Is our 3/2 analysis tight? A. No. Theorem. [Graham, 1969] LPT rule is a 4/3-approximation. Pf. More sophisticated analysis of same algorithm. Q. Is Graham's 4/3 analysis tight? A. Essentially yes. Ex: m machines, n = 2m+1 jobs, 2 jobs of length m+1, m+2, …, 2m-1 and one job of length m.

11.2 Center Selection

15 center r(C) Center Selection Problem Input. Set of n sites s 1, …, s n. Center selection problem. Select k centers C so that maximum distance from a site to nearest center is minimized. site k = 4

16 Center Selection Problem Input. Set of n sites s 1, …, s n. Center selection problem. Select k centers C so that maximum distance from a site to nearest center is minimized. Notation. n dist(x, y) = distance between x and y. n dist(s i, C) = min c  C dist(s i, c) = distance from s i to closest center. n r(C) = max i dist(s i, C) = smallest covering radius. Goal. Find set of centers C that minimizes r(C), subject to |C| = k. Distance function properties. n dist(x, x) = 0(identity) n dist(x, y) = dist(y, x)(symmetry) n dist(x, y)  dist(x, z) + dist(z, y)(triangle inequality)

17 center site Center Selection Example Ex: each site is a point in the plane, a center can be any point in the plane, dist(x, y) = Euclidean distance. Remark: search can be infinite! r(C)

18 Greedy Algorithm: A False Start Greedy algorithm. Put the first center at the best possible location for a single center, and then keep adding centers so as to reduce the covering radius each time by as much as possible. Remark: arbitrarily bad! greedy center 1 k = 2 centers site center

19 Center Selection: Greedy Algorithm Greedy algorithm. Repeatedly choose the next center to be the site farthest from any existing center. Observation. Upon termination all centers in C are pairwise at least r(C) apart. Pf. By construction of algorithm. Greedy-Center-Selection(k, n, s 1,s 2,…,s n ) { Select any site s and let C = {s} repeat k-1 times { Select a site s i with maximum dist(s i, C) Add s i to C } return C } site farthest from any center

20 Center Selection: Analysis of Greedy Algorithm Theorem. Let C* be an optimal set of centers. Then r(C)  2r(C*). Pf. (by contradiction) Assume r(C*) < ½ r(C). n For each site c i in C, consider ball of radius ½ r(C) around it. n Exactly one c i * in each ball; let c i be the site paired with c i *. n Consider any site s and its closest center c i * in C*. n dist(s, C)  dist(s, c i )  dist(s, c i *) + dist(c i *, c i )  2r(C*). n Thus r(C)  2r(C*). ▪ C* sites ½ r(C) cici ci*ci* s  r(C*) since c i * is closest center ½ r(C)  -inequality

21 Center Selection Theorem. Let C* be an optimal set of centers. Then r(C)  2r(C*). Theorem. Greedy algorithm is a 2-approximation for center selection problem. Remark. Greedy algorithm always places centers at sites, but is still within a factor of 2 of best solution that is allowed to place centers anywhere. Question. Is there hope of a 3/2-approximation? 4/3? e.g., points in the plane Theorem. Unless P = NP, there no  -approximation for center-selection problem for any  < 2.

11.6 LP Rounding: Vertex Cover

23 Weighted Vertex Cover Weighted vertex cover. Given an undirected graph G = (V, E) with vertex weights w i  0, find a minimum weight subset of nodes S such that every edge is incident to at least one vertex in S. 3 6 10 7 A E H B DI C F J G 6 16 10 7 23 9 10 9 33 total weight = 55 32

24 Weighted Vertex Cover: IP Formulation Weighted vertex cover. Given an undirected graph G = (V, E) with vertex weights w i  0, find a minimum weight subset of nodes S such that every edge is incident to at least one vertex in S. Integer programming formulation. n Model inclusion of each vertex i using a 0/1 variable x i. Vertex covers in 1-1 correspondence with 0/1 assignments: S = {i  V : x i = 1} n Objective function: maximize  i w i x i. n Must take either i or j: x i + x j  1.

25 Weighted Vertex Cover: IP Formulation Weighted vertex cover. Integer programming formulation. Observation. If x* is optimal solution to (ILP), then S = {i  V : x* i = 1} is a min weight vertex cover.

26 Integer Programming INTEGER-PROGRAMMING. Given integers a ij and b i, find integers x j that satisfy: Observation. Vertex cover formulation proves that integer programming is NP-hard search problem. even if all coefficients are 0/1 and at most two variables per inequality

27 Linear Programming Linear programming. Max/min linear objective function subject to linear inequalities. n Input: integers c j, b i, a ij. n Output: real numbers x j. Linear. No x 2, xy, arccos(x), x(1-x), etc. Simplex algorithm. [Dantzig 1947] Can solve LP in practice. Ellipsoid algorithm. [Khachian 1979] Can solve LP in poly-time.

28 LP Feasible Region LP geometry in 2D. x 1 + 2x 2 = 6 2x 1 + x 2 = 6 x 2 = 0 x 1 = 0

29 Weighted Vertex Cover: LP Relaxation Weighted vertex cover. Linear programming formulation. Observation. Optimal value of (LP) is  optimal value of (ILP). Pf. LP has fewer constraints. Note. LP is not equivalent to vertex cover. Q. How can solving LP help us find a small vertex cover? A. Solve LP and round fractional values. ½½ ½

30 Weighted Vertex Cover Theorem. If x* is optimal solution to (LP), then S = {i  V : x* i  ½} is a vertex cover whose weight is at most twice the min possible weight. Pf. [S is a vertex cover] n Consider an edge (i, j)  E. n Since x* i + x* j  1, either x* i  ½ or x* j  ½  (i, j) covered. Pf. [S has desired cost] n Let S* be optimal vertex cover. Then LP is a relaxation x* i  ½

31 Weighted Vertex Cover and Bipartite Vertex Cover Weighted vertex cover. Linear programming formulation. Observation. Replace x i ← ( x iL + x iR )/2 x i + x j  1 ← x iL + x jR  1 x iR + x jL  1 The new LP represents Vertex Cover in bipartite graphs (that can be solved integrally by using network flow algorithms). The LP solution gives a half integral solution: x i ∈ {0,½,1}

32 Weighted Vertex Cover Theorem. 2-approximation algorithm for weighted vertex cover. Theorem. [Dinur-Safra 2001] If P  NP, then no  -approximation for  < 1.3607, even with unit weights. Open research problem. Close the gap. 10  5 - 21

11.8 Knapsack Problem

34 Polynomial Time Approximation Scheme PTAS. (1 +  )-approximation algorithm for any constant  > 0. n Load balancing. [Hochbaum-Shmoys 1987] n Euclidean TSP. [Arora 1996] Consequence. PTAS produces arbitrarily high quality solution, but trades off accuracy for time. This section. PTAS for knapsack problem via rounding and scaling.

35 Knapsack Problem Knapsack problem. n Given n objects and a "knapsack." n Item i has value v i > 0 and weighs w i > 0. n Knapsack can carry weight up to W. n Goal: fill knapsack so as to maximize total value. Ex: { 3, 4 } has value 40. 1 Value 18 22 28 1 Weight 5 6 62 7 Item 1 3 4 5 2 W = 11 we'll assume w i  W

36 Knapsack is NP-Complete KNAPSACK : Given a finite set X, nonnegative weights w i, nonnegative values v i, a weight limit W, and a target value V, is there a subset S  X such that: SUBSET-SUM : Given a finite set X, nonnegative values u i, and an integer U, is there a subset S  X whose elements sum to exactly U? Claim. SUBSET-SUM  P KNAPSACK. Pf. Given instance (u 1, …, u n, U) of SUBSET-SUM, create KNAPSACK instance:

37 Knapsack Problem: Dynamic Programming 1 Def. OPT(i, w) = max value subset of items 1,..., i with weight limit w. n Case 1: OPT does not select item i. – OPT selects best of 1, …, i–1 using up to weight limit w n Case 2: OPT selects item i. – new weight limit = w – w i – OPT selects best of 1, …, i–1 using up to weight limit w – w i Running time. O(n W). n W = weight limit. n Not polynomial in input size!

38 Knapsack Problem: Dynamic Programming II Def. OPT(i, v) = min weight subset of items 1, …, i that yields value exactly v. n Case 1: OPT does not select item i. – OPT selects best of 1, …, i-1 that achieves exactly value v n Case 2: OPT selects item i. – consumes weight w i, new value needed = v – v i – OPT selects best of 1, …, i-1 that achieves exactly value v Running time. O(n V*) = O(n 2 v max ). n V* = optimal value = maximum v such that OPT(n, v)  W. n Not polynomial in input size! V*  n v max

39 Knapsack: FPTAS Intuition for approximation algorithm. n Round all values up to lie in smaller range. n Run dynamic programming algorithm on rounded instance. n Return optimal items in rounded instance. ItemValueWeight 1134,2211 2656,3422 31,810,0135 422,217,8006 528,343,1997 W = 11 ItemValueWeight 121 272 3195 4236 5297 original instancerounded instance W = 11

40 Knapsack: FPTAS Knapsack FPTAS. Round up all values: – v max = largest value in original instance –  = precision parameter –  = scaling factor =  v max / n Observation. Optimal solution to problems with or are equivalent. Intuition. close to v so optimal solution using is nearly optimal; small and integral so dynamic programming algorithm is fast. Running time. O(n 3 /  ). n Dynamic program II running time is, where

41 Knapsack: FPTAS Knapsack FPTAS. Round up all values: Theorem. If S is solution found by our algorithm and S* is any other feasible solution then Pf. Let S* be any feasible solution satisfying weight constraint. always round up solve rounded instance optimally never round up by more than  |S|  n n  =  v max, v max   i  S v i DP alg can take v max

The Hamiltonian Path Problem Hamiltonian Path Problem: Given an undirected graph, find a cycle visiting every vertex exactly once. The Hamiltonian Path problem is NP-complete. Eulerian Path Problem: Given an undirected graph, find a walk visiting every edge exactly once. Notice that in a walk some vertices may have been visited more than once. The Eulerian Path problem is polynomial time solvable. A graph has an Eulerian path if and only if every vertex has an even degree.

The Hamiltonian Path Problem Application: Seating assignment problem: Given n persons and a big round table of n seats, each pair of persons may like each other or hate each other. Can we find a seating assignment so that everyone likes his/her neighbors (only two neighbors)? Construct a graph as follows. For each person, create a vertex. For each pair of vertices, add an edge if and only if they like each other. Then the problem of finding a seating assignment is reduced to the Hamiltonian path problem.

The Traveling Salesman Problem Traveling Salesman Problem (TSP): Given a complete graph with nonnegative edge costs, Find a minimum cost cycle visiting every vertex exactly once. Given a number of cities and the costs of traveling from any city to any other city, what is the cheapest round-trip route that visits each city exactly once and then returns to the starting city? One of the most well-studied problem in combinatorial optimization.

NP-completeness of Traveling Salesman Problem Hamiltonian Path Problem: Given an undirected graph, find a cycle visiting every vertex exactly once. For each edge, we add an edge of cost 1. For each non-edge, we add an edge of cost 2. Traveling Salesman Problem (TSP): Given a complete graph with nonnegative edge costs, Find a minimum cost cycle visiting every vertex exactly once. The Hamiltonian path problem is a special case of the Traveling Salesman Problem. If there is a Hamiltonian path, then there is a cycle of cost n. If there is no Hamiltonian path, then every cycle has cost at least n+1.

Inapproximability of Traveling Salesman Problem Theorem: There is no constant factor approximation algorithm for TSP, unless P=NP. Idea: Use the Hamiltonian path problem. For each edge, we add an edge of cost 1. For each non-edge, we add an edge of cost 2 nk. If there is a Hamiltonian path, then there is a cycle of cost n. If there is no Hamiltonian path, then every cycle has cost greater than nk. So, if you have a k-approximation algorithm for TSP, one just needs to check if the returned solution is at most nk. If yes, then the original graph has a Hamiltonian path. Otherwise, the original graph has no Hamiltonian path.

Inapproximability of Traveling Salesman Problem Theorem: There is no constant factor approximation algorithm for TSP, unless P=NP. If there is a Hamiltonian path, then there is a cycle of cost n. If there is no Hamiltonian path, then every cycle has cost greater than nk. The strategy is usually like this. This creates a gap between yes and no instances. The bigger the gap, the problem is harder to approximate. This type of theorem is called “hardness result” in the literature. Just like their names, usually they are very hard to obtain.

Approximation Algorithm for TSP There is no constant factor approximation algorithm for TSP, what can we do then? Observation: in real world situation, the costs satisfy triangle inequality. a c b a + b ≥ c For example, think of cost of an edge as the distance between two points.

Approximation Algorithm for Metric TSP Metric Traveling Salesman Problem (metric TSP): Given a complete graph with edge costs satisfying triangle inequalities, Find a minimum cost cycle visiting every vertex exactly once. First check the bad example. 1 nk 1 This violates triangle inequality! How could triangle inequalities help in finding approximation algorithm?

Lower Bounds for TSP What can be a good lower bound to the cost of TSP? A tour contains a matching. Let OPT be the cost of an optimal tour, since a tour contains two matchings, the cost of a minimum weight perfect matching is at most OPT/2. A tour contains a spanning tree. So, the cost of a minimum spanning tree is at most OPT.

Spanning Tree and TSP So the thick edges form a minimum spanning tree. Let the thick edges have cost 1, And all other edges have cost greater than 1. But it doesn’t look like a Hamiltonian cycle at all! Consider a Hamiltonian cycle. The costs of the edges which are not in the minimum spanning tree might have very high costs. Not really! Each such edge has cost at most 2 because of the triangle inequality. Edge cost at most 2

Spanning Tree and TSP a b c d e Strategy: Construct the TSP tour from a minimum spanning tree. Use the edges in the minimum spanning tree as many as possible. The center vertex already has degree 2, skip to the next vertex. Cost = e + a+ a + b+ b + c+ c + d+ d + e = 2a + 2b + 2c + 2d + 2e = 2(a +b + c + d + e) Cost of a minimum spanning tree MST ≤ OPTSOL ≤ 2MST + SOL ≤ 2OPT

Spanning Tree and TSP How to formalize the idea of “following” a minimum spanning tree?

Spanning Tree and TSP How to formalize the idea of “following” a minimum spanning tree? Key idea: double all the edges and find an Eulerian tour. This graph has cost 2MST.

Strategy: shortcut this Eulerian tour. Spanning Tree and TSP

By triangle inequalites, the shortcut tour is not longer than the Eulerian tour. Spanning Tree and TSP Each directed edge is used exactly once in the shortcut tour.

A 2-Approximation Algorithm for Metric TSP (Metric TSP – Factor 2) 1.Find an MST, T, of G. 2.Double every edge of the MST to obtain an Eulerian graph. 3.Find an Eulerian tour, T*, on this graph. 4.Output the tour that visits vertices of G in the order of their first appearance in T*. Let C be this tour. (That is, shortcut T*) Analysis: 1.cost(T) ≤ OPT (because MST is a lower bound of TSP) 2.cost(T*) = 2cost(T) (because every edge appears twice) 3.cost(C) ≤ cost(T*) (because of triangle inequalities, shortcutting) 4.So, cost(C) ≤ 2OPT

Better approximation? There is a 1.5 approximation algorithm for metric TSP. Hint: use a minimum spanning tree and a maximum matching (instead of double a minimum spanning tree). Major open problem: Improve this to 4/3? Just for fun: can we design an approximation algorithm for the seating assignment problem? An aside: hardness result is not an excuse to stop working, but to guide us to identify interesting cases.

Extra Slides

61 Machine 2 Machine 1adf bceg yes Load Balancing on 2 Machines Claim. Load balancing is hard even if only 2 machines. Pf. NUMBER-PARTITIONING  P LOAD-BALANCE. ad f bc ge length of job f TimeL0 machine 1 machine 2 NP-complete by Exercise 8.26

62 Center Selection: Hardness of Approximation Theorem. Unless P = NP, there is no  -approximation algorithm for metric k-center problem for any  < 2. Pf. We show how we could use a (2 -  ) approximation algorithm for k- center to solve DOMINATING-SET in poly-time. n Let G = (V, E), k be an instance of DOMINATING-SET. n Construct instance G' of k-center with sites V and distances – d(u, v) = 1 if (u, v)  E – d(u, v) = 2 if (u, v)  E n Note that G' satisfies the triangle inequality. n Claim: G has dominating set of size k iff there exists k centers C* with r(C*) = 1. n Thus, if G has a dominating set of size k, a (2 -  )-approximation algorithm on G' must find a solution C* with r(C*) = 1 since it cannot use any edge of distance 2. see Exercise 8.29

1 Chapter 11 Approximation Algorithms Slides by Kevin Wayne. 2005 Pearson-Addison Wesley. All rights reserved.

Similar presentations

Presentation on theme: "1 Chapter 11 Approximation Algorithms Slides by Kevin Wayne. 2005 Pearson-Addison Wesley. All rights reserved."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Chapter 11 Approximation Algorithms Slides by Kevin Wayne. 2005 Pearson-Addison Wesley. All rights reserved.

Similar presentations

Presentation on theme: "1 Chapter 11 Approximation Algorithms Slides by Kevin Wayne. 2005 Pearson-Addison Wesley. All rights reserved."— Presentation transcript:

Similar presentations

About project

Feedback