Princeton University COS 423 Theory of Algorithms Spring 2001 Kevin Wayne Approximation Algorithms These lecture slides are adapted from CLRS.
2 Coping With NP-Hardness Suppose you need to solve NP-hard problem X. n Theory says you aren't likely to find a polynomial algorithm. n Should you just give up? Probably yes, if you're goal is really to find a polynomial algorithm. Probably no, if you're job depends on it.
3 Coping With NP-Hardness Brute-force algorithms. n Develop clever enumeration strategies. n Guaranteed to find optimal solution. n No guarantees on running time. Heuristics. n Develop intuitive algorithms. n Guaranteed to run in polynomial time. n No guarantees on quality of solution. Approximation algorithms. n Guaranteed to run in polynomial time. n Guaranteed to find "high quality" solution, say within 1% of optimum. n Obstacle: need to prove a solution's value is close to optimum, without even knowing what optimum value is!
4 Approximation Algorithms and Schemes -approximation algorithm. n An algorithm A for problem P that runs in polynomial time. n For every problem instance, A outputs a feasible solution within ratio of true optimum for that instance. Polynomial-time approximation scheme (PTAS). n A family of approximation algorithms {A : > 0} for a problem P. n A is a (1 + ) - approximation algorithm for P. n A is runs in time polynomial in input size for a fixed . Fully polynomial-time approximation scheme (FPTAS). n PTAS where A is runs in time polynomial in input size and 1 / .
5 Approximation Algorithms and Schemes Types of approximation algorithms. n Fully polynomial-time approximation scheme. n Constant factor.
6 Knapsack Problem Knapsack problem. n Given N objects and a "knapsack." n Item i weighs w i > 0 Newtons and has value v i > 0. n Knapsack can carry weight up to W Newtons. n Goal: fill knapsack so as to maximize total value. ItemValueWeight W = 11 OPT value = 40: { 3, 4 } Greedy = 35: { 5, 2, 1 } v i / w i
7 Knapsack is NP-Hard KNAPSACK: Given a finite set X, nonnegative weights w i, nonnegative values v i, a weight limit W, and a desired value V, is there a subset S X such that: SUBSET-SUM: Given a finite set X, nonnegative values u i, and an integer t, is there a subset S X whose elements sum to t? Claim. SUBSET-SUM P KNAPSACK. Proof: Given instance (X, t) of SUBSET-SUM, create KNAPSACK instance: n v i = w i = u i n V = W = t
8 Knapsack: Dynamic Programming Solution 1 OPT(n, w) = max profit subset of items {1,..., n} with weight limit w. n Case 1: OPT selects item n. – new weight limit = w – w n – OPT selects best of {1, 2,..., n – 1} using this new weight limit n Case 2: OPT does not select item n. – OPT selects best of {1, 2,..., n – 1} using weight limit w Directly leads to O(N W) time algorithm. n W = weight limit. n Not polynomial in input size!
9 Knapsack: Dynamic Programming Solution 2 OPT(n, v) = min knapsack weight that yields value exactly v using subset of items {1,..., n}. n Case 1: OPT selects item n. – new value needed = v – v n – OPT selects best of {1, 2,..., n – 1} using new value n Case 2: OPT does not select item n. – OPT selects best of {1, 2,..., n – 1} that achieves value v Directly leads to O(N V *) time algorithm. n V* = optimal value. n Not polynomial in input size!
10 Knapsack: Bottom-Up INPUT: N, W, w 1,…,w N, v 1,…,v N ARRAY: OPT[0..N, 0..V*] FOR v = 0 to V OPT[0, v] = 0 FOR n = 1 to N FOR w = 1 to W IF (v n > v) OPT[n, v] = OPT[n-1, v] ELSE OPT[n, v] = min {OPT[n-1, v], w n + OPT[n-1, v-v n ]} v* = max {v : OPT[N, v] W} RETURN OPT[N, v*] Bottom-Up Knapsack
11 Knapsack: FPTAS Intuition for approximation algorithm. n Round all values down to lie in smaller range. n Run O(N V*) dynamic programming algorithm on rounded instance. n Return optimal items in rounded instance. ItemValueWeight 1134, , ,810, ,217, ,343,1997 W = 11 ItemValueWeight Original InstanceRounded Instance W = 11
12 Knapsack: FPTAS Knapsack FPTAS. n Round all values: – V = largest value in original instance – = precision parameter – = scaling factor = V / N n Bound on optimal value V *: assume w n W for all n Running Time
13 Knapsack: FPTAS Knapsack FPTAS. n Round all values: – V = largest value in original instance – = precision parameter – = scaling factor = V / N n Bound on optimal value V *: Proof of Correctness
14 Knapsack: State of the Art This lecture. n "Rounding and scaling" method finds a solution within a (1 - ) factor of optimum for any > 0. n Takes O(N 3 / ) time and space. Ibarra-Kim (1975), Lawler (1979). n Faster FPTAS: O(N log (1 / ) + 1 / 4 ) time. n Idea: group items by value into "large" and "small" classes. – run dynamic programming algorithm only on large items – insert small items according to ratio v n / w n – clever analysis
15 Approximation Algorithms and Schemes Types of approximation algorithms. n Fully polynomial-time approximation scheme. n Constant factor.
16 Traveling Salesperson Problem TSP: Given a graph G = (V, E), nonnegative edge weights c(e), and an integer C, is there a Hamiltonian cycle whose total cost is at most C? Is there a tour of length at most 1570?
17 Traveling Salesperson Problem TSP: Given a graph G = (V, E), nonnegative edge weights c(e), and an integer C, is there a Hamiltonian cycle whose total cost is at most C? Is there a tour of length at most 1570? Yes, red tour = 1565.
18 Hamiltonian Cycle Reduces to TSP HAM-CYCLE: given an undirected graph G = (V, E), does there exists a simple cycle C that contains every vertex in V. TSP: Given a complete (undirected) graph G, integer edge weights c(e) 0, and an integer C, is there a Hamiltonian cycle whose total cost is at most C? Claim. HAM-CYCLE is NP-complete. Proof. (HAM-CYCLE transforms to TSP) n Given G = (V, E), we want to decide if it is Hamiltonian. n Create instance of TSP with G' = complete graph. n Set c(e) = 1 if e E, and c(e) = 2 if e E, and choose C = |V|. Hamiltonian cycle in G has cost exactly |V| in G'. not Hamiltonian in G has cost at least |V| + 1 in G'. a dc b G a dc b G' 1 2
19 TSP TSP-OPT: Given a complete (undirected) graph G = (V, E) with integer edge weights c(e) 0, find a Hamiltonian cycle of minimum cost? Claim. If P NP, there is no -approximation for TSP for any 1. Proof (by contradiction). n Suppose A is -approximation algorithm for TSP. n We show how to solve instance G of HAM-CYCLE. n Create instance of TSP with G' = complete graph. n Let C = |V|, c(e) = 1 if e E, and c(e) = |V | + 1 if e E. Hamiltonian cycle in G has cost exactly |V| in G' not Hamiltonian in G has cost more than |V| in G' n Gap If G has Hamiltonian cycle, then A must return it.
20 TSP Heuristic APPROX-TSP(G, c) n Find a minimum spanning tree T for (G, c). h c b a d e g f MST h c b d e g f Input (assume Euclidean distances) a
21 TSP Heuristic APPROX-TSP(G, c) n Find a minimum spanning tree T for (G, c). n W ordered list of vertices in preorder walk of T. n H cycle that visits the vertices in the order L. h c b ad e g f Hamiltonian Cycle H a b c h d e f g a h c b a d e g Preorder Traversal Full Walk W a b c b h b a d e f e g e d a f
22 TSP Heuristic APPROX-TSP(G, c) n Find a minimum spanning tree T for (G, c). n W ordered list of vertices in preorder walk of T. n H cycle that visits the vertices in the order L. h c b ad e g f Hamiltonian Cycle H: (assuming Euclidean distances) h c b a d e g f An Optimal Tour:
23 TSP With Triangle Inequality -TSP: TSP where costs satisfy -inequality: n For all u, v, and w: c(u,w) c(u,v) + c(v,w). Claim. -TSP is NP-complete. Proof. Transformation from HAM-CYCLE satisfies -inequality. Ex. Euclidean points in the plane. n Euclidean TSP is NP-hard, but not known to be in NP. n PTAS for Euclidean TSP. (Arora 1996, Mitchell 1996) u v w (0,0) (5, 9) (-10, 5)
24 TSP With Triangle Inequality Theorem. APPROX-TSP is a 2-approximation algorithm for -TSP. Proof. Let H* denote an optimal tour. Need to show c(H) 2c(H*). n c(T) c(H*) since we obtain spanning tree by deleting any edge from optimal tour. h c b a d e g f An Optimal Tour h c b a d e g f MST T
25 TSP With Triangle Inequality Theorem. APPROX-TSP is a 2-approximation algorithm for -TSP. Proof. Let H* denote an optimal tour. Need to show c(H) 2c(H*). n c(T) c(H*) since we obtain spanning tree by deleting any edge from optimal tour. n c(W) = 2c(T) since every edge visited exactly twice. h c b a d e g f MST T h c b a d e g f Walk W a b c b h b a d e f e g e d a
26 TSP With Triangle Inequality Theorem. APPROX-TSP is a 2-approximation algorithm for -TSP. Proof. Let H* denote an optimal tour. Need to show c(H) 2c(H*). n c(T) c(H*) since we obtain spanning tree by deleting any edge from optimal tour. n c(W) = 2c(T) since every edge visited exactly twice. n c(H) c(W) because of -inequality. h c b a d e g f Walk W a b c b h b a d e f e g e d a h c b a d e g f Hamiltonian Cycle H a b c h d e f g a
27 TSP: Christofides Algorithm Theorem. There exists a 1.5-approximation algorithm for -TSP. CHRISTOFIDES(G, c) n Find a minimum spanning tree T for (G, c). n M min cost perfect matching of odd degree nodes in T. h c b a d e g f MST T h c b e g f Matching M
28 TSP: Christofides Algorithm Theorem. There exists a 1.5-approximation algorithm for -TSP. CHRISTOFIDES(G, c) n Find a minimum spanning tree T for (G, c). n M min cost perfect matching of odd degree nodes in T. n G' union of spanning tree and matching edges. h c b a d e g f G' = MST + Matching h c b e g f Matching M
29 TSP: Christofides Algorithm Theorem. There exists a 1.5-approximation algorithm for -TSP. CHRISTOFIDES(G, c) n Find a minimum spanning tree T for (G, c). n M min cost perfect matching of odd degree nodes in T. n G' union of spanning tree and matching edges. n E Eulerian tour in G'. h c b a d e g f E = Eulerian tour in G' h c b e g f Matching M
30 TSP: Christofides Algorithm Theorem. There exists a 1.5-approximation algorithm for -TSP. CHRISTOFIDES(G, c) n Find a minimum spanning tree T for (G, c). n M min cost perfect matching of odd degree nodes in T. n G' union of spanning tree and matching edges. n E Eulerian tour in G'. n H short-cut version of Eulerian tour in E. h c b a d e g f E = Eulerian tour in G' h c b e g f Hamiltonian Cycle H a d
31 TSP: Christofides Algorithm Theorem. There exists a 1.5-approximation algorithm for -TSP. Proof. Let H* denote an optimal tour. Need to show c(H) 1.5 c(H*). n c(T) c(H*) as before. n c(M) ½ c( *) ½ c(H*). – second inequality follows from -inequality – even number of odd degree nodes – Hamiltonian cycle on even # nodes comprised of two matchings h c b e g f Matching M h c b e g f Optimal Tour * on Odd Nodes
32 TSP: Christofides Algorithm Theorem. There exists a 1.5-approximation algorithm for -TSP. Proof. Let H* denote an optimal tour. Need to show c(H) 1.5 c(H*). n c(T) c(H*) as before. n c(M) ½ c( *) ½ c(H*). n Union of MST and and matching edges is Eulerian. – every node has even degree n Can shortcut to produce H and c(H) c(M) + c(T). h c b a d e g f MST + Matching h c b e g f Hamiltonian Cycle H a d
33 Load Balancing Load balancing input. n m identical machines. n n jobs, job j has processing time p j. Goal: assign each job to a machine to minimize makespan. n If subset of jobs S i assigned to machine i, then i works for a total time of n Minimize maximum T i.
34 Load Balancing on 2 Machines 2-LOAD-BALANCE: Given a set of jobs J of varying length p j 0, and an integer T, can the jobs be processed on 2 identical parallel machines so that they all finish by time T. AD F BC GE Machine 2 Machine 1 TimeT0 length of job F
35 Machine 2 Machine 1ADF BCEG Yes. Load Balancing on 2 Machines 2-LOAD-BALANCE: Given a set of jobs J of varying length p j 0, and an integer T, can the jobs be processed on 2 identical parallel machines so that they all finish by time T. TimeT0 AD F BC GE length of job F
36 Load Balancing is NP-Hard PARTITION: Given a set X of nonnegative integers, is there a subset S X such that 2-LOAD-BALANCE: Given a set of jobs J of varying length p j, and an integer T, can the jobs be processed on 2 identical parallel machines so that they all finish by time T. Claim. PARTITION P 2-LOAD-BALANCE. Proof. Let X be an instance of PARTITION. n For each integer x X, include a job j of length p j = x. n Set Conclusion: load balancing optimization problem is NP-hard.
37 Greedy algorithm. n Consider jobs in some fixed order. n Assign job j to machine whose load is smallest so far. n Note: this is an "on-line" algorithm. FOR i = 1 to m T i 0, S i FOR j = 1 to n i = argmin k T k S i S i {j} T i T i + p j LIST-SCHEDULING (m, n, p 1,..., p n ) machine with smallest load assign job j to machine i Load Balancing
38 Load Balancing Theorem (Graham, 1966). Greedy algorithm is a 2-approximation. n First worst-case analysis of an approximation algorithm. n Need to compare resulting solution with optimal makespan T*. Lemma 1. The optimal makespan is at least n The total processing time is j p j. n One of m machines must do at least a 1/m fraction of total work. Lemma 2. The optimal makespan is at least n Some machine must process the most time-consuming job.
39 Load Balancing Lemma 1. The optimal makespan is at least Lemma 2. The optimal makespan is at least Theorem. Greedy algorithm is a 2-approximation. Proof. Consider bottleneck machine i that works for T units of time. n Let j be last job scheduled on machine i. n When job j assigned to machine i, i has smallest load. It's load before assignment is T i - p j T i - p j T k for all 1 k m. Machine 3 Machine 2 Machine 1A D F B C j = G EIH J 0 T = T i T i - p i Machine i
40 Load Balancing Lemma 1. The optimal makespan is at least Lemma 2. The optimal makespan is at least Theorem. Greedy algorithm is a 2-approximation. Proof. Consider bottleneck machine i that works for T units of time. n Let j be last job scheduled on machine i. n When job j assigned to machine i, i has smallest load. It's load before assignment is T i - p j T i - p j T k for all 1 k n. n Sum inequalities over all k and divide by m, and then apply L1. n Finish off using L2.
41 Load Balancing Is our analysis tight? n Essentially yes. n We give instance where solution is almost factor of 2 from optimal. – m machines, m(m-1) jobs with of length 1, 1 job of length m – 10 machines, 90 jobs of length 1, 1 job of length 10 Machine Machine Machine Machine Machine Machine Machine Machine Machine Machine List Schedule makespan = 19
42 Load Balancing Is our analysis tight? n Essentially yes. n We give instance where solution is almost factor of 2 from optimal. – m machines, m(m-1) jobs with of length 1, 1 job of length m – 10 machines, 90 jobs of length 1, 1 job of length 10 Machine Machine Machine Machine Machine Machine Machine Machine Machine Machine Optimal makespan =
43 Load Balancing: State of the Art What's known. n 2-approximation algorithm. n 3/2-approximation algorithm: homework. n 4/3-approximation algorithm: extra credit. n PTAS.