2 Metric k-center Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the triangle inequality, and k is a positive integer. For any set S V and vertex v define connect(v,S) = min{cost(u,v)|u S} (the cost of the cheapest edge from v to a vertex in S.) Find a set S V, with |S|=k, so as to minimize max v {connect(v,S)}. The metric k-center problem is NP-hard.
Parametric pruning (1) If we know the cost of an optimal solution, we may be able to prune away irrelevant parts of the input and thereby simplify the search for a good solution. However computing the cost of an optimal solution is precisely the difficult core of NP-hard NP-optimization problems. The technique of parametric pruning gets around this difficulty as follows. A parameter t is chosen, which can be viewed as a “guess” on the cost of an optimal solution. For each value of t, the given instance I is pruned by removing parts that will not be used in any solution of cost > t. 3
Parametric pruning (2) The algorithm consists of two steps. In the first step, the family of instances I(t) is used for computing a lower bound on OPT, say t ∗. In the second step, a solution is found in instance I(α t ∗ ), for a suitable choice of α. 4
8 Parametric pruning Sort the edges of G in nondecreasing order of cost, i.e. cost(e 1 ) cost(e 2 ) … cost(e m ). Let G i = (V, E i ), where E i ={e 1, e 2,…, e i }. For each G i, we have to check whether there exists a subset S V such that every vertex in V – S is adjacent to a vertex in S.
9 Dominating Set A dominating set in an undirected graph G = (V, E) is a subset S V such that every vertex in V – S is adjacent to a vertex in S. Let dom(G) denote the size of minimum cardinality dominating set in G. Computing dom(G) is NP-hard.
10 k-Center The k-center problem is equivalent to finding the smallest index i such that G i has a dominating set of size at most k. G i contains k stars (K 1,p ) spanning all vertices. K 1,7
11 G2G2 Independent set (stable set) in G = (V, E) is a subset I V of pairwise non-adjacent vertices. Define the square of graph G = (V, E) to be the graph G 2 = (V, E′), containing an edge (u,v) E′ whenever G has a path of length at most 2 between. G=K 1,4 G2=K5G2=K5
12 Lower bound Lemma 4.1 Given a graph H, let I be an independent set in H 2. Then, | I | dom(H).
13 Hochbaum-Shmoys Algorithm (1986) Input (G, cost: E → Q + ) 1) Construct G 1 2, G 2 2,…, G m 2. 2) Compute a maximal independent set, I r in each graph G r 2. 3)Find the smallest index r such that | I r | k, say j. Output (I j )
14 Approximation ratio of Hochbaum-Shmoys Algorithm-1 Theorem 4.2 Hochbaum-Shmoys Algorithm achieves an approximation factor of 2 for the metric k- center problem.
15 Main Lemma Lemma 4.3 For j as defined in the algorithm, cost(e j ) ≤ OPT. Proof. For evry r k. Now by Lemma 4.1 dom(G r ) ≥ | I r | > k. So r* > r, and r* ≥ j. cost(e j ) ≤ OPT
16 Proof of Theorem 4.2 A maximal independent set I j in a graph G j 2 is also a dominating set. Thus there exist stars in G j 2 centered on the vertices of I j, covering all vertices. By the triangle inequality, each edge used in constructing these stars has cost at most 2cost(e j ). Lemma 6.3 implies 2 cost(e j ) ≤ 2 OPT.
18 Metric weighted k-center Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the triangle inequality, a weight function on vertices, w: V → R + and a bound W R +. For any set S V and vertex v define connect(v,S) = min{cost(u,v)|u S}. Find a set S V of total weight at most W, so as to minimize max v {connect(v,S)}.
19 Weight dominating set Let wdom(G) denote the weight of minimum weight dominating set in G. Calculating wdom(G) is NP-hard.
20 Parametric pruning Sort the edges of G in nondecreasing order of cost, i.e. cost(e 1 ) cost(e 2 ) … cost(e m ). Let G i = (V, E i ), where E i ={e 1, e 2,…, e i }. We need to find the smallest index индекс i such that wdom(G i ) W. If i* is this index, then the cost of the optimal solution is OPT = cost(e i* ).
21 Lightest neighbors Given a vertex weighted graph G = (V, E) let I be an independent set in G 2. For each u I, let s(u) denote a lightest neighbor of u in G, where u is also considered a neighbor of itself. Let S = {s(u) | u I }.
22 Lower Bound Lemma 4.4 Given graph H. Let I be an independent set in H 2. Then w(S) wdom(H). Proof. Let D be a minimum weight dominating set of H. Then the exists a set of disjoint stars in H, centered on the vertices of D and covering all the vertices. Since each of these stars becomes a clique in H 2, the set I can pick at most one vertex from each of them. Thus each vertex in I has a center of the corresponding star available as a neighbor in H. Hence, w(S) wdom(H).
23 Hochbaum-Shmoys Algorithm-2 Input (G, cost: E → Q +, w: V → R +,W) 1) Construct G 1 2, G 2 2,…, G m 2. 2) Compute a maximal independent set I r, in each graph G r 2. 3) Compute S r = {s r (u) | u I r } 4) Find the minimum index r such that w(S r ) W, say j. Output (S j )
24 Approximation ratio of Hochbaum-Shmoys Algorithm-2 Theorem 4.5 Hochbaum-Shmoys Algorithm-2 achieves an approximation factor of 3 for the metric weighted k-center problem.
Proof By Lemma 4.4, cost(e j ) is a lower bound on OPT; the argument is identical to that in Lemma 4.3. Since I j is a dominating set in G j 2, we can cover V with stars of G j 2 centered in vertices of I j. By the triangle inequality these stars use edges of cost at most 2 cost(e j ). Each star center is adjacent to a vertex in S j, using an edge of cost at most cost(e j ). Move each of the centers to the adjacent vertex in S j and redefine the stars. Again, by the triangle inequality, the largest edge cost used in constructing the final stars is at most cost(e j ). 25
26 Tight Example (W = 3) 22 1+ε 1 2 11 1 G b a cd
27 Tight Example 22 1+ε 1 2 11 1 G 2 I n+3 ={b} b a cd S n+3 ={a}OPT={a, c}
28 Shortest superstring Given a finite alphabet Σ, and a set of n strings S = {s 1,…,s n } Σ +. Find a shortest string s that contains each s i as a substring. Without lost of generality, we may assume that no string s i is a substring of another string s j, i j.
Overlap, prefix We begin by developing a good lower bound on OPT. Let us assume that s 1, s 2,…, s n are numbered in order of leftmost occurrence in the shortest superstring, s. Let overlap(s i, s j ) denote the maximum overlap between s i and s j i.e., the longest suffix of s i that is a prefix of s j. Let prefix(s i, s j ) be the prefix of s i obtained removing its overlap with s j. 29
30 Prefix s s1s1 s n–1 s2s2 pref(s 1, s 2 ) snsn s1s1 pref(s n–1, s n )pref(s n, s 1 )over(s n, s 1 )
31 Define the prefix graph of S as the directed graph G pref on vertex set V={1,…,n} that contains an edge i → j of weight prefix(s i,s j ) for each i, j. | prefix(s 1,s 2 )| + | prefix(s 2,s 3 )| + …+ | prefix(s n,s 1 )| represents the weight of the tour 1 2 … n 1. Hence the minimum weight of a travelling salesman tour of the prefix graph gives a lower bound on OPT. Unfortunately, this lower bound is not very useful. TSP is NP-hard.
32 Lower Bound We will use the minimum weight of a cycle cover of the prefix graph. A cycle cover is a collection of disjoint cycles covering all vertices. A Hamiltonian cycle is a cycle cover. We get that the minimum weight of a cycle cover lower- bounds OPT. Unlike minimum TSP, a minimum weight cycle cover can be computed in polynomial time.
33 Cycle → prefix If c = (i 1 i 2 … i l i 1 ) is a cycle in the prefix graph, let α(с) = prefix(s i 1,s i 2 ) ○…○ prefix(s i l-1,s i l ) ○ prefix(s i l,s i 1 ). Let w(с) be the weight of с, w(с) = |α(с)|. Notice that each string s i 1,s i 2,…, s i l is a substring of (α(с)) . Next, let σ(с) = α(с) ○ s i 1. Then σ(с) is a superstring of s i 1,s i 2,…, s i l. In the above construction, we “opened” cycle c at an arbitrary string s i 1. For the rest of the algorithm, we will call s i 1 the representative string for с.
34 Example abcdeabcdeabcde bcdeabcdeabcdea cdeabcdeabcdeabc deabcdeabcdeabcd abcdeabcdeabcde α(с) = abcde, |α(с)|=5, (α(с)) 2 = abcdeabcde, bcdeabcdeabcdea is a substring of (α(с)) 4. σ(с) = α(с)○s i 1 = abcdeabcdeabcdeabcde
35 Algorithm Superstring Input (S = {s 1,…,s n }) 1) Construct the prefix graph G pref corresponding to strings in S. С 2) Find a minimum weight cycle cover of G pref, С = {c 1,…,c k } Output (σ(c 1 ) ○…○ σ(c k )).
36 Remark Clearly, the output σ(c 1 ) ○…○ σ(c k ) is a superstring of the strings in S. Notice that if in each of the cycles we can find a representative string of length at most the weight of the cycle, then the string output is within 2OPT. Thus, the hard case is when all strings of some cycle c are long.
37 Example abcde|abcde|abcde bcde|abcde|abcde|a cde|abcde|abcde|abc de|abcde|abcde|abcd abcde|abcde|abcde α(с) = abcde, |α(с)|=5, (α(с)) 2 = abcdeabcde, bcdeabcdeabcdea is a substring of (α(с)) 4. σ(с) = α(с)○s i 1 = abcde|abcde|abcde|abcde
38 New lower bound Lemma 4.6 If each string in S′ S is a substring of t for a string t, then there is a cycle of weight at most |t| in the prefix graph covering all the vertices corresponding to string in S′.
39 Proof of Lemma 4.6 For each string in S′, locate the starting point of its first occurrence in t . All these starting points will be distinct and will lie in the first copy of t. Consider the cycle in the prefix graph visiting the corresponding vertices in this order. Clearly, the weight of this cycle is at most |t|.
40 Lower bound on overlap Lemma 4.7 Let c and c′ be two cycles in C (cyclic cover of the minimal weight), and let r, r′ be representative strings from these cycles. Then |overlap(r, r′)| < w(c) + w(c′).
41 |overlap(r, r′)| ≥ w(c) + w(c′) r r'r' overlap(r, r′) αα α'α'α'α' α'α' α ○ α' = α' ○ α α is a prefix of length w(c) of overlap (r, r′). α′ is a prefix of length w(c′) of overlap (r, r′). Since |overlap(r, r′)| ≥ w(c) + w(c′), it is follows that α and α′ commute.
42 |overlap(r, r′)| ≥ w(c) + w(c′). r r'r' overlap(r, r′) αα α'α'α'α' α'α' α ○ α' = α' ○ α α is a prefix of length w(c) of overlap (r, r′). α′ is a prefix of length w(c′) of overlap (r, r′). (α) ∞ = (α') ∞ For any N > 0, the prefix of length N of (α) ∞ is the same as that of (α') ∞.
Proof of Lemma 4.7 Now, by Lemma 4.6, there is a cycle of weight at most w(c) in the prefix graph covering all strings in c and c, contradicting the fact that C is a minimum weight cycle cover. So, we have |overlap(r, r′)| < w(c) + w(c′). 43
44 Approximation ratio of Algorithm Superstring Theorem 4.8 Algorithm Superstring achieves an approximation factor of 4 for the shortest superstring problem.
45 Algorithm Superstring Input (S = {s 1,…,s n }) 1) Construct the prefix graph G pref corresponding to strings in S. С 2) Find a minimum weight cycle cover of G pref, С = {c 1,…,c k } Output (σ(c 1 ) ○…○ σ(c k )).
46 Proof r i is a representative string for с i.
Exercise 4.1 Show that the metric k-center problem cannot be approximated within factor < 2, unless P=NP. Hint: show that such an algorithm can solve the dominating set problem in polynomial time. Dominating set Given an undirected graph G=(V,E) and a number k ∈ N, is there a dominating set X ⊆ V(G) with |X| ≤ k. 47