Approximation Algorithms for Prize-Collecting Forest Problems with Submodular Penalty Functions Chaitanya Swamy University of Waterloo Joint work with Yogeshwer Sharma David Williamson Cornell University
Prize-collecting Steiner tree (PCST) Given: graph G=(V,E), edge costs c e ≥ 0, root r V, penalties p v ≥ 0 on vertices Goal: choose a set of edges F E so as to minimize∑ e F c e + ∑ v not connected to r p v cost of edges picked + penalty of nodes disconnected from r
Prize-collecting Steiner tree (PCST) Given: graph G=(V,E), edge costs c e ≥ 0, root r V, penalties p v ≥ 0 on vertices Goal: choose a set of edges F E so as to minimize∑ e F c e + ∑ v not connected to r p v r cost of edges picked + penalty of nodes disconnected from r
Prize-collecting Steiner tree (PCST) Given: graph G=(V,E), edge costs c e ≥ 0, root r V, penalties p v ≥ 0 on vertices Goal: choose a set of edges F E so as to minimize∑ e F c e + ∑ v not connected to r p v Bienstock et al.: gave a 3-approx. LP-rounding algorithm Goemans-Williamson (GW): gave a primal-dual 2-approx. algorithm r cost of edges picked + penalty of nodes disconnected from r
PCST with submodular penalty f’n. Given: graph G=(V,E), edge costs c e ≥ 0, root r V, penalty is given by a set-function p : 2 V ≥ 0 p(A): penalty if set A V is disconnected from r p is submodular: p(A)+p(B) ≥ p(A B)+p(A B) e.g., p(A) = min(|A|, M) Goal: choose a set of edges F E so as to minimize ∑ e F c e + p({v not connected to r}) r Generalizes penalty function of PCST Introduced by Hayrapetyan-S-Tardos: gave a 2-approximation algorithm by extending GW primal-dual algorithm
Prize-collecting Steiner forest (PCSF) Given: graph G=(V,E), edge costs c e ≥ 0, source-sink pairs s i -t i penalties p i ≥ 0 on each s i -t i pair Goal: choose a set of edges F E so as to minimize∑ e F c e + ∑ i: s i not connected to t i in F p i
Prize-collecting Steiner forest (PCSF) Given: graph G=(V,E), edge costs c e ≥ 0, source-sink pairs s i -t i penalties p i ≥ 0 on each s i -t i pair Goal: choose a set of edges F E so as to minimize∑ e F c e + ∑ i: s i not connected to t i in F p i Generalizes connectivity function of PCST Introduced by Jain-Hajiaghayi: gave a 3-approx. primal-dual algorithm
General framework for Prize-Collecting Forest Problems PCST with submodular penalty function Prize-collecting Steiner forest Prize-Collecting Forest (PCF) –connectivity function: arbitrary 0- 1 function –penalty function: submodular function on collections of sets of vertices Prize-collecting Steiner tree
Prize-Collecting Forest (PCF) Given: graph G=(V,E) (|V|=n), edge costs c e ≥ 0, connectivity function f: 2 V {0, 1 } f(S)= 1 need an edge from border of S, (S) := {(u,v) E: exactly one of u, v is in S} penalty function p: 2 2 V ≥ 0 p( S ): penalty if collection S of subsets is violated Goal: choose a set of edges F E so as to minimize∑ e F c e + p({S V: f(S)= 1, F (S)= }) Example: Prize-collecting Steiner forest f(S) = 1 iff there exists some i s.t. exactly one of s i, t i S p( S ) = ∑ i: S S that separates s i -t i p i violated subsets
PCF: properties of p(.) p( )=0 Monotonicity: if S T then p( S ) ≤ p( T ) Submodularity: p( S ) + p( T ) ≥ p( S T ) + p( S T ) Complement property: for A V, p({A, A c }) = p({A}) Union property: for A,B V, p({A, B, A B})=p({A,B}) Inactivity property: if f(A)=0, then p({A})=0 For any 0- 1 connectivity f’n f, can define penalty function, p f ( S ) = M (very large #) if S S with f(S)= 1 ; and 0 o/w. Solving PCF with (f, p f ) solving network design problem with connectivity f’n. f need certain restrictions on p(.) If f( )=0, then f is 0- 1 proper iff p f satisfies above properties. p(.) will be given as an oracle (ground set has 2 |V| elements)
Our Results Give a primal-dual 3-approximation algorithm –Requires novel ideas in implementation and analysis, to overcome difficulties caused due to the exponential size of the ground set of p(.) Give an LP-rounding 2.54-approximation algorithm –solving the LP relaxation poses a significant challenge –LP has 2 n constraints and 2 2 n variables: not clear if even a basic solution has a polynomial description –Reformulate LP as a convex program, solve via ellipsoid method; evaluating objective f’n and computing a subgradient both require solving an LP of size 2 n 2 2 n –overcome difficulty by proving certain structural properties; also required for the rounding procedure
An Integer Program x e : indicates if edge e is picked z S : indicates if penalty is incurred for collection S 2 V Minimize ∑ e c e x e + ∑ S p( S )z S subject to∑ e (S) x e + ∑ S :S S z S ≥ f(S) for each S V x e, z S {0, 1 }for each e, S
A Linear Program x e : indicates if edge e is picked z S : indicates if penalty is incurred for collection S 2 V Minimize ∑ e c e x e + ∑ S p( S )z S (PCF-LP) subject to∑ e (S) x e + ∑ S :S S z S ≥ f(S) for each S V x e, z S {0, 1 }for each e, S x e, z S ≥ 0for each e, S LP has 2 2 n variables and 2 n constraints Not clear if even a basic solution has a polynomial- size description – what does “solving the LP” mean?
A Compact Formulation x e : indicates if edge e is picked z S : indicates if penalty is incurred for collection S 2 V Minimizeh(x):=∑ e c e x e + g(x)s.t.0 ≤ x e ≤ 1 for each e (PCF-CP) where,g(x):=min ∑ S p( S )z S (Pen-P) s.t. ∑ S :S S z S ≥f(S) – ∑ e (S) x e for each S V z S ≥0for each e, S g(x) is convex, so (PCF-CP) is a convex program Equivalent to earlier LP.
The Overall Strategy 1.Get an optimal (or ( 1 + )-optimal solution) x to the convex program using the ellipsoid method. 2.Round fractional solution x to integer solution –need that f is 0- 1 proper f’n, or is weakly-submodular –use 2-approx. algorithm for the network-design problem without penalties (Goemans-Williamson or Jain). Obtain a 2.54-approximation algorithm for the prize-collecting forest problem.
The Ellipsoid Method Start with ball containing polytope P. y i = center of current ellipsoid. Min h(x) subject to x P. P
The Ellipsoid Method P New ellipsoid = min. volume ellipsoid containing “unchopped” half-ellipsoid. Min h(x) subject to x P. If y i is infeasible, use violated inequality to chop off infeasible half-ellipsoid. Start with ball containing polytope P. y i = center of current ellipsoid.
The Ellipsoid Method New ellipsoid = min. volume ellipsoid containing “unchopped” half-ellipsoid. P Min h(x) subject to x P. If y i is infeasible, use violated inequality to chop off infeasible half-ellipsoid. Start with ball containing polytope P. y i = center of current ellipsoid. If y i P – how to make progress?
The Ellipsoid Method Min h(x) subject to x P. P Start with ball containing polytope P. y i = center of current ellipsoid. If y i is infeasible, use violated inequality. If y i P – how to make progress? add inequality h(x) ≤ h(y i )? Separation becomes difficult. yiyi h(x) ≤ h(y i )
Let d = subgradient at y i. use subgradient cut d. (x–y i ) ≤ 0. Generate new min. volume ellipsoid. The Ellipsoid Method Min h(x) subject to x P. P Start with ball containing polytope P. y i = center of current ellipsoid. If y i P – how to make progress? d m is a subgradient of h(.) at u, if for every v, h(v)-h(u) ≥ d. (v-u). add inequality h(x) ≤ h(y i )? Separation becomes difficult. If y i is infeasible, use violated inequality. d yiyi h(x) ≤ h(y i )
The Ellipsoid Method Min h(x) subject to x P. P Start with ball containing polytope P. y i = center of current ellipsoid. If y i P – how to make progress? d m is a subgradient of h(.) at u, if for every v, h(v)-h(u) ≥ d. (v-u). Let d = subgradient at y i. use subgradient cut d. (x–y i ) ≤ 0. Generate new min. volume ellipsoid. x 1, x 2, …, x k : points in P. Can show, min i= 1 …k h(x i ) ≤ OPT+ . x*x* x1x1 x2x2 add inequality h(x) ≤ h(y i )? Separation becomes difficult. If y i is infeasible, use violated inequality.
Computing a subgradient h(x) := ∑ e c e x e + g(x) g(x):=min. ∑ S p( S )z S s.t.∑ S:S S z S ≥ f(S) – ∑ e (S) x e S V z S ≥ 0 S
Computing a subgradient h(x) := ∑ e c e x e + g(x) g(x):=min. ∑ S p( S )z S = max.∑ S (f(S) – ∑ e (S) x e ) y S s.t.∑ S:S S z S ≥ f(S) – ∑ e (S) x e s.t. ∑ S S y S ≤ p( S ) S S V z S ≥ 0 S y S ≥ 0 S Consider point u m. Let y optimal dual solution to g(u). Soh(u) = ∑ e c e u e + ∑ S (f(S) – ∑ e (S) u e ) y S = ∑ e d e u e + ∑ S f(S)y S where d e = c e – ∑ S:e (S) y S. At any point v m, y is a feasible solution to dual of g(v), so h(v) ≥ ∑ e c e v e + ∑ S (f(S) – ∑ e (S) v e ) y S = ∑ e d e v e + ∑ S f(S)y S Lemma: For any point v m, we have h(v) – h(u) ≥ d. (v-u). d is a subgradient of h(.) at point u.
Solving the dual g(x) =max∑ S [f(S) – x( (S))]y S (Pen-D) s.t.∑ S S y S ≤ p( S )for all S 2 V y S ≥ 0for all S Bad: Dual has 2 n variables and 2 2 n constraints Good: It is a polymatroid: p(.) is a monotone submodular f’n. Edmonds’ greedy algorithm yields optimal solution –Sort the sets S in decreasing order of [f(S)-x( (S))] –For the i-th set S i, if [f(S i )-x( (S i ))] > 0, set y S i = p {S 1,…S i- 1 } (S i ) Bad: Reduces complexity to 2 n, but still not polytime Good: Show that optimal solution where the sets S with y S > 0 form a laminar family – key structural lemma Notation: x( (S))= ∑ e (S) x e p S (A) = p( S {A}) – p( S )
Useful properties of p(.) If A, B S, then p S (T) = p S (T c ) = 0 for all sets T in {A B, A B, A\B, B\A, A c, B c } – due to complementarity and union properties If p({A}) = 0, then for any B V, p S {A} ({B}) = p S ({B}) – due to submodularity ordering of sets A with f(A)=0 is irrelevant If p S {A} ({B}) = p S {B} ({A}) = 0, then for any set T V, p S {A} ({T}) = p S {B} ({T}) – by submodularity
Solving the dual (contd.) Initialize y S = 0 for all sets S, laminar family L . While set S that does not cross any set of L –find T = argmin {x( (S)): S does not cross L } –if x( (T)) ≥ 1 return; else set y T = p L ({T}), L L {T} Theorem: y is an optimal solution to (Pen-D). Let L ' = {T L : y T >0} = {T 1,…,T k }, T i = maximal superset of {T 1,…,T i } s.t. p( T i ) = p({T 1,…,T i }) Theorem: Setting z T i = x( (T i+1 )) – x( (T i )) ( x( (T k+1 )) := 1 ) for i= 1,…,k, and z S = 0 for all other S, yields an optimal solution to (Pen-P). Structural lemma yields following algorithm:
Rounding procedure Given: fractional solution x, sets T 1,…, T k – gives succinct description of collections T 1,…, T k, and hence optimal soln. z to (Pen-P) Let [0, 1 ] be a parameter. –Define 0- 1 connectivity function (S) = 1 if f(S) = 1 and ∑ S :S S z S < ; 0 otherwise. –Solve network design problem with connectivity function . If f is proper or weakly-supermodular, then so is , therefore cost of edges picked is bounded Penalty is at most p({S V: ∑ S :S S z S ≥ }) ≤ [∑ S p( S )z S ]/
Open Questions Is there a compact description of the LP? Or a more efficient procedure to solve it? Obtaining a 2-approximation algorithm: iterative rounding may be the way to go Applications to 2-stage stochastic network design: can the second-stage cost be captured by a “nice” penalty function? Extensions to higher connectivity reqmts.
Thank You.