The Matroid Median Problem Viswanath Nagarajan IBM Research Joint with R. Krishnaswamy, A. Kumar, Y. Sabharwal, B. Saha
k-Median Problem Set of locations in a metric space (V,d) Symmetric, triangle inequality Place k facilities such that sum of connection costs (to nearest facility) is minimized: min F µ V, |F| · k u 2 V d(u,F)
k-Median Results poly(log n) approx via tree embeddings [B ’96] LP rounding O(1)-approx [CGST ’99] Lagrangian relaxation + primal dual [JV ’01] Local search with p-exchanges [AGKMMP ’04] best known ratio 3+ ² Hardness of approximation ¼ 1.46 [GK ’98]
Red-Blue Median Facilities are of two different types Partition V into red and blue sets Separate bounds k r and k b on facilities Recently introduced [HKK ’10] Motivated by Content Distribution Networks T facility-types (RB Median is T=2) O(1)-approximation ratio via Local Search k r =3 k b =2
Matroid Median Given matroid M on ground-set V Locate facilities F that are independent in M Minimize connection cost Recap matroid M=(V, I µ 2 V ) A,B 2 I and |A|<|B| ) 9 e 2 B n A : A [ {e} 2 I Substantial generalization of RB Median The CDN application with T facility-types reduces to partition matroid constraint A B e k 1 =2 k 4 =2 k 2 =3 k 3 =1
Talk Outline Thm: 16-approximation for Matroid Median Bad example for Local Search LP relaxation Phase I : sparsification Phase II: reformulation
Local Search? Partition matroid with T parts T-1 exchange local search Swap up to T-1 facilities in each step Unlikely to work beyond T=O(1) m m mm m 1 Eg. T=5 Uniform metric on T+1 Clients n=mT+1 OPT = 1 (small fac.) LOPT = m (big fac.) locality gap (n/T)
LP relaxation min u v d(u,v) ¢ x uv s.t. v x uv = 1 8 u 2 V x uv · y v 8 u,v 2 V v 2 S y v · r(S) 8 S µ V x, y ¸ 0. y 2 M facilities clients u v x uv connection constraints matroid rank constraints
Solving the LP Exponential number of rank constraints Use separation oracle: min S µ V r(S) - v 2 S y v An instance of submodular minimization Also more efficient algorithms to separate over the matroid polytope [C ’84] Solvable in poly-time via Ellipsoid algorithm
Idea for approach (1) Problem non-trivial even if metric is a tree Even O(log n)-approximation not obvious What’s easier than a tree? Suppose input is special star-like instance root facility client 1 client 2 client 3 One root facility (can help any client) Others are private facilities (help only 1 client)
Idea for approach (2) Recall LP variables y j : facility opening (in matroid polytope) x ij : connection For any client i, private j 2 P(i) WMA x ij = y j Connection constraint j x ij = 1 So x ir = 1 - j 2 P(i) x ij = 1 - j 2 P(i) y j Can eliminate all connection variables ! r client i private facilities P(i)
Idea for approach (3) Reformulate the LP min i [ j 2 P(i) d ij ¢ y j + d ir ¢ (1- j 2 P(i) y j ) ] s.t. j 2 P(i) y j · 1, 8 clients i y 2 M This is just an instance of intersection of M with partition matroid from P(i)s To ensure x ir ¸ 0 matroid constraint x ir x ij
Idea for approach (4) Start with LP optimum (x,y) of arbitrary matroid median instance Phase I: Use (x,y) to form clusters of disjoint star-like instances Phase II: Resolve the new star-LP (x,y) itself restricted to the stars not integral Show that new LP is integral ¼ matroid intersection
Phase I: sparsify LP solution
Outline Modify LP connections x in four steps Similar to [CGST ’99] Key: no change in facility variables y Need to ensure y remains in matroid polytope Not true in [CGST ’99] Require some more (technical) work
Step 1: cluster clients L u = v d uv ¢ x uv, contribution of u to LP obj. B(u) is local ball of u vertices within distance 2 ¢ L u from u Order clients u in increasing L u Pick maximal disjoint set of local balls T are the chosen clients Move each client to T-client close to it Loss in obj · 4 ¢ LP* (additive)
Obs on step 1 Local balls of T clients are disjoint y-value inside any local ball ¸ ½ Markov inequality Restrict to clients T (now weighted) For any p,q 2 T : d(p,q) ¸ 2 ¢ (LP p + LP q ) well separated clients T balls y¸½y¸½ separated
More obs on step 1 Suppose y-value in each T’s local ball ¸ 1 Then instance of matroid intersection: Matroid M and partition from local-ball(T) Resolving suitable LP ) integral soln Will need intersection with `laminar’ constraints, not just partition matroid
Step 2: private facilities Ensure that each facility in some T-ball or helps at most one client (ie. private) Break connections from all except closest client 1 to facility j Reconnect to facilities in B(1), y-value ¸ ½ Total reconnection for any client · ½ j Constant factor loss in obj
Step 3: uniform objective Each connection from client p to any facility in B(q) will pay same objective d(p,q) Since p,q well separated d(p,q) · O(1) ¢ d(p,j) For any j 2 B(q) Constant factor loss in obj qp
Step 4: building stars WMA each client i 2 T connected to Its private facilities P(i), OR Its closest other client k 2 T, ie. facility in B(k) Set of `outer’ connections ¼ directed tree Unique out-edge from each client Lem: Can modify outer connection to `star’ Constant factor loss in obj
The star structure One pseudo-root { r, r’ } Every other client connected to either r or r’ All LP-connections x are from client i to: private facility j 2 P(i), obj d(i,j) OR facility in B(k) with k 2 { r, r’ }, uniform obj d(i,k) r r’ i
Phase II: using star Will drop all the connection x-variables WMA x ij = y j for j 2 P(i) private facilities Total outer connection=1 - j 2 P(i) x ij =1 - j 2 P(i) y j Each outer-connection pays same obj d(i,r) Want property (in integral soln) that P(i)= ; ) there is a recourse connection to r Do not quite ensure this, but…
Phase II contd. Add constraint that y(P(r)) + y(P(r’)) ¸ 1 Indeed feasible for (x,y) since each local ball has y-value ¸ ½ This ensures (in integral soln) that P(i)= ; ) there is a recourse connection to r or r’ Lose another constant factor in obj
Phase II: new LP Apply constraints for each star to get LP min i [ j 2 P(i) d ij ¢ y j + d(i,r(i)) ¢ (1- j 2 P(i) y j ) ] s.t. j 2 P(i) y j · 1, 8 clients i y(P(r)) + y(P(r’)) ¸ 1, 8 p-root {r, r’} y 2 M Lem: Integral polytope (via proof similar to matroid intersection) matroid constraint laminar constraints
Summarize Using LP solution and metric properties reduce to star-like instances Formulate new LP for star-like instances, with only facility variables New LP is integral
Other Results O(1)-approximation for prize-collecting version of matroid median Knapsack Median problem (knapsack constraint on open facilities) Give bi-criteria approx, violate budget by w max Can we get true O(1)-approx? Handle other constraints in k-median?
Thank You