Network Design and Bidimensionality Mohammad T. Hajiaghayi University of Maryland.

1 Network Design and Bidimensionality Mohammad T. Hajiaghayi University of Maryland

2 Outline  Buy-at-bulk Network Design  Prize-collecting Network Design  Bidimensionality Theory

3 Steiner Trees  Defined by Gauss in 1836  Given a graph and a subset of nodes, find a subgraph that connects these nodes (e.g., clients and a server)  Objective: Minimize the total connection cost (e.g., cable installation cost)  NP-hard [Garey and Johnson’79]  Different from Minimum Spanning Trees: Intermediate nodes

4 Approximating the Optimal Steiner Tree  Approximation:  Measured by its approximation factor, the ratio between the approximate cost and the cost of an optimal solution  Importance of Approximation Algorithms?  Approximation factors are worst-case bounds; practical performance is often much better  Can be combined with other heuristics, like local search  Give better understanding to design heuristics  Provide provable lower bounds on optimum  The best approximation factor for Steiner trees is 1.38 [BGRS’10]

5 Steiner Forests  More generally, connecting a set of pairs (e.g. multiple servers for multiple VPNs)  Objective: Minimize total connection cost  Solution is a forest, not necessarily a tree  The best approximation factor is 2 by a greedy algorithm [AKR’91, GW’95]  Let’s see a generalization with profound practical applications in telecommunication (e.g., at AT&T, Bell-labs) 12 8 21 27 5 9 14 7 21 3 16 2 5

6  Buying bandwidth to meet demands between a set of pairs of nodes  Cost of buying bandwidth satisfies economies of scale  Different cable types like T1,T2,T3, OS12, OS48, etc.  Capacity on a link can be purchased  at discrete units:  with associated costs:  where: (economies of scale)  So, if you buy in bulk, you save Buy-in-Bulk Generalization

7 Generalization (cont’d)  A non-decreasing monotone concave (or generally sub-additive) function f e : R + R + for an edge e where f e (b) is the minimum cost of cable installation with bandwidth b for edge e bandwidth cost fe(b)fe(b) Multi-Commodity Buy-at-Bulk (MC-BB) : Given a set of bandwidth demand pairs, install sufficient capacities at minimum total cost

8 Cost-Distance  Multi-commodity buy-at-bulk is equivalent to the cost-distance problem (up to a factor 1+ ε ):  On each edge  cost function (installation cost) c: E R+  length function (per-use routing cost) l : E R +  Also a set of pairs (s i, t i ) of nodes with a traffic demand d i between them  Goal: minimize total cost of installation plus routing

9 Cost-Distance (more formally)  Feasible solution: a subset E ' of E such that all pairs s i, t i are connected in G [E ']  Cost of the solution: where l E ' (s i, t i ) is the shortest l - path in G [E ']  Goal: minimize total cost

10 10 Example 10 c=14 l =3 Contribution of this edge to total cost is 14+2*1=16. Contribution of this edge to total cost is 0+2*3=6 l =1 c=0 All demands d i =1

11 Special Cases  Single-source (SS-BB) case: all s i (sources) are equal  Uniform case: cost and length functions on edges are all the same, i.e., each edge e has cost c + l  demand-passing(e) for constants c and l 5 11 8 21 12 Single-source

12 12 Algorithms for Special Cases  O(log n) approximation algorithms for special cases: Single source:  [Guha, Meyerson, and Munagala ’01]:  [Talwar ’02]  [Gupta, Kumar, and Roughgarden ’02]  [Meyerson, Munagala, and Plotkin ’00]  [Goel and Estrin ’03]  [Chekuri, Khanna, and Naor ’01]  … Uniform multicommodity:  [Awerbuch and Azar ’97]  [Bartal ’98]  [Gupta, Kumar, Pal, and Roughgarden ’03]  …  Almost logarithmic hardness in these cases [Andrews ’04].  But no algorithm with good (e.g. polylogarithmic) approximation factor for the most general multi-commodity (non-uniform) buy-at-bulk case for over a decade

13 13 Our Main Result [Chekuri, Hajiaghayi, Kortsarz, Salavatipour, FOCS’06, SICOMP’10] Theorem: For h number of s i, t i pairs, we obtain a (practical) polynomial-time algorithm with approximation ratio O(log 4 h). For simplicity, will present the unit-demand case (i.e. d i =1 for all i’s) and present Õ(log 4 n)

14 Overview of the Algorithm  The algorithm iteratively finds a partial solution connecting some of the residual pairs  The pairs are then removed from the set; repeat until all pairs are connected (routed)  Density of a partial solution = cost of the partial solution # of new pairs routed  Density is the average cost per new routed pair  The algorithm tries to find a low density partial solution at each iteration

15 Overview of the Algorithm (cont’d)  Will show the density of each partial solution in our algorithm is at most Õ(log 3 n)  (OPT / h') where  OPT is the cost of optimum solution  h' is the number of unrouted pairs  A simple analysis (like for set cover) shows: Total Cost  Õ(log 3 n)  OPT  (1/n 2 + 1/(n 2 - 1) +…+ 1)  Õ(log 4 n)  OPT

16 Structure of (near) Optimum  How to compute a low-density partial solution?  Prove the existence of low-density one with a very specific structure: junction tree  Junction tree: given a set P of pairs, tree T rooted at r is a junction tree if  It contains all pairs of P  For every pair s i, t i  P the path connecting them in T goes through r  Why junction trees? knowing the pairs reduces the problem to single-source buy-at-bulk ( SS-BB ) (with O(log n) approx.) r

17 Summary of the Algorithm  So two main ingredients in the proof  Theorem 2: There is always a partial solution that is a junction tree with density Õ (log n)  (OPT / h')  Theorem 3: There is an O (log 2 n) approximation for finding lowest density junction tree (this is low density SS-BB).  Corollary: We can find a partial solution with density Õ (log 3 n)  (OPT / h')  This implies an approximation Õ (log 4 n) for MC-BB

18 Notations for Proof of Thm 2  We provide a junction tree partial solution with density Õ (log n)  (OPT / h')  Consider an optimum solution OPT  Let  E* be the edge set that OPT installs,  OPT c be its (installation) cost  OPT l be the total length (per-use routing cost).  Thus OPT = OPT c + OPT l

19 Removing Cycles  OPT may have cycles !  By [Elkin, Emek, Spielman, and Teng ’05, Abraham, Bartal, Neiman’08] on probabilistic embedding on spanning trees and by losing a factor Õ (log n) on length, we can assume E* is a forest T (WLOG assume T is connected).

20 Junction Tree with Low Density  From T we obtain a collection of rooted subtrees (in the form of junction trees) T 1,…,T a such that  any edge e of T is included in at most O(log n) of subtrees  for every pair there is exactly one index i such that both vertices are in T i and their path in T i goes through the root of T i  The total cost of the junction trees is at most O (log n)  OPT c + Õ (log n)  OPT l =Õ (log n)  OPT  Thus at least one of junction trees of T 1,…,T a has the desired density of Õ (log n)  (OPT / h')

21 Decomposition into Junction Trees  Given T, pick a centroid r 1 (i.e., largest remaining component has at most (2/3) |V(T)| vertices)  Add tree T rooted at r 1 to the collection and the pairs whose paths go through r 1  Remove r 1 from T and apply the procedure recursively to each of the resulting component  Each pair is on exactly one subtree in the collection  Each edge is on O (log n) subtrees since the depth of recursion is O (log n) We are done with the first main theorem 21 r2r2 r1r1

22 Details of Proof of Thm 3  Theorem 3: There is an O (log 2 n) approximation for finding lowest density junction tree  Very similar to single-source except that we have to find a lowest density solution  Goal: connect a subset of pairs to the root r with lowest density (= cost of solution / # of pairs in sol)  Formulate the problem as an Integer Programming (IP) and then consider the Linear Programming (LP) relaxation r

23 First Low Density Single-Sink  Let T be set of terminals to be connected to r  y i is one if we connect terminal i to r  x(e) is one if the edge is in our solution  Let P i be set of paths from terminal i to r  f p is the flow on path p  Above IP denotes the lowest density (lowest average cost) way of connecting a set of terminals from T to r r

24 Finding Low Density Junction Tree  Solve the above LP and partition the terminals of T into log n classes [1-1/2], [1/2-1/4], [1/4-1/8], … with almost equal y variable  Find a class S of terminals among log n classes with max sum of y variables and scale up (lose a factor O (log n))  Use O (log n) approx of [MMP’00,CKN’01] for SS-BB on S r

25 Some Recent Extensions  O(log 3 n) approx for non-uniform buy at bulk when demands are polynomial [Kortsarz and Nutov’ 07]  O(log 4 n) approx can be extended to the node-weighted case but requires some new ideas and some extra work [Chekuri, Hajiaghayi, Kortsarz, Salavatipour ‘07]  O(log 4 n) approx when want to have two disjoint paths between each demand pair [Chekuri, Antonakapoulos, Shepherd and Zhang’ 11]  O(n 1/2 ) approx for the multicommodity case in directed graphs [Chekuri, Even, Gupta, and Segev’ 08]  Our results can be extended to stochastic Steiner tree with non-uniform inflation (by losing an extra factor O(log n)) [Gupta, Hajiaghayi, and Kumar ’07]  Same technique has been used in the Dial-a-Ride problem [Gupta, Hajiaghayi, Ravi, and Nagarajan ’07]  Oblivious network design with ratio O(log 3 n) for uniform buy-at-bulk, i.e., costs of all edges are the same sub-additive function f [Gupta, Hajiaghayi, and Raecke ’07]  Currently thinking of Capacitated Network Design

26  Prize-collecting problems: classic optimization problems with various demands to be ``served'' by some lowest-cost structure  However, if some demands are too expensive to serve, then refuse and instead pay a penalty  Several applications both in  Theory: Game theory, Lagrangian relaxation  Practice: Real-world AT&T application saving millions of dollar in design of fiber networks  Studies for several problems, e.g., [B’89, GW’92, HJ’06, CRR’99,KNN’10,BHM’11,BH’10,HN’10,ABHK’11] 26 Prize-collecting Network Design

27 Prize-collecting Steiner Trees (PCST)  Given: graph G=(V, E), edge costs c e ≥ 0, root r, penalties p v ≥ 0 on vertices  Goal: choose subtree T so as to cost of edges in T + penalty of nodes not connected to r, i.e., ∑ e in T c e + ∑ v not connected to r p v, is minimized r Tree T AT&T Application: Design fiber build connecting new customers to existing net. Graph: street network Root: existing fiber (supernode) Edge cost: digging trench and laying fiber Prize: monthly income for each new customer

28 Our Improvement [Archer, Bateni, Hajiaghayi, Karloff, FOCS’09, SICOMP’11]  Balas’89: introduce PCST  Bienstock et al.’93: give 3-approx. LP-rounding  Goemans-Williamson’92 : 2-approx primal-dual.  Several other heuristics since then [CRR’99,LR’00]  Improving on factor 2 was a famous open problem for 17 years  We obtain 1.967-approx for PCST problems via a Prize-Collecting Clustering technique  Why is it important?  Breaking the barrier and open the path for others  A little improvement (e.g. 2%) can save a lot of money in practice  Technique is new and exciting

29  New clustering paradigm based on prize-collecting frameworks  Cluster vertices of a graph each have a budget  A cluster: a tree connecting its vertices  Connecting cost of a cluster payable by budgets of its vertices  Cost of connecting different clusters not payable by their budgets Prize-Collecting Clustering [Bateni, Hajiaghayi, Marx, STOC’10, J. ACM]

30 PC-Clustering Applications: 1.PC Steiner tree: 1.967-approx[Archer, Bateni, Hajiaghayi, Karloff ’09] 2.PCTSP (and Tour): 1.980- approx[Archer, Bateni, Hajiaghayi, Karloff ’09] 3.Planar Steiner forest: PTAS (1+ ε )[Bateni, Hajiaghayi, Marx’10] 4.Planar submodular prize-collecting Steiner forest: Reduction to bounded-treewidth graphs[Bateni, Chekuri, Ene, Hajiaghayi, Korula, Marx’11] 5.Planar multiway cut: PTAS (1+ ε ) [Bateni, Hajiaghayi, Klein, Mathieu’12] improving over factor 1.34 for general graphs

31 Bidimensionality Theory  Main (theoretical) approaches to solve NP-hard network design problems:  Special instances: Planar graphs, bounded genus graphs (fiber networks in ground), etc.  Approximation algorithms (PTAS): Within a factor C of the optimal solution (PTAS if C= 1+ ε for arbitrary constant ε )  Fixed-parameter algorithms: Parameterize problem by parameter P (typically, the cost of the optimal solution) and aim for f(P) n O(1) (or even f(P) + n O(1) )  We consider all above in Bidimensionality and aim for general algorithmic frameworks

32 Overview  For any network design problem in a large class (“bidimensional”)  Vertex cover, dominating set, connected dominating set, r- dominating set, feedback vertex set, TSP, k-cut, Steiner tree, Steiner forest, multiway cut,…  In broad classes of networks generalizing planar networks (most “minor-closed” graph families)  We obtain (in a series of more than 25 papers):  Strong combinatorial properties  Fixed-parameter algorithms  Often subexponential: 2 O(√k) n O(1) where k=|OPT|  Approximation algorithms  Often PTASs (1+ ε approx): f(1/ ε ) n O(1) r r

33 Summary of Results  A general algorithmic framework (with Rajesh)  Introducing the concept of graph contraction instead of graph minor  Simplifying network decompositions: decompose networks into algorithmically simple instances instead of necessarily small-size networks  Improving deep graph-minor theory of Robertson-Seymour and make it algorithmic  Three workshops so far on the theory Berlin (2007), Dagstuhl (2009), Dagstuhl (2013)


