Some Graph Minor Theory and its Uses in Algorithms

Some Graph Minor Theory and its Uses in Algorithms
Julia Chuzhoy Toyota Technological Institute at Chicago

Node-Disjoint Paths (NDP)
Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). I’ll start with a concrete problem. The problem is Node-Disjoint Paths – a very basic graph routing problem. Input: graph G, a number of pairs of its vertices (s_1,t_1)…,(s_k,t_k), that we call demand pairs, or source-destination pairs. Each pair (s_i,t_i) wants to talk to each other.

Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). Goal: Route as many pairs as possible via node-disjoint paths We want to route as many of these pairs as possible. To route a pair, need to choose a path connecting it The paths that we choose should be disjoint in vertices and edges Subject to this, want to route as many pairs as possible.

Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). Goal: Route as many pairs as possible via node-disjoint paths Example: route (s1,t1) on red path

Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). Goal: Route as many pairs as possible via node-disjoint paths Solution value: 2 route (s3,t3) on green path can’t route anymore because the paths should be disjoint. solution value: # of routed pairs, 2. Would like it as large as possible.

Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). Goal: Route as many pairs as possible via node-disjoint paths Solution value: 2 A very closely related problem is Edge-disjoint paths. defined in the same way, but now the paths can share vertices, as long as they don’t share edges. In EDP setting, can route all 3 Even though it feels like EDP should be easier, the two problems seem to behave in the same way. Most things that I’ll say about NDP are also true for EDP. I’ll focus on NDP for now. Edge-disjoint Paths (EDP): paths must be edge-disjoint

Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). Goal: Route as many pairs as possible via node-disjoint paths terminals Vertices participating in the demand pairs are called terminals To make life easier, I will assume that all demand pairs are disjoint and all terminals are distinct. All terminals have degree 1 It does not make much difference. Standard notation: n is number of graph vertices k-number of demand pairs.

Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). Goal: Route as many pairs as possible via node-disjoint paths Assumptions: All terminals are distinct Each terminal has degree 1 Vertices participating in the demand pairs are called terminals To make life easier, I will assume that all demand pairs are disjoint and all terminals are distinct. All terminals have degree 1 It does not make much difference. Standard notation: n is number of graph vertices k-number of demand pairs. n – number of graph vertices k – number of demand pairs

Input: Graph G, source-sink pairs (s1,t1),…,(sk,tk). Goal: Route as many pairs as possible via node-disjoint paths Can we solve it efficiently? As usual, we want to know whether we can solve the problem efficiently. Let's start with something simple with just 1 demand pair, this is a simple connectivity problem, so we can solve it efficiently. What if there are two demand pairs? k=1? ✔ k=2?

NDP with k=2 NP-hard in directed graphs [Fortune, Hopcroft, Wyllie '80] Efficiently solvable in undirected graphs [Jung ‘70, Shiloach ’80, Thomassen ‘80, Robertson-Seymour ’90] s1 t1 s2 t2 G Surprisingly, for directed graphs the problem is already NP-hard. I’ll only talk about undirected graphs today. For undirected, can still solve efficiently. But the statement of the theorem that comes with the algorithm is very non-trivial: either you can route the two demand pairs, or the graph is planar, and we can draw it so that s1,s2,t1,t2 appear on the boundary of the drawing in this circular order.

NDP with k=2 NP-hard in directed graphs [Fortune, Hopcroft, Wyllie '80] Efficiently solvable in undirected graphs [Jung ‘70, Shiloach ’80, Thomassen ‘80, Robertson-Seymour ’90] s1 t1 s2 t2 G flat graph If we can draw the graph like this, then it is clear we can’t route the two pairs. But the theorem is if and only if: either we can find the routing or the drawing. (I’m glossing over some insignificant technical details here, the graph is flat and not planar but that’s unimportant). What if k is larger? say 3 or 4 or some other constant independent of n? Larger k?

Can we get a simpler algorithm for k=3?
Larger k? Constant k: efficiently solvable [Robertson, Seymour ’90] Running time: f(k)n2 [Kawarabayashi, Kobayashi, Reed ‘12] Robertson and Seymour as part of the graph minor series showed an efficient algorithm for this problem. Graph Minor Series: fundamental body of work, that spans more than 20 papers and took more than 20 years to complete. A not insignificant part of the work is this algorithm for NDP, that spans hundreds of pages. Running time is f(k) times n^2 This is an improved bound on the running time. Before: the height of the tower was in itself a tower, whose height is a tower and so on, a constant number of times. Today: the height of the tower is a constant. What constant? For EDP, the height of the tower can be made maybe 2-5. For NDP not sure. Let me right away mention an open question: if k is 3, then we don’t have any other algorithm. We have to go through Robertson and Seymour’s proof. Can we do it easier? Can we get a simpler algorithm for k=3?

Larger k? Constant k: efficiently solvable [Robertson, Seymour ’90]
Running time: f(k)n2 [Kawarabayashi, Kobayashi, Reed ‘12] NP-hard when k is part of input [Knuth, Karp ’74] The main scenario we’ll consider is when k is allowed to grow with the input size, can be as large as say \sqrt{n} or n/10. Then the problem is NP-hard. To put things into perspective, here is a very simple example, that we will also use later.

Example G Set of demand pairs is SxT: can solve efficiently
Max s-t flow We are given a graph G and two sets S,T of its vertices. Want as many disjoint paths as possible from S to T: But I don't care who gets connected to who. This can be solved by max-flow: add a source, connect to everyone in S; add a destination, connect to every one in T. Require at most one flow unit goes through a vertex use the integrality of flow to get a set of disjoint paths. We’ll get a collection of node-disjoint paths, but we don’t know which vertex will be connected to which vertex, which matching will be routed. We can’t control this matching. Set of demand pairs is SxT: can solve efficiently Demand pairs are a specific matching between S and T: NP-hard

Example G Set of demand pairs is SxT: can solve efficiently
If we want to route a specific matching: then this is Node-Disjoint Paths problem, which is NP-hard.. Set of demand pairs is SxT: can solve efficiently Demand pairs are a specific matching between S and T: NP-hard

Want: Good Approximation Algorithm
How? Multicommodity Flow relaxation: send (fractional) flow between the si-ti pairs. Maximize total amount of flow At most 1 flow unit through each vertex Since I've been working most of my life on approximation algorithms, you know what's coming next. We’ll try to solve the problem approximately, and design a good approximation algorithm. Maybe the natural way to do it: Relax the problem to a multicommodity flow relaxation: instead of connecting each s-t pair with 1 path, send flow between the pairs, that can be fractional. So we want to maximize the total amount of flow we send The constraint is that at most one flow unit goes through a vertex. This can be formulated as a Linear Program and solved efficiently. Let me show you the LP.

Multicommodity Flow LP
- set of all paths from si to ti 1 if P is in the solution f(P)= 0 otherwise flow through path P Here is the LP that we will use. Denote by P_i the set of all paths connecting s_i to t_i For each such path introduce LP-variable f(P) that we think of as the indicator variable of whether or not this path is in the solution But in the fractional sense, we think of it as the flow that path P carries.

- set of all paths from si to ti 1 if P is in the solution f(P)= 0 otherwise (LP) Want to maximize total flow Subject to the constraint that each vertex carries at most one flow unit The flows are non-negative Even though this LP has an exponential number of constraints, we can solve it efficiently using standard techniques (edge-based flow relaxation) s.t.

- set of all paths from si to ti 1 if P is in the solution f(P)= 0 otherwise (LP) s.t.

Plan Compute the optimal flow solution (value OPTflow) Turn it into integral solution (value at least OPTflow/α) - set of all paths from si to ti OPTflow ≥ OPT 1 if P is in the solution f(P)= 0 otherwise (LP) Here is the plan for the approximation algorithm: this is just standard LP-rounding: Compute the flow solution Try to turn it into an integral solution, while only losing a small fraction of the flow. If we can get an integral solution whose value is within an \alpha-factor of the flow solution, then this is an \alpha-approximation algorithm This is because the flow solution can only be better than the integral solution. α-approximation LP-rounding algorithm s.t.

Approximation Algorithm [Kolliopoulos, Stein ‘98]
While there is a path P with f(P)>0: Add such shortest path P to the solution For each path P’ sharing vertices with P, set f(P’) to 0 Here is a very simple greedy algorithm that follows this plan. Look at the current flow solution. Among all flow-paths with non-zero flow on them, take the shortest one and add it to our solution. Now update the flow: for every flow-path sharing vertices with our path, set the flow on it to 0. Keep on doing this as long as there is any flow in the graph.

Approximation Algorithm [Kolliopoulos, Stein ‘98]
While there is a path P with f(P)>0: Add such shortest path P to the solution For each path P’ sharing vertices with P, set f(P’) to 0 -approximation How well does this algorithm do? We can show that it achieves a \sqrt{n}-approximation. Idea: in every iteration, as we add one path to our solution, we delete at most \sqrt{n} flow units. This would mean that we are within a \sqrt{n} factor of the optimal solution.

Can We Do Better? Not if we use the maximum multicommodity flow approach! Can we do better than that? Turns out that if we stick to the same approach, of rounding the multicommodity flow, then we can’t, because this relaxation has a \sqrt{n} integrality gap. Here is an example showing this.

Bad Example s1 s2 sk … tk t1 t2 s3 t3 The graph is a grid,
the sources are on top boundary, the destinations are on bottom boundary in opposite order. Now I’ll show you the flow solution.

Bad Example s1 s2 s3 … sk … tk t3 t2 t1
send 1/3 flow unit on red path, connecting s_1 to t_1 … tk t3 t2 t1

Bad Example s1 s2 s3 … sk … tk t3 t2 t1
send 1/3 flow unit on the blue path, connecting s_2 to t_2 … tk t3 t2 t1

Bad Example s1 s2 s3 … sk … tk t3 t2 t1 keep on doing it.
For each pair select a different row to route it on. Not hard to see that each vertex is used by at most 3 paths, so this is a good flow solution. … tk t3 t2 t1

Integrality gap of the flow relaxation
Bad Example s1 s2 s3 … sk OPTflow=k/3 OPT=1 gap: Will send k/3 flow units, roughly \sqrt{n} Integral: can connect only 1 pair. If try to connect two, the paths would cross. \sqrt{n} gap between integral and fractional solution, called integrality gap shows that we can’t get a better than \sqrt{n} approximation by rounding the multicommodity flow relaxation. Integrality gap of the flow relaxation … tk t3 t2 t1

Approximation Status of NDP
-approximation algorithm [Kolliopoulos, Stein ’98] -hardness of approximation for any [Andrews, Zhang ‘05], [Andrews, C, Guruswami, Khanna, Talwar, Zhang ’10] Maybe we should try other techniques to solve this problem? -----we don't have other techniques Maybe the problem is hard to approximate/ ----we only have a roughly \sqrt{\log n}-hardness of approximation. until recently, the O(\sqrt n) approximation algorithm was the best known even for grid graphs and planar graphs. The lower bound is a very non-planar graph and does not work for grids and planar graphs, where only NP-hardness was known. It seemed clear that one should start attacking this problem by trying to improve the algorithms for grids and planar graphs.

-approximation algorithm [Kolliopoulos, Stein ’98] -hardness of approximation for any [Andrews, Zhang ‘05], [Andrews, C, Guruswami, Khanna, Talwar, Zhang ’10] Even in grids/planar graphs only NP-hardness for grids/planar graphs Maybe we should try other techniques to solve this problem? -----we don't have other techniques Maybe the problem is hard to approximate/ ----we only have a roughly \sqrt{\log n}-hardness of approximation. until recently, the O(\sqrt n) approximation algorithm was the best known even for grid graphs and planar graphs. The lower bound is a very non-planar graph and does not work for grids and planar graphs, where only NP-hardness was known. It seemed clear that one should start attacking this problem by trying to improve the algorithms for grids and planar graphs.

-approximation algorithm [Kolliopoulos, Stein ’98] -hardness of approximation for any [Andrews, Zhang ‘05], [Andrews, C, Guruswami, Khanna, Talwar, Zhang ’10] Maybe we should try other techniques to solve this problem? -----we don't have other techniques Maybe the problem is hard to approximate/ ----we only have a roughly \sqrt{\log n}-hardness of approximation. until recently, the O(\sqrt n) approximation algorithm was the best known even for grid graphs and planar graphs. The lower bound is a very non-planar graph and does not work for grids and planar graphs, where only NP-hardness was known. It seemed clear that one should start attacking this problem by trying to improve the algorithms for grids and planar graphs.

-approximation algorithm [Kolliopoulos, Stein ’98] -hardness of approximation for any [Andrews, Zhang ‘05], [Andrews, C, Guruswami, Khanna, Talwar, Zhang ’10] modest improvements for NDP in grids and planar graphs [C, Kim ’15], [C, Kim, Li ‘16] In these two papers, we have made a very modest progress on approximation algorithms for the problem on planar graphs and grids – n^{1/4}-approximation for grids slightly better than \sqrt{n} approximation for planar graphs. This gave some hope that good algorithms can be found at least for these special cases.

-approximation algorithm [Kolliopoulos, Stein ’98] -hardness of approximation for any [Andrews, Zhang ‘05], [Andrews, C, Guruswami, Khanna, Talwar, Zhang ’10] New: hardness of approximation for subgraphs of grids [C, Kim, Nimavat ’16] Newest: NDP on grids hard to approximate up to almost polynomial factors even on grids [C, Kim, Nimavat ’17] unfortunately, a recent result that will be presented later today shows that the problem is very hard to approximate even on planar graphs that are subgraphs of grids. Hardness factor: 2^{\sqrt{\log n}}. This still left the hope that one can find a decent algorithm for grid graphs, but it looks like we have a new result that gives an even stronger hardness, even for grid graphs.

Disclaimer New Results?
NDP on grids is very hard to approximate [C, Kim, Nimavat ‘17] -hardness for any constant -hardness Disclaimer This result is a work in progress. It was not carefully verified yet and may turn out to be incorrect! First of all, a disclaimer: This is work in progress, and is still being verified, so it may still be incorrect. Assuming that it is correct, we get almost polynomial factor hardness of NDP on grids, which is a very surprising result (I don’t know of any other problem that is so difficult on grids).

New Results? NDP on grids is very hard to approximate [C, Kim, Nimavat ‘17] -hardness for any constant -hardness unless all problems in NP have randomized quasi-poly-time algorithms almost polynomial hardness for grid graphs – stronger than what was known for general graphs depending on complexity assumption, different results. under randomized ETH (need almost exponential time to solve SAT by randomized alg)

Summary for Approximability Status of NDP
Hopeless The summary for the Node-Disjoint Paths problem from approximation viewpoint: it seems hopeless.

Approximability Status of EDP
Exactly the Same Except: has good approximation on grids and nearly-Eulerian planar graphs [Chekuri, Khanna, Shepherd ‘04], [Aumann, Rabani ‘95], [Kleinberg, Tardos ‘95], [Kleinberg, Tardos ‘98], [Kawarabayashi, Kobayashi ‘13] Let me say a few words about EDP. its approximability status is pretty much the same as that of NDP. But there are some notable special cases for which better algorithms are known. Those include planar Eulerian and nearly Eulerian graphs, which include grid graphs.

A Wall But if we look at this graph, called a wall, then EDP on wall graphs seems to behave in the same way as NDP on grids.

Exactly the Same Except: has good approximation on grids and nearly-Eulerian planar graphs [Chekuri, Khanna, Shepherd ‘04], [Aumann, Rabani ‘95], [Kleinberg, Tardos ‘95], [Kleinberg, Tardos ‘98], [Kawarabayashi, Kobayashi ‘13] So whatever results we had for NDP on grids, including negative results, hold for EDP on walls. EDP on walls ≅ NDP on grids

Exactly the Same Except: has good approximation on grids and nearly-Eulerian planar graphs [Chekuri, Khanna, Shepherd ‘04], [Aumann, Rabani ‘95], [Kleinberg, Tardos ‘95], [Kleinberg, Tardos ‘98], [Kawarabayashi, Kobayashi ‘13] So whatever results we had for NDP on grids, including negative results, hold for EDP on walls. same results are known EDP on walls ≅ NDP on grids

Summary so Far NDP and EDP are very hard to approximate, even on planar graphs, and possibly grids/walls So far, our summary is that both problems are very hard to approximate, even on planar graphs, grids and walls. But let me end this part with some positivity.

Some Positivity… If global min-cut in G is at least polylog(n), then EDP has a polylog(n)-approximation [Rao, Zhou ‘10]. Very good algorithms for NDP/EDP on expanders [Broder, Frieze, Suen, Upfal ’94], [Kleinberg, Rubinfeld ’96] [Frieze ‘01] There are some interesting special cases, where we can get good algorithms for EDP, in addition to those I already mentioned. If the graph is moderately well-connected, meaning that the global minimum cut is at least polylog, then one can get a polylog-approximation, using the same flow relaxation. There are also very good approximation algorithms for both NDP and EDP on expander graphs.

Some Positivity… If global min-cut in G is at least polylog(n), then EDP has a polylog(n)-approximation [Rao, Zhou ‘10]. Very good algorithms for NDP/EDP on expanders [Broder, Frieze, Suen, Upfal ’94], [Kleinberg, Rubinfeld ’96] [Frieze ‘01] Example: use greedy A quick intuition of why routing on expanders is easier. If the graph is an expander, we can arrange the flow solution, so that all the flow-paths have polylogarithmic length. When the flow-paths are short, the same greedy algorithm we saw at the beginning works much better, and will give us a polylogarithmic approximaiton.

Some Positivity… If global min-cut in G is at least polylog(n), then EDP has a polylog(n)-approximation [Rao, Zhou ‘10]. Very good algorithms for NDP/EDP on expanders [Broder, Frieze, Suen, Upfal ’94], [Kleinberg, Rubinfeld ’96] [Frieze ‘01] Example: use greedy What if we allow a small congestion? Can we exploit these results for expanders to getting something useful for general graphs? The obvious answer is no, because the problems are hard to approximate. But the next obvious step is to try and make the problems a little easier, by allowing some congestion. Instead of routing the pairs on disjoint paths, we will allow them to share vertices or edges to some extent. Can we exploit this for general graphs?

EDP/NDP with Congestion
An -approximation algorithm with congestion c routes demand pairs with congestion at most c. up to c paths can share an edge or a vertex This gives rise to these problems whose names don’t make much sense: node-disjoint paths with congestion and edge-disjoint paths with congestion. Routing with congestion c means that we allow up to c paths to share an edge or a vertex, depending if we talk about EDP or NDP.

An -approximation algorithm with congestion c routes demand pairs with congestion at most c. solution value remains the number of pairs routed. Approximation factor is computed with respect to that.

An -approximation algorithm with congestion c routes demand pairs with congestion at most c. optimum number of pairs with no congestion allowed What is OPT here? We could define it as the maximum number of pairs that can be routed with no congestion allowed or as maximum number of pairs that can be routed with congestion c. Either one of these will work, we’ll stick to the first one for historical reasons. or: optimum number of pairs with congestion c

Randomized Rounding [Raghavan, Thompson ‘87]
(LP) s.t. For each demand pair (si,ti), choose a path with probability f(P). Expected solution value: OPTLP Maximum vertex congestion: O(log n/log log n) with high probability Maybe the first obvious algorithm to try is the randomized rounding technique of Raghavan and Thompson. We think of the f(P) values as probabilities. For each pair (s_i,t_i), we choose up to one path connecting it with probability equal to the flow on that path. That’s our solution. The expected number of pairs routed is the LP-solution value. using Chernoff, the maximum congestion on any vertex is at most (log n/log log n) with high probability.

(LP) s.t. Constant approximation with O(log n/log log n) congestion Want: much smaller congestion (say 2); poly-log approximation For each demand pair (si,ti), choose a path with probability f(P). Expected solution value: OPTLP Maximum vertex congestion: O(log n/log log n) with high probability This immediately gives us a constant approximation in terms of the number of pairs routed, with almost logarithmic congestion. That’s nice, but the congestion is very high. What if we want a much smaller congestion, say 2? We’ll be looking for polylogarithmic approximation, because there is a lower bound. The bound says that if we want to route with congestion c, we cannot get a better than roughly log to the 1/c approximation. So polylog is the best we can hope for.

Lower bound: Ω(log1/(c+1)n)-hardness with congestion c [Andrews, C, Guruswami, Khanna, Talwar, Zhang ’10] (LP) s.t. Constant approximation with O(log n/log log n) congestion Want: much smaller congestion (say 2); poly-log approximation For each demand pair (si,ti), choose a path with probability f(P). Expected solution value: OPTLP Maximum vertex congestion: O(log n/log log n) with high probability This immediately gives us a constant approximation in terms of the number of pairs routed, with almost logarithmic congestion. That’s nice, but the congestion is very high. What if we want a much smaller congestion, say 2? We’ll be looking for polylogarithmic approximation, because there is a lower bound. The bound says that if we want to route with congestion c, we cannot get a better than roughly log to the 1/c approximation. So polylog is the best we can hope for.

EDP with Congestion Congestion O(log n/log log n): constant approximation [Raghavan, Thompson ’87] Congestion c: approximation [Azar, Regev ’01], [Baveja, Srinivasan ’00], [Kolliopoulos, Stein ‘04] Congestion poly(log log n): polylog(n)-approx [Andrews ‘10] Congestion 2: approximation [Kawarabayashi, Kobayashi ’11] Congestion 14: polylog(k)-approximation [C, ‘11] Congestion 2: polylog(k)-approximation [C, Li ’12] polylog(k)-approximation for NDP with congestion 2 [Chekuri, Ene ’12], [Chekuri, C ‘16] The problem has a very long history There is a very long line of work, that eventually led to the following result: with congestion 2, we can get a polylogarithmic approximation, for both NDP and EDP. This is quite striking: if we allow no congestion, the problem is very hard to approximate, but with congestion 2 we suddenly get a reasonable approximation.

EDP with Congestion Congestion O(log n/log log n): constant approximation [Raghavan, Thompson ’87] Congestion c: approximation [Azar, Regev ’01], [Baveja, Srinivasan ’00], [Kolliopoulos, Stein ‘04] Congestion poly(log log n): polylog(n)-approx [Andrews ‘10] Congestion 2: approximation [Kawarabayashi, Kobayashi ’11] Congestion 14: polylog(k)-approximation [C, ‘11] Congestion 2: polylog(k)-approximation [C, Li ’12] polylog(k)-approximation for NDP with congestion 2 [Chekuri, Ene ’12], [Chekuri, C ‘16] All these results are based on the multicommodity flow relaxation “Tight” due to known hardness results All these results are based on the multicommodity flow relaxation. With congestion 2, multicommodity flow relaxation can give a polylog appox, but with congestion 1 it has \sqrt{n} gap. These results are almost tight. to beat congestion 2, need to solve NDP itself. There is a polylog-hardness for routing with any constant congestion c, though the poly in the log is different in the upper and the lower bounds. There is still a big room for improvement there.

EDP with Congestion Congestion O(log n/log log n): constant approximation [Raghavan, Thompson ’87] Congestion c: approximation [Azar, Regev ’01], [Baveja, Srinivasan ’00], [Kolliopoulos, Stein ‘04] Congestion poly(log log n): polylog(n)-approx [Andrews ‘10] Congestion 2: approximation [Kawarabayashi, Kobayashi ’11] Congestion 14: polylog(k)-approximation [C, ‘11] Congestion 2: polylog(k)-approximation [C, Li ’12] polylog(k)-approximation for NDP with congestion 2 [Chekuri, Ene ’12], [Chekuri, C ‘16] Structural results about graphs The most exciting thing for me is that this line of work led to new insights about graphs, that were not known in the graph theory community, and they eventually led to new results in graph theory, that I will later mention. new results in graph theory!

EDP with Congestion Congestion O(log n/log log n): constant approximation [Raghavan, Thompson ’87] Congestion c: approximation [Azar, Regev ’01], [Baveja, Srinivasan ’00], [Kolliopoulos, Stein ‘04] Congestion poly(log log n): polylog(n)-approx [Andrews ‘10] Congestion 2: approximation [Kawarabayashi, Kobayashi ’11] Congestion 14: polylog(k)-approximation [C, ‘11] Congestion 2: polylog(k)-approximation [C, Li ’12] polylog(k)-approximation for NDP with congestion 2 [Chekuri, Ene ’12], [Chekuri, C ‘16]

Node-Disjoint Paths with Constant Congestion
Will round Multicommodity Flow LP For now we will focus on node-disjoint paths with constant congestion. Our algorithm will perform LP-rounding for the multicommodity flow LP. Before we even start, it would be great to ensure that all vertex degrees are small, say at most logarithmic. Ideally, we want it to be polylog(k). We’ll see later how to get there. Whenever I say that max vertex degree in a graph is bounded, I’ll mean bounded by polylogs, not constant. Let me first motivate why this would be helpful for us. There is a little trick that we are going to use again and again, and it only works in bounded vertex-degree scenarios. Preprocessing: reduce vertex degrees to O(log n)

Node-Disjoint Paths with Constant Congestion
Ideally: polylog(k) Will round Multicommodity Flow LP For now we will focus on node-disjoint paths with constant congestion. Our algorithm will perform LP-rounding for the multicommodity flow LP. Before we even start, it would be great to ensure that all vertex degrees are small, say at most logarithmic. Ideally, we want it to be polylog(k). We’ll see later how to get there. Whenever I say that max vertex degree in a graph is bounded, I’ll mean bounded by polylogs, not constant. Let me first motivate why this would be helpful for us. There is a little trick that we are going to use again and again, and it only works in bounded vertex-degree scenarios. Preprocessing: reduce vertex degrees to O(log n)

Small Edge-Congestion vs Node-Disjointness in Bounded-Degree Graphs
Suppose there are k S-T paths causing edge-congestion at most c. S T Then there are at least k/c edge-disjoint S-T paths. Suppose we have our graph G, and two sets of its vertices S and T. These have nothing to do with our demand pairs. Assume that there are k paths connecting vertices in S to vertices in T with total edge-congestion at most c. We claim that we can get many edge-disjoint paths connecting S to T, at least k/c such paths. Note that in this scenario we don’t care what is the specific matching that is being routed, this will only be used as a subroutine.

Suppose there are k S-T paths causing edge-congestion at most c. S T s t Then there are at least k/c edge-disjoint S-T paths. We can set up an s-t flow problem as before. Every edge participates in up to c green paths. So if we send 1/c flow units on each green path, we’ll get an s-t flow of value k/c and no edge congestion. Using the integrality of flow, we can get k/c edge-disjoint paths connecting S to T. integrality of flow Send 1/c flow units on each green path s-t flow of value k/c and no edge-congestion

Suppose there are k S-T paths causing edge-congestion at most c. S T s t Then there are at least k/c edge-disjoint S-T paths. It is important to note that we may not be routing the same pairs. We may be routing a different matching. We cannot control the matching that we route. In fact, if we wanted to preserve the demand pairs routed, we know that this statement is false. We cannot reduce congestion like that. This only works if we don’t care who gets connected to who. We’ll only be using this as a subroutine, not to solve the whole problem. Important: the pairs routed may not be preserved!!!

Suppose there are k S-T paths causing edge-congestion at most c. S T s t Then there are at least k/c edge-disjoint S-T paths. Let’s see what happens if, on top of this, the maximum vertex degree was small? in this case we can get many node-disjoint paths connecting S to T. What if maximum vertex degree is at most Δ?

Suppose there are k S-T paths causing edge-congestion at most c. S T s t Then there are at least k/(Δc) node-disjoint S-T paths. We’ll route k/(\Delta c) node-disjoint paths. The argument is the same. Each edge carries up to c flow units, so we have at most (\Delta c) flow units going through each vertex. If we send 1/(\Delta c) flow units on each path, we’ll get an s-t flow of value k/(\Delta c), and no vertex-congestion, so flow at most one through each vertex. Again, using the integrality of flow, we can get k/(\Delta c) node-disjoint paths connecting S to T. integrality of flow Send 1/(Δc) flow units on each green path s-t flow of value k/(Δc) and no vertex-congestion

Suppose there are k S-T paths causing edge-congestion at most c. S T s t Can move between low-edge-congestion and node-disjoint routing Then there are at least k/(Δc) node-disjoint S-T paths. So if the maximum vertex degree is small, we can move between low-edge-congestion routing and node-disjoint routing! This is a very useful little trick that we use many times. This only works when we don’t care who gets connected to who, because the matching that we route may get changed. We won’t use it for the problem itself, only in subroutines where we try to find some structures in our graph. only when we don’t need to preserve the matching routed! Send 1/(cΔ) flow units on each green path s-t flow of value k/(Δc) and no vertex-congestion

(LP) s.t. For each demand pair (si,ti), choose a path with probability f(P). Expected solution value: OPTLP Maximum vertex congestion: O(log n/log log n) with high probability Let us now go back to the randomized rounding technique of Raghavan and Thompson. We can use it to reduce the maximum vertex degree to logarithmic. Recall that each demand pair chose up to 1 path connecting it, with probability f(P). The maximum vertex congestion at most logarithmic – I’ll ignore the log log n term here. The congestion is too high for us.

Feasible flow solution send 1/log n flow units through each path delete all edges that carry no flow. all vertex degrees at most log n For each demand pair (si,ti), choose a path with probability f(P). Expected solution value: OPTLP Maximum vertex congestion: O(log n/log log n) with high probability What happens if we send 1/log n flow units on each path we chose? This gives a feasible flow solution; the flow value goes down by log n factor. Let us now delete all edges that don’t carry any flow Max vertex degree becomes at most log n, because each vertex participates in at most log n paths we chose.

New Flow: Flow value at least Ω(OPTLP/log n) All vertex degrees at most log n Each demand pair sends 1/log n flow units via a single path. For each demand pair (si,ti), choose a path with probability f(P). Expected solution value: OPTLP Maximum vertex congestion: O(log n/log log n) with high probability We get a new feasible flow solution that looks nice. Flow value goes down by log n factor. We don’t need to lose this, but we do for simplicity. We can now assume that all vertex degrees are at most log n. We can also assume that every demand pair sends 1/log n flow units, along one path, by discarding all demand pairs that don’t send any flow. This is a much cleaner flow solution to work with, and we achieved bounded vertex degree.

Recap: Expanders To remind you, we said that NDP and EDP are easy on expanders. Here is a recap of expanders. I’m putting it here because we’ll be comparing it with some other definitions later on. An expander is a graph that expands very fast, and does not have small cuts. Whenever we partition the vertices into two subsets, the number of edges going across should be comparable to the number of vertices on the smaller side. There is a lot of work on NDP/EDP on expanders. Bottom line: can get a good approximation. Very good algorithms for Node-Disjoint Paths on expanders (if max vertex degree not too large).

Main Idea: Exploit Algorithms for Expanders!
But our graph is nothing like an expander Find expander-like structures in the graph and use then for routing! For our algorithm, we would like to exploit the fact that we can get good routing on expander graphs. Unfortunately, our graph may be nothing like an expander. The idea is to find an expander-like structures in the graph and use it for routing. In fact, we will embed many expanders into our graph and then solve the problem on the expanders. In order to do this, we need to introduce a very important graph-theoretic notion, called well-linkedness.

Well-Linkedness So now I am taking a detour to define well-linkedness and its variations, and we’ll get back to NDP later.

Well-Linkedness [Robertson,Seymour], [Chekuri, Khanna, Shepherd], [Raecke]
terminals Well-linkedness was used independently in TCS and graph theory, in slightly different ways. As a consequence, there are several different variations of this notion. I’ll start with one and then we’ll see the others. We start with a graph G. red vertices – a subset of vertices, called terminals. Black vertices – the rest of the vertices, non-terminal vertices. I don’t care about the black vertices that much. We would like to define what it means for the terminals to be well-linked, so they need to be very well-connected. Here is the definition.

Well-Linkedness [Robertson,Seymour], [Chekuri, Khanna, Shepherd], [Raecke]
terminals Set T of terminals is well-linked in G, iff for every partition (A,B) of V(G), Whenever we partition all vertices into 2 subsets, the number of edges going across is at least as large as the number of terminals on the smaller side. This is similar to expansion. In expansion, we would compare the number of edges to the total number of vertices on the smaller side. Here we ignore all other vertices, and only compare to the terminals. So this is like using non-uniform sparsest cut, as opposed to the uniform sparsest cut that we use to define expansion. So this is like expansion, only with respect to the terminals.

Equivalent Definition
For every pair (T’,T’’) of subsets of T, with |T’|=|T’|: can connect every vertex of T’ to a distinct vertex of T’’ with edge-disjoint paths. Set T of terminals is well-linked in G, iff for every partition (A,B) of V(G), From equivalence between flows and cuts, this is the same as saying this: for any pair of subsets of the terminals with same cardinalities, we can connect both sides with paths that are edge-disjoint The specific matching is no guaranteed, we don’t control who connects to who. But every terminal on the left will get connected to some terminal on the right and vice versa.

Equivalent Definition
For every pair (T’,T’’) of subsets of T, with |T’|=|T’|: can connect every vertex of T’ to a distinct vertex of T’’ with edge-disjoint paths. we do not ask for routing a specific matching Set T of terminals is well-linked in G, iff for every partition (A,B) of V(G), From equivalence between flows and cuts, this is the same as saying this: for any pair of subsets of the terminals with same cardinalities, we can connect both sides with paths that are edge-disjoint The specific matching is no guaranteed, we don’t control who connects to who. But every terminal on the left will get connected to some terminal on the right and vice versa.

Weaker Definition: -Well-Linkedness
Given a parameter Set T of terminals is -well-linked in G, iff for every partition (A,B) of V(G), Next, I want to define a weaker and a more general notion of well-linkedness. It may look like a boring technicality, but it is very important, because it gives us more power and more flexibility when dealing with well-linkedness. We’ll see soon why. The new notion is called \alpha-well-linkedness This definition uses a parameter \alpha. We can think of it as being something like 1/polylog n, but it can be anything between 0 and 1. Now, whenever we partition all vertices into two subsets, and look at the number of edges going across, it should be at least alpha times the number of terminals on the smaller side. So the difference with the original definition is that we add this slack of \alpha in the definition.

Think: Given a parameter Set T of terminals is -well-linked in G, iff for every partition (A,B) of V(G), Next, I want to define a weaker and a more general notion of well-linkedness. It may look like a boring technicality, but it is very important, because it gives us more power and more flexibility when dealing with well-linkedness. We’ll see soon why. The new notion is called \alpha-well-linkedness This definition uses a parameter \alpha. We can think of it as being something like 1/polylog n, but it can be anything between 0 and 1. Now, whenever we partition all vertices into two subsets, and look at the number of edges going across, it should be at least alpha times the number of terminals on the smaller side. So the difference with the original definition is that we add this slack of \alpha in the definition.

Equivalent: For every pair (T’,T’’) of subsets of T, with |T’|=|T’’|: can connect every vertex of T’ to a distinct vertex of T’’ with edge-congestion at most T’ T’’ Given a parameter Set T of terminals is -well-linked in G, iff for every partition (A,B) of V(G), Going to the equivalent flow-based definition, this means that whenever we have two equal-sized subsets of the terminals, we can connect the two subsets with paths as before, only now we are only guaranteed that the paths cause small edge congestion The edge-congestion is 1/\alpha. For example, if \alpha is 1/\polylog n, we’ll get a polylogarithmic congestion. Everything else is the same as before: every terminal on the left connects to a distinct terminal on the right.

Equivalent: For every pair (T’,T’’) of subsets of T, with |T’|=|T’’|: can connect every vertex of T’ to a distinct vertex of T’’ with edge-congestion at most T’ T’’ Previous Definition: -well-linkedness with Will call: 1-well-linkedness. Given a parameter Set T of terminals is -well-linked in G, iff for every partition (A,B) of V(G), To compare it with the original definition: the original definition is the one where we set \alpha to be equal to 1. So this definition is a bit more general.

Strongest Version: Node-Well-Linkedness
Set T of terminals is Node-Well-Linked in G iff: for every pair (T’,T’’) of subsets of T, with |T’|=|T’’|, can connect every vertex of T’ to a distinct vertex of T’’ via node-disjoint paths There is a third definition that I have to introduce, because it plays a central role in graph theory. In fact that’s the only definition that’s used in graph theory. They don’t use the other ones. It’s called node-well-linkedness. Here I’m only giving the flow-based view, even though one can define a cut-based view as before, only we would have to use vertex cuts. Now whenever we take two equal-sized subsets of T, we should be able to connect them by paths as before, only this time, the paths need to be vertex-disjoint. This is stronger than the previous two definitions, where we asked that the paths are either edge-disjoint or cause a small edge-congestion, because now the paths are vertex-disjoint.

Why All These Definitions?
Node well-linkedness (can connect any pair of subsets via disjoint paths): plays central role in graph minor theory is closely related to treewidth α-well-linkedness (can connect any pair of subsets of terminals with edge-congestion 1/α) is much more convenient to work with. 1-well-linkedness: clean natural definition; useful in EDP Why bother with all these definitions? Node well-linkedness, where the paths are completely disjoint is very important to graph theory, and the only definition that has been used there. \alpha well-linkedness is very convenient to work with, gives you a lot more power and flexibility. The special case of 1-well-linkedness (edge-disjoint paths) makes a lot of sense and has been used a lot in routing problems.

Why All These Definitions?
Node well-linkedness (can connect any pair of subsets via disjoint paths): plays central role in graph minor theory is closely related to treewidth α-well-linkedness (can connect any pair of subsets of terminals with edge-congestion 1/α) is much more convenient to work with. 1-well-linkedness: clean natural definition; useful in EDP All these definitions are morally equivalent! Why bother with all these definitions? Node well-linkedness, where the paths are completely disjoint is very important to graph theory, and the only definition that has been used there. \alpha well-linkedness is very convenient to work with, gives you a lot more power and flexibility. The special case of 1-well-linkedness (edge-disjoint paths) makes a lot of sense and has been used a lot in routing problems. But the cool thing is that all these definitions are morally equivalent, and we can move between them. That’s the main reason why \alpha-well-linkedness is so useful. Often, when you need node-well-linkedness, you can get away with using \alpha-well-linkedness, which is much easier to handle.

Boosting Theorems [Chekuri, Khanna, Shepherd], [Chekuri, C]
Suppose set T of vertices is α-well-linked in G. Then there is , that is 1-well-linked in G, and If max vertex degree in G is , then there is that is node-well-linked, and The reason is these boosting theorems. Whenever we start with a set of terminals that’s \alpha-well-linked, We can find in it a large subset that’s 1-well-linked. The subset size is at least \alpha times the size of the original set. On top of that, if the maximum vertex degree is not too large, we can even get node-well-linkedness: there is a large subset of the terminals that’s node-well-linked. The size is at most \alpha |T|/\Delta.

Boosting Theorems [Chekuri, Khanna, Shepherd], [Chekuri, C]
Suppose set T of vertices is α-well-linked in G. Then there is , that is 1-well-linked in G, and If max vertex degree in G is , then there is that is node-well-linked, and Bounded vertex degree really helps! We can also find the subset T’ of terminals with these properties efficiently, but then we’ll be losing some additional polylogarithmic and \poly(\Delta) factors. This looks like a technical statement but it’s incredibly convenient. In many situations, as we are going to see, we will need to achieve the perfect node-well-linkedness. But we could relax this notion, and get a way with the much weaker \alpha-well-linkedness, due to this statement. I’ll skip the proofs of these theorems in the interest of time. This is another good reason why we need bounded vertex degree. It makes our life much easier. Can efficiently find slightly smaller T’ with same properties

Back to NDP We are done with the detour and now we are going back to NDP.

NDP: Well-Linked Instances
Terminals: vertices participating in the demand pairs An instance is well-linked iff the set of terminals is node-well-linked in G. If input instance is well-linked, could try to embed an expander over the terminals into it Now we can define well-linked instances of NDP To remind you, the terminals are all vertices participating in the demand pairs. We say that the instance is well-linked iff the set of the terminal vertices is node-well-linked. We would like the instance to be well-linked, because then, intuitively, it feels like we should be able to embed an expander, defined over the terminals, into our graph. Of course in general, the instance may not be well-linked at all. But the good thing about well-linkedness is that we can enforce this property, in some sense. This is done through something called well-linked decompositions.

Terminals: vertices participating in the demand pairs An instance is well-linked iff the set of terminals is node-well-linked in G. If input instance is well-linked, could try to embed an expander over the terminals into it But input instance may not be well-linked! Now we can define well-linked instances of NDP To remind you, the terminals are all vertices participating in the demand pairs. We say that the instance is well-linked iff the set of the terminal vertices is node-well-linked. We would like the instance to be well-linked, because then, intuitively, it feels like we should be able to embed an expander, defined over the terminals, into our graph. Of course in general, the instance may not be well-linked at all. But the good thing about well-linkedness is that we can enforce this property, in some sense. This is done through something called well-linked decompositions. Turn it into one!

Well-Linked Decomposition [Chekuri, Khanna, Shepherd ’04]
Start: Flow solution; every demand pair sends 1/log n flow units End: Cut the graph into well-linked instances; keep at least 1/polylog n of the flow. So we start with some instance of the NDP problem, where every demand pair sends 1/log n flow units to each other. We would like to partition this instance into a number of well-linked instances, and make sure that we preserve at least 1/polylog n-fraction of the flow. This kind of an algorithm is called a well-linked decomposition.

Well-Linked Decomposition
Flow Solution: Each demand pair sends 1/log n flow units via 1 path Want: (1/log2n)-well-linkedness So this is our starting point. For now let’s assume that we are only interested in a much weaker version of well-linkedness, namely 1/log^2n-well-linkedness of the terminals. If we already have this property, there is nothing to do. Otherwise, there is a very sparse cut in the graph with respect to the terminals. We can use approximation algorithms to compute some sparse cut between the terminals. If terminals are not (1/log2n)- well-linked: Find a sparse cut

Flow Solution: Each demand pair sends 1/log n flow units via 1 path Want: (1/log2n)-well-linkedness After that we want to continue the process on each side separately. The key is that because the cut is so sparse, we have discarded very little flow relatively to the number of demand pairs that remain in both these clusters, and very little flow compared to how much flow remains on both sides (this is because each pair 1/log n flow units, so there is a lot of flow in the system and we only deleted a small fraction of it). If terminals are not (1/log2n)- well-linked: Find a sparse cut Continue on both sides.

Flow Solution: Each demand pair sends 1/log n flow units via 1 path Want: (1/log2n)-well-linkedness We then continue this process on both sides. If terminals are not well-linked: Find a sparse cut Continue on both sides.

Well-Linked Decomposition: End
In each cluster the terminals are (1/log2n)-well-linked Removed at most half the flow In the end, we’ll end up cutting the graph into pieces, so that the terminals inside each piece are 1/log^2n-well-linked. Because we were cutting on sparse cuts, we only lost very little flow – say, at most half the flow. We then perform boosting in each piece separately to obtain node-well-linkedness. We need to be careful so that we preserve the demand pairs. So we’ll select a subset of the terminals in each instance, which are node-well-linked, and this subset will need to preserve the demand pairs. This requires some extra work. Perform boosting in each cluster to get node-well-linkedness! Need to preserve the demand pairs

Node-Disjoint Paths with Congestion
At the cost of losing an poly(log k)-factor in the approximation, can assume that the terminals are node-well-linked [Chekuri, Khanna Shepherd ‘04], [Chekuri, C ‘15]. Main idea: embed an expander over the terminals into G. Exploit routing algorithms on expanders. So if we are willing to lose a polylog(k)-factor in the approximation, we can assume that the terminals in our given instance are node-well-linked. The main idea is that well-linkedness feels like an expansion property. It’s like expansion that’s defined only with respect to the terminals. Unfortunately, even if our terminals are well-linked, it is possible that our graph is far from being an expander. But the intuition is that it should contain an expander defined over the terminals, and we want to find that expander, and then exploit known algorithms for routing on expanders. So eventually we will embed an expander defined over some subset of the terminals into G, and exploit this expander to find the routing.

Node-Disjoint Paths with Congestion
At the cost of losing an poly(log k)-factor in the approximation, can assume that the terminals are node-well-linked [Chekuri, Khanna Shepherd ‘04], [Chekuri, C ‘15]. Main idea: embed an expander over the terminals into G. Exploit routing algorithms on expanders. Before I show you how to do it, we have done all this work defining the different notions of well-linkedness and connecting between them. I’d like to show you another simple application of this notion. Will get back to this …

More Uses of Well-Linkedness:
Vertex Cut Sparsifiers We will take another detour. I will show another simple example where well-linkedness comes in useful, and will use this example to demonstrate another way to do well-linked decompositions. The example is vertex cut sparsifiers.

Vertex Sparsifiers Input: graph G with edge capacities
a subset T of k vertices called terminals Goal: find a small graph H, containing T, and approximately preserves the routing properties of G. In general, in a vertex sparsifier, we are given a graph G with capacities on its edges We are also given some subset of its vertices that we call terminals. G

Vertex Sparsifiers Input: graph G with edge capacities
a subset T of k vertices called terminals Goal: find a small graph H, containing T, and approximately preserves some properties of G w.r.t. T. These are the red vertices here. We would like to simulate the graph G on a much smaller graph H, that we call sparsifier. The sparsifier H has to contain all terminals, and we want it to behave in the same way as G with respect to the terminals. What properties we want to preserve determines what type of sparsifier we want. G H

Vertex Cut Sparsifiers [Moitra ’09]
Want to preserve: min-cutG(TA,TB) for all partitions (TA,TB) G TB TA For example, in cut sparsifiers, we would like to preserve all cuts between the terminals. So if we partition the terminals into 2 subsets, and look at the capacity of the minimum cut separating the two subsets in G, we would like to preserve this value for all such partitions.

Want to preserve: min-cutG(TA,TB) for all partitions (TA,TB) G TB TA Capacity of min-cut separating TA from TB For example, in cut sparsifiers, we would like to preserve all cuts between the terminals. So if we partition the terminals into 2 subsets, and look at the capacity of the minimum cut separating the two subsets in G, we would like to preserve this value for all such partitions.

Want to preserve: min-cutG(TA,TB) for all partitions (TA,TB) G TB TA A quality-q sparsifier preserves each such cut to within factor q. As you may have guessed, the two parameters that most interest us are: the quality of the sparsifier, and the size of the sparsifier. A quality-q cut sparsifier: preserves all min-cut values between terminal sets to within factor q

Want to preserve: min-cutG(TA,TB) for all partitions (TA,TB) G TB TA Important parameters: Sparsifier quality Sparsifier size A quality-q sparsifier preserves each such cut to within factor q. As you may have guessed, the two parameters that most interest us are: the quality of the sparsifier, and the size of the sparsifier. A quality-q cut sparsifier: preserves all min-cut values between terminal sets to within factor q

Motivation G H Faster algorithms Better approximation factors
Interesting graph-theoretic question Connections to kernels in FPT The motivation for looking for such sparsifiers is simple. If we need to solve some problem again and again, it may be faster to solve it on H If we have algorithms whose approximation factors depend on the size of G, then we get a black-box way to turn them into algorithms whose approximation factor only depends on the number of terminals. All this assuming that the problems we consider only depend on the cut values that we are preserving. This is also a very interesting graph-theoretic question, that naturally connects to the notion of kernels in Fixed Parameter Tractability.

Vertex Cut Sparsifiers
G We are going to make some assumptions: that all edge-capacities are 1, and all terminals have degree 1. this assumption is important to make the algorithm go through. Without it we would still get some results, but they would be weaker. we’ll also assume that all vertices have constant degree. This assumption is not necessary, but the construction will be a bit easier and consistent with what I’m going to show later. So the last assumption can be removed for free. The other ones can also be removed, but then the result will be weaker. This is just to demonstrate the technique. Assumptions: all edge-capacities are 1 all terminals have degree 1 all vertices have constant degree

Contraction-Based Cut Sparsifiers
H We don’t contract the terminals! Here is a very natural way to build sparsifiers. Take the graph G, and partition all non-terminal vertices into clusters. Then contract the clusters. Note that we don’t contract the terminals, the sparsifier must contain the terminals, because we don’t put them into clusters. How good is this sparsifier? Cut V\T into clusters. Contract the clusters.

H min-cut values can’t decrease! The good news is that this cannot decrease the cut values This is because whenever we have any cut in H, We have the same cut, separating the same terminals in G. Cut V\T into clusters. Contract the clusters.

H min-cut values can’t decrease! But unfortunately the cut values may increase. For example, maybe the best way to separate these two sets of terminals would be to cut through these clusters, because there are cheap cuts inside. But in H we can’t do it, we have to put each cluster completely on one of the sides. This may increase the cut. Can we make this scheme to work? As you may have guessed, we can through well-linkedness. Cut V\T into clusters. Contract the clusters. … but they may increase

H What kind of clustering would work? min-cut values can’t decrease! But unfortunately the cut values may increase. For example, maybe the best way to separate these two sets of terminals would be to cut through these clusters, because there are cheap cuts inside. But in H we can’t do it, we have to put each cluster completely on one of the sides. This may increase the cut. Can we make this scheme to work? As you may have guessed, we can through well-linkedness. Cut V\T into clusters. Contract the clusters. … but they may increase Use well-linkedness!

Boundary Well-Linkedness, a.k.a. Bandwidth Property [Raecke ‘02]
G Boundary vertices C So suppose G is our graph, and C is some subgraph of G. We look at the boundary vertices of G – these are all vertices that have a neighbor outside of G. We want them to be well-linked inside the graph induced by C. If they are \alpha-well-linked, then we say that C has the \alpha-bandwidth property. (I don’t know if my attribution of this definition to Raecke is correct but that’s the first paper I am aware of where this notion appeared.) This is a very useful notion of well-linkedness, that we will use later. Intuitively, whatever happens inside this cluster is not that interesting and contracting it does not change the graph too much. Cluster C has α-bandwidth property iff the set of the boundary vertices is α-well-linked in G[C].

H If every cluster has the α-bandwidth property, then H is a quality-O(1/α) cut sparsifier Let’s now go back to our plan to obtain cut sparsifiers by contracting clusters. The claim is that, if we can ensure that each cluster has the \alpha-bandwidth property, then the resulting sparsifier has O(1/\alpha)-quality. We already noticed that cuts in H cannot decrease, because every cut in H is also a cut in G. We also noted that they may increase, because maybe the cheapest way to separate these terminals is to cut through these clusters. Minimum cuts separating subsets of terminals don’t decrease

H If every cluster has the α-bandwidth property, then H is a quality-O(1/α) cut sparsifier Let’s now look at one of these clusters and its boundary vertices. The boundary vertices are \alpha-well-linked inside the cluster. So the cut going across the cluster cannot be that small. It cannot be a very sparse cut with respect to the black vertices. If we move the whole cluster to one side – the side that has more black vertices, then the cut size won’t go up by much, at most 1/\alpha-factor.

H If every cluster has the α-bandwidth property, then H is a quality-O(1/α) cut sparsifier (We use the fact that all vertices have constant degrees; otherwise we would have to define the bandwidth property differently and we don’t want to do that).

H If every cluster has the α-bandwidth property, then H is a quality-O(1/α) cut sparsifier Now we just keep on doing this, moving the next cluster

H If every cluster has the α-bandwidth property, then H is a quality-O(1/α) cut sparsifier and the next cluster

H If every cluster has the α-bandwidth property, then H is a quality-O(1/α) cut sparsifier until each cluster is completely contained in one side of the cut. Now this defines a cut in H. The cut value only goes up by a 1/\alpha factor

Cut Sparsifiers H Cut V\T into clusters. Contract the clusters.
So if we go back to our plan, we now have more details Cut V\T into clusters. Contract the clusters.

Well-linked decomposition
Cut Sparsifiers H Well-linked decomposition We want to cut the non-terminal vertices into a small number of clusters, that have \alpha-bandwidth property, for some \alpha that’s close to 1. We will do this using another variation of the well-linked decomposition, that is important in its own right. Cut V\T into few clusters that have α-bandwidth property. Contract the clusters.

Are boundary vertices α-well-linked? Yes No The main idea of the algorithm is very simple. We start with the graph G and the terminals. Our initial cluster contains all non-terminal vertices. We look at all its boundary vertices - the vertices that connect to the terminals, and ask if they are \alpha-well-linked inside that cluster. If so, we are done. If not, then there must be a sparse cut of the cluster with respect to the boundary vertices. Done! Sparse cut w.r.t. boundary vertices

Recurse on both sides! we create few new boundary vertices We cut along the cut. The cut is sparse, so the number of edges going across is much smaller than the number of the boundary edges on each side. Because of this we create few new boundary vertices. We then recurse on both sides. Intuition: # of green edges << # of old boundary vertices

sparsifier quality What well-linkedness factor can we achieve? How many clusters? We continue this process, until every cluster has the \alpha-bandwidth property. What we want to know is two things: What is the well-linkedness factor we can achieve? This will be the quality of our sparsifier. How many clusters do we create? This will determine the sparsifier size. As you would expect, it’s a tradeoff between these two parameters. The analysis is done by pretty simple charging scheme, but it’s a bit different for different regimes of the parameter. sparsifier size It’s a tradeoff!

k - # of terminals α – well-linkedness parameter N - # of edges connecting the clusters Ex: ½-well-linkedness; N≤k5 Non-constructive Let’s denote by k the number of terminals. As before, \alpha is the well-linkedness that we are trying to achieve. The number of edges connecting the clusters is denoted by N. We will use it to bound the number of clusters. We can achieve a constant well-linkedness, with N, the number of edges polynomial in k. The closer the well-linkedness to 1, the higher the polynomial. For example, we can get ½-well-linkedness with N at most k^5. This does not give an efficient algorithm, because we need to be able to compute sparsest cuts exactly, or to within a constant approximation. constant well-linkedness with N=poly(k)

k - # of terminals α – well-linkedness parameter N - # of edges connecting the clusters If we are willing to go to worse well-linkedness, say 1/polylog k, then we get this expression. in particular, we can ensure that the number of edges connecting the clusters is much smaller than k. This is a constructive result, because now we can use approximation algorithms for sparsest cut. This regime is very useful, and we will be using it later today. It gives a very compact graph, with few edges. For vertex sparsifiers, only the first regime is interesting though. constructive constant well-linkedness with N=poly(k) smaller well-linkedness with N≤O(αk log k)

k - # of terminals α – well-linkedness parameter N - # of edges connecting the clusters can get N<<k with 1/polylog(k) well-linkedness! If we are willing to go to worse well-linkedness, say 1/polylog k, then we get this expression. in particular, we can ensure that the number of edges connecting the clusters is much smaller than k. This is a constructive result, because now we can use approximation algorithms for sparsest cut. This regime is very useful, and we will be using it later today. It gives a very compact graph, with few edges. For vertex sparsifiers, only the first regime is interesting though. constant well-linkedness with N=poly(k) smaller well-linkedness with N≤O(αk log k)

Cut Sparsifiers H So the conclusion is that we can get a constant-quality sparsifiers with poly(k) clusters. Kratsch and Wahlstrom got somewhat similar results using very different techniques Cut V\T into few clusters that have α-bandwidth property. Contract the clusters.

Cut Sparsifiers H Can get a constant-quality sparsifier with poly(k) clusters [C ‘12], [Kratsch, Wahlstrom ’12] So the conclusion is that we can get a constant-quality sparsifiers with poly(k) clusters. Kratsch and Wahlstrom got somewhat similar results using very different techniques Cut V\T into few clusters that have α-bandwidth property. Contract the clusters.

Lots of great work done Lots of great open questions! Many interesting connections to graph theory. Unfortunately I don’t have time to talk more about vertex sparsifiers. There is lots of great work that was done on vertex sparsifiers, there are many really great open problems there. We don’t understand them well at all. There are also lots of interesting connections to graph theory. Unfortunately, due to lack of time I can’t talk more about this…

k - # of terminals α – well-linkedness parameter N - # of edges connecting the clusters can get N<<k with 1/polylog(k) well-linkedness! One important take-away from this part is the bandwidth property, and this different kind of well-linked decompositions. -They are very important, we will use them later. -One thing to remember is that if we are willing to compromise on well-linkedness and settle for 1/polylog k, we can get very small number of edges connecting the clusters, even much smaller than k. We will use this later. Constant well-linkedness with N=poly(k) smaller well-linkedness with N≤O(αk log k)

Back to NDP with Congestion
This is the end of the second detour. Now we return to NDP with congestion. Let me remind where we stopped.

Terminals: vertices participating in the demand pairs An instance is well-linked iff the set of terminals is node-well-linked in G. Enough to get polylog(k)-approximation with congestion 2 on well-linked instances. We said that terminals are vertices participating in demand pairs. We said that an instance is well-linked iff its terminals are node-well-linked. And we said that it is enough to get a polylog(k)-approximation on well-linked instances, at the cost of losing a polylog(k) factor in the solution value.

Main Idea [Chekuri, Khanna, Shepherd], [Rao, Zhou]
Embed an expander over the terminals into G! The main idea has matured over several papers. Well-linkedness reminds us of expansion. The feeling is that if the terminals are well-linked, then there is an expander over the terminals sitting in the graph We want to find that expander. In other words, we’ll try to embed an expander over the terminals into G. Like, maybe, X is an expander that we want to embed into G. The vertices of X are the terminals of G, or their subset. terminals of G

Embed an expander over the terminals into G! What does it mean to embed an expander X into G? Each vertex is embedded as a connected sub-graph of G, containing the right terminal The clusters must be disjoint. Each edge is a path connecting the corresponding clusters.

Embed an expander over the terminals into G! The clusters are disjoint from each other and the paths are disjoint from each other. But a path may visit many clusters. The clusters are disjoint The paths are disjoint But a path may visit many clusters

Embed an expander over the terminals into G! For example, a path may look like this. This is the reason why we get congestion 2. The clusters are disjoint The paths are disjoint But a path may visit many clusters

1. Embed an expander over the terminals into G \ 2. Find a routing on node-disjoint paths in the expander 3. Translate it into congestion-2 routing in G Here is the plan for the rest of the algorithm. Once we embed an expander over the terminals into G, we can use known algorithms to route many pairs via node-disjoint paths in the expander. We can then translate it into congestion-2 routing in G. Let’s see why The clusters are disjoint The paths are disjoint But a path may visit many clusters

Embedding an Expander into G
\ Look at some path in the expander. Maybe it connects s to t. Want to translate it into a path connecting s to t in G. For each edge in the path, take its embedding – the red paths The blue clusters are connected, so we can stitch them together into a path If the paths in X are vertex-disjoint, then each red path and each blue component will be used at most once. So each vertex of G will be used at most twice. This is because a path can go through many clusters This gives routing with congestion at most 2 in G. This is because a path can visit a number of clusters. Routing on vertex-disjoint paths in X gives a routing in G with vertex-congestion 2!

Main Idea 1. Embed an expander over the terminals into G
2. Find a routing on node-disjoint paths in the expander 3. Translate it into congestion-2 routing in G Going back to our plan, it’s more or less clear how to do the last two steps. the main question is how do we do the first step?

Main Idea 1. Embed an expander over the terminals into G
2. Find a routing on node-disjoint paths in the expander 3. Translate it into congestion-2 routing in G That’s what we’ll focus on. There is a really nice framework that allows one to embed an expander into a graph, called the cut-matching game. For now, think about it as some abstract game. Will later show how to embed an expander into a graph using it.

Cut-Matching Game [Khandekar, Rao, Vazirani ’06], [Orecchia, Schulman, Vazirani, Vishnoi ‘08]
Cut Player: wants to build an expander Matching Player: wants to delay its construction The game is played between two players, the cut player and the matching player. We start with a graph that has an even number of vertices and no edges. In every iteration we'll add some edges to the graph, until it becomes an expander. Cut player wants it to happen as quickly as possible Matching player wants to delay the construction of the expander.

Cut-Matching Game [Khandekar, Rao, Vazirani ’06], [Orecchia, Schulman, Vazirani, Vishnoi ‘08]
Cut Player: wants to build an expander Matching Player: wants to delay its construction There is a strategy for cut player, s.t. after O(log2n) iterations, we get an expander! In every iteration, the cut player computes a partition of the vertices into two equal-sized subsets The matching player returns a complete matching between them. We add the matching edges to the expander we are building. This ends the iteration. In the second iteration, the cut player computes a different partition into equal-sized subsets. The matching player produces another matching The edges are added to the graph. We keep on doing this, until the graph is an expander. The question is for how long do we need to do this? Khandekar, Rao and Vazirani show that no matter what the matching player does, the cut player can play this game so that O(log^2n) iterations are sufficient. We even have an efficient algorithm that computes the partition for the cut player in every step.

Embedding Expander into Graph
How does it help us with embedding an expander into G? --Blue vertices are the terminals, they are well-linked. --We’ll build an expander and embed it into G, by playing that game. --Start with an empty graph that contains only the terminals. In the first iteration, we compute the partition of the terminals like the cut player. The terminals are node-well-linked, so we can connect the two sets of terminals by disjoint paths This routing defines a matching, and we add those edges to the expander we are building.

After O(log2k) iterations, we get an expander embedded into G. We continue like this: in the second iteration, we compute the partition, route the two sets, get a matching, and so on. We are guaranteed that after O(log^2k) iterations we get an expander, and simultaneously we get its embedding into G. This scheme was suggested for faster algorithms for sparsest cut, and it works there well. But it does not work so well for us. The problem is that in every iteration we compute the paths from scratch, and we may reuse the vertices and the edges, resulting in congestion log^2k, Problem: congestion Ω(log2k)

After O(log2k) iterations, we get an expander embedded into G. As an aside, we can use the cut-matching game to reduce the maximum vertex degree to log^2k, just like we used randomized rounding to reduce it to O(log n). ---We will need to re-do the boosting to get the node-well-linkedness back. Overall, as we said, the problem with this approach is that in every iteration we compute the paths from scratch, and we may reuse the vertices and the edges, resulting in congestion log^2k, ---If we could somehow split the graph into pieces and compute the flow in each piece separately, we would not accumulate the congestion. ---This brings us to defining another very useful combinatorial structure, called the Path-of-Sets System. Can use the cut-matching game to reduce max vertex degree to O(log2k) Will need to redo boosting Problem: congestion Ω(log2k)

Path-of-Sets System

A Path-of-Sets System C1 C2 C3 … CL … width w length L A1 B1 w A2 B2
AL BL L disjoint connected clusters Two disjoint sets Ai, Bi of w vertices in each cluster Ci Ai Bi is node-well-linked in Ci For all i, set Pi of w disjoint paths connecting Bi to Ai+1 All paths are disjoint from each other and internally disjoint from clusters For now, think about it as some abstract combinatorial object. The system has to parameters: width w and length L We start with L disjoint clusters. Each is a connected subgraph of G. Each cluster has two sets A_i,B_i of distinct vertices, of cardinality w each. -----it has more vertices, but these are special. We require that A_i\cup B_i is well-linked in C_i. Every consecutive pair of clusters is connected by a set of w paths. They connect B_{i-1} to A_i. The paths are disjoint form each other and internally disjoint from the clusters: each path starts in a cluster, ends in another cluster; all other vertices are disjoint from the clusters. Parameter L: number of clusters; parameter w: number of paths in each set.

From Well-Linkedness to Path-of-Sets
C2 C3 … CL w C1 A1 B1 A2 B2 A3 B3 AL BL Theorem [C, ’11], [C, Li ’12], [Chekuri, C ’13]: Suppose G has a set of k node-well-linked vertices and max vertex degree polylog(k). Then we can efficiently construct a path-of-sets system in G with parameters L and w, if: This is a very useful theorem, that evolved from a long sequence of work. The theorem shows that if we have a set of k well-linked vertices in our graph, and maximum vertex degree is bounded by polylog(k), we can build a PoS system, as long as L and w are not too large. The tradeoff appears here. We can improve it if the proof is existential, non-constructive.

C2 C3 … CL w C1 A1 B1 A2 B2 A3 B3 AL BL We’ll use: L=O(log2k) w=k/polylog k Theorem [C, ’11], [C, Li ’12], [Chekuri, C ’13]: Suppose G has a set of k node-well-linked vertices and max vertex degree polylog(k). Then we can efficiently construct a path-of-sets system in G with parameters L and w, if: For our problem, we will use this setting of parameters: --number of clusters is log^2k; width is k/polylog k. On top of that, the theorem gives us some extras: ---we can ensure that there are w terminals, that form demand pairs, and connect to A_1 by disjoint paths. So here is the picture we'll get Extras: Can connect w terminals to A1 by disjoint paths Can make sure they form demand pairs!

C2 C3 … CL w C1 A1 B1 A2 B2 A3 B3 AL BL The paths are disjoint from each other and the PoS system Here is the picture that we get. The terminals form demand pairs Their paths are disjoint from the PoS system (except for the last vertex). We claim that this structure is sufficient in order to embed the expander into our graph. Let's see why The terminals form demand pairs Given the PoS, can embed an expander!

Embedding the Expander
C1 C2 C3 … CL w … Ci Let’s focus on some cluster C_i of the Path-of-Sets system. Recall that A_i and B_i are node-well-linked inside the cluster. ---So we can connect them by node-disjoint paths inside the cluster. --We don’t know what matching we are going to route, but we can rearrange the vertices, so it will look like this. ---We can then go and do it to each cluster in turn. We’ll get w paths that go across all clusters in order. Ai Bi is node-well-linked inside Ci Ai Bi

C1 C2 C3 … CL w … We now add back the terminals and their paths. Here we’ll try to build an expander on a subset of the terminals. We’ll only use the terminals that are connected to A_1.

C1 C2 C3 … CL w … Each vertex of the expander needs to be embedded into some connected sub-graph of G. We embed it into the corresponding path. Expander vertex the path containing the terminal

C1 C2 C3 … CL In this picture, the orange terminal embeds into the orange path, The blue terminal embeds into the blue path, and so on. Expander vertex the path containing the terminal

C1 C2 C3 … CL What about the edges? We play the cut-matching game in order to embed the edges. Use the cut player to partition the vertices into two subsets. go to the first cluster. This defines a partition of the vertices on the left. Expander edges? cut-matching game!

C1 C2 C3 … CL These vertices are node-well-linked inside the cluster, so we can connect them by disjoint paths inside C_1. Expander edges? cut-matching game!

C1 C2 C3 … CL node-disjoint paths Maybe these are the paths. --The black paths are disjoint from each other, but not from the paths into which the terminals are embedded. --That’s why we get congestion 2. The paths define some matching, that we treat as the answer of the matching player. we add the matching edges to the expander we are constructing. Expander edges? cut-matching game!

C1 C2 C3 … CL Then we go to the second iteration. Use the cut player to partition the vertices. Now we go to the second cluster. This defines a partition of the vertices on the left. Because they are well-linked, we can route them by disjoint paths. This defines a matching, whose edges are added to the expander.

C1 C2 C3 … CL We continue like this for log^2k iterations. In each iteration we can use a distinct cluster. After log^2k iterations, we are done. We have embedded the expander with congestion 2! After O(log2k) iterations, we obtain an expander embedded into G with vertex-congestion 2.…

Algorithm for NDPwC in Well-Linked Instances
Find a Path-of-Sets System Embed an expander into G Find vertex-disjoint routing in the expander To summarize, here is how the algorithm works: start with a well-linked instance Compute a Path-of-Sets system - I still owe you this proof. Use the Path-of-Sets system to embed an expander through the cut-matching game Solve the routing problem on expander Transform the solution into routing in G. Transform into routing in G

Structural Result If G contains a large node-well-linked set of vertices, and has a bounded max vertex degree, then it contains a large Path-of-Sets System Treewidth sparsifiers Along the way we have show an important structural result: if G contains a large well-linked set of vertices, and has bounded max vertex degree, then it contains a large Path-of-Sets system. This result turned out to be very useful in several other scenarios. I will now talk a bit about one of them: the excluded grid theorem. Excluded grid theorem Vertex flow sparsifiers Large-treewidth graph decompositions

Excluded Grid Theorem [Robertson, Seymour ‘86]
The theorem was proved as part of Robertson and Seymour’s graph minor series. It deals with the notion of treewidth. So first I’ll give some intuition on treewidth.

Treewidth: Motivation
Simple graphs We all like trees. Trees are simple graphs that are usually easy to understand Many problems have efficient algorithms on trees Usually through dynamic programming.

Simple graphs One reason they are easy is that they have very small vertex separators: you can delete one vertex and cut the tree into pieces. This allows us to do dynamic programming easily on trees for many problems.

Simple graphs Complicated graphs And then there are complicated graphs, where often problems are hard to solve. But these are really two extremes. It’s not like we are either here or there. It feels like there should be a more gradual transition from trees to these complicated graphs. If we take a tree and add a few edges to it, it does not suddenly become a very complicated graph. It would be really great to be able to quantify how complex the graph is.

Simple graphs Complicated graphs Treewidth: measures how complex the graph is. Treewidth does exactly this. Treewidth of a graph is a number between 1 and n-1. If treewidth is 1, your graph is a tree or a forest. As it gets closer to n-1, the graph is more complex. Intuitively, if treewidth is k, then can get dynamic-programming based algorithms with run time exponential in k (not a promise, just intuition). Treewidth is defined by the means of tree decomposition. Treewidth k  DP-based algorithms with running time 2O(k)poly(n).

Idea: simulate the graph by a tree-like structure
Tree Decomposition b c a g h f Idea: simulate the graph by a tree-like structure d e Suppose we are given some graph. I want to define a tree decomposition for it. Think about it as if we simulate this graph by a tree-like structure. Example from Bodlaender’s talk

Tree Decomposition Example from Bodlaender’s talk b c a g h a f c g a
In order to build a tree decomposition, we start with some tree – any tree we want. Every vertex of the tree has a bag associated with it. Into each bag we can put any vertices of our graph. Each vertex may be added many times In order for it to be a valid tree decomposition, there are two rules that we need to observe. Example from Bodlaender’s talk

Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a The first rule is that, if we take any vertex of our graph, and look at the bags containing it, then: there must be at last one such bag and all such bags must form a connected sub-graph of the tree. For every vertex of G, bags containing it must form a non-empty connected sub-graph of the tree Example from Bodlaender’s talk

Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a For example, if we take the vertex a For every vertex of G, bags containing it must form a non-empty connected sub-graph of the tree Example from Bodlaender’s talk

Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a it belongs to these three pink bags. For every vertex of G, bags containing it must form a non-empty connected sub-graph of the tree Example from Bodlaender’s talk

Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a They clearly a connected non-empty subtree of our tree. For every vertex of G, bags containing it must form a non-empty connected sub-graph of the tree Example from Bodlaender’s talk

Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a
Now to the second rule.

For every edge of G, some bag must contain its both endpoints
Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a The second rule is that for every edge of G, at least one bag must contain its both endpoints. For example, let’s take some edge, for example (a,g): For every edge of G, some bag must contain its both endpoints Example from Bodlaender’s talk

Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a For example, let’s take some edge: (a,g) For every edge of G, some bag must contain its both endpoints Example from Bodlaender’s talk

Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a The pink bag has both a and g. This has to happen to each edge in the original graph. If we follow these two rules, we get a valid tree decomposition of G. For every edge of G, some bag must contain its both endpoints Example from Bodlaender’s talk

Tree Decomposition width: 2
b c a g h a f c g a f g h f c d e d e width: 2 b c a Assume now that we are given some valid tree-decomposition of G. The decomposition width is the maximum number of vertices in any bag -1. The “-1” here is a bit funny and can be ignored. It’s there to ensure that the treewidth of a tree is 1. This decomposition, for example, has width 2, because each bag has at most 3 vertices in it. The treewidth of a graph is the smallest width of any tree decomposition. Decomposition width = max # of vertices in a bag -1 Treewidth: min width of any decomposition

small treewidth  small separators
Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a When the treewidth is small, the graph behaves like a tree. This is because it has small vertex separators. For example, what happens when we take out one vertex out of the tree? small treewidth  small separators

small treewidth  small separators
Tree Decomposition b c a g h a f c g a f g h f c d e d e b c a For example, what happens when we take out one vertex out of the tree? This would mean we take out all vertices in its bag from the graph. We claim that the graph decomposes into pieces. small treewidth  small separators

Tree Decomposition b c a g h a f c f d e
The tree itself decomposes into these areas once we delete the vertex. Let’s look at all the vertices of the graph sitting in the bags of each one of these areas. The vertices in the bag may also belong to these areas, let's take them out.

Every bag defines a vertex separator in G!
Tree Decomposition b c a g h a f c V3\{a,c,f} f d e V1\{a,c,f} So we took out the vertices that belong to the bag. The definition of treewidth ensures that we cannot have edges connecting different pieces. So every bag defines a separator in our graph. This can be naturally exploited in dynamic programming-based algorithms. Typically, but not always, the running time would be exponential in the treewidth. V2\{a,c,f} Every bag defines a vertex separator in G! Can exploit in DP!

On Treewidth Computing tw(G) is NP-hard [Arnborg, Corneil, Proskurowski ‘87] But can compute in time [Bodlaender ‘96] Efficient approximation A few useful facts about treewidth: computing the treewidth is NP-hard But it can be done in time exponential in treewidth and linear in n. We can also compute the treewidth and the corresponding tree-decomposition approximately, with approximation factor \sqrt{log(treewidth)}, using algorithms for balanced cuts.

Treewidth of Some Graphs
Tree or forest: 1 Cycle: 2 (√n×√n)-grid: √n n-vertex constant-degree expander: Ω(n) complete graph: n-1 Here are treewidth values for some graphs. Tree or forest has treewidth 1. cycle: treewidth 2 \sqrt{n} by \sqrt{n}-grid has treewidth \sqrt{n} n-vertex consant-degree expander: \Omega(n) complete graph on n vertices: n-1.

Treewidth and Well-Linkedness
Thm. Let k be the maximum size of a node-well-linked set of vertices in G. Then: k/4-1≤treewidth(G)≤k-1 There is a very useful connection between treewidth and well-linkedness that we will use: Let’s look at the largest-cardinality set of vertices that’s node-well-linked, and let’s look at its cardinality. Then this cardinality is the same as treewidth to within factor 4. So we could have defined treewidth as the largest cardinality of a node-well-linked set of vertices, but then we wouldn’t get the nice tree decomposition. I’ll only show one direction of this theorem, the easier one, and a slightly weaker bound.

Proof of: tw(G)≥k/8 Given: set k of node-well-linked terminals
h a f c d e b Given: set k of node-well-linked terminals We'll only show that treewidth cannot be too small relatively to k, where k is the cardinality of largest node-well-linked set of vertices. So we are given a set of k vertices, that are node-well-linked, let's call them terminals. Remember that every bag defines a separator in our graph. We can travel down the tree until we find a bag that partitions the terminals roughly evenly, so it defines a balanced cut with respect to the terminals. Maybe it's this bag. Some tree bag defines a balanced separator w.r.t. the terminals

k/4 node-disjoint paths must go through the bag!
Proof of: tw(G)≥k/8 k/4 node-disjoint paths must go through the bag! a f c ≥k/4 terminals ≥k/4 terminals If we look at these areas of the tree, then at least two of them should contain at least a quarter of the terminals. The terminals are node-well-linked, so we should be able to connect the terminals in the two pieces with node-disjoint paths. No edges connect these pieces, so the paths have to go through the bag. So we can’t have a tree decomposition of width less than k/8

Treewidth and Well-Linkedness
Thm. Let k be the maximum size of a node-well-linked set of vertices in G. Then: k/4-1≤treewidth(G)≤k-1 I will not show the other direction, but we will be using this theorem.

High-Treewidth Graphs
Low-Treewidth Graphs High-Treewidth Graphs Trees Now the picture is more nuanced. We have trees that we all like. Low-treewidth graphs are kind of OK. But then there are large-treewidth graphs, that we don’t know what to do with.

High-Treewidth Graphs
Low-Treewidth Graphs High-Treewidth Graphs Trees The question is: can we say something useful about them? Something that we could later use to handle such graphs in many different scenarios. Robertson and Seymour defined a big hammer to deal with such graphs, called the excluded grid theorem.

If the treewidth of G is large, then G contains a large grid as a minor. Can obtain the grid from G by a sequence of: edge / vertex deletions and edge-contractions The theorem states that if the treewidth of a graph is large, then it must contain a large grid as a minor. We already discussed treewidth, let’s now define what it means that one graph is a minor of another. This is a grid. Saying that it’s a minor of my graph means that I should be able to delete some edges and vertices from my graph, and contract some edges, and get a grid. So there is a sequence of edge deletions, vertex deletions, and edge contractions that will take me from my graph to the grid. This is a standard definition of minors. There is another equivalent one that I like more.

Minors by Embedding To say that this grid is a minor of G is the same as saying that we can embed the grid into G. The embedding is very similar to how we embedded to expander. Every vertex embeds into connected cluster Every edge embeds into a path

Minors by Embedding But now we don’t allow any congestion, so:
The clusters are disjoint The paths are disjoint And the paths can’t visit the clusters along the way.

G has a large grid minor ≅ G contains a subdivision of a large wall
A Wall There is a third definition. This one only works for grids, unlike the previous two definitions that worked for any graph. To say that G contains a large grid as a minor, is the same as saying that G contains a large wall as a subgraph, or a subdivision of this wall, so we can replace every edge by a path. G has a large grid minor ≅ G contains a subdivision of a large wall

Why “Excluded Grid Theorem”?
[Robertson, Seymour ‘86] If the treewidth of G is large, then G contains a large grid as a minor. Why “Excluded Grid Theorem”? Excluded Grid Theorem [Robertson, Seymour ‘86] If G excludes a large grid as a minor, then its treewidth is small. You may also wonder why the theorem was called “excluded grid theorem”. It’s because we can restate it in this way: If G excludes a large grid as a minor, then its treewidth is small.

If the treewidth of G is large, then G contains a large grid as a minor. Why a grid? Another thing you may wonder about: why a grid, and not some other graph as a minor? Wouldn’t it be nice to get something else, like a clique minor? So let’s try replacing the grid with some other family x of graphs.

If the treewidth of G is large, then G contains a large X as a minor.
Excluded Grid Theorem [Robertson, Seymour ‘86] If the treewidth of G is large, then G contains a large X as a minor. Some graph family In the new statement of the theorem, X is some family of graphs, such as grids, but can also be cliques, or any other family you like. Our first observation is that if X is a family of non-planar graphs, like cliques, then the theorem is false this is because grid itself has a large treewidth, but it cannot contain any non-planar graph as a minor. The second observation is that if X is any planar graph, then it is a minor of a large enough grid. So if we prove the theorem for grids, we prove it for all X’s that are planar. So the statement of the theorem now makes perfect sense, it is the most general statement that’s true.

grid has large treewidth, does not contain X as a minor
Excluded Grid Theorem [Robertson, Seymour ‘86] If the treewidth of G is large, then G contains a large X as a minor. grid has large treewidth, does not contain X as a minor If X is non-planar, the theorem is false! In the new statement of the theorem, X is some family of graphs, such as grids, but can also be cliques, or any other family you like. Our first observation is that if X is a family of non-planar graphs, like cliques, then the theorem is false this is because grid itself has a large treewidth, but it cannot contain any non-planar graph as a minor. The second observation is that if X is any planar graph, then it is a minor of a large enough grid. So if we prove the theorem for grids, we prove it for all X’s that are planar. So the statement of the theorem now makes perfect sense, it is the most general statement that’s true.

proving the Thm for grid proves it for all planar X’s.
Excluded Grid Theorem [Robertson, Seymour ‘86] If the treewidth of G is large, then G contains a large X as a minor. If X is non-planar, the theorem is false! If X is planar then it is a minor of a large enough grid. In the new statement of the theorem, X is some family of graphs, such as grids, but can also be cliques, or any other family you like. Our first observation is that if X is a family of non-planar graphs, like cliques, then the theorem is false this is because grid itself has a large treewidth, but it cannot contain any non-planar graph as a minor. The second observation is that if X is any planar graph, then it is a minor of a large enough grid. So if we prove the theorem for grids, we prove it for all X’s that are planar. So the statement of the theorem now makes perfect sense, it is the most general statement that’s true. proving the Thm for grid proves it for all planar X’s.

unless you need to solve NDP…
Grids are Great! unless you need to solve NDP… Luckily, grids are also really great graphs! (unless you need to solve NDP problem on them). They are well-structured and easy to understand. just knowing that G contains a large grid as a minor gives us a lot of useful information about G, like: --it has many disjoint cycles --has many disjoint cycles of length 0 mod m, if m is much smaller than the grid size --unless you are trying to solve NDP, for all other routing problems, grid is a very convenient routing structure, so we know that G contains that, --the size of vertex cover in G is large, by just looking at the grid, feedback vertex set is large, and so on.

Grids are Great! If G has a large grid as a minor, then:
G contains many disjoint cycles G contains many disjoint cycles of length 0 mod m, for m<<(grid size) G contains a convenient routing structure The size of the vertex cover in G is large … Luckily, grids are also really great graphs! (unless you need to solve NDP problem on them). They are well-structured and easy to understand. just knowing that G contains a large grid as a minor gives us a lot of useful information about G, like: --it has many disjoint cycles --has many disjoint cycles of length 0 mod m, if m is much smaller than the grid size --unless you are trying to solve NDP, for all other routing problems, grid is a very convenient routing structure, so we know that G contains that, --the size of vertex cover in G is large, by just looking at the grid, feedback vertex set is large, and so on.

Applications Fixed parameter tractability Erdos-Posa type results
Graph minor theory Algorithm for NDP where k is small Algorithms for graph crossing number … The theorem has found many applications in many different areas. We’ll mention a couple of typical ways in which the theorem is being used.

Application 1: Node-Disjoint Paths
The first application is node-disjoint paths.

Node-Disjoint Paths for Constant k
Constant k: efficiently solvable [Robertson, Seymour ’90] Running time: f(k)n2 [Kawarabayashi, Kobayashi, Reed ‘12] I have shown this slide at the beginning of the talk. It refers to the algorithm of Robertson and Seymour for NDP, when the number k of the demand pairs is a constant. Here is how the algorithm works at a very high level.

If tw(G)<w, can solve by dynamic programming in time 2O(k+w)poly(n). Will use: w=f(k) What if tw(G)>f(k)? First, if the treewidth is bounded by some value w, we can solve the problem by using dynamic programming. The running time is exponential in k – the number of demand pairs, and the treewidth times poly(n). We will use some large function of k as the threshold w for the treewidth If treewidth is below f(k), then we are OK. What happens if the treewidth is greater than that threshold?

If tw(G)<w, can solve by dynamic programming in time 2O(k+w)poly(n). Will use: w=f(k) What if tw(G)>f(k)? Need to prove: the deletion does not affect the routing Then we know that the graph contains a gigantic wall. The wall has a lot of routing capacity, but we only need to route k demand pairs. So we can argue that we don’t need that much routing capacity, and that it’s OK to delete some vertex of this wall without hurting the routing. Intuitively, it’s the middle vertex of the wall, that is far enough from all terminals (though in reality things are a bit more complicated). Proving that this deletion does not hurt the routing requires a lot of work. Idea: G contains a large wall W Delete a vertex in the middle of W

Application 2: Fixed-Parameter Tractability
Bidimensionality Theory [Demaine, Hajiaghayi 07] The second example is the use of the theorem in fixed-parameter tractability, and in particular in bidimensionality theory. I am going to show a very simple example. For this example this is not necessarily the best possible algorithm. But it will be convenient to demonstrate the technique on it.

Example: Feedback Vertex Set
Feedback Vertex Set: given a graph G, select a min-cardinality subset U of vertices, such that G\U has no cycles. k: size of feedback vertex set Want: a fixed-parameter tractable algorithm, with running time f(k)poly(n). The example is a feedback vertex set. We are given a graph G. A Feedback vertex set is a subset of vertices, so that, if we remove them, G will not contain any cycles. Let’s say that the minimum feedback vertex set has cardinality k. We want to design an FPT algorithm, so its running time should be some function of k times poly(n).

The Algorithm Feedback vertex set size is k, so G cannot contain a gird minor of size more than So tw(G)<g(k) for some function g. Use dynamic programming on the tree decomposition to solve the problem in time Here is a simple algorithm. Since the size of the feedback vertex set in the graph is k, the graph cannot contain a grid minor that’s too large. For example, if it contained a grid minor of size 4\sqrt{k} by 4\sqrt{k}, then we would need to remove more than k vertices to disconnect all cycles. This can’t happen But then its treewidth is bounded by some function of k, and we can solve the problem by running dynamic programming on the tree decomposition. The running time is exponential in g(k) and polynomial in n. Note: If the graph is planar or excludes a fixed minor, then we can make g(k)<k, and get algorithms that are sub-exponential in k. We’ll get back to this algorithm later… In planar/excluded-minor graphs, can take: obtain algorithms that are sub-exponential in k!

If the treewidth of G is k, then G contains a grid of size f(k)xf(k) as a minor. To make the theorem more concrete, let’s put parameters in. If the treewidth of G is k, then G contains a grid minor of size f(k) by f(k).

If the treewidth of G is k, then G contains a grid of size f(k)xf(k) as a minor. How large is f(k)? [Robertson, Seymour ‘94]: Conjecture [Robertson, Seymour ‘94]: This is tight. But: in planar and excluded-minor graphs, f(k)=Ω(k)! [Robertson, Seymour, Thomas ‘94], [Demaine, Hajiaghayi ’08] The important question is: how large is f(k)? The bigger it is, the more useful this theorem is. What is the best we can hope for? It is easy to see that we can’t beat \sqrt{k}. A complete graph on n vertices has treewidth n-1, but cannot contain a grid minor of size larger than \sqrt{n}x\sqrt{n}, because we don’t have enough vertices for that. Robertson and Seymour used a little more subtle argument to show that the bound is at most \sqrt k by \sqrt{log k}. They also conjectured that this is the right answer, and that we should be able to find grids that are that large.

If the treewidth of G is k, then G contains a grid of size f(k)xf(k) as a minor. How large is f(k)? [Robertson, Seymour ‘94]: Conjecture [Robertson, Seymour ‘94]: This is tight. But: in planar and excluded-minor graphs, f(k)=Ω(k)! [Robertson, Seymour, Thomas ‘94], [Demaine, Hajiaghayi ’08] However these lower bounds don’t hold if G is a planar or an excluded-minor graph. For these special cases, if the treewidth of a graph is k, we can get a grid minor of size \Omega(k)x\Omega(k), and this bound is tight (just think of a grid itself)

A planar graph of treewidth k has a grid minor of size Ω(k)xΩ(k).
Excluded Grid Theorem [Robertson, Seymour ‘86] If the treewidth of G is k, then G contains a grid of size f(k)xf(k) as a minor. A planar graph of treewidth k has a grid minor of size Ω(k)xΩ(k). How large is f(k)? [Robertson, Seymour ‘94]: Conjecture [Robertson, Seymour ‘94]: This is tight. But: in planar and excluded-minor graphs, f(k)=Ω(k)! [Robertson, Seymour, Thomas ‘94], [Demaine, Hajiaghayi ’08] However these lower bounds don’t hold if G is a planar or an excluded-minor graph. For these special cases, if the treewidth of a graph is k, we can get a grid minor of size \Omega(k)x\Omega(k), and this bound is tight (just think of a grid itself)

If the treewidth of G is k, then G contains a grid of size f(k)xf(k) as a minor. How large is f(k)? [Robertson, Seymour ‘94]: Conjecture [Robertson, Seymour ‘94]: This is tight. But: in planar and excluded-minor graphs, f(k)=Ω(k)! [Robertson, Seymour, Thomas ‘94], [Demaine, Hajiaghayi ’08] But for general graphs, this is the negative result that we have, and we don’t have anything better. Let’s see what was known on the positive side.

Excluded Grid Theorem [Robertson, Seymour, Thomas ‘89]:
[Diestel, Gorbunov, Jensen, Thomassen ‘99] – simpler proof [Kawarabayashi, Kobayashi ‘12], [Leaf, Seymour ‘12]: There was a long line of work, trying to improve the bounds on f(k). For awhile things where stuck on a sub-logarithmic bound. So at this point the best negative result is roughly \sqrt{k}, and the best positive result is roughly \sqrt{\log k}, and the question is: what is the right bound. In particular, can we build a grid minor, whose size is polynomial in the treewidth?

Open problem: get tight (constructive) bounds on the theorem
Excluded Grid Theorem [Robertson, Seymour, Thomas ‘89]: [Diestel, Gorbunov, Jensen, Thomassen ‘99] – simpler proof [Kawarabayashi, Kobayashi ‘12], [Leaf, Seymour ‘12]: [Chekuri, C ‘13]: [C, ‘16]: Together with Chandra Chekuri, we have answered this question in the positive, for a pretty tiny polynomial, building on the work that was done for the graph routing problems. This was recently improved to this somewhat better polynomial, but the latter proof is non-constructive. The first result is constructive, it gives an efficient algorithm to build the grid minor. The second result is not so constructive. It gives an algorithm whose running time is exponential in k and polynomial in n. We can’t use this approach to get running time that’s poly(n,k). Getting tight bounds on the theorem is an important open problem. My guess would be that we should be improving the positive result (lower bound) Open problem: get tight (constructive) bounds on the theorem

Main Idea C1 C2 C3 … CL … width w length L w
Thm: If G contains a path-of-sets system of width and length Θ(g2), then there is a (gxg)-grid minor in G. One of the main ideas in the proof of the polynomial bound for the theorem is to notice that, if we get a large enough path-of-set system, then we can build a large grid minor in the graph. This was noted independently by Leaf & Seymour and Chandra and myself. To get a (gxg) grid minor, it is enough to build a path-of-sets system of width g^2 and length g^2. [Leaf, Seymour ‘12] [Chekuri, C ’13]

Thm: If G contains a path-of-sets system of width and length Θ(g2), then there is a (gxg)-grid minor in G. So now it’s enough to show that, if the treewith of your graph is k, then G has a path-of-sets system of width and length k^{\eps} for some constant \eps. Enough to show: Treewidth(G)=k implies G has a path-of-sets system of width and length kε

Thm: If G contains a path-of-sets system of width and length Θ(g2), then there is a (gxg)-grid minor in G. If max vertex degree bounded The rest of the proof is quite easy: if G has a large treewidth, this means that it has a large well-linked set. Provided that the maximum vertex degree is not too large, it then has a large PoS. large node-well-linked set of vertices G has large treewidth large Path-of-Sets system

Excluded Grid Theorem: Rest of Proof
Large PoS system gives a large grid minor. Can reduce max vertex while approximately preserving treewidth If G has large set of vertices that’s node-well-linked, and max vertex degree in G is bounded, then G contains a large PoS system. Here is how the rest of the proof for the Excluded Grid Theorem looks like: First, we want to show that a large path-of-sets system implies the existence of a large grid minor. I’ll only show a sketch of the proof. Next, we show that in any graph, we can reduce its maximum vertex degree, without changing the treewidth by too much. Finally, this part is also needed for the disjoint-paths problem: the theorem that states that if a bounded-degree graph has a large node-well-linked set of vertices, then it contains a large Path of Sets system. Also needed for NDP

1. Large PoS System Implies a Large Grid Minor
Let’s start with the first part: if we have a large Path-of-Sets system, then we can get a large grid minor. I will only give a sketch of the proof.

Building the Grid How does one build a grid?
A grid consists of a number of horizontal paths, the rows of the grid, And a number of vertical paths – its columns. Here I’ll relax a notion of a column a bit, and will allow columns that look like this. I claim that if we can show that our graph contains this structure as a subgraph, or a subdivision of this structure, then our graph contains a large grid minor.

Building the Grid This is because we are allowed to contract edges.
I can take these green edges and contract them.

Building the Grid Then I’ll get the first column.
Then I’ll contract these green edges, and will get the second column.

Building the Grid I’ll keep doing this until I get a grid.

Building the Grid

Building the Grid C1 C4 C2 C3 C1 C2 C3 … Ch P1 P2 P3 Pw
So if this structure is a subgraph of my graph, or its subdivision is, we’ll be done. What does it have to do with the path-of sets system? In the system we already have this set of horizontal paths, that we can treat as corresponding to the blue horizontal paths. We only need to route the vertical edges. The idea is to route each such edge in turn in a different cluster. So first edge we’ll try to route in the first cluster, second edge in the second cluster, and so on. This may sound easy because each cluster is connected. So of course inside C_1 I can connect P_1 to P_2. The problem is that the path connecting P_1 to P_2 must go directly from P_1 to P_2, and it cannot visit any other path along the way.

May require re-routing the horizontal paths
Building the Grid C1 For each Ci, we’ll be looking for a direct path connecting some consecutive pair of horizontal paths C4 C2 C3 May require re-routing the horizontal paths C1 C2 C3 … Ch C1 C2 C3 … Ch So eventually, in each C_i, we’ll try to find a direct path connecting some consecutive pair of horizontal paths. This requires some work, but it’s not very difficult. This may also require fixing and re-routing the horizontal paths a bit. Because of the well-linkedness of the A_i’s and B_i’s, there is enough routing capacity in each cluster. P1 P2 P3 Pw …

2. Can reduce max vertex degree while approximately preserving the treewidth
Let’s now move to the second part: that we can reduce maximum vertex degree in a graph, without changing the treewidth by too much.

Degree Reduction Theorem: Any graph G of treewidth ≥ k contains a sub-graph G’ of treewidth ≥ k’ and maximum vertex degree ≤ d. Can use cut-matching game to get k’=k/polylog k, d=polylog k. [Reed, Wood ‘12]: and d=4 [Chekuri, Ene ‘13]: and d constant [Chekuri, C ‘14]: and d=3 We want a theorem like this: If G has treewidth k, then we can find a subgraph of G whose maximum vertex degree is at most d, and treewidth is at least k’, where we want k’ to be pretty close to k. We can use the cut-matching game to lower the degree to polylogarithmic, and keep the treewidth almost the same.

Degree Reduction Theorem: Any graph G of treewidth ≥ k contains a sub-graph G’ of treewidth ≥ k’ and maximum vertex degree ≤ d. Can use cut-matching game to get k’=k/polylog k, d=polylog k. [Reed, Wood ‘12]: and d=4 [Chekuri, Ene ‘13]: and d constant [Chekuri, C ‘14]: and d=3 Reed and Wood used very different techniques to show that we can get degree 4 and get treewidth k^{1/4}. Using the algorithms for node-disjoint paths, we can get constant degree, and treewidth k/polylog k. A result with Chandra Chekuri shows that we can lower the degree all the way to 3, and still get k/polylog k treewidth. The last 2 results rely on a construction of a path-of-sets system, so if you want a clean proof you may want to stick with the first two results. Finally, we need to show the last part.

Use a construction of a PoS system
Degree Reduction Theorem: Any graph G of treewidth ≥ k contains a sub-graph G’ of treewidth ≥ k’ and maximum vertex degree ≤ d. Can use cut-matching game to get k’=k/polylog k, d=polylog k. [Reed, Wood ‘12]: and d=4 [Chekuri, Ene ‘13]: and d constant [Chekuri, C ‘14]: and d=3 Use a construction of a PoS system Reed and Wood used very different techniques to show that we can get degree 4 and get treewidth k^{1/4}. Using the algorithms for node-disjoint paths, we can get constant degree, and treewidth k/polylog k. A result with Chandra Chekuri shows that we can lower the degree all the way to 3, and still get k/polylog k treewidth. The last 2 results rely on a construction of a path-of-sets system, so if you want a clean proof you may want to stick with the first two results. Finally, we need to show the last part.

3. Building the PoS system
Theorem [C, ’11], [C, Li ’12], [Chekuri, C ’13]: Suppose G has a set of k node-well-linked vertices and max vertex degree polylog(k). Then we can efficiently construct a path-of-sets system in G with parameters L and w, if: This is the theorem we are trying to prove: if G has k vertices that are node-well-linked, and maximum vertex degree at most polylog(k), then we can build a path-of-sets system with these parameters. I’ll show you the main ideas of a non-constructive proof, which is easier. Will show: non-constructive version

Main Idea T: set of k node-well-linked vertices, called terminals
PoS System, length 1, width k/2 The high-level idea is very simple. We have a set T of k vertices, that are node-well-linked. Let’s call them terminals. Split them into 2 groups of k/2 vertices each, and draw the graph like that. You can think of it as a path of sets system of length 1 (1 cluster) and width k/2. k/2 terminals

Main Idea T: set of k node-well-linked vertices, called terminals
Now we’ll do iterations, where in each iteration we’ll try to double the length of the PoS system, while making sure that the width only goes down by a constant. After the first iteration, we get length 2, and the width goes down by a constant, then length 4, and so on, until we reach the length that we want. In order to be able to carry this out, we need a procedure that takes one cluster and splits it into two.

Finding a Path-of-Sets System
Interface vertices; node-well-linked Must be boundary vertices; But there can be more boundary vertices Let’s look at some cluster C. Let’s look at the vertices where the edges on the left are coming in, and the vertices where edges from the right are coming in. We call them interface vertices. Recall that they must be node-well-linked. If we look at the whole graph, where this path of sets system sits, then these interface vertices have to be the boundary vertices of the cluster, but there can be more boundary vertices in this cluster Boundary vertices – vertices that have neighbors outside the cluster.

w1≤w2 w1 vertices w2 vertices We’ll make the interface vertices on the left blue, and the ones on the right green. Let’s assume that there are w1 green vertices and w2 blue ones. Even though their number should be the same, let’s assume for now that there are fewer blue than green vertices. We’ll see in a bit why we need this.

The interface of Ci is node-well-linked inside Ci w1≤w2 w1/c edges w1/c vertices w2/c vertices What we would like is to split this cluster into two. One of the sides, C1, should contain a constant fraction of the blue vertices The other side should contain a constant fraction of the green vertices. The number of edges going across should be at least w1/c. On top of this, we want the interface of each cluster to be node-well-linked inside that cluster. For C1, it’s the blue and the black vertices, and for C2 it’s the green and the black vertices. I claim that if we can do that, then we re done.

Starting Point T: set of k node-well-linked vertices, called terminals
PoS System, length 1, width k/2 Going back to the whole algorithm, we start with 1 cluster – the whole graph. This is a path-of-sets system of length 1 and width k/2 This is a starting point. We would now like to execute a step.

Step C2 C1 CL … Start: length L width w End: length 2L width w/c’
In a step, we start with PoS that has length L and width 2. We would like to double the length, so that the width only decreases by some constant. We’ll do this by splitting each cluster one-by-one. There is just one little subtlety. Look at the edges connecting C_1 to C_2. When we split C_1, we’ll lose a constant fraction of them. When we split C_2, we’ll also lose a constant fraction of them. If we are not careful, we’ll lose all of them. So we need to coordinate between the clusters and do the splitting one-by-one.

Step C2 C1 CL … First, we split the first cluster

Step C2 CL C’1 C’’1 … Here the empty circles are the blue vertices that don’t lie in the first cluster, and the green vertices that don’t lie in the second cluster. We throw them out, and their adjacent edges too.

Step C2 CL C’1 C’’1 … Now when we try to split C_2, we’ll only take into account the edges that survived.

Step C2 CL C’1 C’’1 … So now we’ll have fewer blue vertices than green vertices. We split C_2

Step CL C’1 C’’1 C’2 C’’2 … Discard these vertices, and keep on going

Step CL C’1 C’’1 C’2 C’’2 … Until we split all clusters.

Step CL C’1 C’’1 C’2 C’’2 …

Step C’L C’’L … C’1 C’’1 C’2 C’’2 Iteration:
In the end, the length doubles, and the width goes down by factor c^2. For example, for the edges that connected C_1 to C_2: We have lost a 1/c constant fraction of them while splitting C_1, and another 1/c fraction while splitting C_2, but we have retained a 1/c^2 fraction. Overall, we start with a Path of Sets system of length L and width w. We double the length. The width goes down by c^2, where c is what we lose in a single splititng. Iteration: Start: Path-of-Sets system of length L, width w End: Path-of-Sets system of length 2L, width w/c2, for some constant c.

Final Accounting Start: Path-of-Sets system of length 1, width k/2
log2L iterations End: Path-of-Sets system of length L, width If , can find PoS system of width w and length L. For the final accounting: We start with PoS of length 1, width k/2. To reach length L, we need log L iterations. The width will go down by factor c^2 in each iteration, so this will be the final width. So if this inequality holds, we will find a PoS system as required. To remind you, for the excluded grid theorem, we need L and w to be k^{\eps} for some constant \eps. For NDP we need the length to be log^2k, and width k/polylog(k). For Excluded-Grid Theorem: need L,w=kε For NDP: need L=Ω(log2k),w=k/polylog(k)

Final Accounting Start: Path-of-Sets system of length 1, width k/2
log2L iterations End: Path-of-Sets system of length L, width Currently: c is about 29 If , can find PoS system of width w and length L. The critical question is what is c? (how much we lose in one splitting procedure). it determines the final parameters that we obtain. Right now, we can get c approximately 2^9. The approach is non-constructive, because we can only lose constant factors, and so we’ll need to compute sparsest cuts exactly. For Excluded-Grid Theorem: need L,w=kε For NDP: need L=Ω(log2k),w=k/polylog(k)

This approach is non-constructive!
Final Accounting Start: Path-of-Sets system of length 1, width k/2 log2L iterations End: Path-of-Sets system of length L, width This approach is non-constructive! If , can find PoS system of width w and length L. The critical question is what is c? (how much we lose in one splitting procedure). it determines the final parameters that we obtain. Right now, we can get c approximately 2^9. The approach is non-constructive, because we can only lose constant factors, and so we’ll need to compute sparsest cuts exactly. For Excluded-Grid Theorem: need L,w=kε For NDP: need L=Ω(log2k),w=k/polylog(k)

To summarize, right now, all we need to do is to show this splitting procedure, that will split one cluster into two.

Want: The interface of Ci is node-well-linked inside Ci To remind you, we wanted to make sure that for each one of the two resulting clusters, the interface vertices are node-well-linked inside the cluster. But because we can use boosting theorems, we can relax this requirement.

Enough: The interface of Ci is α-well-linked inside Ci, for some constant 0<α<1 Terminals; node-well-linked. It is enough to ensure that the interface vertices are \alpha-well-linked for some constant \alpha. From now on we can focus on C and ignore the rest of the graph. The blue and the green vertices of C will be called terminals, and they are node-well-linked. To remind you, we want: C_1 to contain a constant fraction of the blue terminals, C_2 to contain a constant fraction of the green terminals, and many edges connecting the two clusters. So a structure like that that we are looking for, I’ll call it a 2-cluster chain. a constant fraction of green and blue terminals is preserved. a 2-cluster chain

Enough: The interface of Ci is α-well-linked inside Ci, for some constant 0<α<1 w green terminals w/c blue terminals The number of green terminals will be denoted by w for now, and let’s say that we have w/c blue terminals. a constant fraction of green and blue terminals is preserved. a 2-cluster chain

Finding a 2-Cluster Chain: a weak 2-cluster chain
I will relax the notion of a 2-cluster chain to something even weaker, called a weak 2-cluster chain, and then I’ll show you that we can get away with that too.

Weak 2-Cluster Chain: So now I have the cluster C and I’ll only look at the green terminals, I’ll ignore the blue ones. We would like to find two disjoint clusters C_1 and C_2 in C, that don’t contain the terminals.

Weak 2-Cluster Chain: We would like the boundary vertices of each cluster to be \alpha-well-linked inside the cluster, for some constant \alpha. This is what we used to call the \alpha-bandwidth property As before, the boundary vertices are all vertices that have neighbors outside of the cluster. The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to terminals with many disjoint paths, that do not pass through the clusters.

Weak 2-Cluster Chain: Additionally, we would like to make sure that each cluster can connect with lots of paths to the terminals. The number of paths should be \Omega(w), where w is the number of terminals. The paths cannot visit the other cluster along the way. Why is this enough? We also needed to connect the two clusters to each other, and we also needed to connect the blue terminals to one of them. The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to the terminals with Ω(w) disjoint paths, that do not pass through the clusters.

Weak 2-Cluster Chain: We exploit the fact that the terminals are node-well-linked. For example, we can connect green terminals on the left to green terminals on the right by disjoint paths. The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to terminals with Ω(w) disjoint paths, that do not pass through the clusters.

Weak 2-Cluster Chain: Then we can take a ride on the paths that connected then to the clusters. This way we can build many node-disjoint paths connecting the two clusters. We need to do this carefully, so that we still preserve many green paths connecting the green terminals to the clusters. This requires some work but is not too hard. What about connecting the blue terminals to the clusters? The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to terminals with Ω(w) disjoint paths, that do not pass through the clusters.

Weak 2-Cluster Chain: Again, we use well-linkedness of the terminals to connect the blue terminals to some green terminals. Then we can take a ride on the paths connecting those terminals to the clusters. The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to terminals with Ω(w) disjoint paths, that do not pass through the clusters.

Weak 2-Cluster Chain: Again, we need to be careful that we leave enough green paths untouched, and enough paths connecting the two clusters. This again requires some work. The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to terminals with Ω(w) disjoint paths, that do not pass through the clusters.

Weak 2-Cluster Chain: But ultimately, if we manage to build a weak 2-cluster chain, we will be done. This is a bit technical but not very difficult. I will now further weaken our requirements a little bit, by posing the following concrete question. The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to terminals with Ω(w) disjoint paths, that do not pass through the clusters.

Concrete question: G Input: graph G set T of k terminals
T is node-well-linked Here is a concrete question. We are given a graph G a set T of k terminals. the terminals are node-well-linked in G.

T is node-well-linked We would like to find 2 disjoint clusters C_1,C_2 that only contain non-terminal vertices, such that the following holds: if we look at the boundary vertices in each cluster – these are the vertices that have neighbors outside the cluster, then we get many such vertices, at least \Omega(k), where k is the number of terminals. The second property is that the boundary vertices are \alpha-well-linked inside each cluster. Goal: find 2 disjoint clusters C1,C2 of non-terminal vertices: Each cluster has Ω(k) boundary vertices Boundary vertices are α-well-linked in each cluster

T is node-well-linked Remember that this \alpha-well-linkedness property, when the boundary vertices are well-linked in a cluster, we used to call \alpha-bandwidth property, when we discussed vertex sparsifiers. This is very close to what we wanted in a 2-cluster chain. The difference is that there we also needed to make sure that each cluster can send lots of flow to the terminals. Here we replace it with the condition that the boundary is large. This is the minimum that we should do, and it's already interesting and useful question in its own right Goal: find 2 disjoint clusters C1,C2 of non-terminal vertices: Each cluster has Ω(k) boundary vertices Boundary vertices are α-well-linked in each cluster α-bandwidth property

T is node-well-linked Ideally, α is a constant Will show: for α=1/polylog(n) For the 2-cluster chain, we need \alpha to be a constant. I’ll show you an idea of a less technical but constructive proof that will give \alpha=1/\polylog(n). This is already sufficient to get a weaker construction of a path-of-sets system, and some new results for Node Disjoint Paths and Excluded Grid theorem. I will also use this opportunity to introduce another technique that turns out to be very helpful in this kind of problems. Goal: find 2 disjoint clusters C1,C2 of non-terminal vertices: Each cluster has Ω(k) boundary vertices Boundary vertices are α-well-linked in each cluster

T is node-well-linked Good clusters Whenever we have a cluster with these two properties, I’ll call it a good cluster. So our goal is to find two disjoint good clusters Goal: find 2 disjoint clusters C1,C2 of non-terminal vertices: Each cluster has Ω(k) boundary vertices Boundary vertices are α-well-linked in each cluster

Main Idea: Graph Compression
Compress the graph to get rid of irrelevant information Easier to look for the clusters in the compressed graph. How to compress? The main idea is to use a trick that seems to work really well in this type of problems. I first saw this trick in Raecke’s paper on oblivious routing The intuition is that the graph contains lots of information that is not relevant to us, and we’d like to compress it so that this information is hidden. The hope is that it will be easier to find what we are looking for in the compressed graph. How exactly would we compress the graph? We’ll do something similar to what we did in vertex cut sparsifiers. Like in vertex cut sparisfiers

H Boundary vertices are α-well-linked This is a slide from before, about vertex cut sparsifiers To remind you, we took all non-terminal vertices and partitioned them into clusters that have the \alpha-bandwidth property, and then contracted the clusters. Again, \alpha-bandwidth property means that the boundary vertices are \alpha-well-linked in the cluster. Cut V\T into clusters that have α-bandwidth property. Contract the clusters.

Graph Compression H Cut V\T into clusters that: Contract the clusters.
Reminder: Good clusters: Have α-bandwidth property have Ω(k) boundary vertices We will do something similar. But this time we will only contract clusters, that have \alpha-bandwidth property as before have fewer than k/(1000Δ) boundary vertices, where Δ is the maximum vertex degree, which is about polylog k. To remind you, the two clusters we are looking for are just the opposite: we want them to have the \alpha-bandwidth property, but we also want them to have more than k/1000 boundary vertices. Cut V\T into clusters that: have α-bandwidth property have fewer than k/(1000Δ) boundary vertices Contract the clusters.

the number of edges in the contracted graph decreases
The Plan Contract the graph iteratively Start: every vertex is a separate cluster Step: either find 2 good clusters, or contract the graph even more End: will find 2 good clusters the number of edges in the contracted graph decreases We are going to perform a number of iterations, where in each iteration we will compress the graph more and more. At the beginning, the compressed graph is the original graph itself, so every vertex is in a separate cluster. In every iteration, we show that we can either find what we are looking for, the 2 good clusters, or contract the graph even more. By that we mean that the number of edges in the contracted graph will go down. We can’t keep contracting forever, because we only contract clusters that contain no terminals and have a small boundary. So eventually we’ll have to stop and we’ll find the 2 good clusters we are looking for.

The Plan Contract the graph iteratively
Start: every vertex is a separate cluster Step: either find 2 good clusters, or contract the graph even more End: will find 2 good clusters So as long as I can show you how to execute a step, we will be done. Let’s now focus on the step.

Step Execution Input: a contracted graph Output: smaller contracted graph; or 2 good clusters To remind you, we start with the current contracted graph. We want to either contract it further, or find the 2 good clusters.

Iteration Description
Current contracted graph So let’s say that H is our current contracted graph. The red vertices are the terminals.

The blue area is all non-terminal vertices. Each of them is a contracted cluster. It is important that: Maximum vertex degree in this graph is at most k/1000. This is because all clusters have small boundary – less than k/(1000Δ) The number of edges inside the blue area is at least k/4, because the terminals are node-well-linked and the degrees are small. Max vertex degree ≤ k/1000 At least k/4 edges

A and B each contain constant fraction of edges of H We take the vertices in the blue area and randomly partition them into two subsets, A and B. We expect each of A and B to contain a constant fraction of the edges of H. (because of the bounded degree and many edges) B A Max vertex degree ≤ k/1000 At least k/4 edges

A and B each contain constant fraction of edges of H Also, the number of edges sticking out of each one of these sets cannot be too large – no more than all edges in the graph B A |out(A)|,|out(B)|≤|E(H)|

A and B each contain constant fraction of edges of H So if we focus, for example, on A, there are many edges sitting inside A relatively to the number of edges sticking out of A. B A |out(A)|,|out(B)|≤|E(H)|

uncontract |E(A)|>|out(A)|/8 We can say, for example, that the number of edges inside A is at least say 1/8 of the number of edges sticking out of A. Intuitively, it feels like A contains too many edges and we should be able to contract it more. We can’t contract clusters that are already contracted. This is forbidden. We need to open the clusters that sit inside A, and find a new more compact clustering. Let’s uncontract all these clusters, so now A is a sub-graph of the original graph. I showed in red the boundary vertices. Their number is comparable to the number of edges sticking out of A, because the degree is bounded by polylog k - this is the original graph. # boundary vertices |out(A)|

uncontract All clusters have α-bandwidth property |E(A)|>|out(A)|/8 well-linked decomposition Let us now do a well-linked decomposition inside this new cluster A, exactly like we did for vertex sparsifiers. We view these red boundary edges as terminals. We want to make sure that all new clusters have the \alpha-bandwidth property. But if we are OK with \alpha being small enough, say 1/log^2k, then we can make sure that there will be a very small number of edges connecting different clusters. The number of green edges will be much smaller than the number of the red edges. At the beginning, we had much more green edges. Their number was comparable to the number of the red edges. # boundary vertices |out(A)| #green edges << # red edges

uncontract A smaller contracted graph? |E(A)|>|out(A)|/8 well-linked decomposition now |E(A)|<<|out(A)| So we contract the clusters back, get a new contracted graph, and it has much fewer edges inside than what A used to have. We don't touch B at all, all clusters stay the same here. Does this mean that we have managed to contract the graph much more? contract

uncontract A smaller contracted graph? Yes, unless some cluster has large boundary well-linked decomposition According to our rules, we can contract a cluster only if it has the \alpha-bandwidth property and a small boundary. (We have used the small boundary in our algorithm, it is essential). So if all clusters have small boundaries, this gives a better compression of the whole graph. Otherwise, one of the clusters must have a large boundary, and this means that it is a good cluster. contract But then we’ve found a good cluster!

uncontract If all clusters have small boundary, we managed to contract the graph more well-linked decomposition So if all clusters in the decomposition have a small boundary, we have found an even smaller contracted graph. Otherwise, we have found a good cluster in A. Otherwise, we’ve found a good cluster! contract

Going back to our picture, we took all non-terminal vertices in the current contracted graph, and partitioned them into two sets A and B. B A

Run the procedure on A and B separately If either returns a more contracted graph, done Otherwise, each contains a good cluster We run the procedure on each cluster separately. If one of the procedures returns a better contracted graph, then we are done. Otherwise, each one contains a good cluster, and we are done again. B A

T is node-well-linked We get: α=1/polylog(n) Need: α constant So to remind you where we stand, I showed you how to execute this procedure. We can obtain an efficient algorithm that does it. But the well-linkedness factor that we will get is only 1/\polylog k. For the path-of-sets system we need it to be a constant. We can do the same with constant \alpha, only the proof is more technical and non-constructive, because we’ll need to compute sparsest cuts exactly. Goal: find 2 disjoint clusters C1,C2 of non-terminal vertices: Each cluster has Ω(k) boundary vertices Boundary vertices are α-well-linked in each cluster

Weak 2-Cluster Chain: Excluded Grid Theorem
What we were really after is the 2-cluster chain. It has the same properties as the 2 clusters we have found, only we should also be able to connect each cluster to the terminals with many paths. This requires some minor modification of the procedure that I won’t go into. So at this point we are done with the proof of the Excluded Grid theorem. The boundary of Ci is α-well-linked inside Ci, for some constant 0<α<1 Can connect each cluster to terminals with Ω(w) disjoint paths, that do not pass through the clusters.

T is node-well-linked But let us go back to the question here, we don’t need to stop at 2 clusters. The same procedure would work if we wanted many more than 2 clusters. Goal: find 2 disjoint clusters C1,C2 of non-terminal vertices: Each cluster has Ω(k) boundary vertices Boundary vertices are α-well-linked in each cluster

Can find many disjoint subgraphs of G with large treewidth!
Concrete question: Input: graph G set T of k terminals T is node-well-linked Can find many disjoint subgraphs of G with large treewidth! We can find L clusters, for any choice of L. We use the same procedure. Instead of partitioning into 2 subsets A and B, we can partition into more subsets, and get the same result for more clusters. We’ll get fewer boundary vertices and worse \alpha in the well-linkedness. What this tells us is that we can find many disjoint subgraphs of G that have large treewidth. Each of the clusters has a large set of well-linked vertices – the boundary vertices. After optimizing the parameters, we can distill this into the following theorem. Goal: find L disjoint clusters C1,…,CL of non-terminal vertices: Each cluster has many boundary vertices Boundary vertices are α-well-linked in each cluster

Large-Treewidth Graph Decomposition [C, Chekuri ‘13]
Treewidth k treewidth ≥ w If we have a graph of treewidth k, then we can partition it into L disjoint clusters, each of which has treewidth w, as long as L and w are not too large compared to k. You could get this theorem directly from the excluded grid theorem or the path-of-sets system, but the bounds will be much worse. This theorem is also very useful. In many applications of the excluded grid theorem, we don’t need the power of the whole theorem, but it is enough to use this theorem instead, and get better bounds. Let’s see an example. treewidth ≥ w treewidth ≥ w treewidth ≥ w L

Getting around the Excluded Grid Theorem?
Example: FPT algorithm for Feedback Vertex Set In many applications, where the Excluded grid theorem is traditionally used, we can use that theorem instead and get better bound. Here is one example. This is the algorithm that I showed before for Feedback Vertex Set.

The Algorithm Feedback vertex set size is k, so G cannot contain a gird minor of size more than So tw(G)<g(k) for some function g. Use dynamic programming on the tree decomposition to solve the problem in time Here is how the algorithm worked. We know that if feedback vertex set size is k, then G cannot contain a grid minor that has size more than 4\sqrt{k}x4\sqrt{k} From the excluded grid theorem, the treewidth of G is bounded by some function g(k). Then we do dynamic programming, and get this running time.

The Algorithm Feedback vertex set size is k, so G cannot contain a gird minor of size more than So tw(G)<g(k) for some function g. Use dynamic programming on the tree decomposition to solve the problem in time What is g(k)? The question is: what is g(k)? It determines the running time of our algorithm. Using current bounds on the excluded-grid theorem, we can choose g(k) to be roughly k^{9.5}, and we’ll get running time that’s exponential in this. Can choose , running time

The Algorithm Feedback vertex set size is k, so G cannot contain a gird minor of size more than So tw(G)<g(k) for some function g. Use dynamic programming on the tree decomposition to solve the problem in time What is g(k)? Using the previous theorem, it is enough to choose g(k) to be O(k polylog(k)) This will give running time that’s only exponential in O(k polylog k). Even if we could get better bounds for the excluded grid theorem, it would never give us bounds that are so good. Can choose , running time

Large-Treewidth Graph Decomposition
If the treewidth is k, I can partition the graph into clusters of treewidth at least 2 each, and get k+1 such clusters. Each such cluster contains a cycle. So this is enough to certify that the feedback vertex set is more than k. treewidth ≥ 2 treewidth ≥ 2 treewidth ≥ 2 k+1 FVS value is at least k+1

Conclusion FPT algorithms Well-linked sets and decompositions
Approximation algorithms for Node-Disjoint Paths Treewidth Vertex Sparsifiers Excluded Grid Theorem To conclude: I have talked about some problems that we traditionally studied in theoretical computer science, such as: approximation algorithms for node-disjoint paths, vertex sparsifiers, and FPT algorithms, And about some structural graph-theoretic notions, such as: -well-linked sets and well-linked decompositions, Treewidth, Excluded grid theorem and its uses, Path-of-sets systems. Not surprisingly, there are lots of interesting connections between the two areas. There is a lot of room for more interaction between these areas. Many of us are working on graph-related problems and can benefit from interacting more with the graph theory community. FPT algorithms Path-of-Sets Systems

Lots of Interesting Open Problems!
Vertex cut sparsifiers Approximation algorithms for congestion minimization Tighter bounds for Excluded Grid theorem Simpler algorithms for NDP with 3 demand pairs? … I didn’t have time to mention them, but there are lots of interesting open problems. There are many open problems related to vertex sparsifiers. Many of the most basic questions about vertex sparsifiers remain open (tradeoff between sparsifier size and its quality). The main remaining open problem in the area of approximation for routing problems is congestion minimization. Here we need to route all demand pairs and would like to minimize the congestion. ---approximation algorithm achieves O(log n/log log n)-factor; lower bound is \Omega(log log n). Open even on planar graphs Even the integrality gap is open -We want to get tight bounds for the excluded grid theorem Simpler algorithms for NDP with 3 demand pairs. Many more… Thank you!

Some Graph Minor Theory and its Uses in Algorithms

Similar presentations

Presentation on theme: "Some Graph Minor Theory and its Uses in Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Some Graph Minor Theory and its Uses in Algorithms

Similar presentations

Presentation on theme: "Some Graph Minor Theory and its Uses in Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback