A New Approach to the Maximum-Flow Problem Andrew V. Goldberg, Robert E. Tarjan Presented by Andrew Guillory
Outline Background Definitions Push-Relabel Algorithm Correctness / Termination Proofs Sequential Implementation Dynamic Tree Implementation
Maximum Flow Problem Classic problem in operations research Many problems reduce to max flow Maximum cardinality bipartite matching Maximum number of edge disjoint paths Minimum cut (Max-Flow Min-Cut Theorem) Machine learning applications Structured Prediction, Dual Extragradient and Bregman Projections (Taskar, Lacoste-Julien, Jordan JMLR 2006) Local Search for Balanced Submodular Clusterings (Narasimhan, Bilmes, IJCAI 2007)
Relation to Optimization Special case of submodular function minimization Special case of linear programming Integer edge capacities permit integer maximum flows (constructive proof)
History of Algorithms Augmenting Paths based algorithms Ford-Fulkerson (1962) O(mU) Edmonds-Karp (1969) O(nm 3 ) … O(n 3 ) O(nmlog(n)) O(nmlog(U)) Push-Relabel based algorithms Goldberg (1985) O(n 3 ) Goldberg and Tarjan (1986) O(nmlog(n 2 /m)) Ahuja and Orlin O(nm + n 2 log(U))
Outline Background Definitions Push-Relabel Algorithm Correctness / Termination Proofs Sequential Implementation Dynamic Tree Implementation
Definitions Graph G = (V, E) |V| = n |E| = m G is a flow network if it has source s and sink t capacity c(v,w) for each edge (v,w) in E c(v,w) = 0 for (v,w) not in E
Definitions (continued) A flow f on G is a real value function on vertex pairs f(v,w) <= c(v,w) for all (v,w) f(v,w) = -f(w,v) ∑ u f(u,v) = 0 for all v in V - {s,t} Value of a flow |f| is ∑ v f(v,t) Maximum flow is a flow of maximum value
Definitions (continued again) A preflow f on G is a real value function on vertex pairs f(v,w) <= c(v,w) for all (v,w) f(v,w) = -f(w,v) ∑ u f(u,v) >= 0 for all v in V - {s} Flow excess e(v) = ∑ u f(u,v) Intuition: flow into a vertex can exceed flow out
Outline Background Definitions Push-Relabel Algorithm Correctness / Termination Proofs Sequential Implementation Dynamic Tree Implementation
Intuition Starting with a preflow, push excess flow closer towards sink If excess flow cannot reach sink, push it backwards to source Eventually, preflow becomes a flow and in fact the maximum flow
Residual Graph Residual capacity r f (v, w) of a vertex pair is c(v, w) – f(v, w) If v has positive excess and (v,w) has residual capacity, can push δ = min(e(v), r f (v, w)) flow from v to w Edge (v,w) is saturated if r f (v, w) = 0 Residual graph G f = (V, E f ) where E f is the set of residual edges (v,w) with r f (v, w) > 0
Labeling A valid labeling is a function d from vertices to nonnegative integers d(s) = n d(t) = 0 d(v) <= d(w) + 1 for every residual edge If d(v) < n, d(v) is a lower bound on distance to sink If d(v) >= n, d(v) - n is a lower bound on distance to source
Push Operation Push(v,w) Precondition: v is active (e(v) > 0) and r f (v, w) > 0 and d(v) = d(w) + 1 Action: Push δ = min(e(v), r f (v, w)) from v to w f(v,w) = f(v,w) + δ; f(w,v) = f(w,v) – δ; e(v) = e(v) - δ; e(w) = e(w) + δ;
Relabel Operation Relabel(v) Precondition: v is active (e(v) > 0) and r f (v, w) > 0 implies d(v) <= d(w) Action: d(v) = min{d(w) + 1 | (v,w) in E f }
Generic Push-Relabel Algorithm Starting from an initial preflow > While there is an active vertex Chose an active vertex v Apply Push(v,w) for some w or Relabel(v)
Example 0/3 0/1 0/2 Flow Network ST
Example /3 0/1 0/2 ST Initial preflow / labeling
Example /3 0/1 0/2 ST Select an active vertex
Example /3 0/1 0/2 Relabel active vertex ST
Example /3 0/1 0/2 Select an active vertex ST
Example /3 1/1 0/2 Push excess from active vertex ST
Example /3 1/1 0/2 Select an active vertex ST
Example /3 1/1 0/2 Relabel active vertex ST
Example /3 1/1 0/2 Select an active vertex ST
Example /3 1/1 1/2 Push excess from active vertex ST
Example /3 1/1 1/2 Select an active vertex ST
Example /3 1/1 1/2 Relabel active vertex ST
Example /3 1/1 1/2 Select an active vertex ST
Example /3 1/1 1/2 Push excess from vertex ST
Example /3 1/1 1/2 Maximum flow ST
Outline Background Definitions Push-Relabel Algorithm Correctness / Termination Proofs Sequential Implementation Dynamic Tree Implementation
Correctness Lemma 2.1 If f is a preflow, d is a valid labeling, and v is active, either push or relabel is applicable to v Lemma 3.1 The algorithm maintains a valid labeling d Theorem 3.2 A flow is maximum iff there is no path from s to t in G f (Ford and Fulkerson [7])
Correctness (continued) Lemma 3.3 If f is a preflow and d is a valid labeling for f, there is no path from s to t in G f Proof by contradiction Path s, v 0, v 1, …, v l, t implies that d(s) <= d(v 0 ) + 1 <= d(v 1 ) + 2 <= … <= d(t) + l < n Which contradicts d(s) = n
Correctness (continued) Theorem 3.4 If the algorithm terminates with a valid labeling, the preflow is a maximum flow If the algorithm terminates, all vertices have zero excess (preflow is a flow) By Lemma 3.3 the sink is not reachable from the source By Theorem 3.2 the flow is maximum
Termination Lemma 3.5 If f is a preflow and v is an active vertex then the source is reachable from v in G f Let S be the set of vertices reachable in G f Suppose s is not in S For every u,w, with w in S and u not in S, f(u,w) <= 0 ∑ w in S e(w) = ∑ u in V, w in S f(u,w) = ∑ u not in S, w in S f(u,w) + ∑ u in S, w in S f(u,w) = ∑ u not in S, w in S f(u,w) <= 0 e(w) = 0 for all w in S Lemma 3.6 A vertex’s label never decreases
Termination (continued) Lemma 3.7 At any time the label of any vertex is at most 2n – 1 Only active vertex labels are changed Active vertices can reach s Path v, v 0, v 1, …, v l, s implies that d(v) <= d(v 0 ) + 1 <= d(v 1 ) + 2 <= … <= d(s) + l <= n + n - 1
Termination (continued) Lemma 3.8 There are at most 2n 2 labeling operations Only the labels corresponding to V-{s,t} may be relabeled Each of these n – 2 labels can only increase At most (2n – 1) (n – 2) relabelings
Termination (continued) Lemma 3.9 The number of saturating pushes is at most 2nm For any pair (v,w) d(w) must increase by 2 between saturating pushes from v to w Similarly d(v) must increase by 2 between pushes from w to v d(v) + d(w) >= 1 on the first saturating push d(v) + d(w) <= 4n - 3 on the last At most 2n - 1 saturating pushes per edge
Termination (continued) Lemma 3.10 The number of nonsaturating pushes is at most 4n 2 m Φ = ∑ v d(v) where v is active Each nonsaturating push causes Φ to decrease by at least 1 The total increase in Φ from saturating pushes is (2n – 1) 2nm The total increase in Φ from relabeling is (2n – 1)(n – 2) Φ is 0 initially and 0 at termination
Termination Theorem 3.11 The algorithm terminates in O(n 2 m) Total time = # nonsaturating pushes + #saturating pushes + #relabeling operations 4n 2 m + 2nm + 2n 2 = O(n 2 m)
Outline Background Definitions Push-Relabel Algorithm Correctness / Termination Proofs Sequential Implementation Dynamic Tree Implementation
Implementation At each step select an active vertex and apply either Push or Relabel Problem: Determining which operation to perform and in the case of Push finding a residual edge Solution: For each vertex maintain a list of edges which touch that vertex and a current edge
Push/Relabel Operation Push/Relabel(v) Precondition: v is active Action: If Push(v,w) is applicable to current edge (v,w) then Push(v,w) Else if (v,w) is not the last edge advance current edge Else reset the current edge and Relabel(v)
Push/Relabel Operation Lemma 4.1 The push/relabel operation does a relabeling only when relabeling is applicable Theorem 4.2 The push/relabel implementation runs in O(nm) time plus O(1) time per nonsaturating push operation
O(n 3 ) bound We can select vertices in arbitrary order Certain vertex selection strategies give O(n 3 ) bounds Maximum distance method (proved here) First-in, first-out method (proved in paper) Wave method
Maximum distance method At each step, select the active vertex with maximum distance d(v)
Maximum distance method Theorem The maximum distance method performs at most 4n 3 nonsaturating pushes Consider D = max x d(x) where x is active D only increases because of relabeling D increases at most 2n 2 times D starts at 0 and ends nonnegative D changes at most 4n 2 times There is at most one nonsaturating push per node per value of D
Maximum distance method Theorem The maximum distance method runs in time O(n 3 ) using the push/relabel implementation Previous theorem and Theorem 4.2
First-In First-Out Method Discharge() Precondition: Queue is not empty Action: Push/Relabel the vertex v at the front of the queue until e(v) = 0 or d(v) increases If w becomes active during the Push/Relabel add w to the back of the Queue If v is still active add v to the back of the Queue
First-In First-Out Method Lemma 4.3 The number of passes over the queue is at most 4n 2 Proof very similar to the proof of O(n 3 ) bound for maximum distance method Corollary 4.4 The number of non saturating pushes is at most 4n 3 One per vertex per pass
First-In First-Out Method Theorem 4.5 The first-in, first-out method runs in O(n 3 ) time Corollary 4.4 and Theorem 4.2
Outline Background Definitions Push-Relabel Algorithm Correctness / Termination Proofs Sequential Implementation Dynamic Tree Implementation
Intuition: Maintain trees such that connections between child nodes and parent nodes correspond to edges in the residual graph which permit push operations Send flow up branches of trees Queue contains trees with active roots
Send Operation Send(v) Precondition: v is active Action: While v is not the root of its tree and e(v) > 0 Send flow up the tree from v Cut the tree along the bottleneck edge(s)
Example /3 0/2 0/1 ST Preflow / labeling
Example ST Residual graph excess = 3
Example ST Dynamic tree over residual graph excess = 3
Example ST Select active vertex excess = 3
Example ST Send flow up tree excess = 2
Example ST Cut along bottleneck edges excess = 2
Example ST Send flow up tree excess = 1
Example ST Cut along bottleneck edges excess = 1
Example /3 2/2 1/1 ST New preflow / labeling
Tree-Push/Relabel Operation Tree-Push/Relabel(v) Precondition: v is a root of a tree and active Action: 1) If Push is applicable to current edge (v,w): 1a) If we can combine v and w’s trees without making the tree > size k, make w v’s parent and Send(v) 1b) Else Push(v,w) and Send(w) 2) Else 2a) If (v,w) isn’t the last edge advance the edge 2b) Else cut v’s children out of the tree, relabel v, and reset the current edge
Dynamic Tree Implementation Lemma 5.1 The dynamic tree algorithm runs in O(nm log k) time plus O(log k) time per addition of a vertex to the queue Trees are kept at most size k by 1a) Tree operations take time O(log k) Each Tree-Push/Relabel operation takes O(1) tree operations plus O(1) tree operations per cut Relabeling takes time O(nm) There are O(nm) cuts Tree-Push/Relabel is performed O(nm) times plus once per addition to the Queue
Dynamic Tree Implementation Lemma 5.2 The number of times a vertex is added to the queue is O(nm + n 3 /k) A vertex is added only after d(v) changes or e(v) increases from zero d(v) changes at most n 2 times e(v) increased only in 1a) or 1b) Number of vertices added to queue in 1a) or 1b) is the number of cuts performed (2nm) plus one per occurrence of each subcase
Dynamic Tree Implementation Lemma 5.2 The number of times a vertex is added to the queue is O(nm + n 3 /k) (Continued) Subcase 1a) occurs at most 2nm times (the number of links) Subcase 1b) occurs at most 2nm times when it causes a cut and at most 2nm times when the push from v to w is saturating
Dynamic Tree Implementation Lemma 5.2 The number of times a vertex is added to the queue is O(nm + n 3 /k) (Continued) Subcase 1b) is nonsaturating if it doesn’t cause a cut or a saturating push from v to w In a nonsaturating occurrence of 1b), either v or w’s tree is large (size greater than k/2) There are at most 2n/k large trees in the queue at the beginning of a pass If the large tree has changed since the beginning of this pass, charge the operation to the cut / link that changed it (at most one per link, 2 per cuts, 6nm) Else charge the operation to that tree (at most 2n/k per pass, 2n 2 passes, 4n 3 /k)
Dynamic Tree Implementation Theorem 5.3 The dynamic tree algorithm runs in O(nm log(n 2 /m)) time if k is chosen to be n 2 /m
Closing Comments Parallel version: discharge all active vertices in parallel (O(n 2 log n)) Maximum distance method: related work shows O(n 2 m 1/2 ) bound Implementation tricks: global relabeling, gap relabeling Maximum distance method better than tree version in practice?