Toward Constant time Enumeration by Amortized Analysis 27/Aug/2015 Lorentz Workshop for Enumeration Algorithms Using Structures Takeaki Uno (National Institute of Informatics, Japan)
Approach from Computation I would like to start from PHILOSOPHY Personally, I think algo. researches are divided into three + + Mathematical approaches BP, approx., on-line, embedding, interior point method, sampling… + + Structural approaches graph class, miner, perfect graph conjecture, Voronoi diagram,… + + Computation approaches DP, branch&bound, k-best, plane sweep, prefix array,… Many in former, and less in latter (I don’t know it is good or not)
Computation Approach for Enumeration In enumeration algorithms, many researches seem to follow computation approaches and structural approaches There are still many unstudied problems, so there would be a big frontier Today, I would like to talk about amortized analysis on enumeration algorithm that is in a computation approach So the talk is combination of complexity analysis, algorithm design, and structures
Complexity Analysis Complexity analysis a base notion of algorithm design With complexity analysis we can design efficient algorithms + + k-best selection reduce problems to 12/13 iteratively with tricky pivots + + merge sort balance of problem partition reduces time complexity We can not understand why the time complexity is small, without understanding the complexity analysis
Targets in This Talk Output sensitive algorithms especially, (average) polynomial delay time algorithms Simple problems, paths, matchings, trees, the polynomiality is clear Reduce the “(average) delay”, ex., O(n 2 ) O(nlog n), O(n) O(1) (O*(1.5 n ) O*(1.5 n )) Essence; difficulty, approach, example,
2 Motivation to Amortization
There is an Algorithm Suppose that there is an enumeration algorithm Z. we want to know time complexity of Z (output polynomiality) What is needed? What will we obtain as a result? We assume that Z is a tree-shaped recursion algorithm (the structure of the recursion is a tree) and the problem is combinatorial. (has at most 2 n solutions) ・・・
Iteration = O(X) We now know that each iteration takes O(X) time. Can we do something? No. Possibility for “exponentially many iterations, with few solutions” ex) ex) feasible solutions for SAT, with branch-and-bound algorithm ・・・ O(X)
Solution for Each We now know that each iteration outputs a solution. Can we do something? Yes! #solutions = #iterations “O(X) time for each solution” ・・・ O(X)
Solutions at Leaves We now know that at each leaf, a solution is output. Can we do something? ex) ex) s-t paths, spanning trees, … No. Possibility for “exponentially many inner iterations, with few leaves” ・・・ O(X) EXP. many
Bounded Depth We now know that the height of recursion tree is at most H Can we do something? YES! [#iterations] < [#solutions] × [height] “O(X ・ H) time for each solution” ・・・ O(X) O(H)
At least Two Children Instead of the height, we now know that each non-leaf iteration has at least two children Can we do something? YES! [#iterations] < 2 × [#solutions] “O(X) time for each solution” ・・・ O(X) two
Good Three Cases These three cases are typical in which we can bound the time complexity efficiently In each case, the time complexity for an iteration depends on maximum computation time on an iteration If we want to do better, we have to use amortized analysis (average computation time of an iteration) ・・・ two ・・・
3 Basic Analysis
Bottom-widenessBottom-wideness An iteration of enumeration algorithm generates recursive calls for solving “subproblems” Subproblems are usually smaller than the original problem bottom-wideness Many bottom level iterations take short time, few iterations take long time (we call bottom-wideness) Can we do something better? ・・・ long middle short middle
Bad Case An iteration of enumeration algorithm generates recursive calls for solving “subproblems” Subproblems are usually smaller than the original problem Many bottom level iterations take short time, few iterations take long time Can we do something better? No. not sufficient in the right case, an iteration takes O(n) time on average n n-1 n-2 2 ・・・
BalancedBalanced The recursion tree was biased. If balanced? No. not sufficient in the right case, an iteration takes O(n) time on average n n-1 n What is sufficient to reduce the amortized time complexity?
Sudden Decrease In the cases, sudden decrease occurs. time for parent and child differ much We shall clarify good characterization for “no sudden decrease” Then, what is good? n n-1 n n n-1 n-2 2 ・・・
Toy Case If any iteration has two children, and the computation time decreases constantly, the amortized computation time will be reducedEx) n + 2(n-1) + 4(n-2) +…+ 2 n n1 22 n This holds for any polynomial {Σ 2 n-i poly(i) } / 22 n = O(1) ・・・ n n-1 n computation time #iterations = O(1)
AnalysisAnalysis This holds for any polynomial of the form poly(i) = i 2, i 3,… { Σ 2 n-i poly(i) } / 22 n = O(1) Compare the computation time on adjacent levels 2 n-(i+1) poly(i+1) / 2 n-i poly(i) = poly(i+1) / 2poly(i) There are constants α < 1 s.t. poly( i+1) / 2poly( i ) 0 ・・・ n n-1 n If bottom level iteration takes 1 unit time, total computation time < 2 n ( 1 / (1-α)) If bottom level iteration takes 1 unit time, total computation time < 2 n ( 1 / (1-α))
Generalization of the Toy Case Assume that poly(i) is an arbitrary polynomial There are constant δ and α < 1 s.t. poly(i+1) / 2poly( i ) δ When i < δ, poly(i) is constant, thus any iteration on level below δ takes constant time For i ≥ δ, { Σ i ≥δ 2 n-i poly(i) } / 22 n(n-δ) = O(1) ・・・ n n-1 n Therefore, amortized computation time for one iteration is O(1) Therefore, amortized computation time for one iteration is O(1)
More Than Two Children Consider cases that an iteration may generate more than two recursive calls (so, iterations have three or more children) Let N(i) be the number of iterations in level i computation time on level i is bounded by Σ N(i) poly(i) Compare adjacent levels N(i+1) poly(i+1) / N(i) poly(i) ≤ poly(i+1) / 2poly(i) ・・・ n n-1 n ・・・ Thus, in the same way, we can show amortized computation time for one iteration is O(1) Thus, in the same way, we can show amortized computation time for one iteration is O(1)
ApplicationApplication You may think this is too much trivial to enumeration algorithms However, surprisingly, there are applications Consider the enumeration of elimination orderings Elimination ordering for given a structure (graph, set, etc.) a way of removing its elements one by one until the structure will be empty, with keeping a given property ・・・ n n-1 n
Elimination Ordering for Connectivity Ex) Ex) For given a connected graph G=(V,E), remove vertices one by one with keeping the connectivity We can enumerate this elimination ordering by simple backtracking Each iteration takes O(|V| 2 ) time Iter (G, S) if G is empty then output S for each vertex v in G, if G-v is connected then call Iter (G-v, S+v) Iter (G, S) if G is empty then output S for each vertex v in G, if G-v is connected then call Iter (G-v, S+v) …unpublished
Necessary Condition (1) (1) For any connected graph, there are at least two vertices whose removals are connected Each (internal) iteration has at least two children (2) (2) Computation time on an iteration in level i is O(i 2 ) Amortized computation time of and iteration is O(1) time Iter (G, X) if G is empty then output X for each vertex v in G if G-v is connected then call Iter (G-v, X+v) Iter (G, X) if G is empty then output X for each vertex v in G if G-v is connected then call Iter (G-v, X+v)
Small Pit Falls How to output X in O(1) time? output X by the difference from the previously output one Since the number of additions and deletions is linear in the number of iterations, the amortized output time is O(1) How to give G and X to the recursive call in O(1) time? always update them, and give them by pointers to G and X Before the recursive call, we remove v and adjacent edges from G, and add v to X After the recursive call, we add v and adjacent edges to G, and remove v from X. This doesn’t increase the time/space complexity Q Q
Other Elimination Ordering There are many kinds of elimination orderings + + perfect elimination ordering (chordal graphs) + + strongly perfect elimination ordering (chordal graphs) + + vertex on the surface of the convex hull (points) … + + edge coloring of bipartite graph can be also solved At least, there proposed constant time algorithms for the first two (technical, to achieve amortized constant time for each iteration) We can have the same results with very very simple algorithms Ruskey et. al. ‘03 Ruskey et. al. ‘03 Ruskey et. al. ‘03 Matsui & U ‘05
4 Amortize by Children
Biased Recursion Trees The previous cases are something perfect + + the height (depth) is equal at everywhere + + computation time depends on the height We want to have stronger tools that can be applied to biased cases ・・・ n n-1n/2 n-2n/
Well-known Case Let T(X) be the computation time on iteration X If every child takes at most T(X) / α, for some α > 1 the height of the tree is O(log n) + + useful in complexity analysis + + however, #iterations is bounded by polynomial (not fit for enumeration) ・・・ n n/3n/2 n/6n/ We need another idea
Local Amortization If #children is large, amortized time complexity will be small even though sudden decrease occurs Let C(X) be children of iteration X, and assign computation time T(X) to its children each child receives T(X) / |C(X)| The time complexity of an iteration is O( max X { T(X) / ( |C(X)| + 1) } ) We can use #grandchildren instead of #children
Estimating #(Grand)Children This analysis needs to estimate #children (and #grandchildren) this will be a technical part of the proof + + estimate by the degree of the pivot vertex + + #edges in a cycle + + #edges in a cut… ・・・
Enumeration of s?-path Problem: Problem: given a graph G=(V,E), and a vertex s, enumerate all simple paths one of whose end is s Simply, by back tracking, we can solve Each iteration takes O(d(t)) where d(t) is the degree of t the time complexity of an iteration is O(|V|) Iter (G=(V,E), t, S) output S for each vertex v in G adjacent to s call Iter (G-t, v, S+v) Iter (G=(V,E), t, S) output S for each vertex v in G adjacent to s call Iter (G-t, v, S+v) s s …folklore
AmortizationAmortization + + Each iteration takes O(d(t)) time + + Each iteration generates d(t) recursive calls Thus, max x { T(X) / ( |C(X)| + 1) } = O(1), and the amortized time complexity of an iteration is O(1) Iter (G=(V,E), s, S) output S for each vertex v in G adjacent to s call Iter (G-s, v, S+v) Iter (G=(V,E), s, S) output S for each vertex v in G adjacent to s call Iter (G-s, v, S+v) s s
Other Problems By using #grandchildren, the complexity on the enumeration algorithms for the following structures are established + + spanning trees of a given graph O(|V|) O(1) + + trees of size k in a given graph O(|E|) O(k) etc…
5 Push out Amortization
Computation Time “Increases” In the “toy” cases, the key property was that “the total computation time on each level increases with a constant factor, by going to a deeper level “ 2 n-(i+1) poly(i+1) / 2 n-i poly(i) = poly(i+1) / 2poly(i) It seems that “increase of computation time is good for us” (it implicitly forbids “sudden decrease”) Since the tree is biased, apply this idea locally --- parent and child (or descendants)
Local Increase In the “toy” cases, we compare the total computation time on a level and that on the neighboring level Instead of that, we compare the computation time of a parent, and the total time on its children the condition of the toy case is implemented as follows Σ child Z of X T(Z) ≥ αT(X) for some α > 1 n n-1n/2 We will characterize good cases by this condition We will characterize good cases by this condition
PO ( Push Out ) Condition T(X): computation time of iteration X T* : the maximum computation time on a leaf C(X): set of child iterations of X α ( T(X) ≦ Σ T(Y) Y ∈ C(X) PO Condition α , β > 1 constant rest ≤ time of children / α
FormulaFormula S(X): computation time given to X by its parent T* : the maximum computation time on a leaf C(X): set of child iterations of X Proof: Proof: give one’s computation time to its children, so that each child Z receives (just move, for analysis) α T(X) × ( T(Z) / Σ Y ∈ C(X) T(Y) ) Each child gives the received time and its own time to its children, recursively (so, grandchildren) if PO condition holds in all iterations, S(X) = O( T(X) ), thus amortized computation time of each iteration is O(T*) Theorem
InductionInduction Each child gives the received computation time and its own computation time to its children (so, grandchildren) Observation: Observation: under this rule, any iteration z receives at most T(Z) /(α-1) from its parent (intuitively, T(Z) / α from parent, T(Z) / α 2 from grandparent,... ) Suppose that an iteration x satisfies the condition Then, its child z receives at most {T(Z) / ΣT(Y)} { T(X) + T(X) / (α-1) } = {T(Z) / ΣT(Y) } { T(X) α / (α-1) } ≤ {T(Z) / α } { α / (α-1) } = T(Z) / (α-1) (α+1) (1/α) T(X)/ ΣT(Y) ≥ α
Amortized Time on Leaf Claim: Claim: under this rule, any iteration z receives at most T(z) / (α-1) from its parent Each leaf receives T* / (α-1) time from its parent After this move, only leaves have computation time Each leaf has T* + T* / (α-1) = O(T*) time amortized time complexity of an iteration is O(T*) (α+1) (1/α)
PO ( Push Out ) Condition T(X): computation time of iteration X T* : the maximum computation time on a leaf C(X): set of child iterations of X α ( T(X) - βT* (|C(X)|+1) ≤ Σ T(Y) Y ∈ C(X) PO Condition α , β > 1 constant assign T* to it & children rest ≤ time of children / α First, for each iteration X, assign T* of its computation time to each child and itself, (βT* (|C(X)|+1) in total), and forget about this Then, the situation is the same as the previous theorem
FormulaFormula if PO condition holds in all iterations, the amortized computation time of each iteration is O(T*) Theorem PO Condition T* : the maximum computation time on a leaf T(X): computation time of iteration X C(X): set of child iterations of X S(X): computation time given to X by its parent α ( T(X) - βT* (|C(X)|+1) ≤ Σ T(Y) Y ∈ C(X)
Connected Component Enumeration
Problem: Problem: for given a graph G = (V, E), output all vertex sets of G that induces a connected subgraph In particular, we consider enumeration of those including a specified vertex r Connected Component Enumeration terms d(v): the degree of v G-v: the graph obtained from G by removing v G/e: the graph obtained from G by contracting edge e AvisFukuda96, U.98 r r
Simple Binary Partition Enumerated by simple binary partition, with vertex v adjacent to r, as follows ● ● vertex subsets including v recurs with G/(r,v) ● ● vertex subsets not including v recurs with G - v An iteration takes O(|V|) time PO condition does not hold with naïve observation r r v v
Observing in Detail An iteration takes O( d(r) + d(v) ) time, actually Child iterations take time Ω( (d(r) + d(v) -2) / 2 + ○○ ) and Ω( d(r) -1 + △△ ) ● ● Here, let T(X)= 3 d(r)+d(v) and T* = 2 ● ● Sum of child iterations 3(d(r)+d(v)-2) / 2 + 3d(r)-3 = ( 9d(r)+3d(v) ) / 2 -6 ≧ 1.5 T(X) – 3T* r r v v Tricky! PO condition holds!
Matching Enumeration Matching Enumeration
Problem: Problem: for given a graph G = (V, E), output all matchings of G matching: matching: an edge subset s.t. no two edges are adjacent Enumeration of Matchings terms d(v): the degree of v G-e: the graph obtained from G by removing e G + (e): the graph obtained from G by removing edge e and edges adjacent to e G-u: the graph obtained from G by removing vertex u and edges incident to u
Basic Algorithm Matching enumeration is done by, with edge e =(u,v) ● ● enumerate matchings including e recurs with G + (e) ● ● enumerate matchings not including e recurs G – (u,v) that is binary partition (branch & bound) An iteration takes O(|V|) time (or maximum degree) T* =O(1), but PO doesn’t hold when degrees of u and v are large v v u u …forklore
Trick on Branching Partition the problem by edges e 1,…, e k incident to u ● ● matchings including e 1 (G + (e 1 ) ) ● ● matchings including e 2 (G + (e 2 ) ) … ● ● matchings including e k (G + (e k ) ) ● ● matchings including none of e 1,…, e k (G- u) An iteration takes O(|E|) time, increased. T* =O(1) doesn’t change A child with G + ((u,v)) spends O(|E|-d(u)-d(v)+1) time, (×d(u)) and the child with G-u spends O(|E|-d(u)) time (×1) u u e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8 WADS2015
PO Condition d(u) children spends O(|E|-d(u)-d(v)+1) time, and a child spends O(|E|-d(u)) time in total, all children take O( (d(u)+1) (|E|-d(u)) - Σ d(v) +d(u) ) = O( d(u)(|E|-d(u)) + |E| - Σ d(v) ) = O( d(u)(|E|-d(u)) ) time Choosing max. deg vertex as u, then + + d(u)=1: d(v) is also 1, so two subproblems has ≥ |E|-2 edges + + d(u) ≥ c|E|: #children ≥ c|E| + + otherwise: d(u)(|E|-d(u)) ≥ c|E| for some c In any case, PO condition hold, thus O(1) time for each u u e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 e8e8
Spanning Tree Enumeration
Problem: Problem: for a given graph G = (V,E), output all spanning trees of G Enumeration of Spanning Trees terms G-e: the graph obtained from G by removing e G/e: the graph obtained from G by removing edge e and edges adjacent to e WADS2015
Basic Algorithm ● ● spanning trees including edge e those in graph with shrinking e ● ● spanning trees not including e those in graph with removing e An iteration takes O(|V|) time, or O(log |V|) Existing algorithms attained O(children + grandchildren), by choice of pivot and data structures e e
Trick on Branching T*= O(1), but PO doesn’t hold when e has many parallel/series edges e has parallel edges e 1,…, e k including one of e, e 1,…, e k, or including none e has series edges e 1,…, e k not including one of e, e 1,…, e k, or including all In these cases, divide the problem in the other ways e e
Trick on Branching T*= O(1), but PO doesn’t hold when e has many parallel/series edges e 0 has parallel edges e 1,…, e k recurs with (G-{e 1,…, e k }) / e 0, or G-{e 0,e 1,…, e k } e 0 has series edges e 1,…, e k recurs with G/({e 0,…,e k }- e i) ) - e i, or G/{e 0,e 1,…, e k } Construction of each graph from another is O(1) time an iteration is done still in O(|E|) time PO condition holds when e has parallel/ series edges so that it has many children e0e0 e0e0
Checking PO Condition e 0 has parallel edges e 1,…, e k recurs with (G-{e 1,…, e k }) / e 0, or G-{e 0,e 1,…, e k } e 0 has series edges e 1,…, e k recurs with G/({e 0,…,e k }- e i) ) - e i, or G/{e 0,e 1,…, e k } an iteration is done still in O(|E|) time + + if e 0 has < c|E| parallel/series edges, PO condition holds since children are large + + if e 0 has ≥ c|E| parallel/series edges, PO condition holds since c|E| children So, it is O(1) time for each e0e0 e0e0
ConclusionConclusion
ConclusionConclusion Mechanism of amortization - enumeration algorithm spends much time on bottom level Basic (toy) case (elimination ordering) - even toy cases are interesting! Local amortization (path enumeration) - cost for a parent is assigned to children and grandchildren Biased (general) case (matching enumeration) - just modify the algorithm so that the conditions are satisfied
ReferencesReferences PO Condition T. Uno, Constant Time Enumeration by Amortization, WADS 2015, LNCS9214, (2015) Elimination orderings L. Chandran, L. Ibarra, F. Ruskey, J. Sawada, Generating and characterizing the perfect elimination orderings of a chordal graph, TCS, (2003) Y. Matsui, T. Uno, On the Enumeration of Bipartite Minimum Edge Colorings, Graph Theory in Paris, Trends in Mathematics 2007, (2007) Matching T. Uno, Algorithms for Enumerating All Perfect, Maximum and Maximal Matchings in Bipartite Graphs, ISAAC97, LNCS 1350, (1997) T. Uno, A Fast Algorithm for Enumerating Bipartite Perfect Matchings, ISAAC2001, LNCS 2223, (2001) T. Uno, A Fast Algorithm for Enumerating Non-Bipartite Maximal Matchings, J. National Institute of Informatics 3, (2001)
ReferencesReferences Connected component D. Avis, K. Fukuda, Reverse Search for Enumeration, DAM 65, (1996) T. Uno, Studies on Speeding Up Enumeration Algorithms, PhD thesis (1998) Spanning Trees H. N. Kapoor and H. Ramesh, Algorithms for Generating All Spanning Trees of Undirected, Directed and Weighted Graphs, LNCS 519, (1992) A. Shioura, A. Tamura and T. Uno, An Optimal Algorithm for Scanning All Spanning Trees of Undirected Graphs, SIAM J. Comp. 26, (1997) T. Uno, An Algorithm for Enumerating All Directed Spanning Trees in a Directed Graph, ISAAC96, LNCS 1178, (1996) T. Uno, A New Approach for Speeding Up Enumeration Algorithms, ISAAC98, LNCS 1533, (1998) T. Uno, A New Approach for Speeding Up Enumeration Algorithms and Its Application for Matroid Bases, COCOON 99, LNCS 1627, (1999)