Fixed parameter tractability II: Kernelization Algorithms & Networks Today: Thomas van Dijk
Overview Fixed parameter complexity recap Two kernelizations Vertex Cover again Cluster Editing Aspects of kernelization: Feedback Vertex Set Algorithm Analysis Implementation Experimentation
Quick reintroduction to FPT / Kernelization
Hard problems ‘Natural’ problems tend to be either in P Nice NP-Hard Nasty How to solve nasty problems? Solve only small instances Don’t actually solve the problem: … approximate … Exploit properties of certain instances
Fixed parameter complexity In many applications, some number can be assumed to be small Time of algorithm can be exponential in this small number, but should be polynomial in usual size of problem
Fixed parameter problems Given: Graph G, integer k, … Parameter: k Question: Does G have a ??? of size at least (at most) k? Examples: vertex cover, independent set, coloring, … Examples k-Coloring NP-complete for k>2… k-Clique Seems to require W( nf(k) ) time… k-Vertex Cover Solvable in O( 2k (n+m) ) time.
Fixed parameter complexity theory To distinguish between behavior: O( f(k) poly(n) ) Fixed Parameter Tractable O( n f(k) ) Not so nice Proposed by Downey and Fellows.
Kernelization algorithms Kernelization = “Preprocessing with quality guarantee” Polynomial time algorithm: Input: Graph G, parameter k Output: Graph G’, parameter k’ < k, Such that ( G’, k’ ) is YES iff ( G, k ) is YES G’ has O( f(k) ) vertices
Kernel iff FPT Kernelization FPT FPT Kernelization Kernelize. Solve kernel by any exact algorithm. Runtime: O( pkernelize-time(n) + fsolve-time(fkernel-size(k)) ) FPT Kernelization Definition of FPT: has O( p(n) f(k) ) time algorithm If n < f(k), input is kernel. If n > f(k), solve problem with FPT algorithm; output some YES or NO instance accordingly.
Typical form of kernelization Repeat some rules, until no rule is possible Rules can do some necessary modification and decrease k. Rules can remove some part of the graph. Rules can output YES or NO. Sometimes add ‘annotation’ to the graph
Kernel for Vertex Cover
Vertex cover: simple kernel Rule 1: if there is a vertex v of deg(v)>k, then remove v, decrease k by one. Rule 1 is safe. Consider v not in vertex cover. Covering v’s edges costs >k vertices. Resulting instance is NO. Therefore, if instance is YES, then v is in the cover. Therefore, instance is YES iff converted instance is YES.
Vertex cover: simple kernel Rule 2: if not rule 1, and m > k2, then answer NO. Rule 2 is safe. By [not rule 1], no vertex has degree > k. Therefore, a vertex covers at most k edges. Therefore, instance with > k2 edges must be NO. Analysis of kernel size By rule 2, m ≤ k2. In general, n ≤ 2m. Analysis of runtime: clearly polynomial
Kernel for Cluster Editing
Cluster editing Instance: undirected graph G=(V,E), integer k Parameter: k Question: can we make at most k modifications to G, such that each connected component is a clique, by adding edges deleting edges O( k2 ) vertices Gramm, Guo, Hüffner, Niedermeier CIAC’03 O( k ) vertices E.g. Fellows, Langston, Rosamund, Shaw FCT’07
Trivial rules and a plan Rule 1: If a connected component of G is a clique, remove this connected component Rule 2: If we have more than k connected components and Rule 1 does not apply: answer NO Consequence: after Rule 1 and Rule 2, there are at most k connected components Plan: find rules that make connected component small Annotate the graph: pairs of vertices can be permanent or forbidden.
Observation and rule 3 If two vertices have k+1 neighbors in common, they must belong to the same clique in a solution Rule 3: Suppose vertices v, w have k+1 neighbors in common. If the edge did not exist, add it and decrease k by 1 Set the edge {v,w} to be permanent
Another observation and rule 4 If there are at least k+1 vertices that are adjacent to exactly one of v and w, then {v,w} cannot be an edge in the solution Rule 4: Suppose vertex v has k+1 neighbors that do not neighbor vertex w If {v,w} is an edge: delete it and decrease k by one Mark the pair {v,w} as forbidden Rule 5: if a pair is both forbidden and permanent then there is no solution
Transitivity (triangles) Rule 6: if {v,w} is permanent, and {w,x} is permanent, then set {w,x} to be permanent if the edge did not exist, add it, and decrease k by one Rule 7: if {v,w} is permanent and {w,x} is forbidden, then set {w,x} to be forbidden if the edge existed, delete it, and decrease k by one
Runtime Rules can be executed in polynomial time With properly chosen data structures, can kernelize a graph in O(n3) time
Analysis: plan of attack Already know: at most k connected components Small components are fine So suppose component is big … we’ll show contradiction We’ll aim at a quadratic kernel
Analysis: counting Consider a connected component C with at least 4k+1 vertices. Let K = vertices not involved in a modification K, by definition, must already form a clique Since k modifications cannot touch > 2k vertices: |K| ≥ 2k+1 All edges in K already are permanent by rule 3
Counting continued Consider a vertex v in C-K v is adjacent to k+1 vertices in K By rule 3, v gets permanent edges to all of K That is, v joins K. v is non-adjacent to k+1 vertices in K By rule 4, v gets forbidden edges to all of K That is, v leaves all of K. Since |K|>2k+1, any v fits one of the above cases. Therefore, each connected component has size at most 4k In total: at most 4k2 vertices
Fixed parameter complexity of feedback set problems
Overview Feedback set problems Cubic kernel for Feedback Vertex Set Algorithmic considerations Experimental evaluation
Feedback Vertex Set Instance: Question: An undirected graph G = (V,E) An integer k Question: Does there exist an S V of size at most k such that G[ V-S ] is acyclic?
Example
Example: an FVS
Example: an FVS
Example: not an FVS
Example: not an FVS
Example: a lowerbound
Directed feedback vertex set (and Loop Cutset) Feedback set problems Feedback vertex set Feedback edge set Feedback arc set Directed feedback vertex set (and Loop Cutset)
Feedback edge set Easy: spanning tree Feedback set problems Feedback vertex set Feedback edge set Easy: spanning tree Feedback arc set Directed feedback vertex set (and Loop Cutset)
NP-COMPLETE Feedback set problems Feedback vertex set Feedback edge set Easy: spanning tree NP-COMPLETE Feedback arc set Directed feedback vertex set (and Loop Cutset)
Feedback Vertex Set Instance: Parameter: Question: An undirected graph G = (V,E) Parameter: An integer k Question: Does there exist an S V of size at most k such that G[ V-S ] is acyclic?
Directed feedback vertex set (and Loop Cutset) Feedback set problems Feedback vertex set Feedback edge set Easy NP-COMPLETE Feedback arc set Directed feedback vertex set (and Loop Cutset)
Feedback set problems Parameterized on size of solution Feedback vertex set Fixed parameter tractable Feedback edge set Easy Feedback arc set Directed feedback vertex set (and Loop Cutset)
Feedback set problems Parameterized on size of solution Feedback vertex set (and Loop Cutset) Fixed parameter tractable Feedback edge set Easy Feedback arc set Directed feedback vertex set
Feedback set problems Parameterized on size of solution Feedback vertex set Directed feedback vertex set (and Loop Cutset) Fixed parameter tractable Feedback edge set Easy and Loop Cutset ? Feedback arc set
Intuition: “why” FVS is FPT For n→∞, fixed k, instances become either Easy “Because” they are sparse Complicated but in a way that is easy to recognize “There is no way this can be done with only k vertices!” It is hard to make any complicated structure without making lots of cycles
Feedback Vertex Set is FPT Almost directly from some deep theorems of fixed parameter complexity Not practical There are several FPT algorithms, e.g. O*( 5k ) Chen, Fomin, Liu, Lu, Villanger O*( 4k ) probabilistic Becker, Bar-Yehuda, Geiger Practical…?
Kernels for feedback vertex set O(k11) vertices Burrage, Estivill-Castro, Fellows, Langston, Mac & Rosamond IWPEC’06 O(k3) vertices Bodlaender STACS’06 O(k2) vertices Thomassé SODA’09
Kernelization overview A collection of rules A proof: if none of the rules apply, then the graph can have only O( k3 ) vertices Complicated; not in this presentation An algorithm: just keep trying the rules until none apply
Simple rules Islet rule: Twig rule: Triple edge rule: Remove degree 0 vertices. Twig rule: Remove degree 1 vertices. Triple edge rule: Remove more-than-double edges.
Degree two rule Degree 2 rule: Suppose v has degree 2. Bypass v. There exists optimal FVS that does not use v: choose one of its neighbors
Self-loops Self-loop rule: Suppose v has a self-loop. Remove v and decrease k by one. Vertex v must be in the FVS.
Large double degree Large double degree rule: Suppose v has k+1 double edges. Remove v and decrease k by one.
Flower rule example, k ≤ 2
Flower rule example, k ≤ 2 v
Flower rule example, k ≤ 2
Flower rule Flower rule: Suppose there are k+1 cycles that are vertex-disjoint except all include v. Remove v and decrease k by one. If v is not chosen in the FVS then all k+1 cycles need to be broken separately.
Cascade of degree-2 rule
Cascade of degree-2 rule
Cascade of degree-2 rule
Cascade of degree-2 rule
Cascade of degree-2 rule
Improvement rule example, k ≤ 2 B
Improvement rule example, k ≤ 2 B
Improvement rule Improvement rule: Suppose there are k+2 vertex-disjoint paths between vertices u and v. Add a double edge (u,v). Any two of the paths form a cycle. If neither u nor v is chosen, at least k+1 of the paths must be broken separately. So at least one of u and v must be chosen.
Improvement rule example, k ≤ 2 B
Improvement rule example, k ≤ 1 B
Improvement rule example, k ≤ 1 B
Large double degree, k ≤ 1
Large double degree, k ≤ 1
2-Approximation rule 2-Approximation rule: Suppose a 2-approximate FVS is larger than 2k. Conclude NO. This can be found in polynomial time Becker, Geiger Bafna, Berman, Fujito
Rule: Abdication Suppose a vertex v governs a piece X. If it has one edge into X, remove that edge. If it has multiple edges into X, select v in the FVS.
Rule overview Islet, Twig, Triple edge Degree two Self-loop, Flower Improvement 2-Approximation Abdication
Analysis Let A be a 2-approximate FVS. By 2-Approximation rule: |A| ≤ 2k Let B be vertices with a double edge to A. By Large Double Degree rule: |B| in O(k2) Call a connected components of G[V-A-B] a piece. We will bound the number of pieces by O(k3). More analysis gives O(k3) vertices and edges. Skipped here.
Analysis Border of a piece: its neighbors in A and B. Associate each piece with a pair of vertices in its border that does not have a double edge Exists for each piece, by Twig and Abdication. Case distinction on the type of those two vertices Case both in A: More than k+2 such pieces would give Improvement. O( k2 ) pairs of vertices in A, at most k+1 pieces each. O( k3 ) pieces of this type.
Analysis Case both in B No two such pieces: would be an unbroken cycle. Ω( k4 ) pairs of vertices in B; not done yet. We know A is feedback vertex set: B is not in the feedback vertex set. Pieces are not in the feedback vertex set. Vertices in B and how they are connected by pieces: must be a forest! O( k2 ) vertices in B; therefore O( k2 ) pieces of this type.
Analysis Case one in A, one in B O( k3 ) such pairs If only one piece for the pair: done. Consider the pairs (a,b) that do have multiple pieces associated to it. For any a, at most k such b: Flower rule. For any (a,b), at most k+1 pieces: Improvement rule. k vertices in A, each paired with at most k vertices in B, each with at most k+1 pieces: O( k3 ) such pieces. QED
Related results Quadratic kernel Linear kernel on planar FVS Thomassé SODA’09 Linear kernel on planar FVS Bodlaender & Penninkx IWPEC’08 Directed FVS is FPT Chen, Liu, Lu & O’Sullivan, Razgon STOC’08
Algorithmic considerations A bunch of rules … are they all “required?” The complicated ones are. Dropping any one of those: size of output no longer bounded in k. The O(k3) size bound, is it tight? Yes: exists reduced instance with Θ(k3) vertices How to efficiently check for the rules?
Order of rule checking Matters a lot in an implementation Some aspects: Cascading small-degree rules. Improvement makes double edges, leads to Large Double Degree Abdication rule uses pieces. Only available when 2-approximation is up-to-date.
Justification of the rules The Flower rule is required
Justification of the rules The Improvement rule is required The Abdication rule is required
Justification of the rules The 2-Approximation rule is required Consider a grid Degree-2 rule cuts the corners
Justification of the rules The 2-Approximation rule is required Consider a grid Nothing else happens if k ≥ 3
Justification of the rules The 2-Approximation rule is required Consider a grid Nothing else happens if k ≥ 3: not Flower (just 2 petals)
Justification of the rules The 2-Approximation rule is required Consider a grid Nothing else happens if k ≥ 3: not Improvement (just 4 paths)
Justification of the rules The 2-Approximation rule is required Consider a grid Only the 2-approximation
Checking for the improvement rule?
Checking for the improvement rule Node-capacity flow Standard reduction to arc-capacity flow
Checking for the Flower rule?
Checking for the Flower rule If v is a flower with k petals, there is a selection of edges where: v has 2k of its edges selected All other vertices have either 2 or none of their edges selected
Flower with 3 petals 2 2 2 2 2 2 6 2 2 2 2 2 2
Checking for the Flower rule Reduce to perfect matching in general graphs Then “just” use Edmonds’ algorithm
Flower with 3 petals 2 2 2 2 2 2 6 2 2 2 2 2 2
{0,2}-Matchting widget No edges selected (vertex not in petal)
{0,2}-Matchting widget Two edges selected (vertex in petal)
Implementation C++ : Interesting experimental results Library of Efficient Data structures and Algorithms By ‘Algorithmic Solutions’ Quite nice Interesting experimental results
Runtime on random graphs, k=20
Runtime on random graphs, k=20
Runtime on random graphs
Effectiveness of the kernelization After kernelization, one still needs to solve the problem on the kernel So compare runtime of solve, versus kernelize and solve kernel Following data: solve using a simple branch-and-bound
Effectiveness of the kernelization
Effective branching factor (histogram) With kernelization Without kernelization
UAI’06 Networks: runtime
UAI’06 Networks: effectiveness as heuristic (histogram)
Effectiveness on UAI’06 Networks Unfortunately, those are way too big to find an optimal loop cutset for Thousands of nodes … Don’t actually need optimal loop cutset for cutset conditioning Kernelization improves quality of heuristics?
Random graphs
UAI’06 Networks
Some results from the experiments The kernelization runs quickly Kernelization helps practical exact runtime a lot! Kernelization helps the Becker/Geiger 2-approximation quality a lot! As a heuristic, seems much more practical than Becker/Bar-Yehuda/Geiger O(4k) algorithm
Kernelization for Loop Cutset
Loop Cutset Instance: Parameter: Question: An directed acyclic graph G = (V,A) Parameter: An integer k Question: Does there exist an S V of size at most k such that for all simple loops, S contains a vertex that is not a ‘sink’ on that loop
Loop Cutset Instance: Parameter: Question: An directed acyclic graph G = (V,A) Parameter: An integer k Question: Does there exist an S V of size at most k such that for all simple loops, S contains a vertex that is not a ‘sink’ on that loop
Kernelization for Loop Cutset Transform to Blackout FVS Kernelize Blackout FVS Based on Bodlaender’s FVS kernelization Transform back
Blackout Feedback Vertex Set Instance: An undirected graph G=(V,E) A set X V of ‘blacked out’ vertices Parameter: An integer k Question: Does there exist an S V-X of size at most k such that G[ V-S ] is acyclic
Transform to Blackout FVS Split all vertices v into vin and vout Blackout all in vertices
Transform to Blackout FVS Split all vertices v into vin and vout Blackout all in vertices
Transform to Blackout FVS Split all vertices v into vin and vout Blackout all in vertices
Some additional rules Degree two rule becomes somewhat more complicated Handle the blacked out vertices…
Transform back to Loop Cutset Possibly edges between Two allowed vertices An allowed and a blacked out vertex Not between two blacked out vertices Blacked out component rule