Exploiting Structure in Symmetry Detection for CNF Paul T. Darga, Mark H. Liffiton, Karem A. Sakallah, and Igor L. Markov The University of Michigan Hello! Next slide… TODO in general: increase size of greek fonts
Structure in SAT Human-designed artifacts possess considerable structure Manifested in instances of Boolean satisfiability (SAT) Symmetry Some rearrangement of the components of the design that preserves its structure Sparsity Most design elements are directly related to only a few other elements in the whole design We can exploit this structure to speed up SAT solving! TODO: generalize the idea of “structure” to designs; it is manifested in the CNF formulas derived from these generic designs We don’t “define” symmetry; just say what it is… same with sparsity The key observation is that in any domain, and in particular EDA, human-designed artifacts possess considerable structure. The human mind is capable of processing only so much complexity simultaneously, and so
Symmetry Breaking = (a'+b+c)(a+b'+c')(b'+c)(a') Shatter (Aloul et. al) Converts CNF formula to a colored undirected graph (Crawford) Uses nauty (McKay), a graph symmetry detection tool, to find the symmetries present in the graph Converts the symmetries back into additional symmetry-breaking predicates (SBPs) Appends the SBPs to the original formula The resulting formula is often much faster to solve TODO: replace Karnaugh map with: The list of clauses (vertically) Their mapping to clauses in the “symmetric” instance An explanation of the cycle notation for permutations Eight assignments to search reduced to four Mention that this is finding structural, not functional, symmetries = (a'+b+c)(a+b'+c')(b'+c)(a') = (a'+b+c)(a+b'+c')(b'+c) (a'+b+c) (a+b'+c') (b'+c) (a'+b+c) (a+b'+c') (b'+c) (a,a')(b,c')(b',c)
Symmetry Breaking Benchmark Sym Search Total % Sym Hole-n 0.38 0.07 0.45 84.4 Urq 0.76 1.17 1.93 39.4 GRoute 38.76 5.08 43.84 88.4 FPGARoute 3.24 0.21 3.45 93.9 ChnlRoute 25.86 0.17 26.03 99.4 XOR 11.43 2.41 13.84 82.6 2pipe 23.50 8.01 31.51 74.6 TODO: remember to explain the table columns and highlight the %Sym information On all but the synthetic Urquhart instances, symmetry detection with nauty dominates the run time of the Shatter flow Further improvements must come from improved symmetry detection
Outline Graph construction nauty : "No Automorphisms, Yes?" Problem description Partition refinement The search tree saucy : a new symmetry detection tool Algorithmic improvements exploiting structure Experimental results: saucy run time insignificant compared to SAT solver Conclusions and future work TODO: give the bottom line here to entice the audience
Graph Construction for CNF = C1C2C3 = (a'+b+c)(a+b'+c')(b'+c) C1 Two vertices and one edge for each variable One vertex for each clause One edge for each literal in each clause Color literals and clauses differently a' b c C3 a b' c' C2
Symmetry Detection Problem What precisely is a symmetry of a graph G? A symmetry is a permutation of the labels assigned to vertices of G such that G = G The set of all symmetries is denoted Aut(G) C1 C1 C2 G G G = G a' b c c a c' a' b b' TODO: highlight edges as we reconstruct the graph on the right-hand side C3 C3 C3 a b' c' a' c' a c b b' C2 C1 C2 = (a,a')(b,c')(b',c)(C1,C2) = (a,b',c')(a',b,c)
Symmetry Detection Problem 1234 2134 (1,2) 3124 (1,3,2) 4123 (1,4,3,2) 1243 (3,4) 2143 (1,2)(3,4) 3142 (1,3,4,2) 4132 (1,4,2) 1324 (2,3) 2314 (1,2,3) 3214 (1,3) 4213 (1,4,3) 1342 (2,3,4) 2341 (1,2,3,4) 3241 (1,3,4) 4231 (1,4) 1423 (2,4,3) 2413 (1,2,4,3) 3412 (1,3)(2,4) 4312 (1,4,2,3) 1432 (2,4) 2431 (1,2,4) 3421 (1,3,2,4) 4321 (1,4)(2,3) Problem: there are n! possible labelings! Can we prune the search space? 3 1 4 2 TODO: get rid of the naïve approach business Highlight the green after showing the table
Partition Refinement We can rule out many candidate labelings Distinguish vertices that cannot possibly be mapped into each other by any symmetry Fast distinguishing method: degree 1 1 1 Select a color in the graph Compute the number of connections every vertex has to that color Distinguish vertices within colors based on that count Repeat until coloring stabilizes Refinement distinguished all vertices! This graph has no symmetry besides the identity. 1 1 4 1 1 4 6 7 Mention at the end that we have a “discrete coloring”… important for the search tree 3 4 2 1 2 3 1 1 2 1 5 3
Partition Refinement In a stable coloring: vertices in different colors definitely cannot map into each other in some symmetry of the graph vertices in the same color may map into each other (i.e. refinement is only an approximation) 1 2 3 7 6 4 5 C1 a' b c TODO: get rid of some text here, especially on the second half Make sure to define “stable coloring” as one where no further refinement is possible C3 a b' c' C2
The Search Tree 2 4 1 3 5 1 2 3 4 5 3 5 1 2 4 = 035421 3 5 1 2 4 3 5 1 2 4 5 3 1 2 4 3 5 2 4 1 3 5 1 4 2 5 3 1 2 4 3 5 1 4 2 3 5 2 4 1 5 3 1 2 4 5 3 1 2 4 3 5 1 4 2 5 3 2 4 1 = 5 3 2 1 4 3 5 4 1 2 3 5 4 2 1 3 5 1 4 2 5 3 1 2 4 5 3 2 4 1 Suppose we are given this graph, and we wish to find its symmetries. All vertices are initially colored the same—meaning there are 6 factorial, or 720, different possible labelings of this graph We run partition refinement on this coloring and are able to divide the vertices into two groups, representing the inner triangle and the outer vertices. This has left us with 3 factorial times 3 factorial, or 36, labelings remaining to examine. However, partition refinement cannot help us anymore with this coloring since it is stable. Therefore, we select one color in the stable coloring and create new colorings, one for each vertex of the selected color, in which one vertex is artificially distinguished from the others. This has, in effect, partitioned the labelings. The benefit of this operation is that by artificially coloring one vertex differently, we have created new colorings that are no longer stable, so we can again run partition refinement on them to further prune the number of labelings. We do this in a depth-first manner. Looking at the leftmost coloring, we run partition refinement on it, and find that we’ve been able to prune away more labelings, but the new coloring is not discrete. We thus recursively run the search tree algorithm… … run partition refinement … and finally arrive at a discrete coloring. The first discrete coloring gets a special name: zeta. All the other discrete colorings we find during the search will be compared with zeta to produce the symmetries of the graph. We backtrack to the leftmost node and examine its other child, run partition refinement on it, and get another discrete coloring. Note that we can get this coloring from zeta by swapping 2 and 4, and 3 and 5. This swapping is the permutation represented by this leaf of the search tree. We check the permutation against the graph and see that it is a symmetry of the graph, representing a horizontal reflection. We then backtrack, and repeat this process with the next coloring, and find two more permutations that happen to be symmetries. Finally, we examine the last coloring, and get two more symmetries. At this point, the search ends, since there are no unexamined colorings. We see that in the end, six discrete colorings were examined, rather than the 720 labelings we would have examined without partition refinement. These six symmetries are the entire set of symmetries for this graph. 0 = 1 = (2,4)(3,5) 2 = (0,3)(1,2) 3 = (0,3,5)(1,2,4) 4 = (0,5,3)(1,4,2) 5 = (0,5)(1,4) Out of 6! = 720 possible labelings, partition refinement pruned away all but the six symmetries of the graph 4 = (0,5,3)(1,4,2) 2 = (0,3)(1,2) 0 = 3 = (0,3,5)(1,2,4) 5 = (0,5)(1,4) 1 = (2,4)(3,5) Select a non-singleton color T (for target) and generate |T| colorings, each with one element of T artificially distinguished from the remainder of T Discrete colorings (leaf nodes) yield likely symmetries
Pruning Using Generators Too many symmetries: |Aut(G)| is O(n!) Group theory provides the answer: generators Irredundant set H Aut(G) implicitly represents entire set of symmetries Exponential compression: |H| log2|Aut(G)| We prune away subtrees guaranteed to yield symmetries that we can already generate with previously discovered symmetries TODO: focus on major bullets Review proof that we can get exponential compression
saucy : Exploiting Structure nauty works very well on small graphs (and thus small formulas) but fails to scale Takes considerably longer than the SAT solver after adding SBPs to the CNF formula Runs out of memory on formulas with corresponding graphs having >50,000 vertices saucy improvement #1: sparse representation saucy improvement #2: use knowledge of graph construction Clause vertices only connected to their literals Never connected to each other TODO: rework #2 so that the structure stuff are sub-bullets
saucy : Algorithmic Improvements Positive Negative Clauses nauty : Iterate over all colors For each vertex, count connections to refining color, and sort saucy improvement #3: Determine directly connected colors saucy improvement #4: For each vertex in refining color, count connections For every color touched, sort the counts 1
saucy : Asymptotic Performance Partition refinement nauty implementation: O(n3) saucy improvement #4: O(n2 log n) Search tree size Worst case: exponential No “bad leaves”: O(n3) In practice: O(n) Complete algorithm No “bad leaves”: O(n5 log n) Much lower in practice
saucy : Empirical Performance Inst. n zChaff nauty % Sym saucy s4-4-3-1 10354 218.53 88.74 28.9 0.11 0.05 s4-4-3-2 9974 877.59 79.67 8.3 0.10 0.01 s4-4-3-3 9970 884.78 75.98 7.9 0.09 s4-4-3-4 10714 464.46 155.31 25.1 0.14 0.03 s4-4-3-5 11072 134.09 101.63 43.1 0.08 s4-4-3-6 9620 13.24 76.48 85.2 0.75 s4-4-3-7 10362 18.27 78.96 81.2 0.54 s4-4-3-8 6608 0.68 28.42 97.7 0.06 8.11 2pipe 3575 0.13 2.93 95.8 0.02 13.33 3pipe 10048 6.44 57.53 89.9 1.98 4pipe 21547 153.50 523.64 77.3 0.49 0.32 5pipe 38746 122.85 3144.85 96.2 1.65 1.33
saucy : Empirical Performance TODO: possibly get rid of the bullets
Conclusions and Future Work CNF formulas from EDA applications exhibit considerable structure (symmetry and sparsity) saucy, a new implementation of the nauty symmetry-detection system Exploits structure to improve symmetry detection performance by several orders of magnitude Symmetry-detection time insignificant compared to SAT solver Future work Apply saucy to more sparse domains which may benefit from symmetry detection Find other applications of partition refinement—a surprisingly general framework for distinguishing objects in a finite domain TODO: mention again the %sym number
Thank You!
saucy : Exploiting Structure Graphs from typical CNF formulas possess a particular structure By construction, clauses are never connected to other clauses Thus, when refining the partition with a color of clauses, we can ignore all colors containing clauses, since we know that the connection count for every vertex will be zero Such graphs are also very sparse Few literals are connected to most clauses Few clauses are connected to most literals Thus, we aggressively avoid work by maintaining data structures (like adjacency lists) which explicitly direct the search and refinement procedures
saucy : Exploiting Structure 1 1 1 1 1 1 1 1 1 1 1 1
saucy example: Hole-3
saucy : Exploiting Structure We can generalize this idea of avoiding obviously disconnected colors saucy improvement #3 Iterate over a color's adjacency lists to determine connected colors Compute connection counts only for those colors
saucy example: Hole-3 nauty : saucy improvement #3: Positive Negative Clauses nauty : Iterate over all colors For each vertex, count connections to refining color, and sort saucy improvement #3: Determine directly connected colors saucy improvement #4: For each vertex in refining color, count connections For every color touched, sort the counts 1
saucy : Exploiting Structure nauty works very well on small graphs (and thus small formulas) but fails to scale Takes considerably longer than the SAT solver after adding SBPs Runs out of memory on formulas with corresponding graphs having >50,000 vertices saucy improvement #1: sparse representation Input graph is represented in adjacency-list format
saucy : Exploiting Structure Graphs from CNF formulas possess a particular structure Clause vertices only connected to their literals Never connected to each other saucy improvement #2: ignore colors containing clauses when refining with clauses We know that the connection count for every vertex will be zero
Symmetry Detection Problem What precisely is a symmetry of a graph G? A symmetry is a permutation of the labels assigned to vertices of G such that G = G The set of all symmetries is denoted Aut(G) c b' C3 b C1 c' C2 a' a b' c C3 c' C2 b C1 a a' a b C3 b' C2 a' C1 c' c G = G G G = (a,a')(b,c')(b',c)(C1,C2) = (a,c',b')(a',c,b)
The Search Tree We have a stable ordered partition of the vertices of the graph. How can we extract Aut(G)? Recall the naïve approach: we need labelings (i.e. discrete colorings) We select a non-singleton color T (for target) and generate |T| colorings, each with one element of T individualized "in front of" the remainder of T Partitions the set of all discrete colorings descendant from We can then recursively apply partition refinement to further prune the search space! Fix the first discrete coloring found as ; the remaining discrete colorings yield likely candidates for Aut(G)
The Search Tree 1 2 4 1 2 3 4 5 3 5 1 2 4 3 5 = 035421 3 5 1 2 4 3 5 1 2 4 5 3 1 2 4 3 5 2 4 1 3 5 1 4 2 5 3 1 2 4 Out of 6! = 720 possible labelings, partition refinement pruned away 714, leaving the six symmetries of the graph Discrete colorings do not necessarily yield symmetries Refinement is only an approximation Only occurs on highly regular graphs, which are uncommon in EDA applications 3 5 1 4 2 3 5 2 4 1 5 3 1 2 4 3 5 1 4 2 5 3 2 4 1 5 3 1 2 4 = 5 3 2 1 4 3 5 4 1 2 3 5 4 2 1 5 3 1 2 4 5 3 2 4 1 3 5 1 4 2 4 = (0,5,3)(1,4,2) 2 = (0,3)(1,2) 0 = 3 = (0,3,5)(1,2,4) 5 = (0,5)(1,4) 1 = (2,4)(3,5)
saucy : Empirical Performance Instance Vertices nauty saucy #4 Speedup zChaff w/SBPs Urq 299 0.02 0.01 2.00 1.73 Hole 301 0.025 2.50 523.39 Xor 464 0.06 6.00 timeout 1.75 Fpga 671 0.1825 0.0125 14.60 0.03 s4-4-3-1 10354 88.74 0.11 806.73 441.18 218.53 s4-4-3-2 9974 79.67 0.1 796.70 204.29 877.59 s4-4-3-3 9970 75.98 0.09 844.22 884.78 s4-4-3-4 10714 155.31 0.14 1109.36 464.46 s4-4-3-5 11072 101.63 923.91 134.09 s4-4-3-6 9620 76.48 764.80 679.13 13.24 s4-4-3-7 10362 78.96 789.60 831.01 18.27 s4-4-3-8 6608 28.42 473.67 123.82 0.68 s4-4-3-9 12920 209.52 0.17 1232.47 75.21 2pipe 3575 2.93 146.50 0.18 0.13 3pipe 10048 57.53 442.54 3.20 6.44 4pipe 21547 523.64 0.49 1068.65 228.82 153.50 5pipe 38746 3144.85 1.65 1905.97 347.92 122.85
Inst. n zChaff SBPs nauty %Sym saucy Urq 299 1.73 0.02 50.0% 0.01 33.3% Hole 301 523.39 0.025 55.6% Xor 464 timeout 1.75 0.06 3.3% 0.6% Fpga 671 0.03 0.1825 85.9% 0.0125
saucy : Empirical Performance Inst. n zChaff nauty % Sym saucy s4-4-3-1 10354 218.53 88.74 28.9 0.11 0.05 s4-4-3-2 9974 877.59 79.67 8.3 0.10 0.01 s4-4-3-3 9970 884.78 75.98 7.9 0.09 s4-4-3-4 10714 464.46 155.31 25.1 0.14 0.03 s4-4-3-5 11072 134.09 101.63 43.1 0.08 s4-4-3-6 9620 13.24 76.48 85.2 0.75 s4-4-3-7 10362 18.27 78.96 81.2 0.54 s4-4-3-8 6608 0.68 28.42 97.7 0.06 8.11 2pipe 3575 0.13 2.93 95.8 0.02 13.33 3pipe 10048 6.44 57.53 89.9 1.98 4pipe 21547 153.50 523.64 77.3 0.49 0.32 5pipe 38746 122.85 3144.85 96.2 1.65 1.33
saucy : Empirical Performance Inst. n zChaff nauty % Sym saucy 2pipe 3575 0.13 2.93 95.8 0.02 13.3 3pipe 10048 6.44 57.53 89.9 1.98 4pipe 21547 153.50 523.64 77.3 0.49 0.32 5pipe 38746 122.85 3144.85 96.2 1.65 1.33
saucy : Runtime Performance Speedup is roughly linear in the number of vertices Primarily due to efficient use of sparsity within the partition refinement procedure Search tree maintenance has relatively low overhead We ran saucy and nauty on the complement graphs Isomorphic search trees! Isolate performance difference in refinement Slowdown is roughly linear, which is expected given difference in representation
Ordered Partitions The partition refinement algorithm based on degree operates independently of the labeling of the graph To guarantee identical partition representations for isomorphic graphs, we impose an ordering on the colors in the partition When a color is split, the new colors are assigned in sorted order of degree with the refining color Refining colors are chosen based on position within the partition ordering, not based on label The search algorithm absolutely the refinement procedure to be labeling-independent!
Partition Refinement In a stable coloring: vertices in different colors definitely cannot map into each other in some symmetry of the graph vertices in the same color may map into each other (i.e. refinement is only an approximation) 1 2 3 7 6 4 5 C1 Refinement can't distinguish any vertices (all have degree two) Vertices in the triangle and square cannot map into each other Fortunately, this rarely happens with EDA instances An exact, polynomial time partition refinement algorithm would prove that the graph isomorphism problem is in P a' b c TODO: get rid of some text here, especially on the second half C3 a b' c' C2
Additional Pruning Methods Enumerating all symmetries is not an option |Aut(G)| is O(n!) Many EDA-related instances possess exponentially many symmetries Group theory provides the answer: generators Find a set H Aut(G) that implicitly represents the entire set of symmetries Every element of Aut(G) is a product (composition) of integer powers of elements of H Exponential compression: |H| log2|Aut(G)| We prune away subtrees guaranteed to yield symmetries that we can already generate with previously discovered symmetries TODO: focus on major bullets Review proof that we can get exponential compression
Symmetry Breaking Benchmark Sym Search Total % Sym Hole-n 0.38 0.07 0.45 84.4 Urq 0.76 1.17 1.93 39.4 GRoute 38.76 5.08 43.84 88.4 FPGARoute 3.24 0.21 3.45 93.9 ChnlRoute 25.86 0.17 26.03 99.4 XOR 11.43 2.41 13.84 82.6 2pipe 23.50 8.01 31.51 74.6 TODO: remember to explain the table columns and highlight the %Sym information On all but the synthetic Urquhart instances, symmetry detection with nauty dominates the run time of the Shatter flow Further improvements must come from improved symmetry detection