Exploiting Structure in Symmetry Detection for CNF

Slides:

Advertisements

Similar presentations

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.

Advertisements

1 Partition Into Triangles on Bounded Degree Graphs Johan M. M. van Rooij Marcel E. van Kooten Niekerk Hans L. Bodlaender.

Saucy3: Fast Symmetry Discovery in Graphs Hadi Katebi Karem A. Sakallah Igor L. Markov The University of Michigan.

 The amount of time it takes a computer to solve a particular problem depends on:  The hardware capabilities of the computer  The efficiency of the.

Solving Difficult SAT Instances In The Presence of Symmetry Fadi A. Aloul, Arathi Ramani Igor L. Markov and Karem A. Sakallah University of Michigan.

DAC Solving Difficult SAT Instances In The Presence of Symmetry Fadi A. Aloul, Arathi Ramani Igor L. Markov and Karem A. Sakallah University of Michigan.

1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 23 Instructor: Paul Beame.

SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.

Exploiting Structure in Symmetry Detection for CNF Paul T. Darga, Mark H. Liffiton, Karem A. Sakallah, and Igor L. Markov The University of Michigan.

Algorithm Efficiency and Sorting

Shatter: Efficient Symmetry- Breaking for Boolean Satisfiability Fadi A. Aloul Igor L. Markov, Karem A. Sakallah The University of Michigan.

Important Problem Types and Fundamental Data Structures

Comp 249 Programming Methodology Chapter 15 Linked Data Structure - Part B Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia.

Fixed Parameter Complexity Algorithms and Networks.

1 Algebraic Structure in Almost-Symmetries Igor Markov, Univ. of Michigan Presented by Ian Gent, St. Andrews.

Techniques for Proving NP-Completeness Show that a special case of the problem you are interested in is NP- complete. For example: The problem of finding.

Additional NP-complete problems

NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.

NP-Complete problems.

Interval Notation Interval Notation to/from Inequalities Number Line Plots open & closed endpoint conventions Unions and Intersections Bounded vs. unbounded.

Maximum Density Still Life Symmetries and Lazy Clause Generation Geoffrey Chu, Maria Garcia de la Banda, Chris Mears, Peter J. Stuckey.

Great Theoretical Ideas in Computer Science for Some.

Faster Symmetry Discovery using Sparsity of Symmetries Paul T. Darga Karem A. Sakallah Igor L. Markov The University of Michigan.

Data Structures I (CPCS-204) Week # 2: Algorithm Analysis tools Dr. Omar Batarfi Dr. Yahya Dahab Dr. Imtiaz Khan.

The NP class. NP-completeness

NP-Completeness (2) NP-Completeness Graphs 4/13/2018 5:22 AM x x x x x

Chapter 10 NP-Complete Problems.

Computational Geometry

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

CSC 421: Algorithm Design & Analysis

CSC 421: Algorithm Design & Analysis

COMP108 Algorithmic Foundations Algorithm efficiency

Computability and Complexity

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

NP-Completeness (2) NP-Completeness Graphs 7/23/ :02 PM x x x x

CSC 421: Algorithm Design & Analysis

Algorithms and networks

NP-Completeness Yin Tat Lee

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Degree and Eigenvector Centrality

CSE 421: Introduction to Algorithms

Complexity 6-1 The Class P Complexity Andrei Bulatov.

NP-Completeness (2) NP-Completeness Graphs 11/23/2018 2:12 PM x x x x

Richard Anderson Lecture 25 NP-Completeness

Algorithms and networks

Lectures on Graph Algorithms: searching, testing and sorting

On the effect of randomness on planted 3-coloring models

Additional NP-complete problems

Chapter 11 Limitations of Algorithm Power

Haskell Tips You can turn any function that takes two inputs into an infix operator: mod 7 3 is the same as 7 `mod` 3 takeWhile returns all initial.

Pruning 24-Feb-19.

Graphs and Algorithms (2MMD30)

CSC 421: Algorithm Design & Analysis

NP-Completeness Yin Tat Lee

CSE 6408 Advanced Algorithms.

Backtracking and Branch-and-Bound

Trevor Brown DC 2338, Office hour M3-4pm

Generic SBDD using Computational Group Theory

Discrete Mathematics for Computer Science

Important Problem Types and Fundamental Data Structures

Instructor: Aaron Roth

Analysis and design of algorithm

Switching Lemmas and Proof Complexity

CSC 421: Algorithm Design & Analysis

Instructor: Aaron Roth

Heaps & Multi-way Search Trees

NP-Completeness (2) NP-Completeness Graphs 7/9/2019 6:12 AM x x x x x

Complexity Theory: Foundations

Presentation transcript:

Exploiting Structure in Symmetry Detection for CNF Paul T. Darga, Mark H. Liffiton, Karem A. Sakallah, and Igor L. Markov The University of Michigan Hello! Next slide… TODO in general: increase size of greek fonts

Structure in SAT Human-designed artifacts possess considerable structure Manifested in instances of Boolean satisfiability (SAT) Symmetry Some rearrangement of the components of the design that preserves its structure Sparsity Most design elements are directly related to only a few other elements in the whole design We can exploit this structure to speed up SAT solving! TODO: generalize the idea of “structure” to designs; it is manifested in the CNF formulas derived from these generic designs We don’t “define” symmetry; just say what it is… same with sparsity The key observation is that in any domain, and in particular EDA, human-designed artifacts possess considerable structure. The human mind is capable of processing only so much complexity simultaneously, and so

Symmetry Breaking  = (a'+b+c)(a+b'+c')(b'+c)(a') Shatter (Aloul et. al) Converts CNF formula to a colored undirected graph (Crawford) Uses nauty (McKay), a graph symmetry detection tool, to find the symmetries present in the graph Converts the symmetries back into additional symmetry-breaking predicates (SBPs) Appends the SBPs to the original formula The resulting formula is often much faster to solve TODO: replace Karnaugh map with: The list of clauses (vertically) Their mapping to clauses in the “symmetric” instance An explanation of the cycle notation for permutations Eight assignments to search reduced to four Mention that this is finding structural, not functional, symmetries  = (a'+b+c)(a+b'+c')(b'+c)(a')  = (a'+b+c)(a+b'+c')(b'+c) (a'+b+c) (a+b'+c') (b'+c) (a'+b+c) (a+b'+c') (b'+c) (a,a')(b,c')(b',c)

Symmetry Breaking Benchmark Sym Search Total % Sym Hole-n 0.38 0.07 0.45 84.4 Urq 0.76 1.17 1.93 39.4 GRoute 38.76 5.08 43.84 88.4 FPGARoute 3.24 0.21 3.45 93.9 ChnlRoute 25.86 0.17 26.03 99.4 XOR 11.43 2.41 13.84 82.6 2pipe 23.50 8.01 31.51 74.6 TODO: remember to explain the table columns and highlight the %Sym information On all but the synthetic Urquhart instances, symmetry detection with nauty dominates the run time of the Shatter flow Further improvements must come from improved symmetry detection

Outline Graph construction nauty : "No Automorphisms, Yes?" Problem description Partition refinement The search tree saucy : a new symmetry detection tool Algorithmic improvements exploiting structure Experimental results: saucy run time insignificant compared to SAT solver Conclusions and future work TODO: give the bottom line here to entice the audience

Graph Construction for CNF  = C1C2C3 = (a'+b+c)(a+b'+c')(b'+c) C1 Two vertices and one edge for each variable One vertex for each clause One edge for each literal in each clause Color literals and clauses differently a' b c C3 a b' c' C2

Symmetry Detection Problem What precisely is a symmetry of a graph G? A symmetry is a permutation  of the labels assigned to vertices of G such that G = G The set of all symmetries is denoted Aut(G) C1 C1 C2 G  G G = G a' b c c a c' a' b b' TODO: highlight edges as we reconstruct the graph on the right-hand side C3 C3 C3 a b' c' a' c' a c b b' C2 C1 C2 = (a,a')(b,c')(b',c)(C1,C2) = (a,b',c')(a',b,c)

Symmetry Detection Problem         1234  2134 (1,2) 3124 (1,3,2) 4123 (1,4,3,2) 1243 (3,4) 2143 (1,2)(3,4) 3142 (1,3,4,2) 4132 (1,4,2) 1324 (2,3) 2314 (1,2,3) 3214 (1,3) 4213 (1,4,3) 1342 (2,3,4) 2341 (1,2,3,4) 3241 (1,3,4) 4231 (1,4) 1423 (2,4,3) 2413 (1,2,4,3) 3412 (1,3)(2,4) 4312 (1,4,2,3) 1432 (2,4) 2431 (1,2,4) 3421 (1,3,2,4) 4321 (1,4)(2,3) Problem: there are n! possible labelings! Can we prune the search space? 3 1 4 2 TODO: get rid of the naïve approach business Highlight the green after showing the table

Partition Refinement We can rule out many candidate labelings Distinguish vertices that cannot possibly be mapped into each other by any symmetry Fast distinguishing method: degree 1 1 1 Select a color in the graph Compute the number of connections every vertex has to that color Distinguish vertices within colors based on that count Repeat until coloring stabilizes Refinement distinguished all vertices! This graph has no symmetry besides the identity. 1 1 4 1 1 4 6 7 Mention at the end that we have a “discrete coloring”… important for the search tree 3 4 2 1 2 3 1 1 2 1 5 3

Partition Refinement In a stable coloring: vertices in different colors definitely cannot map into each other in some symmetry of the graph vertices in the same color may map into each other (i.e. refinement is only an approximation) 1 2 3 7 6 4 5 C1 a' b c TODO: get rid of some text here, especially on the second half Make sure to define “stable coloring” as one where no further refinement is possible C3 a b' c' C2

The Search Tree 2 4 1 3 5 1 2 3 4 5 3 5 1 2 4  = 035421 3 5 1 2 4 3 5 1 2 4 5 3 1 2 4 3 5 2 4 1 3 5 1 4 2 5 3 1 2 4 3 5 1 4 2 3 5 2 4 1 5 3 1 2 4 5 3 1 2 4 3 5 1 4 2 5 3 2 4 1  = 5 3 2 1 4 3 5 4 1 2 3 5 4 2 1 3 5 1 4 2 5 3 1 2 4 5 3 2 4 1 Suppose we are given this graph, and we wish to find its symmetries. All vertices are initially colored the same—meaning there are 6 factorial, or 720, different possible labelings of this graph We run partition refinement on this coloring and are able to divide the vertices into two groups, representing the inner triangle and the outer vertices. This has left us with 3 factorial times 3 factorial, or 36, labelings remaining to examine. However, partition refinement cannot help us anymore with this coloring since it is stable. Therefore, we select one color in the stable coloring and create new colorings, one for each vertex of the selected color, in which one vertex is artificially distinguished from the others. This has, in effect, partitioned the labelings. The benefit of this operation is that by artificially coloring one vertex differently, we have created new colorings that are no longer stable, so we can again run partition refinement on them to further prune the number of labelings. We do this in a depth-first manner. Looking at the leftmost coloring, we run partition refinement on it, and find that we’ve been able to prune away more labelings, but the new coloring is not discrete. We thus recursively run the search tree algorithm… … run partition refinement … and finally arrive at a discrete coloring. The first discrete coloring gets a special name: zeta. All the other discrete colorings we find during the search will be compared with zeta to produce the symmetries of the graph. We backtrack to the leftmost node and examine its other child, run partition refinement on it, and get another discrete coloring. Note that we can get this coloring from zeta by swapping 2 and 4, and 3 and 5. This swapping is the permutation represented by this leaf of the search tree. We check the permutation against the graph and see that it is a symmetry of the graph, representing a horizontal reflection. We then backtrack, and repeat this process with the next coloring, and find two more permutations that happen to be symmetries. Finally, we examine the last coloring, and get two more symmetries. At this point, the search ends, since there are no unexamined colorings. We see that in the end, six discrete colorings were examined, rather than the 720 labelings we would have examined without partition refinement. These six symmetries are the entire set of symmetries for this graph. 0 =  1 = (2,4)(3,5) 2 = (0,3)(1,2) 3 = (0,3,5)(1,2,4) 4 = (0,5,3)(1,4,2) 5 = (0,5)(1,4) Out of 6! = 720 possible labelings, partition refinement pruned away all but the six symmetries of the graph 4 = (0,5,3)(1,4,2) 2 = (0,3)(1,2) 0 =  3 = (0,3,5)(1,2,4) 5 = (0,5)(1,4) 1 = (2,4)(3,5) Select a non-singleton color T (for target) and generate |T| colorings, each with one element of T artificially distinguished from the remainder of T Discrete colorings (leaf nodes) yield likely symmetries

Pruning Using Generators Too many symmetries: |Aut(G)| is O(n!) Group theory provides the answer: generators Irredundant set H  Aut(G) implicitly represents entire set of symmetries Exponential compression: |H|  log2|Aut(G)| We prune away subtrees guaranteed to yield symmetries that we can already generate with previously discovered symmetries TODO: focus on major bullets Review proof that we can get exponential compression

saucy : Exploiting Structure nauty works very well on small graphs (and thus small formulas) but fails to scale Takes considerably longer than the SAT solver after adding SBPs to the CNF formula Runs out of memory on formulas with corresponding graphs having >50,000 vertices saucy improvement #1: sparse representation saucy improvement #2: use knowledge of graph construction Clause vertices only connected to their literals Never connected to each other TODO: rework #2 so that the structure stuff are sub-bullets

saucy : Algorithmic Improvements Positive Negative Clauses nauty : Iterate over all colors For each vertex, count connections to refining color, and sort saucy improvement #3: Determine directly connected colors saucy improvement #4: For each vertex in refining color, count connections For every color touched, sort the counts 1

saucy : Asymptotic Performance Partition refinement nauty implementation: O(n3) saucy improvement #4: O(n2 log n) Search tree size Worst case: exponential No “bad leaves”: O(n3) In practice: O(n) Complete algorithm No “bad leaves”: O(n5 log n) Much lower in practice

saucy : Empirical Performance Inst. n zChaff nauty % Sym saucy s4-4-3-1 10354 218.53 88.74 28.9 0.11 0.05 s4-4-3-2 9974 877.59 79.67 8.3 0.10 0.01 s4-4-3-3 9970 884.78 75.98 7.9 0.09 s4-4-3-4 10714 464.46 155.31 25.1 0.14 0.03 s4-4-3-5 11072 134.09 101.63 43.1 0.08 s4-4-3-6 9620 13.24 76.48 85.2 0.75 s4-4-3-7 10362 18.27 78.96 81.2 0.54 s4-4-3-8 6608 0.68 28.42 97.7 0.06 8.11 2pipe 3575 0.13 2.93 95.8 0.02 13.33 3pipe 10048 6.44 57.53 89.9 1.98 4pipe 21547 153.50 523.64 77.3 0.49 0.32 5pipe 38746 122.85 3144.85 96.2 1.65 1.33

saucy : Empirical Performance TODO: possibly get rid of the bullets

Conclusions and Future Work CNF formulas from EDA applications exhibit considerable structure (symmetry and sparsity) saucy, a new implementation of the nauty symmetry-detection system Exploits structure to improve symmetry detection performance by several orders of magnitude Symmetry-detection time insignificant compared to SAT solver Future work Apply saucy to more sparse domains which may benefit from symmetry detection Find other applications of partition refinement—a surprisingly general framework for distinguishing objects in a finite domain TODO: mention again the %sym number

Thank You!

saucy : Exploiting Structure Graphs from typical CNF formulas possess a particular structure By construction, clauses are never connected to other clauses Thus, when refining the partition with a color of clauses, we can ignore all colors containing clauses, since we know that the connection count for every vertex will be zero Such graphs are also very sparse Few literals are connected to most clauses Few clauses are connected to most literals Thus, we aggressively avoid work by maintaining data structures (like adjacency lists) which explicitly direct the search and refinement procedures

saucy : Exploiting Structure 1 1 1 1 1 1 1 1 1 1 1 1

saucy example: Hole-3

saucy : Exploiting Structure We can generalize this idea of avoiding obviously disconnected colors saucy improvement #3 Iterate over a color's adjacency lists to determine connected colors Compute connection counts only for those colors

saucy example: Hole-3 nauty : saucy improvement #3: Positive Negative Clauses nauty : Iterate over all colors For each vertex, count connections to refining color, and sort saucy improvement #3: Determine directly connected colors saucy improvement #4: For each vertex in refining color, count connections For every color touched, sort the counts 1

saucy : Exploiting Structure nauty works very well on small graphs (and thus small formulas) but fails to scale Takes considerably longer than the SAT solver after adding SBPs Runs out of memory on formulas with corresponding graphs having >50,000 vertices saucy improvement #1: sparse representation Input graph is represented in adjacency-list format

saucy : Exploiting Structure Graphs from CNF formulas possess a particular structure Clause vertices only connected to their literals Never connected to each other saucy improvement #2: ignore colors containing clauses when refining with clauses We know that the connection count for every vertex will be zero

Symmetry Detection Problem What precisely is a symmetry of a graph G? A symmetry is a permutation  of the labels assigned to vertices of G such that G = G The set of all symmetries is denoted Aut(G) c b' C3 b C1 c' C2 a' a b' c C3 c' C2 b C1 a a' a b C3 b' C2 a' C1 c' c G = G G  G = (a,a')(b,c')(b',c)(C1,C2) = (a,c',b')(a',c,b)

The Search Tree We have a stable ordered partition  of the vertices of the graph. How can we extract Aut(G)? Recall the naïve approach: we need labelings (i.e. discrete colorings) We select a non-singleton color T (for target) and generate |T| colorings, each with one element of T individualized "in front of" the remainder of T Partitions the set of all discrete colorings descendant from  We can then recursively apply partition refinement to further prune the search space! Fix the first discrete coloring found as ; the remaining discrete colorings yield likely candidates for Aut(G)

The Search Tree 1 2 4 1 2 3 4 5 3 5 1 2 4 3 5  = 035421 3 5 1 2 4 3 5 1 2 4 5 3 1 2 4 3 5 2 4 1 3 5 1 4 2 5 3 1 2 4 Out of 6! = 720 possible labelings, partition refinement pruned away 714, leaving the six symmetries of the graph Discrete colorings do not necessarily yield symmetries Refinement is only an approximation Only occurs on highly regular graphs, which are uncommon in EDA applications 3 5 1 4 2 3 5 2 4 1 5 3 1 2 4 3 5 1 4 2 5 3 2 4 1 5 3 1 2 4  = 5 3 2 1 4 3 5 4 1 2 3 5 4 2 1 5 3 1 2 4 5 3 2 4 1 3 5 1 4 2 4 = (0,5,3)(1,4,2) 2 = (0,3)(1,2) 0 =  3 = (0,3,5)(1,2,4) 5 = (0,5)(1,4) 1 = (2,4)(3,5)

saucy : Empirical Performance Instance Vertices nauty saucy #4 Speedup zChaff w/SBPs Urq 299 0.02 0.01 2.00 1.73 Hole 301 0.025 2.50 523.39 Xor 464 0.06 6.00 timeout 1.75 Fpga 671 0.1825 0.0125 14.60 0.03 s4-4-3-1 10354 88.74 0.11 806.73 441.18 218.53 s4-4-3-2 9974 79.67 0.1 796.70 204.29 877.59 s4-4-3-3 9970 75.98 0.09 844.22 884.78 s4-4-3-4 10714 155.31 0.14 1109.36 464.46 s4-4-3-5 11072 101.63 923.91 134.09 s4-4-3-6 9620 76.48 764.80 679.13 13.24 s4-4-3-7 10362 78.96 789.60 831.01 18.27 s4-4-3-8 6608 28.42 473.67 123.82 0.68 s4-4-3-9 12920 209.52 0.17 1232.47 75.21 2pipe 3575 2.93 146.50 0.18 0.13 3pipe 10048 57.53 442.54 3.20 6.44 4pipe 21547 523.64 0.49 1068.65 228.82 153.50 5pipe 38746 3144.85 1.65 1905.97 347.92 122.85

Inst. n zChaff SBPs nauty %Sym saucy Urq 299 1.73 0.02 50.0% 0.01 33.3% Hole 301 523.39 0.025 55.6% Xor 464 timeout 1.75 0.06 3.3% 0.6% Fpga 671 0.03 0.1825 85.9% 0.0125

saucy : Empirical Performance Inst. n zChaff nauty % Sym saucy s4-4-3-1 10354 218.53 88.74 28.9 0.11 0.05 s4-4-3-2 9974 877.59 79.67 8.3 0.10 0.01 s4-4-3-3 9970 884.78 75.98 7.9 0.09 s4-4-3-4 10714 464.46 155.31 25.1 0.14 0.03 s4-4-3-5 11072 134.09 101.63 43.1 0.08 s4-4-3-6 9620 13.24 76.48 85.2 0.75 s4-4-3-7 10362 18.27 78.96 81.2 0.54 s4-4-3-8 6608 0.68 28.42 97.7 0.06 8.11 2pipe 3575 0.13 2.93 95.8 0.02 13.33 3pipe 10048 6.44 57.53 89.9 1.98 4pipe 21547 153.50 523.64 77.3 0.49 0.32 5pipe 38746 122.85 3144.85 96.2 1.65 1.33

saucy : Empirical Performance Inst. n zChaff nauty % Sym saucy 2pipe 3575 0.13 2.93 95.8 0.02 13.3 3pipe 10048 6.44 57.53 89.9 1.98 4pipe 21547 153.50 523.64 77.3 0.49 0.32 5pipe 38746 122.85 3144.85 96.2 1.65 1.33

saucy : Runtime Performance Speedup is roughly linear in the number of vertices Primarily due to efficient use of sparsity within the partition refinement procedure Search tree maintenance has relatively low overhead We ran saucy and nauty on the complement graphs Isomorphic search trees! Isolate performance difference in refinement Slowdown is roughly linear, which is expected given difference in representation

Ordered Partitions The partition refinement algorithm based on degree operates independently of the labeling of the graph To guarantee identical partition representations for isomorphic graphs, we impose an ordering on the colors in the partition When a color is split, the new colors are assigned in sorted order of degree with the refining color Refining colors are chosen based on position within the partition ordering, not based on label The search algorithm absolutely the refinement procedure to be labeling-independent!

Partition Refinement In a stable coloring: vertices in different colors definitely cannot map into each other in some symmetry of the graph vertices in the same color may map into each other (i.e. refinement is only an approximation) 1 2 3 7 6 4 5 C1 Refinement can't distinguish any vertices (all have degree two) Vertices in the triangle and square cannot map into each other Fortunately, this rarely happens with EDA instances An exact, polynomial time partition refinement algorithm would prove that the graph isomorphism problem is in P a' b c TODO: get rid of some text here, especially on the second half C3 a b' c' C2

Additional Pruning Methods Enumerating all symmetries is not an option |Aut(G)| is O(n!) Many EDA-related instances possess exponentially many symmetries Group theory provides the answer: generators Find a set H  Aut(G) that implicitly represents the entire set of symmetries Every element of Aut(G) is a product (composition) of integer powers of elements of H Exponential compression: |H|  log2|Aut(G)| We prune away subtrees guaranteed to yield symmetries that we can already generate with previously discovered symmetries TODO: focus on major bullets Review proof that we can get exponential compression

Symmetry Breaking Benchmark Sym Search Total % Sym Hole-n 0.38 0.07 0.45 84.4 Urq 0.76 1.17 1.93 39.4 GRoute 38.76 5.08 43.84 88.4 FPGARoute 3.24 0.21 3.45 93.9 ChnlRoute 25.86 0.17 26.03 99.4 XOR 11.43 2.41 13.84 82.6 2pipe 23.50 8.01 31.51 74.6 TODO: remember to explain the table columns and highlight the %Sym information On all but the synthetic Urquhart instances, symmetry detection with nauty dominates the run time of the Shatter flow Further improvements must come from improved symmetry detection