Data Structures for SAT Solvers The 2-Literal Representation Gábor Kusper Eszterházy Károly College Eger, Hungary
Boolean Satisfiability (SAT) Identify truth assignment that satisfies boolean formula or prove it does not exist Well-known NP-complete problem
Outline Notation Data structures used by SAT solvers –Literal matrix(Scherzo) –Adjacency lists(GRASP, …) –Head/tail lists(SATO) –Watched literals(Chaff) New data structure: –2-Literal Matrix
Conjunctive Normal Form (CNF) Clause Positive Literal Negative Literal ( a + c ) ( b + c ) (¬a + ¬b + ¬c )
Literal & Clause Classification (a + ¬b)(¬a + b + ¬c )(a + c + d )(¬a + ¬b + ¬c ) a assigned 0b assigned 1 c and d unassigned unsatunresolved satisfied
Additional Definitions Resolution Example: 1 = (¬a + b + c), 2 = (a + b + d) Resolution:res( 1, 2, a) = (b + c + d) Unit Propagation –An unresolved clause is unit if it has exactly one unassigned literal = (a + c)(b + c)(¬a + ¬b + ¬c) –A unit clause has exactly one option for being satisfied c must be set to 0. –Boolean Constraint Propagation: iterated application of unit propagation
Data Structures Literal matrix (Scherzo) –View CNF formula as a matrix, where the rows denote the clauses and the columns the variables 2-Literal matrix, NEW Adjacency lists (most SAT solvers) –Counter-based state maintenance –Keep counters of sat, unsat and unassigned (free) literals for each clause Lazy data structures –Head/Tail lists(SATO) –Watched literals(Chaff)
State-of-the-art SAT Solvers MiniSAT solver: ods/MiniSat/ Java SAT solver: A paper about data structures: Efficient data structures for backtrack search SAT solvers Inês Lynce and João Marques-Silva
Literal Matrix View CNF formula as a matrix, where the rows denote the clauses and the columns the variables –Assigned variables result in unsat literals –Satisfied clauses result in sat clauses –Each clause is an array of bits –Each clause contains counter of sat, unsat and unassigned sat literals Used in the past in Binate Covering algorithms –E.g.: Scherzo, by Courdert et al., DAC’95 and DAC’96
1-Literal Matrix Representation We can call the Literal Matrix to 1-Literal Matrix We decode combination of 1-clause, each 1-clause correspond to a bit: 01: -, 10: + 01: a, 10: ā The representation: 00sat 10ā 01a11unsat
+ - x x x + x x a b c d a + ¬b ¬a + b + ¬c a + c + d ¬a + ¬b + ¬c (a + ¬b)(¬a + b + ¬c )(a + c + d )(¬a + ¬b + ¬c ) b assigned 1 x x sat x x + + sat a b c d a + ¬b a + c + d a assigned 0 (a + ¬b)(¬a + b + ¬c )(a + c + d )(¬a + ¬b + ¬c ) x - x x sat x x + + sat a b c d a + ¬b a + c + d 1-Literal Matrix
+ - x x x + x x a b c d a + ¬b ¬a + b + ¬c a + c + d ¬a + ¬b + ¬c b assigned 1 a assigned 0 x - x x sat x x + + sat a b c d
Definition of k-clause A k-clause has k literal. Example: ( a + c ) ( b + c ) (¬a + ¬b + ¬c ) –3-clauses in this formula are: (¬a + ¬b + ¬c ) –2-clauses in this formula are: (a + c) (b + c) –There is no unit, i.e., 1-clause in this example.
2-Literal Matrix Representation We decode combination of 2-clause. Each 2-clause correspond to a bit: 1000: a e, 0100:a ē, 0010:ā e, 0001: ā ē Can code every boolean functions with two variables. The representation: sat8 1000a e ā ē a e ā e A 1010e ā B 1011ā e a ē C 1100a ē D 1101a ē a e E 1110a e ā ē F 1111unsat
2-Literal Matrix + - x x x + x x a b c d a + ¬b ¬a + b + ¬c a + c + d ¬a + ¬b + ¬c + x - x x + + x x a c b d a + ¬b ¬a + b + ¬c a + c + d ¬a + ¬b + ¬c + x - x x x + + x + a c b d a + ¬b ¬a + ¬b + ¬c ¬a + b + ¬c a + c + d + x - x x x + + x + a c b d xx 1111
2-Literal Matrix ( a + c) assigned 1 a c b d xx 1111 a c b d a assigned 1
Unit Propagation public void unitPropagation(int column, BitSet unitToProp) { if (nLiterals[column].equals(unSatLit)) return; BitSet clone = (BitSet) nLiterals[column].clone(); clone.and(unitToProp); if (clone.equals(nLiterals[column])) subsumed = true; nLiterals[column].or(unitToProp); if (nLiterals[column].equals(unSatLit)) numberOfEffectiveLiterals--; }
n-Literal Matrix Representation We decode combination of n-clause, each n-clause correspond to a bit. It can code every boolean functions with n variables. We need 2 n bit. The 1-literal and the 2-literal matrix have the same size.
1-Literal vs. 2-Literal Matrix 1-Literal Matrix: –Advantages: Easy to implement Unit propagation results either in an sat clause or an unsat literal –Disadvantages: Wasteful, on 4 bit we store only 9 different information
1-Literal vs. 2-Literal Matrix 2-Literal Matrix: –Advantages: Economical, on 4 bit we store 15 different information One can propagate more (1110) or less (1000) information at once as a normal unit (1100) –Disadvantages: Unit propagation by a 2-literal does not necessarily result in a sat clause or an unsat literal
Standard CNF Representation Adjacency list representation: –Each clause contains: A list of literals Counter of sat, unsat and unassigned (free) literals –Each variable x keeps a list with all clauses with literals on x –Number of references kept in variables equals total number of literals, |L| –Used in some SAT solvers: GRASP rel-sat (some versions) POSIT etc.
Lazy Data Structures Head/Tail Lists –Each clause contains a list of literals –Each unresolved clause is only referenced in two unassigned variables (but possibly in several assigned variables) –Each time a variable is assigned, referenced clauses either become unit, sat, unsat or a new reference becomes associated with another of the clause’s unassigned variables Unit and unsat clauses can then be identified in constant time –Clause can be declared unit/unsat by inspection of two references –When backtracking, previous references are recovered Knowledge of the order of literal assignments is maintained and it is essential
Examples of Lazy Structures literal assigned search decision depth unsatisfied literal satisfied literal unassigned literal literal references kept in variables Largest number of literal references in variables: |L| Smallest number of literal references in variables:2|C| clause literals
HT HT HT Unit clause HT HT HT Backtracking Head/Tail Lists HT
Lazy Data Structures Watched Literals –Each unresolved clause is only referenced in two unassigned variables (and not in any assigned variables) –Each time a variable is assigned, referenced clauses either become unit, sat, unsat or, of the two clause references, one becomes associated with another of the clause’s unassigned variables Unit and unsat clauses can only be identified in linear time –Must visit all literals to confirm that clause is unit or unsat –When backtracking, do nothing Knowledge of the order of literal assignments in clause is not (and cannot be) maintained
Watched Literals WW WW WW WW WW WW Backtracking Unit clause WW
HT vs. WL Head/Tail Lists: –Advantages: Order relation between the two (H and T) references –More efficient identification of unit and unsat clauses When one reference attempts to visit the other, clause is either unit or unsat –Better accuracy in characterizing the dynamic size of clauses –Disadvantages: Larger overhead during backtracking Worst-case number of references for each clause equals number of literals –Total (worst-case):|L| Similar to adjacency lists in the worst-case
HT vs. WL Watched Literals (WL): –Advantages: Smaller overhead Constant number (2) of references for each clause –Total (worst-case):2|C| Twice the number of clauses, and |C| << |L| –Disadvantages: Lack of order relation between the two (W) references –Identification of new unit or unsat clauses is always linear in clause size –Worse accuracy in characterizing the dynamic size of clauses
Matrix vs. Lazy Data Structures Matrix data structures: –Each clause is an array of bits Lazy data structures: –Each clause is a list of literals Matrix data structures: –Advantages: Can identify not only unit clauses but also binary and ternary ones –Disadvantages: It needs space also for not concrete literals unit propagation is a |C| time method backtrack is a |C| time method
Matrix vs. Lazy Data Structures Lazy data structures: –Advantages: Unit propagation is a |P| + |N| time method –|P|+|N| <= |C| –Disadvantages: We don’t know the size of the clause, can identify only unit clauses