1 A theory-based decision heuristic for DPLL(T) Dan Goldwasser Ofer Strichman Shai Fine Haifa university TechnionIBM-HRL
2 DPLL Decide BCP Analyze conflict Backtrack SAT UNSAT full assignment partial assignment conflict
3 DPLL(T) Decide BCP DeductionAdd Clauses Analyze conflict Backtrack SAT UNSAT full assignment partial assignment conflict T-propagation / T-conflict
4 Theory propagation Matters for efficiency, not correctness. Depending on the theory, the best strategy can be: One T-implication at a time All possible T-implications (“exhaustive theory-propagation”). Cheap-to-compute T-implications
5 In this work we are interested in Linear Arithmetic ( LA ) We will see: The potential of theory propagation Why doesn’t it work today How can it be approximated efficiently Theory propagation for LA
6 A geometric interpretation Let H be a finite set of hyperplanes in d dimensions. Let n = | H | An arrangement of H, denoted A ( H ), is a partition of R d. An arrangement in d =2: # cells · n d
7 A geometric interpretation Consider a consistent partial assignment of size r. e.g. assignment to ( l 1, l 2, l 3 ), hence r =3. How many such T-implications are there ? l1l1 l4l4 current partial assignment (1,0,0) n = 6 r = 3 l5l5 T-Implied
8 A geometric interpretation Consider a consistent partial assignment of size r. Theorem 1: O(( n ¢ log r ) / r ) of the remaining constraints intersect the cell [HW87] with high probability (1 - 1/r c ). Some example numbers: r = 3, ~47% of the remaining constraints are implied. r = 12, ~70% of the remaining constraints are implied. r = 60, ~90% of the remaining constraints are implied. [HW87] D. Haussler and E. Welzl. Epsilon-nets and simplex range queries. Comput. Geom., 2: , 1987.
9 Assigned vs. implied in practice Two benchmarks. Measured averages at T-consistent points
10 Theory propagation for LA Let l 1, l 2, l 3 be asserted. Is l 4 (or : l 4 ) T-implied ? Two techniques for finding T-implications. 1. “Plunging”: check satisfiability of ( l 1 Æ l 2 Æ l 3 Æ l 4 ) and of ( l 1 Æ l 2 Æ l 3 Æ : l 4 ) Requires solving a linear system. Too expensive in practice (see e.g. [DdM06]). [DdM06] Integrating simplex with DPLL(T), Dutertre and De Moura, SRI-CSL-06-01
11 Theory propagation for LA Let l 1, l 2, l 3 be asserted. Is l 4 (or : l 4 ) T-implied ? Two techniques for finding T-implications. 2. Check if all vertices on the same side of l 4 There is an exponential number of vertices. Too expensive in practice.
12 Approximating theory propagation Problem 1: How can we use conjectured information without losing soundness ? Problem 2: how can we find cheaply good conjectures i.e., conjectured T-implications
13 Problem 1: how to use conjectures ? We use conjectured implications just to bias decisions. SAT chooses a variable to decide, we conjecture its value. Might be better than the alternative: SAT’s heuristics are T-ignorant.
14 Problem 2: conjecturing T-implications We examined two methods: 1. k - vertices Find k -vertices. If they are all on the same side of l 4 – conjecture that l 4 is implied. l4l4 In this case we conjecture : l 4
15 Problem 2: conjecturing T-implications We examined two methods: 1. k - vertices Find k -vertices. If they are all on the same side of l 4 – conjecture that l 4 is implied. l4l4 In this case we conjecture nothing
16 Problem 2: conjecturing T-implications We examined two methods: 1. k - vertices Find k -vertices. If they are all on the same side of l 4 – conjecture that l 4 is implied. l4l4 In this case we (falsely) conjecture l 4
17 Problem 2: conjecturing T-implications We examined two methods: 1. k - vertices Find k -vertices. If they are all on the same side of l 4 – conjecture that l 4 is implied. Too expensive in practice
18 Problem 2: conjecturing T-implications We examined two methods: 2. One approximated point Here we always conjecture a T-implication. l4l4
19 Problem 2: conjecturing T-implications We examined two methods: 2. One approximated point Here we always conjecture a T-implication. l4l4
20 Problem 2: conjecturing T-implications We examined two methods: 2. One approximated point Here we always conjecture a T-implication. l4l4
21 Problem 2: conjecturing T-implications We examined two methods: 2. One approximated point The idea: use the assignment maintained by Simplex. It’s for free. Competitive SMT solvers Use general Simplex [DdM06], not classical Simplex Do not activate Simplex after each assignment They only update the assignment according to the ‘simple’ constraints (e.g. “ x < c ”).
22 Problem 2: conjecturing T-implications The assignment maintained by general Simplex is updated after each partial (Boolean) assignment Based on simple constraints only. Several possibilities: is T-inconsistent is T-consistent doesn’t satisfy it is T-consistent satisfies it 22%
23 Problem 2: conjecturing T-implications Our hope: is ‘close’ to the polygon. Therefore it can be successful in guessing implications. Even if l 4 is not T-implied, it can guide the search. l4l4
24 Results Some results for the 200 benchmarks from SMT-COMP’07 Implementation on top of ArgoLib Each column refers to a different strategy of choosing the value.
25 0-pivot vs. Minisat MiniSat
26 Back to the future # of cells is exponential in d rather than exponential in n n d rather than 2 n for n sufficiently larger than d, better worst-case complexity SMT-LIB + SRI’s GDP benchmarks Examples: n : d QF_RDL_SCHEDULING 10.9: 1 QF_RDL_SAL6.7 : 1 QF_LRA_SC 3.9 : 1 QF_LRA_START_UP 6.9 : 1 QF_LRA_UART6.1 : 1 QF_LRA_CLOCK_SYNCH 3.3 : 1 QF_LRA_SPIDER_BENCHMARKS 3.2 : 1 QF_LRA_SAL6.1 : 1 MathSAT benchmarks (difference logic) 44.5: 1 SEP benchmarks (difference logic)17: 1
27 P#2: a reversed lazy approach Current SAT-based ‘lazy’ approaches Search the Boolean domain check assignment in the theory domain A ‘reversed lazy approach’: Search the theory domain check assignment in the Boolean domain T-solver SAT
28 How can we enumerate the cells ? There exists a data structure (“incidence graph”) that represents the linear arrangement Too large in practice… Corresponds to an explicit representation of the search space. Constructing a symbolic representation seems as hard as building the arrangement. For two years we worked on a random, incremental algorithm, each time adding a constraint and consulting SAT. The short summary: we were unable to beat Yices…
29 Summary We showed how to use ‘free’ information computed by general simplex in order to improve SAT’s decision. Somewhat compensates on the fact that there is no theory propagation. Future research: How can we let the theory lead efficiently ?
30 How many T-implications are there ? Let p be a polygon defined by a (consistent) assignment to r, ( r · n ), hyperplanes Theorem 1: O(( n ¢ log r ) / r ) of the remaining constraints intersect the cell [HW87] with high probability (1 - 1/r c ). In practice, less constraints are implied: Due to the constant in O Assignments and predicates are not random Most decisions are made in low decision levels. [HW87] D. Haussler and E. Welzl. Epsilon-nets and simplex range queries. Comput. Geom., 2: , 1987.
31 Let’s summarize our failed attempts… For two years we worked on a random algorithm: Choose randomly r constraints. Build the corresponding arrangement H ( r ). Now each cell corresponds to a partial assignment. Together with BCP may lead to a conflict. Otherwise – need to refine. … The short summary: we could never beat Yices.