Towards More Efficient SAT-Based Model Checking Joao Marques-Silva Electronics & Computer Science University of Southampton LAA C&V Workshop, Isaac Newton Institute, Cambridge, May 2006
2 Motivation Remarkable improvements made to SAT solvers over the last decade –Clause learning; lazy data structures; adaptive branching heuristics; search restarts Very successful application of SAT in model checking –Bounded and unbounded model checking Existing (industry motivated) challenges –Ability to handle ever increasing systems –Ability to find deep counterexamples –Ability to prove difficult properties Lines of research –More efficient SAT solvers (?) –Better uses of SAT technology in SAT-based model checking
3 Goals of this talk SAT & SAT-based model checking Interpolants in SAT-based model checking Optimizations to the utilization of interpolants
4 Outline SAT & SAT-based model checking –Organization of a modern SAT solver –SAT-based bounded model checking (BMC) –Interpolant-based unbounded model checking (UMC) Improvements to SAT-based model checking Results & conclusions
5 Modern SAT algorithms Follow the organization of the DPLL algorithm –Backtrack search with unit propagation Several key techniques are used: –Clause learning[Marques-Silva&Sakallah’96] Infer new clauses from causes of conflicts –Allows implementing non-chronological backtracking –Exploiting structure of conflicts[Marques-Silva&Sakallah’96] Identify Unique Implication Points (UIPs) –Dominators in graph of implied assignments –Optimized data structures[Moskewicz et al.’01] Lazy evaluation of clause state –Adaptive branching heuristics[Moskewicz et al.’01] Variable branching metrics are affected by number of conflicts Aging mechanisms for focusing on most recent conflicts –Search restarts[Gomes,Selman&Kautz’98] Opportunistically restart backtrack search [Davis et al.’62]
6 Evolution of SAT solvers Remarkable improvements over the last decade
7 Bounded model checking Verification of safety properties: F f Characteristic functions for representing initial states and transition relation, respectively I 0 and T –Resulting Boolean formula: k = I 0 U k F k Where: –Interpretation: [Biere et al.’99] T0T0 T1T1 T k-1 FkFk I0I0 Y0Y0 Y1Y1 YkYk
8 Bounded model checking A possible BMC algorithm: –Given some initial k –While k user-specified time-bound UB Generate CNF formula for I 0 U k F k Invoke SAT solver on If formula is satisfiable, then a counterexample within k time steps has been found –Return counterexample Otherwise, increase k The BMC algorithm is incomplete –But complete if completeness threshold is known BMC loop
9 Towards completeness Unbounded model checking –Utilization of induction Standard BMC loop –Stop BMC loop for some i, if cannot have loop-free path of size i that can be reached from I 0 or if cannot have loop-free path of size i that can reach F k –Maximum unfolding bounded by largest loop-free path –... –Utilization of interpolants BMC and Craig interpolants allow SAT-based computation of abstractions of reachable states –Avoid computing exact sets of reachable states –One of the most promising approaches in practice Maximum unfolding bounded by largest shortest path between any two states [McMillan’03] [Sheeran et al.’00] [Chauhan et al.’02;Gupta et al.’03]
10 Interpolants Given two subsets of clauses A and B, assume A B is unsatisfiable. Then, there exists an interpolant A’ for the pair (A, B) with the following properties: –A implies A’ –A’ B is unsatisfiable –A’ refers only to the common variables of A and B –Example: A = p q, B = q r –A’ = q Size of interpolants: –Given a resolution refutation of A B, can compute interpolant for the pair (A, B) in linear time on the size of the resolution refutation SAT solvers can be instructed to output resolution refutation ! Computing interpolants: –Different algorithms can be used Pudlak’97, McMillan’03 [Craig’57] [Pudlak’97] [McMillan’03]
11 (¬c) (c) (¬b) (b c) Deriving resolution refutations For unsatisfiable formulas: –Learned clauses capture a resolution refutation from a subset of the original clauses –SAT solvers can be instructed to recreate resolution refutation for unsatisfiable formula (a b)(¬a c) [Zhang&Malik’03] = (a b) (¬a c) (¬b) (¬c) 22 33 44 11 c = 0 b = 0 a = 0 11 11 22 (3)(3) (4)(4)
12 Computing interpolants Interpolant is a Boolean circuit that follows structure of resolution refutation –Can map circuit into CNF in linear time and space [Tseitin’68; Plaisted&Greenbaum’86] A = (r y)( r x)B = ( y a)( y a)( x) (r y)( r x) (y x)( y a) (x)(x)(y)(y) ( y a) (x)(x) yx A’ = y + x A implies A’; A’ B is unsatisfiable A’ with variables common to A and B
13 Abstraction of reachable states For each iteration of BMC loop, call to SAT solver returns unsat unless counterexample is found –Analysis of resolution refutation yields abstractions of reachable states –Given A and B, and a resolution refutation for A B, compute Craig interpolant A’: A = I 0 T 0 implies A’ A’ B is unsatisfiable A’ solely represented with state variables –If A holds, then A’ holds A 1 = A’ represents abstraction of states reachable from I 0 in 1 time step !
14 Fixpoint of reachable states Can iterate computation of interpolants: T0T0 T1T1 T k-1 FkFk A1A1 AB A2A2 Abstraction of states reachable in 2 time steps T0T0 T1T1 T k-1 FkFk A i-1 AB AiAi Abstraction of states reachable in i time steps T0T0 T1T1 T k-1 FkFk I0I0 AB A1A1 Abstraction of states reachable in 1 time step If A i I 0 A 1 A 2 ... A i-1, then a fixpoint is reached; all reachable states identified !
15 If F k is satisfied from I 0, then we have a counterexample ! If a fixpoint of the reachable states is identified, then no reachable state can satisfy property ! If A B is sat, may have abstracted too much; must unfold more time steps Maximum value of k is bounded by largest shortest path between any two states UMC algorithm k = 0 repeat if from I 0 can satisfy F k within k steps return reachable R = I 0 let A = I 0 T 0, and B = T 1 T 2 T k-1 F k while A B = false P = unsat_proof(A B) A’ = interpolant(P, A, B) if A’ R, return unreachable R = A’ R A = A’ T 0 end while increase k end repeat Fixpoint BMC loop Calls to SAT solver
16 Outline SAT & SAT-based model checking Improvements to SAT-based model checking –Rescheduling the BMC loop Can exploit feedback from the fixpoint checking loop –Reusing computed interpolants Interpolants readily available if fixed point condition is based on interpolants Can envision alternative fixpoint conditions Results & conclusions
17 Rescheduling the BMC loop k = 0 repeat if from I 0 can satisfy F k within k steps return reachable R = I 0 let A = I 0 T 0, and B = T 1 T 2 T k-1 F k while A B = false P = unsat_proof(A B) A’ = interpolant(P, A, B) if A’ R, return unreachable R = A’ R A = A’ T 0 end while increase k end repeat BMC loop Number of iterations can be used to restrict when to check again the BMC condition ! Fixpoint
18 Rescheduling the BMC loop Fixpoint checking with i+1 iterations (last iteration is sat): Checked all states reachable in up to k+i states, with an unfolding of size k; no counterexample was found Need to check BMC condition only when unfolding of FSM exceeds k+i time steps In general useful if counterexample exists I0I0 A1A1 A2A2 A i+1 while A B = false P = unsat_proof(A B) A’ = interpolant(P, A, B) if A’ R, return unreachable R = A’ R A = A’ T 0 end while Fixpoint
19 Interpolant reuse Boolean formula N is usable for B iff B N –B satisfiable iff B N satisfiable Learnt interpolants can be reused –For requiring states from a set of states –For preventing states from a set of states A different organization of BMC: [Copty et al.’01]
20 Interpolant reuse Different ways for computing interpolants –Computed interpolants can be direct or inverse –Interpolants can be computed at different time steps Direct interpolants –Over-approximation of reachable states –Under-approximation of states that do not satisfy failing property Inverse interpolants –Under-approximation of unreachable states –Over-approximation of states that satisfy failing property T0T0 T1T1 T k-1 FkFk I0I0 Can compute (multiple) interpolants
21 Direct interpolants P r,t : –Direct interpolant computed r time steps from I 0 and t time steps to F k From the initial state, P r,t, t=k-r: In general, P r+u,t :
22 Conditions for interpolant reuse I Conditions on direct interpolants: –P r,t (Y r ) is usable for k, with t 0 and r k – P r,t (Y k-t ) is usable for k, with r 0 and t k T0T0 I0I0 T k-1 FkFk T r-1 T k-t-1 P r,t P r,t TrTr T k-t
23 An example I Standard UMC model checking, with BMC and fixpoint loops Automaton with unfolding of size k+1 Fixed point checking for j+1 iterations –Last iteration yields spurious counterexample; j interpolants computed Interpolants computed at Y 1 : –P 1,k, P 2,k,..., P j,k Examples of interpolant reuse: –P i,k (Y i ), 1 i j, is usable for m, m k P i,k represents over-approximation of the states reachable in i time steps –With unfolding of size k+1, P i,k (Y 1 ), 1 i j, is usable for k+1 P i,k represents under-approximation of the states that do not satisfy failing property in k time steps –With unfolding of size m k, P i,k (Y m-k ), 1 i j, is usable for m T0T0 T1T1 TkTk F k+1 I0I0 Y0Y0 Y1Y1 Y k+1
24 Inverse interpolants Q r,t : –Reverse interpolant computed r time steps from I 0 and t time steps to F k From the initial state, Q r,t, t=k-r: In general, Q r+u,t :
25 Conditions for interpolant reuse II Conditions on inverse interpolants: –Q r,t (Y k-t ) is usable for k, with r 0 and t k – Q r,t (Y r ) is usable for k, with t 0 and r k T0T0 I0I0 T k-1 FkFk T r-1 T k-t-1 Q r,t Q r,t TrTr T k-t
26 An example II Standard UMC model checking, with BMC and fixpoint loops Automaton with unfolding of size k+1 Fixed point checking for j+1 iterations –Last iteration yields spurious counterexample; j interpolants computed Interpolants computed at Y 1 : –Q 1,k, Q 2,k,..., Q j,k Examples of interpolant reuse: – Q i,k (Y i ), 1 i j, is usable for m, m k Q i,k represents under-approximation of the states unreachable in i time steps –With unfolding of size k+1, Q i,k (Y 1 ), 1 i j, is usable for k+1 Q i,k represents over-approximation of the states that satisfy failing property in k time steps –With unfolding of size m k, Q i,k (Y m-k ), 1 i j, is usable for m T0T0 T1T1 TkTk F k+1 I0I0 Y0Y0 Y1Y1 Y k+1
27 An example III Inverse UMC model checking, with BMC and fixpoint loops Automaton with unfolding of size k+1 Fixed point checking for j+1 iterations –Last iteration yields spurious counterexample; j interpolants computed Interpolants computed at Y k-1 : –Q k,1, Q k,2,..., Q k,j Examples of interpolant reuse: – Q k,i (Y k ), 1 i j, is usable for m, m k Q k,i represents under-approximation of the states unreachable in k time steps –With unfolding of size k+1, Q k,i (Y k+1-i ), 1 i j, is usable for k+1 Q k,i represents over-approximation of the states that satisfy failing property in i time steps –With unfolding of size m k, Q k,i (Y m-i ), 1 i j, is usable for m T0T0 T1T1 TkTk F k+1 I0I0 Y0Y0 Y1Y1 Y k+1
28 More on interpolant reuse All interpolants computed in standard interpolant-based UMC flow can be reused –Easy to integrate with existing interpolant-based UMC flow Learning and reusing of interpolants can be integrated into any approach for BMC or UMC –Plain BMC algorithm –Different approaches for UMC Inverse interpolants provide alternative fixpoint condition (from previous slide): –If Q k,i F k+1 Q k,1 Q k,i-1 is satisfiable, then we have a fixpoint Potentially interesting; depends on automaton
29 Outline SAT & SAT-based model checking Improvements to SAT-based model checking Results & conclusions
30 Results on rescheduling Evaluated rescheduling on different benchmarks –Specifically designed and industrial examples Evaluated both the plain UMC algorithm and rescheduling
31 Experience with reuse Experimented interpolant reuse on industrial benchmarks –Plain (incomplete) BMC loop –Direct interpolants computed at each step (for last time step) Interpolants not used for checking fixed point condition Experience so far: –Search space is reduced –CPU times increase The problems observed: –Large interpolants Naive simplifications Computed solely for search pruning purposes –Ineffective representation One Reduced Boolean Circuit (RBC) for each interpolant
32 Conclusions SAT technology has improved dramatically over the last decade –Key techniques: Clause learning, optimized data structures, adaptive branching heuristics, search restarts SAT has been applied to model checking with success –Bounded and unbounded model checking Described optimizations to the utilization of interpolants in SAT- based model checking Results preliminary –Rescheduling can allow number of iterations to be significantly reduced Not significant on industrial benchmarks –Reuse of interpolants reduces amount of search, increases run times
33 Many challenges Effectiveness of rescheduling in industrial context? Can interpolant reuse yield performance gains? Can we find “good” interpolants to learn and reuse? –E.g. size/depth of interpolant (or CNF representation)