Bounded Model Checking Ilham Amezzane 13/04/2007
Satisfiability (SAT) Given a Boolean function in conjunctive normal form (CNF), decide whether it is satisfiable. What is a CNF? Conjunction of clauses. Clauses are the disjunction of literals. CNF formula is satisfiable if each clause is satisfiable. NP-complete problem clauses
SAT Applications EDA(Electronic Design Automation) Bounded Model Checking Tools GRASP, SATO, Chaff ...
SAT Solver Progress 1960 -2010
How to Solve SAT Try all possible assignments to variables. For n variables you have 2n possible combinations. Explodes for even reasonable n. Apply some heuristics to generate assignemnts which are likely to satisfy the boolean function.
DPLL SAT solver Most modern SAT solvers are based on the Davis-Putnam- Logemann-Loveland (DPLL) algorithm (1960,1962), which performs a branching search with backtracking. The DPLL algorithm is sound and complete, i.e., it finds a solution if and only if the formula is satisfiable.
Some Definitions Unit clause - a clause where all but one of its literals are false, and the remaining literals are unassigned. Unit literal - the unassigned literal in a unit clause. Unit clause rule - if all but one of its literals has been assigned the value 0, then remaining (unassigned) literal must be assigned to 1 for this clause to be satisfied. Decision - a process of assigning a value to a variable (any value). Backtracking - a process of reassigning a decision which caused the conflict.
DPLL Algorithm
DPLL Algorithm decide-next-branch() deduce() - unit propagation: chooses an unassigned branch variable and assigns a value to it. deduce() - unit propagation: If an unassigned variable exists, make a decision about the variable assignments deductible from this decision using BCP (Boolean Constraint Propagation). BCP consists of iterative application of the unit clause rule (which is invoked when a clause becomes a unit clause). BCP rule - the last unassigned literal is implied to be true. It avoids the search path where the last literal is also false, since such a path can’t lead to a solution. If no conflict is discovered - choose the next var for making a decision, otherwise - resolve conflict by backtracking.
DPLL Algorithm analyze-conflict() backtrack() Finds out the reason for conflict (occurs when a var is implied to be true as well as false). backtrack() Undo some decisions and their implications.
In other words The initial step consists of some preprocessing, during which it may be discovered that the formula is unsatisfiable. The outer loop starts by choosing an unassigned variable, and a value to assign to it (decide-next-branch). If no such variable exists, a solution has been found. Otherwise, the variable assignments deducible from this decision are made (using deduce), through a procedure called Boolean Constraint Propagation (BCP). It typically consists of iterative application of the unit clause rule, which is invoked whenever a clause becomes a unit clause, i.e., all but one of its literals are false and the remaining literal is unassigned. According to the rule, the last unassigned literal is implied to be true – this avoids the search path where the last literal is also false, since such a path cannot lead to a solution. A conflict occurs when a variable is implied to be true as well as false. If no conflict is discovered during BCP, then the outer loop is repeated, by choosing the next variable for making a decision. However, if a conflict does occur, backtracking is performed within an inner loop in order to undo some decisions and their implications. If all decisions need to be undone (i.e., the backtracking level blevel is 0), the formula is declared unsatisfiable since the entire search space has been exhausted.
Example f = a (b + c + d) (b’ + c) (b’ + d) (x’ + y’) (x + z’) (x’ + b’ + y) (x + b’ + z) (c + d + y’ + z’). a = 1. f = (b + c + d) (b’ + c) (b’ + d) (x’ + y’) (x + z’) (x’ + b’ + y) (x + b’ + z) (c + d + y’ + z’). Branching variable: b = 1. f = c d (x’ + y’) (x + z’) (x’ + y) (x + z) (c + d + y’ + z’). c = 1, d = 1. f = (x’ + y’) (x + z’) (x’ + y) (x + z) (y’ + z’). Branching variable: x = 1. f = y’ y (y’ + z’). Conflict.
Example contd. f is satisfiable. f = (x’ + y’) (x + z’) (x’ + y) (x + z) (y’ + z’). Switch the status of most recent variable, i.e., x = 0. f = z’ z (y’ + z’). Still conflict, flip the second most recent variable, i.e., b = 0. f = (c + d) (x’ + y’) (x + z’) (c + d + y’ + z’). Branching variable: x = 1. f = (c + d) y’ (c + d + y’ + z’). y = 0. f = (c + d). Branching variable: c = 1. f = 1. f is satisfiable.
Deduction Algorithm(BCP) Most of SAT solver spent about 80% of running time in deduce(). Unit clause ( v1 + v2 + ¬v3 ), v1=F, v2=F v3 = F When can it occur? All-but-one literals in a clause are assigned to 0 BCP(Boolean Constraint Propagation) How implement? Keeping counters for each clause.
ZChaff: Branching Heuristics Decision of branching variable effects the height of decision tree significantly. Most occurring literal seems the natural choice. Search should be localized. i.e., after resolving conflicts one should branch the literals which were causing conflicts. Assign each variable a counter, for each additional clause increase the counter of each variable occurring in this clause. At any stage choose the variable with highest counter among undecided variables. Since each conflict resolution introduces clauses, this strategy will keep the search localized.
ZChaff: BCP optimization For BCP we do not need to check all clauses if they are unit clauses. For each clause choose two variables, if at least one of these two is zero, only then this clause can be a unit clause. An assignment (x = 1) will convert a non-unit irredundant clause into unit clause only if x’ occurs in the clause. Similar property holds for assignment (x = 0). Need for an efficient data structure.
Conflict analysis The original DPLL algorithm used chronological backtracking, i.e., it would backtrack up to the most recent decision, for which the other value of the variable had not been tried. Modern SAT solvers use conflict analysis technique to analyze the reasons for a conflict. Conflict analysis is used to perform conflict-driven learning and conflict-driven backtracking. Conflict-driven learning consists of adding conflict clauses to the formula, in order to avoid the same conflict in the future. Conflict-driven backtracking allows non chronological backtracking, i.e., up to the closest decision which caused the conflict. These techniques greatly improve the performance of the SAT solver on structured problems.
ZChaff: Analyze Conflicts Effectively While resolving conflicts one should also add some new clauses to avoid the repetition of same search space exploration. e.g., if the variables are explored in order a, b, c, d, e, then we need not to flip the variables in chronological order, as if the conflict was due to variable a, we will end up with repetitive search space.
Reference zChaff: Matthew W. Moskewicz, Conor F. Madigan,Ying Zhao, Lintao Zhang, Sharad Malik: Chaff: Engineering an Efficient SAT Solver. DAC 2001: 530-535, ACM.
Model Checking The target of model checking is the verification of sequential properties of dynamic systems. A dynamic system has a state component which changes over time. Model checking, in the first place, is only applicable to finite systems.
Model Checking (cont.) Sequential properties are usually represented in temporal logic. Formulas of temporal logic try to express system behavior over time. There are various variants of temporal logic, such as Linear Temporal Logic (LTL) or Computation Tree Logic (CTL), which usually require dedicated algorithms.
Modelling Languages As a language describing system models we can for example use: Petri nets, labelled transition systems (LTSs) and process algebras, Java programs, UML (unified modelling language) state machines, Promela language (input language of the Spin model checker), and VHDL,Verilog, or SMV languages (mostly for HW design).
Some Model Checking Approaches Explicit State Model Checking: Tools include Spin,Murj Java Pathfinder Maria, PROD, CPN Tools,CADP, etc. BDD based Symbolic Model Checking: Tools include NuSMV 2, VIS, Cadence SMV, etc. Bounded Model Checking: Tools include BMC,CMBC, NuSMV 2, VIS, Cadence SMV, etc
Symbolic Model Checking Method used by most “industrial strength” model checkers. Uses Boolean encoding for state machine and sets of states. Can handle much larger designs – hundreds of state variables. BDDs traditionally used to represent Boolean functions.
Problems with BDDs BDDs are a canonical representation. Often become too large. Variable ordering must be uniform along paths. Selecting right variable ordering very important for obtaining small BDDs. Often time consuming or needs manual intervention. Sometimes, no space efficient variable ordering exists.
Advantages of SAT Procedures SAT procedures also operate on Boolean formulas but do not use canonical forms. Do not suffer from the potential space explosion of BDDs. Different split orderings possible on different branches. Very efficient implementations exist.
Bounded Model Checking Originally presented in the paper: Armin Biere,Alessandro Cimatti, Edmund M. Clarke, Yunshan Zhu: Symbolic Model Checking without BDDs.TACAS 1999: 193-207, LNCS 1579.
Bounded Model Checking as SAT Given a property p: (e.g. “signal_a = signal_b”) Is there a state reachable in k cycles, which satisfies p ? p p p p p . . . s0 s1 s2 sk-1 sk
Models and Properties
Basics of Bounded Model Checking The basic idea is the following: Encode all the executions of the system M of length k into a propositional formula. Conjunct this formula with a formula which is satisfiable for all executions the system of length k which violate the property . If the formula is satisfiable, a counterexample has been found. If the formula is unsatisfiable, no counterexample of length k exists.
Basic Setup For simplicity first consider the following setup: As system models we consider systems whose state vector s consist of n Boolean state variables hs[0], s[1], . . . , s[n−1]i. We take k+1 copies of the system state vector denoted by s0, s1, . . . , sk. Let I(s) be the initial state predicate of the system, and T(s, s0) be the transition relation both expressed as propositional formulas.
Unrolling the Transition Relation
Circuit BMC Unrolling
Circuit BMC Unrolling Solution
Expressing Invariants
Final formula
Reachability Diameter If the formula is unsatisfiable, we have proved that there is no execution of length at most 3 that violates the invariant. Clearly for every finite state system there is some bound d called the reachability diameter such that from the initial state every reachable state is reachable with an execution of at most length d. By taking d = 2n, where n is the number of state bits, we could guarantee completeness. Unfortunately computing better approximations of d are computationally hard in the general case.
BMC: Pros and Cons Boolean formulas can be more compact than BDDs Leverages efficient SAT-solver technology Minimal length counterexamples (often, not always) Basic method is incomplete Not always better than BDD-based methods or explicit state model checking
References Zchaff: A fast SAT solver. 2002. http://bears.ece.ucsb.edu/class/256bd/RCFB256B3450-Zchoff.pdf A survey of recent advances in SAT-based formal verification, Mukul Prasad, Armin Biere, Aarti Gupta, in Software Tools for Technology. Boolean Satisfiability Solver Performance Comparison. Conclusion. Chaff: VSIDS(Variable State Independent Decaying Sum). http://www.cis.upenn.edu/~lee/02cis640/slides/SAT.ppt