SAT, Interpolants and Software Model Checking Ken McMillan Cadence Berkeley Labs.

Slides:

Advertisements

Similar presentations

Model Checking Base on Interoplation

Advertisements

Model Checking Lecture 4. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.

SMELS: Sat Modulo Equality with Lazy Superposition Christopher Lynch – Clarkson Duc-Khanh Tran - MPI.

Automated abstraction refinement II Heuristic aspects Ken McMillan Cadence Berkeley Labs.

The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Exploiting SAT solvers in unbounded model checking

A practical and complete approach to predicate abstraction Ranjit Jhala UCSD Ken McMillan Cadence Berkeley Labs.

Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

Exploiting SAT solvers in unbounded model checking K. L. McMillan Cadence Berkeley Labs.

Consequence Generation, Interpolants, and Invariant Discovery Ken McMillan Cadence Berkeley Labs.

Applications of Craig Interpolation to Model Checking K. L. McMillan Cadence Berkeley Labs.

Relevance Heuristics for Program Analysis Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?

Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

The Evolution of Symbolic Model Checking Ken McMillan Cadence Berkeley Labs.

Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.

Software Model Checking with SMT Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.

Synthesis, Analysis, and Verification Lecture 04c Lectures: Viktor Kuncak VC Generation for Programs with Data Structures “Beyond Integers”

The Project Problem formulation (one page) Literature review –“Related work" section of final paper, –Go to writing center, –Present paper(s) to class.

50.530: Software Engineering

Chaff: Engineering an Efficient SAT Solver Matthew W.Moskewicz, Concor F. Madigan, Ying Zhao, Lintao Zhang, Sharad Malik Princeton University Presenting:

50.530: Software Engineering Sun Jun SUTD. Week 10: Invariant Generation.

UIUC CS 497: Section EA Lecture #2 Reasoning in Artificial Intelligence Professor: Eyal Amir Spring Semester 2004.

Proofs from SAT Solvers Yeting Ge ACSys NYU Nov

Methods of Proof Chapter 7, second half.. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound)

Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko.

Interpolants from Z3 proofs Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.

Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.

Panel on Decision Procedures Panel on Decision Procedures Randal E. Bryant Lintao Zhang Nils Klarlund Harald Ruess Sergey Berezin Rajeev Joshi.

What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs.

Program Analysis as Constraint Solving Sumit Gulwani (MSR Redmond) Ramarathnam Venkatesan (MSR Redmond) Saurabh Srivastava (Univ. of Maryland) TexPoint.

SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:

The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.

Using Statically Computed Invariants Inside the Predicate Abstraction and Refinement Loop Himanshu Jain Franjo Ivančić Aarti Gupta Ilya Shlyakhter Chao.

Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.

Lazy Abstraction Thomas A. Henzinger Ranjit Jhala Rupak Majumdar Grégoire Sutre UC Berkeley.

Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.

1 Predicate Abstraction of ANSI-C Programs using SAT Edmund Clarke Daniel Kroening Natalia Sharygina Karen Yorav (modified by Zaher Andraus for presentation.

SAT-Based Decision Procedures for Subsets of First-Order Logic

Formal Verification Group © Copyright IBM Corporation 2008 IBM Haifa Labs SAT-based unbounded model checking using interpolation Based on a paper “Interpolation.

Temporal-Safety Proofs for Systems Code Thomas A. Henzinger Ranjit Jhala Rupak Majumdar George Necula Westley Weimer Grégoire Sutre UC Berkeley.

Lazy Abstraction with Interpolants Yakir Vizel (based on the work and slides of K. L. McMillan at CAV06)

1 Abstraction Refinement for Bounded Model Checking Anubhav Gupta, CMU Ofer Strichman, Technion Highly Jet Lagged.

Invisible Invariants: Underapproximating to Overapproximate Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

Lazy Abstraction Tom Henzinger Ranjit Jhala Rupak Majumdar Grégoire Sutre.

SAT Solving Presented by Avi Yadgar. The SAT Problem Given a Boolean formula, look for assignment A for such that.  A is a solution for. A partial assignment.

272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.

By: Pashootan Vaezipoor Path Invariant Simon Fraser University – Spring 09.

Boolean Satisfiability and SAT Solvers

SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.

Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011 TexPoint fonts used in EMF. Read the TexPoint manual before you.

Introduction to Satisfiability Modulo Theories

Lazy Abstraction Jinseong Jeon ARCS, KAIST CS750b, KAIST2/26 References Lazy Abstraction –Thomas A. Henzinger et al., POPL ’02 Software verification.

Lazy Annotation for Program Testing and Verification Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang November 26,

Boolean Satisfiability Present and Future

© Copyright 2008 STI INNSBRUCK Intelligent Systems Propositional Logic.

SMT and Its Application in Software Verification (Part II) Yu-Fang Chen IIS, Academia Sinica Based on the slides of Barrett, Sanjit, Kroening, Rummer,

CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.

Logical Agents Chapter 7. Outline Knowledge-based agents Propositional (Boolean) logic Equivalence, validity, satisfiability Inference rules and theorem.

© Anvesh Komuravelli Spacer Compositional Verification of Procedural Programs using Horn Clauses over Integers and Arrays Anvesh Komuravelli work done.

Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.

The software model checker BLAST Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar Presented by Yunho Kim TexPoint fonts used in EMF. Read.

Hybrid BDD and All-SAT Method for Model Checking

Inference and search for the propositional satisfiability problem

A theory-based decision heuristic for DPLL(T)

Introduction to Software Verification

Lifting Propositional Interpolants to the Word-Level

Decision Procedures An Algorithmic Point of View

A Progressive Approach for Satisfiability Modulo Theories

Predicate Abstraction

Presentation transcript:

SAT, Interpolants and Software Model Checking Ken McMillan Cadence Berkeley Labs

Applications of SAT solvers BMC of programs using SAT (e.g., CBMC) SAT solvers in decision procedures –Eager approach (e.g., UCLID) –Lazy approach (Verifun, ICS, many others) SAT-based image computation –Applied to predicate abstraction (Lahiri, et al)... We will consider instead the lessons learned from solving SAT that can be applied to software verification. SAT solvers have been applied in many ways in software verification

Outline SAT solvers –How do they work –What general lessons can we learn from the experience Software model checking survey –How various methods do or do not embody lessons from SAT A modest proposal –An attempt to apply the lessons of SAT to software verification

SAT solvers Solvers charactized by –Exhaustive BCP –Conflict-driven learning (resolution) –Deduction-based decision heuristics DPLL DPDLL variable elimination backtrack search SATO, GRASP,CHAFF,etc

Lesson #1: Be Lazy DP approach –Eliminate variables by exhaustive resolution –Extremely eager: deduces all facts about remaining variables –Essentially quantifier elimination -- explodes. DPLL approach –Lazy: only resolves clauses when model search fails –Resolution use as a form of failure generalization Learns general facts from model search failure Implications: 1. Make expensive deductions only when their relevance can be justified. 2.Don't do quantifier elimination.

Lesson #2: Be Eager In a DPLL solver, we always close deduction under unit resolution (BCP) before making a decision. –Guides decision making model search –Guides resolution steps in failure generalization –BCP updated after decision making and clause learning Implications: 1.Be eager with inexpensive deduction. 2.Deduce all the cheap facts before trying any expensive ones. 3.Let the expensive deduction drive the cheap deduction

Lesson #3: Learn from the Past Facts useful in one particular case are likely to be useful in other cases. This principle is embodied in –Clause learning –Deduction-based decision heuristics (e.g., VSIDS) Implication: Deduce facts that have been useful in the past.

Static Analysis Compute the least fixed-point of an abstract transformer –This is the strongest invariant the analysis can provide Inexpensive analyses: –value set analysis –affine equalities, etc. These analyses lose information at a merge: x = y x = z T Be eager with inexpensive deductions Be lazy with expensive deductionsX Learn from the pastN/A

Predicate abstraction Abstract transformer: –strongest Boolean postcondition over given predicates Advantage: does not lose information at a merge –join is disjunction x = y x = z x=y Ç x=z Disadvantage: –Abstract post is very expensive! –Computes information about predicates with no relevance justification Be eager with inexpensive deductionsX Be lazy with expensive deductionsX Learn from the pastN/A

PA with CEGAR loop Model check abstraction T # Choose initial T # Can extend Cex from T # to T? Add predicates to T # true, done Cex yes, Cex no Choose predicates to refute cex's –Generalizes failures –Some relevance justification Still performs expensive deduction without justification –strongest Boolean postcondition Fails to learn from past –Start fresh each iteration –Forgets expensive deductions Be eager with inexpensive deductionsX Be lazy with expensive deductionsX+ Learn from the pastX

Boolean Programs Abstract transformer –Weaker than predicate abstraction –Evaluates predicates independently -- loses correlations {T} x=y; {x=0, y=0} Predicate abstraction {T} x=y; {T} Boolean programs Advantages –Computes less expensive information eagerly –Disadvantages –Still computes expensive information without justification –Still uses CEGAR loop Be eager with inexpensive deductionsX Be lazy with expensive deductionsX++ Learn from the pastX

Lazy Predicate Abstraction Unwind the program CFG into a tree –Refine paths as needed to refute errors ERR! x=y y=0 Add predicates along path to allow refutation of error Refinement is local to an error path Search continues after refinement –Do not start fresh -- no big CEGAR loop Previously useful predicates applied to new vertices

Lazy Predicate Abstraction ERR! x=y y=0 Add predicates along path to allow refutation of error Refinement is local to an error path Search continues after refinement –Do not start fresh -- no big CEGAR loop Previously useful predicates applied to new vertices Be eager with inexpensive deductionsX Be lazy with expensive deductions - Learn from the past

SAT-based BMC Inherits all the properties of SAT Deduction limited to propositional logic –Cannot directly infer facts like x · y –Inexpensive deduction limited to BCP Program Loop Unwinding Convert to Bit Level SAT Be eager with inexpensive deductions -- Be lazy with expensive deductions Learn from the past

SAT-based with Static Analysis Allows richer class of inexpensive deductions Inexpensive deductions not updated after decisions and clause learning – Coupling could be tighter –Perhaps using lazy decision procedures? Program Loop Unwinding Convert to Bit Level SAT Static Analysis x=y;x=z; x=z decision Be eager with inexpensive deductions - Be lazy with expensive deductions Learn from the past

Lazy abstraction and interpolants A way to apply the lessons of SAT to lazy abstraction Keep the advantages of lazy abstraction... –Local refinement (be lazy) –No "big loop" as in CEGAR (learn from the past)...while avoiding the disadvantages of predicate abstraction... –no eager image computation...and propagating inexpensive deductions eagerly –as in static analysis

Interpolation Lemma Notation: L ( ) is the set of FO formulas over the symbols of If A B = false, there exists an interpolant A' for (A,B) such that: A A' A' B = false A' 2 L (A) Å L (B) Example: –A = p q, B = q r, A' = q Interpolants from proofs –in certain quantifier-free theories, we can obtain an interpolant for a pair A,B from a refutation in linear time. [McMillan 05] –in particular, we can have linear arithmetic,uninterpreted functions, and restricted use of arrays (Craig,57)

Interpolants for sequences Let A 1...A n be a sequence of formulas A sequence A 0...A n is an interpolant for A 1...A n when –A 0 = True –A i -1 Æ A i ) A i, for i = 1..n –A n = False –and finally, A i 2 L (A 1...A i ) Å L (A i+1...A n ) A1A1 A2A2 A3A3 AkAk... A' 1 A' 2 A' 3 A' k-1... TrueFalse )))) In other words, the interpolant is a structured refutation of A 1...A n

Interpolants as Floyd-Hoare proofs False x 1 =y 0 True y 1 >x 1 ) ) ) 1. Each formula implies the next 2. Each is over common symbols of prefix and suffix 3. Begins with true, ends with false Path refinement procedure SSA sequence Prover Interpolation Path Refinement proof structured proof x=y; y++; [x=y] x 1 = y 0 y 1 =y 0 +1 x 1 y 1

Lazy abstraction -- an example do{ lock(); old = new; if(*){ unlock; new++; } } while (new != old); program fragment L=0 L=1; old=new [L!=0] L=0; new++ [new==old] [new!=old] control-flow graph

1 L=0 T 2 [L!=0] T Unwinding the CFG L=0 L=1; old=new [L!=0] L=0; new++ [new==old] [new!=old] control-flow graph 0 T F L=0 Label error state with false, by refining labels on path

6 [L!=0] T 5 [new!=old] T 4 L=0; new++ T 3 L=1; old=new T Unwinding the CFG L=0 L=1; old=new [L!=0] L=0; new++ [new==old] [new!=old] control-flow graph 0 12 L=0 [L!=0] F L=0 F T Covering: state 5 is subsumed by state 1.

T 11 [L!=0] T 10 [new!=old] T 8 T Unwinding the CFG L=0 L=1; old=new [L!=0] L=0; new++ [new==old] [new!=old] control-flow graph L=0 L=1; old=new [L!=0] L=0; new++ [new!=old] F L=0 6 [L!=0] F L=0 7 [new==old] T old=new F F T Another cover. Unwinding is now complete. 9 T

Covering step If (x) ) (y)... –add covering arc x B y –remove all z B w for w descendant of y x · y x=y X We restict covers to be descending in a suitable total order on vertices. This prevents covering from diverging.

Refinement step Label an error vertex False by refining the path to that vertex with an interpolant for that path. By refining with interpolants, we avoid predicate image computation. T T T T T T T x = 0 [x=y] [x y] y++ [y=0] y=2 x=0 y=0 y 0 F X Refinement may remove covers

Forced cover Try to refine a sub-path to force a cover –show that path from nearest common ancestor of x,y proves (x) at y T T T T T T T x = 0 [x=y] [x y] y++ [y=0] y=2 x=0 y=0 y 0 F refine this path y 0 Forced cover allow us to efficiently handle nested control structure

T [x=z] [x z] y=1 y=2 y 2 {1,2} [y=1 Æ x z] Incremental static analysis Update static analysis of unwinding incrementally –Static analysis can prevent many interpolant-based refinements –Interpolant-based refinements can refine static analysis T T T T T T T x = 0 [x=y] [x y] y++ [y=0] y=2 x=0 y=0 y 0 F y=2 from value set analysis x=z F refine this path y=2 value set refined

Applying the lessons from SAT Be lazy with epensive deductions –All path refinements justified –No eager predicate image computation Be eager with inexpensive deductions –Static anlalysis updated after all changes –Refinement and static analysis interact Learn from the past – Refinements incremental – no big CEGAR loop – Re-use of historically useful facts by forced covering

Experiments Windows device driver benchmarks from BLAST benchmark suite –programs flattened to "simple goto programs" Compare performance against BLAST, a lazy predicate abstraction tool No static analysis. namesource LOC SGP LOC BLAST (s) IMPACT (s) BLAST IMPACT kbfiltr12K2.3K diskperf14K3.9K cdaudio44K6.3K floppy18K8.7K parclass138K8.8K parport61K13K Almost all BLAST time spent in predicate image operation.

The Saga Continues After these results, Ranjit Jhala modified BLAST –vertices inherit predicates from their parents, reducing refinements –fewer refinements allows more predicate localization Impact also made more eager, using value set analysis namesource LOC SGP LOC BLAST (s) IMPACT (s) BLAST IMPACT kbfiltr12K2.3K diskperf14K3.9K cdaudio44K6.3K floppy18K8.7K parclass138K8.8K parport61K13K

Conclusions Caveats –Comparing different implementations is dangerous –More and better software model checking benchmarks are needed Tentative conclusions –For control-dominated codes, predicate abstraction is too "eager better to be more lazy about expensive deductions –Propagate inexpensive deductions can produce substantial speedup roughly one order of magnitude for Windows examples –Perhaps by applying the lessons of SAT, we can obtain the same kind of rapid performance improvements obtained in that area Note 2-3 orders of magnitude speedup in lazy model checking in 6 months!

Future work Procedure summaries –Many similar subgraphs in unwinding due to procedure expansions –Cannot handle recursion –Can we use interpolants to compute approximate procedure summaries? Quantified interpolants –Can be used to generate program invariants with quantifiers –Works for simple examples, but need to prevent number of quantifiers from increasing without bound Richer theories –In this work, all program variables modeled by integers –Need an interpolating prover for bit vector theory Concurrency...

Unwinding the CFG An unwinding is a tree with an embedding in the CFG L=0 L=1; old=new [L!=0] L=0; new++ [new==old] [new!=old] L=0 L=1; old=new [L!=0] L=0; new++ MvMv MeMe

Expansion Every non-leaf vertex of the unwinding must be fully expanded... L=0 0 1 MvMv MeMe If this is not a leaf......and this exists......then this exists....but we allow unexpanded leaves (i.e., we are building a finite prefix of the infinite unwinding)

Labeled unwinding A labeled unwinding is equiped with... –a lableing function : V ! L (S) –a covering relation B µ V £ V L=0 L=1; old=new [L!=0] L=0; new++ [new!=old] 6 [L!=0] 7 [new==old] T F L=0 F T T These two nodes are covered. (have a ancestor at the tail of a covering arc)...

Well-labeled unwinding An unwinding is well-labeled when... – ( ) = True –every edge is a valid Hoare triple –if x B y then y not covered L=0 L=1; old=new [L!=0] L=0; new++ [new!=old] 6 [L!=0] 7 [new==old] T F L=0 F T T

Safe and complete An unwinding is –safe if every error vertex is labeled False –complete if every nonterminal leaf is covered T 10 [L!=0] T 9 [new!=old] T 8 T L=0 L=1; old=new [L!=0] L=0; new++ [new!=old] F L=0 6 [L!=0] F L=0 7 [new==old] T old=new F F T... Theorem: A CFG with a safe complete unwinding is safe. 9 T

Unwinding steps Three basic operations: –Expand a nonterminal leaf –Cover: add a covering arc –Refine: strengthen labels along a path so error vertex labeled False

Overall algorithm 1.Do as much covering as possible 2.If a leaf can't be covered, try forced covering 3.If the leaf still can't be covered, expand it 4.Label all error states False by refining with an interpolant 5.Continue until unwinding is safe and complete