Software Model Checking with SMT Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.

Slides:



Advertisements
Similar presentations
Model Checking Base on Interoplation
Advertisements

SMELS: Sat Modulo Equality with Lazy Superposition Christopher Lynch – Clarkson Duc-Khanh Tran - MPI.
Automated abstraction refinement II Heuristic aspects Ken McMillan Cadence Berkeley Labs.
The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.
Exploiting SAT solvers in unbounded model checking
Functional Decompositions for Hardware Verification With a few speculations on formal methods for embedded systems Ken McMillan.
Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Exploiting SAT solvers in unbounded model checking K. L. McMillan Cadence Berkeley Labs.
Consequence Generation, Interpolants, and Invariant Discovery Ken McMillan Cadence Berkeley Labs.
Applications of Craig Interpolation to Model Checking K. L. McMillan Cadence Berkeley Labs.
Copyright 2000 Cadence Design Systems. Permission is granted to reproduce without modification. Introduction An overview of formal methods for hardware.
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Abstraction of Source Code (from Bandera lectures and talks)
50.530: Software Engineering
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
50.530: Software Engineering Sun Jun SUTD. Week 10: Invariant Generation.
UIUC CS 497: Section EA Lecture #2 Reasoning in Artificial Intelligence Professor: Eyal Amir Spring Semester 2004.
Proofs from SAT Solvers Yeting Ge ACSys NYU Nov
Hoare’s Correctness Triplets Dijkstra’s Predicate Transformers
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
Timed Automata.
Interpolation and Widening Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.
Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko.
Interpolants from Z3 proofs Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
August Moscow meeting1August Moscow meeting1August Moscow meeting11 Deductive tools in insertion modeling verification A.Letichevsky.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.
SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:
An Integration of Program Analysis and Automated Theorem Proving Bill J. Ellis & Andrew Ireland School of Mathematical & Computer Sciences Heriot-Watt.
The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.
Revisiting Generalizations Ken McMillan Microsoft Research Aws Albarghouthi University of Toronto.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.
Formal Verification Group © Copyright IBM Corporation 2008 IBM Haifa Labs SAT-based unbounded model checking using interpolation Based on a paper “Interpolation.
Last time Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search strategy.
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
ESE601: Hybrid Systems Introduction to verification Spring 2006.
1 Abstraction Refinement for Bounded Model Checking Anubhav Gupta, CMU Ofer Strichman, Technion Highly Jet Lagged.
Describing Syntax and Semantics
1 Formal Engineering of Reliable Software LASER 2004 school Tutorial, Lecture1 Natasha Sharygina Carnegie Mellon University.
Invisible Invariants: Underapproximating to Overapproximate Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Formal Methods 1. Software Engineering and Formal Methods  Every software engineering methodology is based on a recommended development process  proceeding.
1 Program Correctness CIS 375 Bruce R. Maxim UM-Dearborn.
SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.
Introduction to Satisfiability Modulo Theories
Ethan Jackson, Nikolaj Bjørner and Wolfram Schulte Research in Software Engineering (RiSE), Microsoft Research 1. A FORMULA for Abstractions and Automated.
Formal Verification Lecture 9. Formal Verification Formal verification relies on Descriptions of the properties or requirements Descriptions of systems.
Lazy Annotation for Program Testing and Verification Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang November 26,
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
Scientific Debugging. Errors in Software Errors are unexpected behaviors or outputs in programs As long as software is developed by humans, it will contain.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
Nikolaj Bjørner Microsoft Research DTU Winter course January 2 nd 2012 Organized by Flemming Nielson & Hanne Riis Nielson.
1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.
SAT-Based Model Checking Without Unrolling Aaron R. Bradley.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding Combined Theories.
CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.
Daniel Kroening and Ofer Strichman 1 Decision Procedures An Algorithmic Point of View Basic Concepts and Background.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
Knowledge Repn. & Reasoning Lecture #9: Propositional Logic UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005.
© Anvesh Komuravelli Spacer Model Checking with Proofs and Counterexamples Anvesh Komuravelli Carnegie Mellon University Joint work with Arie Gurfinkel,
SMT-Based Verification of Parameterized Systems
Solving Linear Arithmetic with SAT-based MC
Automating Induction for Solving Horn Clauses
Introduction to Software Verification
Introduction to verification
Predicate Abstraction
Presentation transcript:

Software Model Checking with SMT Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A

Software model checking and SMT In checking properties of software as opposed to hardware or protocols, we have several interesting issues to deal with: –Software is not finite-state –Extensive use is made of arithmetic –We have arrays and linked data structures The theory reasoning capabilities of SMT solvers are thus extensively used in software model checking. This also means software verification relies heavily on abstraction. Moreover, software has structure that we can exploit such as –Sequential control flow –Procedural abstraction Software model checking algorithms are specialized to exploit this structure.

Role of SMT In this lecture, we'll consider the uses of SMT in software model checking. SMT is used in three distinct roles: –Counterexample generation (from satisfying assignments) –Testing and strengthening conjectures (from UNSAT cores) –Abstraction refinement (from interpolants) In this lecture, we will focus in the latter two roles, using the SMT solver as an engine for generating proofs. Finding counterexamples using SMT is essentially similar to the finite state case. As we'll see, counterexamples are often a by-product of abstraction refinement.

Model Checking yes! no! p q ModelChecker p q SystemModel G(p ) F q)LogicalSpecificationCounterexample A great advantage of model checking is the ability to produce behavioral counterexamples to explain what is going wrong.

Temporal logic (LTL)...

Types of temporal properties We will focus on safety properties.

Safety and reachabilityI States = valuations of state variables Transitions = execution steps Initial state(s) F Bad state(s) Breadth-first search Counterexample!

Reachable state set I F Remove the “bug” Breadth-first search Fixed point = reachable state set Safety property verified! Model checking is a little more complex than this, but reachability captures the essence for our purposes. Model checking can find very subtle bugs in circuits and protocols, but suffers from state explosion, and doesn't terminate in the infinite-state case.

Abstraction Problem: verify a shallow property of a very large system Solution: Abstraction –Extract just the facts about the system state that are relevant to the proving the shallow property. An abstraction is a restricted deduction system that focuses our reasoning on relevant facts, and thus makes proof easier.

Relevance and refinement Problem: how do we decide what deductions are relevant to a property? –Is relevance even a well defined notion? Relevance: –A relevant deduction is one that is used in a simple proof of the desired property. Generalization principle: –Deductions used in the proof of special cases tend to be relevant to the overall proof.

Proofs A proof is a series of deductions, from premises to conclusions Each deduction is an instance of an inference rule Usually, we represent a proof as a tree... P1P1P1P1 P2P2P2P2 P3P3P3P3 P4P4P4P4 P5P5P5P5 CPremisesConclusion P 1 P 2 C If the conclusion is “false”, the proof is a refutation SMT solvers can be instrumented to provide refutations in the UNSAT case.

Inference rules The inference rules depend on the theory we are reasoning in _ _ _ _  Resolution rule: Boolean logic Linear arithmetic x1 · y1x1 · y1x1 · y1x1 · y1 x2 · y2x2 · y2x2 · y2x2 · y2 x 1 +x 2 · y 1 +y 2 Sum rule:

Reachable states: complex Inductive invariants I F A Boolean-valued formula over the system state  :::: Partitions the state space into two regions Forms a barrier between the initial states and bad states No transitions cross this way Inductive invariant: simple!

Invariants and relevance A predicate is relevant if it is used in a simple inductive invariant l 1 : x = y = 0; l 2 : while(*) l 3 : x++, y++; l 4 : while(x != 0) l 5 : x--, y--; l 6 : assert (y == 0); state variables: pc, x, y inductive invariant = property + pc = l 1 Ç x = y pc = l 1 and x = yRelevant predicates: pc = l 1 and x = y Irrelevant (but provable) predicate: x ¸ 0Irrelevant (but provable) predicate: x ¸ 0 property: pc = l 6 ) y = 0

What is Abstraction By abstraction, we mean something like "reasoning with limited information". The purpose of abstraction is to let us ignore irrelevant details, and thus simplify our reasoning. In abstract interpretation, we think of an abstraction as a restricted domain of information about the state of a system. Here, we will take a slightly broader view: An abstraction is a restricted deduction system We can think of an abstraction as a language for expressing facts, and a set of deduction rules for inferring conclusions in that language.

The function of abstraction The function of abstraction is to reduce the cost of proof search by reducing the space of proofs. RichDeductionSystem Abstraction Automated tool can search this space for a proof. An abstraction is a way to express our knowledge of what deductions may be relevant to proving a particular fact.

Symbolic transition systems

Proof by Inductive Invariant Many different choices have been made in practice. We will discuss a few...

Inductive invariants from fixed points In abstract interpretation, we use our abstraction to deduce facts about executions of increasing length until a fixed point is reached.

Abstraction languages Difference bounds Affine equalities

Abstraction languages

Example Let's try some abstraction languages on an example... l 1 : x = y = 0; l 2 : while(*) l 3 : x++, y++; l 4 : while(x != 0) l 5 : x--, y--; l 6 : assert (y == 0); Difference bounds Affine equalities

Another example Let's try an even simpler example... l 1 : x = 0; l 2 : if(*) l 3 : x++; l 4 : else l 5 : x--; l 6 : assert (x != 0); Difference bounds Affine equalities

Deduction systems Up to now, we have implicitly assumed we have an oracle that can prove any valid formulas of the forms:

Localization abstraction

Example Boolean Programs A Boolean program is defined by a set of such facts. l 1 : int x = *; l 2 : if(x > 0){ l 3 : x--; l 4 : assert(x >= 0); l 5 : } In practice, we may add some disjunctions to our set of allowed deductions, to avoid adding more predicates.

Boolean Programs with SMT We can carry out deduction within our restricted proof system using an SMT solver. This amounts to testing a collection of conjectures. Example:

Boolean Programs with SMT FFFF FT?? TFFF TT x-- In one case, we get no information, one is infeasible

Conjecture explosion Eager deduction: conjectures tested without evidence that they are useful. Important general point: eager deduction must be limited to a tractable amount (example: Boolean constraint propagation in SAT) In case the proof fails because of weakening, we can recover by adding a conjecture that is relevant to the failure (lazy deduction)

Strengthening by UNSAT core Let's say we check this conjecture: We do this by checking unsatisfiability of the following formula: If UNSAT, the solver can produce an UNSAT core, e.g.: This gives us a strengthened fact: This can strengthen conjectures based on counterexamples (see IC3) and also reduce the space of conjectures to test.

Generalized counterexamples

Predicate abstraction and SMT Moving to predicate abstraction, we now have the same abstract language, but we allow to deduce disjunctions on the RHS, e.g., This is not fundamentally different from Boolean programs, but disjunctions can explode the conjecture set. In practice, disjunctions are usual not conjectured eagerly, but in response to proof failures (see Das and Dill method).

Invariant search In general, making the space of proofs smaller will make the proof search easier.

Relevance and abstraction The key to proving a property with abstraction is to choose a small space of deductions that are relevant to the property. How do we choose... –Predicates for predicate abstraction? –System components for localization? –Disjunctions for Boolean programs? Next, we will observe that deductions that are relevant to particular cases tend to be relevant in general. This gives us a methodology of abstraction refinement.

Basic refinement framework A common paradigm in theorem proving is to let proof search guide counterexample search and vice-versa. This occurs in CDCL, for example, where BCP guides model search and model search failure (conflict) generates new deduction (conflict clauses) ProofSearchCounterexamplesearch failure failure

Basic framework for software Abstraction and refinement are proof systems –spaces of possible proofs that we search Abstractor Refiner General proof system Incomplete Specialized proof system Completeprog. pf. special case cex. pf. of special case Refinement = augmenting abstractor’s proof system to replicate proof of special case generated by refiner. Narrow the abstractor’s proof space to relevant facts.

Background Simple program statements (and their Hoare axioms) {}{}{}{} [][][][] {  )  } {}{}{}{} x := e {  [e/x]} {}{}{}{} havoc x { 8 x  } A compound stmt is a sequence simple statements  1 ;...;  k A CFG (program) is an NFA whose alphabet is compound statements. – –The accepting states represent safety failures. x = 0; while(*) x++; x++; assert x >= 0; [x<0] x := x +1 x := 0

Hoare logic proofs Write H (L) for the Hoare logic over logical language L. A proof of program C in H (L) maps vertices of C to L such that: –the initial vertex is labeled True –the accepting vertices are labeled False –every edge is a valid Hoare triple. [x<0] x := x +1 x := 0 {True}{False} {x ¸ 0} This proves the failure vertex not reachable, or equivalently, no accepting path can be executed.

Path reductiveness An abstraction is path-reductive if, whenever it fails to prove program C, it also fails to prove some (finite) path of program C. Example, H (L) is path-reductive if L is finite L is finite L closed under disjunction/conjunction L closed under disjunction/conjunction Path reductiveness allows refinement by proof of paths. In place of “path”, we could use other program fragments, including restricted paths (with extra guards), paths with loops, procedure calls... We will focus on paths for simplicity.

Example x = y = 0; while(*) x++; y++; while(x != 0) x--; y--; assert (y == 0); x:=0;y:=0 x:=x+1; y:=y+1 [x  0]; x:=x-1; y:=y-1 [x=0]; [y  0] Try to prove with predicate abstraction, with predicates {x=0,y=0} Predicate abstraction with P is Hoare logic over the Boolean combinations of P

{x=0 Æ y=0} {x  0 Æ y  0} {True} {x = y} {False} {True} Unprovable path x = y = 0; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y != 0] Cannot prove with PA({x=0,y=0}) Ask refiner to prove it! Augment P with new predicate x=y. PA can replicate proof. {x = y Æ x=0} {x = y Æ x  0} {x = y} {False} {True} Abstraction refinement: Path unprovable to abstraction Path unprovable to abstraction Refiner proves Refiner proves Abstraction replicates proof Abstraction replicates proof

Path reductiveness Path reductive abstractions can be characterized by the path proofs they can replicate –Predicate abstraction over P replicates all the path proofs over Boolean combinations of P. –The Boolean program abstraction replicates all the path proofs over the cubes of P. For these cases, it is easy to find an augmentation that replicates a proof (if the proof is QF). In general, finding the least augmentation might be hard... But where do the path proofs come from?

Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods –Feasible interpolation (BLAST, IMPACT) –Bounded provers (SATABS) –Constraint-based (ARMC) Local proof There are many technical approaches to this problems. All can be viewed, however, as a search for a local proof.

Interpolation Lemma If A  B = false, there exists an interpolant A' for (A,B) such that: A  A' A' ^ B = false A' 2 L (A) \ L (B) Example: –A = p  q, B =  q  r, A' = q [Craig,57] In many logics, an interpolant can be derived in linear time from a refutaion proofs of A ^ B.

Interpolants as Floyd-Hoare proofs False x 1 =y 0 True y 1 >x 1 ) ) ) 1. Each formula implies the next 2. Each is over common symbols of prefix and suffix 3. Begins with true, ends with false Proving in-line programs SSA sequence Prover Interpolation Hoare Proof proof x=y; y++; [x=y] x 1 = y 0 y 1 =y 0 +1 x1y1x1y1 {False} {x=y} {True} {y>x} x = y y++ [x == y]

Local proofs and interpolants x=y; y++; [y · x] x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 y0 · x1y0 · x1y0 · x1y0 · x1 x 1 +1 · y 1 y 1 · x 1 +1 y 1 · y · 0 FALSE x1 · y0x1 · y0x1 · y0x1 · y0 y 0 +1 · y 1 TRUE x 1 · y x 1 +1 · y 1 FALSE This is an example of a local proof...

Definition of local proof x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 y0y0y0y0 scope of variable = range of frames it occurs in y1y1y1y1 x1x1x1x1 vocabulary of frame = set of variables “in scope” {x 1,y 0 } {x 1,y 0,y 1 } {x 1,y 1 } x 1 +1 · y 1 x1 · y0x1 · y0x1 · y0x1 · y0 y 0 +1 · y 1 deduction “in scope” here Local proof: Every deduction written Every deduction written in vocabulary of some in vocabulary of some frame. frame.

Forward local proof x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 {x 1,x 0 } {x 1,y 0,y 1 } {x 1,y 1 } Forward local proof: each deduction can be assigned a frame such that all the deduction arrows go forward. x 1 +1 · y 1 1 · 0 FALSE x1 · y0x1 · y0x1 · y0x1 · y0 y 0 +1 · y 1 For a forward local proof, the (conjunction of) assertions crossing frame boundary is an interpolant. TRUE x 1 · y x 1 +1 · y 1 FALSE

Reverse local proof x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 {x 1,x 0 } {x 1,y 0,y 1 } {x 1,y 1 } Reverse local proof: each deduction can be assigned a frame such that all the deduction arrows go backward. For a reverse local proof, the negation of assertions crossing frame boundary is an interpolant. TRUE : y 0 +1 · x 1 : y 1 · x 1 FALSE y 0 +1 · y 1 y 0 +1 · x 1 x1 · y0x1 · y0x1 · y0x1 · y0 1 · 0 FALSE

General local proof x 1 =3y 0 x 1 · 2 1 · x 1 {x 1,y 0 } {x 1 } General local proof: each deduction can be assigned a frame, but deduction arrows can go either way. For a general local proof, the interpolants contain implications. TRUE x 1 · 2 ) x 1 · 0 x1 · 0x1 · 0x1 · 0x1 · 0 FALSE x1 · 0x1 · 0x1 · 0x1 · 0 1 · 0 FALSE

Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods –Feasible interpolation (BLAST, IMPACT) –Bounded provers (SATABS) –Constraint-based (ARMC) Local proof

Refinement with SP The strongest post-condition of  w.r.t. progam , written SP( ,  ), is the strongest  such that {  }  {  }. The SP exactly characterizes the states reachable via . False x 1 =y 0 True y 1 >x 1 x=y; y++; [x=y] x 1 = y 0 y 1 =y 0 +1 x1y1x1y1 {False} {x=y} {True} {y=x+1} x = y y++ [y · x] Refinement with SP: Syntactic SP computation: {}{}{}{} [][][][] { Æ }{ Æ }{ Æ }{ Æ } {}{}{}{} x := e { 9 v  [v/x] Æ x = e[v/x]} {}{}{}{} havoc x { 9 x  } This is viewed as symbolic execution, but there is a simpler view.

SP as local proof Order the variables by their creation in SSA form: x 0 Â y 0 Â x 1 Â y 1 Â  Refinement with SP corresponds to local deduction with these rules:  x = e  e/x] x max. in  FALSE  unsat. We encode havoc specially in the SSA: havoc x x =  i where  i is a fresh Skolem constant fresh Skolem constant Think of the  i ’s as implicitly existentially quantified

SP example y 0 =  1 x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 {x 1,y 0 } {x 1,y 0,y 1 } {x 1,y 1 } Ordering of rewrites ensures forward local proof. The (conjunction of) assertions crossing frame boundary is an interpolant with  i ’s existentially quantifed. TRUE 9  1 (x 1 =  1 Æ y 0 =  1 ) FALSE x 1 =  1 y 1 =  1 +1 y1 · 1y1 · 1y1 · 1y1 · 1  1 +1 ·  1 FALSE 9  1 (x 1 =  1 Æ y 1 =  1 +1) x 1 = y 0 y 1 = x We can use quantifier elimination if our logic supports it.

Refinement quality Refinement with SP and WP is incomplete –May exists a refinement that proves program but we never find one These are weak proof systems that tend to yield low-quality proofs Example program: x = y = 0; while(*) x++; y++; while(x != 0) x--; y--; assert (y == 0); {x == y} invariant:

{y = 0} {y = 1} {y = 2} {y = 1} {y = 0} {False} {True} {x = y} {False} {True} Execute the loops twice This simple proof contains invariants for both loops Predicates diverge as we unwind A practical method must somehow prevent this kind of divergence! x = y = 0; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y != 0] Refine with SP (and proof reduction) Same result with WP! We need refinement methods that can generate simple proofs!

Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods –Feasible interpolation (BLAST, IMPACT) –Bounded provers (SATABS) –Constraint-based (ARMC) Local proof

Bounded Provers [SATABS] Define a (local) proof system –Can contain whatever proof rules you want Define a cost metric for proofs –For example, number of distinct predicates after dropping subscripts Exhaustive search for lowest cost proof –May restrict to forward or reverse proofs  x = e  e/x] x max. in  FALSE  unsat. Allow simple arithmetic rewriting.

Loop example x 0 = 0 y 0 = 0 x 1 =x 0 +1 y 1 =y 0 +1 TRUE x 0 = 0 Æ y 0 = 0... x 1 =1 Æ y 1 = 1 x 2 =x 1 +1 y 2 =y x 1 = 1 y 1 = 1 x 2 = 2 y 2 = cost: 2N x 2 =2 Æ y 2 = 2 x 0 = y 0 x 1 = y 0 +1 x 1 = y 1 x 2 = y 1 +1 x 2 = y 2 TRUE x 0 = y 0... x 1 = y 1 cost: 2 x 2 = y 2 Lowest cost proof is simpler, avoids divergence.

Lowest-cost proofs Lowest-cost proof strongly depends on choice of proof rules –This is a heuristic choice –Rules might include bit vector arithmetic, arrays, etc... –May contain SP or WP (so complete for refuting program paths) Search for lowest cost proof may be expensive! –Hope is that lowest-cost proof is short –Require fixed truth value for all atoms (refines restricted case) Divergence is still possible when a terminating refinement exists –However, heuristically, will diverge less often than SP or WP.

Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods –Feasible interpolation (BLAST, IMPACT) –Bounded provers (SATABS) –Constraint-based (ARMC) Local proof

Constraint-based interpolants Farkas’ lemma: If a system of linear inequalities is UNSAT, there is a refutation proof by summing the inequalities with non-neg. coefficients. Farkas’ lemma proofs are local proofs! x 0 · 0 0 · y 0 x 1 · x 0 +1 z 1 · x 1 -1 y 0 +1 · y 1 y 1 +1 · x 1 1 (y 0 +1 · y 1 ) 1 (y 1 +1 · x 1 ) 1 (x 0 · 0) 1 (0 · y 0 ) x0 · y0x0 · y0x0 · y0x0 · y0 1 (x 1 · x 0 +1) 0 (z 1 · x 1 -1) x1 · y0x1 · y0x1 · y0x1 · y0 1 · 0 Intermediate sums are the interpolants! x0 · y0x0 · y0x0 · y0x0 · y0 x1 · y0x1 · y0x1 · y0x1 · y0 1 · 0 0 · 0 Coefficients can be found by solving an LP. Interpolants can be controlled with additional constraints..

Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods –Feasible interpolation (BLAST, IMPACT) –Bounded provers (SATABS) –Constraint-based (ARMC) Local proof

Interpolation and SMT An SMT solver has heuristics for generating simple proofs –CDCL focuses on relevant decisions and deductions –Theory solvers generate simple theory lemmas However, it generates non-local proofs. Feasible interpolation can be seen as factoring a non-local proof into a local proof. This allows us to use an SMT solver to generate the simple local proofs we require for refinement. Interpolation (factoring into local form) can be done efficiently for certain theories –propositional logic –linear arithmetic (integer or real) –equality, function symbols, arrays

Non-local to local For linear arithmetic, interpolation amounts to putting a Farkas proof in the right order. x 0 · y 0 x 1 · x 0 -1 x 2 · x 1 -1 y 0 · x 2 x0 · y0x0 · y0x0 · y0x0 · y0 x 1 · y · -2 0 · 0 x 2 · x 0 -2 x 2 · y · -2 Non-local! Interpolation re-orders the sum to make the proof local. x 1 · y 0 -1 x 2 · y · -2

Interpolant quality We measure the cost or complexity of a local proof by the deductions produced. Simple non-local proofs generated by the SMT solver can be far from optimal when translated to local form, however. Farkas proof Interpolant A Farkas proof involving different variables or constraints might yield simpler interpolant. To produce good quality (simple) interpolants, we may need to modify the theory solvers to search for better proofs.

Feasible interpolation and SMT Think of an interpolant as a proof in interpolated form:

Feasible interpolation and resolution Propositional resolution has linear-time interpolation... Boolean circuit CDCL produces resolution proofs (conflict clause generation = resolution)

Interpolation of theory lemmas Extending this to Nelson/Oppen combinations is a bit complex (see McMillan TACAS 2004, Yorsh and Musuvathi CADE 2005). Various theory combinations have been implemented in interpolating SMT solvers. Farkas proof......theory clause interpolation

Interpolating provers Quite a number of provers are now available that produce interpolants, typically by constructing a proof in interpolated form. ProverTheory or proof systemlocal FOCIQF_AUFIDL, QF_UFLRAyes* CSISatQF_UFLRAno MathSAT5QF_UFLIA, QF_UFLRAno iPrincessLIAno SMTInterpolQF_LIA, QF_LRAno Vampyresuperposition + LRAyes* iZ3AUFLIAno *Some of these provers try to produce simpler interpolants by searching for simple local proofs.

Basic Framework Abstraction and refinement are proof systems –spaces of possible proofs that we search Abstractor Refiner General proof system Incomplete Specialized proof system Completeprog. pf. special case cex. pf. of special case An effective refiner produces simple proofs that generalize away from the special case and yield simple abstractions. One way to accomplish this is to use an SMT solver and feasible interpolation.

Lazy v. Eager Abstraction We can classify abstractors as eager or lazy based on the extent to which they test conjectures speculatively. At the extreme lazy end is lazy abstraction. This method does not generalize deductions from one execution path to another. Lazy abstraction (as in BLAST) performs eager deduction using predicate abstraction along individual execution paths. Lazy abstraction with interpolants (as in IMPACT) is yet more lazy, as essentially all deduction is done be the refiner. IMPACT relies on SMT as its only proof engine.

An exampledo{ lock(); lock(); old = new; old = new; if(*){ if(*){ unlock(); unlock(); new++; new++; } } while (new != old); program fragment L=0 L=1; old=new old=new [L!=0] L=0; L=0; new++ new++ [new==old] [new!=old] control-flow graph

1L=0T 2 [L!=0]T IMPACT algorithm L=0 L=1; old=new old=new [L!=0] L=0; L=0; new++ new++ [new==old] [new!=old] control-flow graph 0T FL=0 Label error state with false, by refining labels on path [CAV 2006]

6 [L!=0]T 5[new!=old]T 4 L=0; L=0; new++ new++T 3 L=1; L=1;old=new T IMPACT algorithm L=0 L=1; old=new old=new [L!=0] L=0; L=0; new++ new++ [new==old] [new!=old] control-flow graph 0 12 L=0 [L!=0] F L=0 F L=0L=0T Cover: state 5 is subsumed by state 1.

T 11 [L!=0]T 10[new!=old]T 8 T IMPACT algorithm L=0 L=1; old=new old=new [L!=0] L=0; L=0; new++ new++ [new==old] [new!=old] control-flow graph L=0 L=1; L=1;old=new [L!=0] L=0; L=0; new++ new++ [new!=old] F L=0 6 [L!=0] F L=0 L=0 7[new==old]T old=newF old=new F T Another cover. Unwinding is now complete. 9 T Uncovered prefix of ART is now an inductive invariant.

Abstract counterexamples An abstract counterexample is an explanation of why a given proofs system (abstraction) does not prove a particular case. An unprovable path looks like this: 1111 2222 3333 4444 5555 usually cubes

...in predicate refinement We can use the abstract counterexample as a restriction in refinement x=0 x++ x++ x++ [x < 0] [x=0][x=0] [x=1] [x=2] [x  0,1,2] Restricted path, from PA({x=0,x=1,x=2}) Lowest-cost proof leads to divergence! Lowest-cost proof without restriction. {x=3}{False} {True} {0 · x} {False} Restricting paths can make the refiner’s job easier. However, it also skews the proof cost metric. This can cause the refiner to miss globally optimal proofs, leading to divergence.

Summary Abstraction and refinement can be thought of as two proof systems: –Abstractor is general, but incomplete –Refiner is specialized, but complete. –Analogous to BCP/conflict clause generation in CDCL The abstractor is eager, within narrowly restricted deduction system. –When proof fails, we must analyze failures to find a special case that fails. –These special case become goals for the refiner. Refiners can be viewed as focused local proof systems –The refiner's proof augments the abstractor's deduction system, adding terms, conjectures, hypotheses, etc. –A good refinement is a simple local proof

Summary, cont. SMT solvers have several important roles in this process. –Testing and strengthening conjectures (with UNSAT cores) –Refinement though feasible interpolation –Generating counterexamples when refinement fails In these roles, we exploit several qualities of modern SMT solvers –Ability to handle the needed theories, such as integers, bit-vectors, arrays. –Efficiency in checking large numbers of conjectures (incrementally) –CDCL and theory heuristics to find simple proofs and small cores Because of these capabilities, SMT solvers are the primary engines of proof in many software model checking tools.