SLAM internals Sriram K. Rajamani Rigorous Software Engineering Microsoft Research, India.

Slides:



Advertisements
Similar presentations
A SAT characterization of boolean-program correctness K. Rustan M. Leino Microsoft Research, Redmond, WA 14 Nov 2002 IFIP WG 2.4 meeting, Schloβ Dagstuhl,
Advertisements

Advanced programming tools at Microsoft
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Abstraction of Source Code (from Bandera lectures and talks)
Semantics Static semantics Dynamic semantics attribute grammars
Bebop: A Symbolic Model Checker for Boolean Programs Thomas Ball Sriram K. Rajamani
1 Lecture 08(c) – Predicate Abstraction and SLAM Eran Yahav.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Chair of Software Engineering Software Verification Stephan van Staden Lecture 10: Model Checking.
Automatic Predicate Abstraction of C-Programs T. Ball, R. Majumdar T. Millstein, S. Rajamani.
1 Automatic Predicate Abstraction of C Programs Parts of the slides are from
ISBN Chapter 3 Describing Syntax and Semantics.
Verification of parameterised systems
Symmetry-Aware Predicate Abstraction for Shared-Variable Concurrent Programs Alastair Donaldson, Alexander Kaiser, Daniel Kroening, and Thomas Wahl Computer.
CS 355 – Programming Languages
The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.
Using Statically Computed Invariants Inside the Predicate Abstraction and Refinement Loop Himanshu Jain Franjo Ivančić Aarti Gupta Ilya Shlyakhter Chao.
Synergy: A New Algorithm for Property Checking
Katz Formal Specifications Larch 1 Algebraic Specification and Larch Formal Specifications of Complex Systems Shmuel Katz The Technion.
Secrets of Software Model Checking Thomas Ball Sriram K. Rajamani Software Productivity Tools Microsoft Research
1 Predicate Abstraction of ANSI-C Programs using SAT Edmund Clarke Daniel Kroening Natalia Sharygina Karen Yorav (modified by Zaher Andraus for presentation.
Counter Example Guided Refinement CEGAR Mooly Sagiv.
CS 267: Automated Verification Lectures 14: Predicate Abstraction, Counter- Example Guided Abstraction Refinement, Abstract Interpretation Instructor:
Automatically Validating Temporal Safety Properties of Interfaces Thomas Ball and Sriram K. Rajamani Software Productivity Tools, Microsoft Research Presented.
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Predicate Abstraction for Software and Hardware Verification Himanshu Jain Model checking seminar April 22, 2005.
ESC Java. Static Analysis Spectrum Power Cost Type checking Data-flow analysis Model checking Program verification AutomatedManual ESC.
Software Verification with BLAST Tom Henzinger Ranjit Jhala Rupak Majumdar.
Review: forward E { P } { P && E } TF { P && ! E } { P 1 } { P 2 } { P 1 || P 2 } x = E { P } { \exists … }
Review: forward E { P } { P && E } TF { P && ! E } { P 1 } { P 2 } { P 1 || P 2 } x = E { P } { \exists … }
Describing Syntax and Semantics
Automatic Predicate Abstraction of C Programs Thomas BallMicrosoft Rupak MajumdarUC Berkeley Todd MillsteinU Washington Sriram K. RajamaniMicrosoft
Model Checking Lecture 5. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.
Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Deciding a Combination of Theories - Decision Procedure - Changki pswlab Combination of Theories Daniel Kroening, Ofer Strichman Presented by Changki.
Software Model Checking with SLAM Thomas Ball Testing, Verification and Measurement Sriram K. Rajamani Software Productivity Tools Microsoft Research
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 3: Modular Verification with Magic, Predicate Abstraction.
Rule Checking SLAM Checking Temporal Properties of Software with Boolean Programs Thomas Ball, Sriram K. Rajamani Microsoft Research Presented by Okan.
Aditya V. Nori, Sriram K. Rajamani Microsoft Research India.
SAT & SMT. Software Verification & Testing SAT & SMT Test case generation Predicate Abstraction Verifying Compilers.
Newton: A tool for generating abstract explanations of infeasibility1 The Problem P (C Program) BP (Boolean Program of P) CFG(P) CFG(BP)
CS 363 Comparative Programming Languages Semantics.
Thomas Ball Sriram K. Rajamani
Programming Languages and Paradigms Imperative Programming.
Automatically Validating Temporal Safety Properties of Interfaces Thomas Ball, Sriram K. MSR Presented by Xin Li.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
Chapter 3 Part II Describing Syntax and Semantics.
Localization and Register Sharing for Predicate Abstraction Himanshu Jain Franjo Ivančić Aarti Gupta Malay Ganai.
The Yogi Project Software property checking via static analysis and testing Aditya V. Nori, Sriram K. Rajamani, Sai Deep Tetali, Aditya V. Thakur Microsoft.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding Combined Theories.
1 Automatically Validating Temporal Safety Properties of Interfaces - Overview of SLAM Parts of the slides are from
/ PSWLAB Thread Modular Model Checking by Cormac Flanagan and Shaz Qadeer (published in Spin’03) Hong,Shin Thread Modular Model.
Counter Example Guided Refinement CEGAR Mooly Sagiv.
Boolean Programs: A Model and Process For Software Analysis By Thomas Ball and Sriram K. Rajamani Microsoft technical paper MSR-TR Presented by.
#1 Having a BLAST with SLAM. #2 Software Model Checking via Counter-Example Guided Abstraction Refinement Topic: Software Model Checking via Counter-Example.
The software model checker BLAST Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar Presented by Yunho Kim TexPoint fonts used in EMF. Read.
Having a BLAST with SLAM
Software Model Checking with SLAM
Symbolic Implementation of the Best Transformer
Over-Approximating Boolean Programs with Unbounded Thread Creation
Semantics In Text: Chapter 3.
Abstractions from Proofs
Predicate Abstraction
Course: CS60030 Formal Systems
Presentation transcript:

SLAM internals Sriram K. Rajamani Rigorous Software Engineering Microsoft Research, India

C- Types  ::= void | bool | int | ref  Expressionse::=c | x | e 1 op e 2 | &x | *x LExpression l ::= x | *x Declarationd::=  x 1,x 2,…,x n Statementss::=skip | goto L 1,L 2 …L n | L: s | assume(e) | l = e | l = f (e 1,e 2,…,e n ) |return x |s 1 ; s 2 ;… ; s n Procedures p ::=  f (x 1 :  1,x 2 :  2,…,x n :  n ) Programg ::= d 1 d 2 … d n p 1 p 2 … p n

C-- Types  ::= void | bool | int Expressionse::=c | x | e 1 op e 2 LExpression l ::= x Declarationd::=  x 1,x 2,…,x n Statementss::=skip | goto L 1,L 2 …L n | L: s | assume(e) | l = e |f (e 1,e 2,…,e n ) |return |s 1 ; s 2 ;… ; s n Procedures p ::= f (x 1 :  1,x 2 :  2,…,x n :  n ) d s Programg ::= d 1 d 2 … d n p 1 p 2 … p n

BP Types  ::= void | bool Expressionse::=c | x | e 1 op e 2 LExpression l ::= x Declarationd::=  x 1,x 2,…,x n Statementss::=skip | goto L 1,L 2 …L n | L: s | assume(e) | l = e |f (e 1,e 2,…,e n ) |return |s 1 ; s 2 ;… ; s n Procedures p ::= f (x 1 :  1,x 2 :  2,…,x n :  n ) d s Programg ::= d 1 d 2 … d n p 1 p 2 … p n

Syntactic sugar if (e) { S1; } else { S2; } S3; goto L1, L2; L1: assume(e); S1; goto L3; L2: assume(!e); S2; goto L3; L3: S3;

Example, in C void cmp (int a, int b) { if (a == b) g = 0; else g = 1; } int g; main(int x, int y){ cmp(x, y); if (!g) { if (x != y) assert(0); }

void cmp(int a, int b) { goto L1, L2; L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Example, in C--

c2bp: Predicate Abstraction for C Programs Given P : a C program F = {e 1,...,e n } –each e i a pure boolean expression –each e i represents set of states for which e i is true Produce a boolean program B(P,F) same control-flow structure as P boolean vars {b 1,...,b n } to match {e 1,...,e n } properties true of B(P,F) are true of P

Assumptions Given P : a C program F = {e 1,...,e n } –each e i a pure boolean expression –each e i represents set of states for which e i is true Assume: each e i uses either: –only globals (global predicate) –local variables from some procedure (local predicate for that procedure) Mixed predicates: –predicates using both local variables and global variables –complicate “return” processing –covered in Step 2

C2bp Algorithm Performs modular abstraction –abstracts each procedure in isolation Within each procedure, abstracts each statement in isolation –no control-flow analysis –no need for loop invariants

void cmp (int a, int b) { goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Preds: {x==y} {g==0} {a==b}

void cmp (int a, int b) { goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } decl {g==0} ; main( {x==y} ) { } void cmp ( {a==b} ) { } Preds: {x==y} {g==0} {a==b}

void equal (int a, int b) { goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } decl {g==0} ; main( {x==y} ) { cmp( {x==y} ); assume( {g==0} ); assume( !{x==y} ); assert(0); } void cmp ( {a==b} ) { goto L1, L2; L1: assume( {a==b} ); {g==0} = T; return; L2: assume( !{a==b} ); {g==0} = F; return; } Preds: {x==y} {g==0} {a==b}

C-- Types  ::= void | bool | int Expressionse::=c | x | e 1 op e 2 LExpression l ::= x Declarationd::=  x 1,x 2,…,x n Statementss::=skip | goto L 1,L 2 …L n | L: s | assume(e) | l = e | f (e 1,e 2,…,e n ) | return | s 1 ; s 2 ;… ; s n Procedures p ::= f (x 1 :  1,x 2 :  2,…,x n :  n ) Programg ::= d 1 d 2 … d n p 1 p 2 … p n

Abstracting Assignments Suppose you are given an assignment s if Implies F (WP(s, e i )) is true before s then –e i is true after s if Implies F (WP(s, !e i )) is true before s then –e i is false after s {e i } = Implies F (WP(s, e i )) ?true : Implies F (WP(s, !e i )) ? false : *;

Abstracting Expressions via F F = { e 1,...,e n } Implies F (e) –best boolean function over F that implies e ImpliedBy F (e) –best boolean function over F implied by e –ImpliedBy F (e) = !Implies F (!e)

Implies F (e) and ImpliedBy F (e) e Implies F (e) ImpliedBy F (e)

Computing Implies F (e) F = { e 1,...,e n } minterm m = d 1 &&... && d n –where d i = e i or d i = !e i Implies F (e) –disjunction of all minterms that imply e Naïve approach –generate all 2 n possible minterms –for each minterm m, use decision procedure to check validity of each implication m  e Many optimizations possible

do { KeAcquireSpinLock(); nPacketsOld = nPackets; b = true; if(request){ request = request->Next; KeReleaseSpinLock(); nPackets++; b = b ? false : *; } } while (nPackets != nPacketsOld); !b KeReleaseSpinLock(); Example Add new predicate to boolean program (c2bp) b : (nPacketsOld == nPackets) U L L L L U L U U U E

Assignment Example Statement in P: Predicates in F: y = y+1; {x==y} Weakest Precondition: WP(y=y+1, x==y) = x==y+1 Implies F ( x==y+1 ) = false Implies F ( x!=y+1 ) = x==y Abstraction of assignment in B: {x==y} = {x==y} ? false : *;

Absracting Assumes WP( assume(e), Q) = e  Q assume(e) is abstracted to: assume( ImpliedBy F (e) ) Example: F = {x==2, x<5} assume(x < 2) is abstracted to: assume( {x<5} && !{x==2} )

Abstracting Procedures Each predicate in F is annotated as being either global or local to a particular procedure Procedures abstracted in two passes: –a signature is produced for each procedure in isolation –procedure calls are abstracted given the callees’ signatures

Abstracting a procedure call Procedure call –a sequence of assignments from actuals to formals –see assignment abstraction Procedure return –NOP for C-- with assumption that all predicates mention either only globals or only locals –with pointers and with mixed predicates: Most complicated part of c2bp Covered in Step 2

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } decl {g==0} ; main( {x==y} ) { cmp( {x==y} ); assume( {g==0} ); assume( !{x==y} ); assert(0); } void cmp ( {a==b} ) { Goto L1, L2 L1: assume( {a==b} ); {g==0} = T; return; L2: assume( !{a==b} ); {g==0} = F; return; } {x==y} {g==0} {a==b}

Precision loss during predicate abstraction For program P and F = {e 1,...,e n }, there exist two “ideal” abstractions: –Boolean(P,F) : most precise abstraction –Cartesian(P,F) : less precise abstraction, where each boolean variable is updated independently Theory: –with an “ideal” theorem prover, c2bp can compute Cartesian(P,F) Practice: –c2bp computes a less precise abstraction than Cartesian(P,F) –we use Das/Dill’s technique to incrementally improve precision –with an “ideal” theorem prover, the combination of c2bp + Das/Dill can compute Boolean(P,F)

Cartesian abstraction F = { (x!=5), (y 5)} Statement: y = x Abstract statement in the boolean program: {y < 5} = {x != 5} ? * : false {y > 5} = {x != 5} ? * : false Suppose {x != 5} is true before the assignment Then, only one of {y 5} can be true after the assignment This correlation is lost in cartesian abstraction

prog. P’ prog. P SLIC rule The SLAM Process boolean program path predicates slic c2bp bebop newton

Bebop Model checker for boolean programs Based on CFL reachability Explicit representation of CFG Implicit representation of path edges and summary edges Generation of hierarchical error traces

Domains g  Global l  Local f  Frame s  Stack = Frame* State = Global X Local X Stack Transitions: T  (Global X Local) X ( Global X Local) T +  Local X (Local X Frame) T -  (Local X Frame) X Local

Transition Relation STEP T (g, l, g’, l’) ________________ (g,l,s)  (g’,l’, s) PUSH T + (l, l’, f) __________________ (g,l,s)  (g’,l’, s.f) POP T - (l, f, l’’) _______________________ (g,l,s.f)  (g’,l’, s) g  Global l  Local f  Frame s  Stack = Frame* State = Global X Local X Stack Transitions: T  (Global X Local) X ( Global X Local) T +  Local X (Local X Frame) T -  (Local X Frame) X Local

Naïve model checking STEP T (g, l, g’, l’) ________________ (g,l,s)  (g’,l’, s) PUSH T + (l, l’, f) __________________ (g,l,s)  (g’,l’, s.f) POP T - (l, f, l’’) _______________________ (g,l,s.f)  (g’,l’, s) Start with initial state: (g 0,l 0,  ) Find all the states s such that (g 0,l 0,  )  * s Even if Globals and Locals are finite, the set of reachable states could be infinite since the stack can grow infinite No guarantee for termination Still, assertion checking is decidable Need to use a different algorithm (CFL reachability)

CFL reachability CFL-STEP P(g 1, l 1, g 2, l 2 ) T(g 2, l 2, g 3, l 3 ) ________________ P(g 1, l 1, g 3, l 3 ) CFL-PUSH P(g 1, l 1, g 2, l 2 ) T + (l 2, l 3,f, ) __________________ P(g 2, l 3, g 2, l 3 ) CFL-SUM P(g 1, l 1, g 2, l 2 ) T - (l 2, f, l 3, ) __________________ Sum(g 1, l 1, f, g 2, l 3 ) CFL-POP P(g 1, l 1, g 2, l 2 ) T + (l 2, l 3,f, ) Sum( g 2, l 3, f, g 3, l 4 ) _______________________ P(g 1, l 1, g 3, l 4 ) P  (Global X Local) X ( Global X Local) Sum  (Global X Local) X Frame X ( Global X Local) CFL-INIT _______________ P(g 0, l 0, g 0, l 0 ) [Sharir-Pnueli 81] [Reps-Sagiv-Horwitz 95]

decl g; void main() decl u,v; [1] u := !v; [2] equal(u,v); [3] if (g) then R: skip; fi [4] return; void equal(a, b) [5] if (a = b) then [6] g := 1; else [7] g := 0; fi [8] return; [1] [2] u!=v g=0 [5] [7] [3] [4] [1] [8] [7] u=v g=0 u!=v g=1 u=v g=1

Symbolic CFL reachability  Partition path edges by their “target”  PE(v) = { |  }  What is for boolean programs?  A bit-vector!  What is PE(v)?  A set of bit-vectors  Use a BDD (attached to v) to represent PE(v)

decl g; void main() decl u,v; [1] u := !v; [2] equal(u,v); [3] if (g) then R: skip; fi [4] return; void equal(a, b) [5] if (a = b) then [6] g := 1; else [7] g := 0; fi [8] return; [1] [2] u!=v g=0 [5] [7] [3] [4] [1] [8] [7] u=v g=0 u!=v g=1 u=v g=1 g=g’& u=u’& v=v’ g=g’& v!=u’ & v=v’ g=g’ & a=a’ & b=b’ & a!=b g’=0 & v’!=u’ & v=v’ g’=0 & v!=u’ & v=v’ g=g’& a=a’ & b=b’& a!=b g’=0 & a=a’ & b=b’& a!=b

Bebop: summary  Explicit representation of CFG  Implicit representation of path edges and summary edges  Generation of hierarchical error traces  Complexity: O(E * 2 O(N) )  E is the size of the CFG  N is the max. number of variables in scope

prog. P’ prog. P SLIC rule The SLAM Process boolean program path predicates slic c2bp bebop newton

Type 1: False error trace due to approximate predicate abstraction

Type 2: False error trace due to insufficient predicates

prog. P’ prog. P SLIC rule The SLAM Process (more accurate) boolean program path predicates slic c2bp bebop newton

Type 1: False error trace due to approximate predicate abstraction

With this technique we can be as precise as boolean abstraction (rather than cartesian)

Type 2: False error trace due to insufficient predicates

prog. P’ prog. P SLIC rule The SLAM Process (more accurate) boolean program path predicates slic c2bp bebop newton

Newton Given an error path p in boolean program B –is p a feasible path of the corresponding C program? Yes: found an error No: find predicates that explain the infeasibility –uses the same interfaces to the theorem provers as c2bp.

Newton Execute path symbolically Check conditions for inconsistency using theorem prover (satisfiability) After detecting inconsistency: –minimize inconsistent conditions –traverse dependencies –obtain predicates

do { KeAcquireSpinLock(); if(*){ KeReleaseSpinLock(); } } while (*); KeReleaseSpinLock(); Example Model checking boolean program (bebop) U L L L L U L U U U E

do { KeAcquireSpinLock(); nPacketsOld = nPackets; if(request){ request = request->Next; KeReleaseSpinLock(); nPackets++; } } while (nPackets != nPacketsOld); KeReleaseSpinLock(); Example Is error path feasible in C program? (newton) U L L L L U L U U U E

do { KeAcquireSpinLock(); nPacketsOld = nPackets; b = true; if(request){ request = request->Next; KeReleaseSpinLock(); nPackets++; b = b ? false : *; } } while (nPackets != nPacketsOld); !b KeReleaseSpinLock(); Example Add new predicate to boolean program (c2bp) b : (nPacketsOld == nPackets) U L L L L U L U U U E

Symbolic simulation for C-- Domains –variables: names in the program –values: constants + symbols State of the simulator has 3 components: –store: map from variables to values –conditions: predicates over symbols –history: past valuations of the store

Symbolic simulation Algorithm Input: path p For each statement s in p do match s with Assign(x,e): let val = Eval(e) in if (Store[x]) is defined then History[x] := History[x]  Store[x] Store[x] := val Assume(e): let val = Eval(e) in Cond := Cond and val let result = CheckConsistency(Cond) in if (result == “inconsistent”) then GenerateInconsistentPredicates() End Say “Path p is feasible”

Symbolic Simulation: Caveats Procedure calls –add a stack to the simulator –push and pop stack frames on calls and returns –implement mappings to keep values “in scope” at calls and returns Dependencies –for each condition or store, keep track of which values where used to generate this value –traverse dependency during predicate generation

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); }

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Global: main: (1)x: X (2)y: Y Conditions :

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Global: main: (1)x: X (2)y: Y cmp : (3)a: A (4)b: B Conditions : Map : X  A Y  B

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Global: (6) g: 0 main: (1)x: X (2)y: Y cmp : (3)a: A (4)b: B Conditions : (5)(A == B) [3, 4] Map : X  A Y  B

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Global: (6) g: 0 main: (1)x: X (2)y: Y cmp : (3)a: A (4)b: B Conditions : (5)(A == B) [3, 4] (6)(X == Y) [5] Map : X  A Y  B

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Global: (6) g: 0 main: (1)x: X (2)y: Y cmp : (3)a: A (4)b: B Conditions : (5)(A == B) [3, 4] (6)(X == Y) [5] (7) (X != Y) [1, 2] Contradictory!

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Global: (6) g: 0 main: (1)x: X (2)y: Y cmp : (3)a: A (4)b: B Conditions : (5)(A == B) [3, 4] (6)(X == Y) [5] (7) (X != Y) [1, 2] Contradictory!

void cmp (int a, int b) { Goto L1, L2 L1: assume(a==b); g = 0; return; L2: assume(a!=b); g = 1; return; } int g; main(int x, int y){ cmp(x, y); assume(!g); assume(x != y) assert(0); } Predicates after simplification: { x == y, a == b }

Refinement Using Interpolants Given formulas  -,  + s.t.  -   + is unsatisfiable An interpolant for  -,  + is a formula  s.t. 1.  - implies  2.  has symbols common to  -,  + 3.    + is unsatisfiable  computable from proof of unsat. of  -   + [Henzinger-Jhala-Majumdar-McMillan POPL 04] give a way to generate predicates for refinement from infeasible traces using interpolants

Interpolants: Example (1) pc 1 : x = ctr pc 2 : ctr = ctr + 1 pc 3 : y = ctr pc 4 : assume(x = i-1) pc 5 : assume(y  i) TraceTrace Formula Cut + Interpolate at each point Pred. Map: pc i  Interpolant from cut i -- ++ Interpolate x 1 = ctr 0 Predicate Map pc 2 : x= ctr x 1 = ctr 0  ctr 1 = ctr  y 1 = ctr 1  x 1 = i 0 – 1  y 1  i 0

pc 1 : x = ctr pc 2 : ctr = ctr + 1 pc 3 : y = ctr pc 4 : assume(x = i-1) pc 5 : assume(y  i) TraceTrace Formula Cut + Interpolate at each point Pred. Map: pc i  Interpolant from cut i -- ++ Interpolate Predicate Map pc 2 : x = ctr pc 3 : x= ctr-1 x 1 = ctr 1 -1 x 1 = ctr 0  ctr 1 = ctr  y 1 = ctr 1  x 1 = i 0 – 1  y 1  i 0 Interpolants: Example (2)

-- ++ Interpolate  pc 1 : x = ctr pc 2 : ctr = ctr + 1 pc 3 : y = ctr pc 4 : assume(x = i-1) pc 5 : assume(y  i) TraceTrace Formula y 1 = x x 1 = ctr 0  ctr 1 = ctr  y 1 = ctr 1  x 1 = i 0 – 1  y 1  i 0 Predicate Map pc 2 : x = ctr pc 3 : x= ctr-1 pc 3 : y= x+1 Cut + Interpolate at each point Pred. Map: pc i  Interpolant from cut i Interpolants: Example (3)

pc 1 : x = ctr pc 2 : ctr = ctr + 1 pc 3 : y = ctr pc 4 : assume(x = i-1) pc 5 : assume(y  i) TraceTrace Formula Cut + Interpolate at each point Pred. Map: pc i  Interpolant from cut i -- ++ Interpolate Predicate Map pc 2 : x = ctr pc 3 : x = ctr-1 pc 4 : y = x+1 pc 5 : y= i y 1 = i 0 x 1 = ctr 0  ctr 1 = ctr  y 1 = ctr 1  x 1 = i 0 – 1  y 1  i 0 Interpolants: Example (4)

pc 1 : x = ctr pc 2 : ctr = ctr + 1 pc 3 : y = ctr pc 4 : assume(x = i-1) pc 5 : assume(y  i) TraceTrace Formula Predicate Map pc 2 : x = ctr pc 3 : x = ctr-1 pc 4 : y = x+1 pc 5 : y = i Theorem: Predicate map makes trace abstractly infeasible x 1 = ctr 0  ctr 1 = ctr  y 1 = ctr 1  x 1 = i 0 – 1  y 1  i 0 Interpolants: Example (5)

Homework Weakest Precondition: WP( *p=3, x==5 ) = ?? What if *p and x alias? Statement in P:Predicates in E: *p = 3 {x==5}

Advanced Topics: Pointers

prog. P’ prog. P SLIC rule So far.. boolean program path predicates slic c2bp bebop newton

C-- Types  ::= void | bool | int Expressionse::=c | x | e 1 op e 2 LExpression l ::= x Declarationd::=  x 1,x 2,…,x n Statementss::=skip | goto L 1,L 2 …L n | L: s | assume(e) | l = e |f (e 1,e 2,…,e n ) |return |s 1 ; s 2 ;… ; s n Procedures p ::= f (x 1 :  1,x 2 :  2,…,x n :  n ) d s Programg ::= d 1 d 2 … d n p 1 p 2 … p n

C- Types  ::= void | bool | int | ref  Expressionse::=c | x | e 1 op e 2 | &x | *x LExpression l ::= x | *x Declarationd::=  x 1,x 2,…,x n Statementss::=skip | goto L 1,L 2 …L n | L: s | assume(e) | l = e | l = f (e 1,e 2,…,e n ) |return x |s 1 ; s 2 ;… ; s n Procedures p ::=  f (x 1 :  1,x 2 :  2,…,x n :  n ) d s Programg ::= d 1 d 2 … d n p 1 p 2 … p n

Abstracting Procedure Returns Let a be an actual at call-site P(…) –pre(a) = the value of a before transition to P Let f be a formal of a procedure P –pre(f) = the value of f upon entry to P

int R (int f) { int r = f+1; f = 0; return r; } WP(x=r, x==2) = r==2 Q() { int x = 1; x = R(x) } pre(f) == x x = r {x==1} {x==2} {f==pre(f)} {r==pre(f)+1} predicatecall/return relationcall/return assign {x==2} = s & {x==1}; } bool R ( {f==pre(f)} ) { {r==pre(f)+1} = {f==pre(f)}; {f==pre(f)} = *; return {r==pre(f)+1}; } WP(f=x, f==pre(f) ) = x==pre(f) f = x x==pre(f) is true at the call to R pre(f)==x and x==1 and r==pre(f)+1 implies r==2 Q() { {x==1},{x==2} = T,F; s = R(T); {x==1} s

int R (int f) { int r = f+1; f = 0; return r; } WP(x=r, x==2) = r==2 Q() { int x = 1; x = R(x) } pre(f) == x x = r {x==1} {x==2} {f==pre(f)} {r==pre(f)+1} predicatecall/return relationcall/return assign bool R ( {f==pre(f)} ) { {r==pre(f)+1} = {f==pre(f)}; {f==pre(f)} = *; return {r==pre(f)+1}; } WP(f=x, f==pre(f) ) = x==pre(f) f = x x==pre(f) is true at the call to R pre(f)==x and x==1 and r==pre(f)+1 implies r==2 Q() { {x==1},{x==2} = T,F; s = R(T); {x==1} s {x==1}, {x==2} = *, s & {x==1}; }

Pointers and SLAM With pointers, C supports call by reference –Strictly speaking, C supports only call by value –With pointers and the address-of operator, one can simulate call-by-reference Boolean programs support only call-by-value-result –SLAM mimics call-by-reference with call-by-value-result Extra complications: –address operator (&) in C –multiple levels of pointer dereference in C

What changes with pointers? C2bp –abstracting assignments –abstracting procedure returns Newton –simulation needs to handle pointer accesses –need to copy local heap across scopes to match Bebop’s semantics –need to refine imprecise alias analysis using predicates Bebop –remains unchanged!

Recall : Abstracting Assignments Suppose you are given an assignment s if Implies F (WP(s, e i )) is true before s then –e i is true after s if Implies F (WP(s, !e i )) is true before s then –e i is false after s {e i } = Implies F (WP(s, e i )) ?true : Implies F (WP(s, !e i )) ? false : *;

Assignments + Pointers Weakest Precondition: WP( *p=3, x==5 ) = x==5 What if *p and x alias? We use Das’s pointer analysis [PLDI 2000] to prune disjuncts representing infeasible alias scenarios. Correct Weakest Precondition: (p==&x and 3==5) or (p!=&x and x==5) Statement in P:Predicates in E: *p = 3 {x==5}

Abstracting Procedure Return Need to account for –lhs of procedure call –mixed predicates –side-effects of procedure Boolean programs support only call-by- value-result –C2bp models all side-effects using return processing

Extending Pre-states Suppose formal parameter is a pointer –eg. P(int *f) pre( *f ) –value of *f upon entry to P –can’t change during P * pre( f ) –value of dereference of pre( f ) –can change during P

int R (int *a) { *a = *a+1; } Q() { int x = 1; R(&x); } pre(a) = &x {x==1} {x==2} predicatecall/return relationcall/return assign {x==2} = s & {x==1}; } bool R ( {a==pre(a)}, {*pre(a)==pre(*a)} ) { {*pre(a)==pre(*a)+1} = {*pre(a)==pre(*a)} ; return {*pre(a)==pre(*a)+1}; } a = &x Q() { {x==1},{x==2} = T,F; s = R(T,T); {a==pre(a)} {*pre(a)==pre(*a)} {*pre(a)=pre(*a)+1} pre(x)==1 and pre(*a)==pre(x) and *pre(a)==pre(*a)+1 and pre(a)==&x implies x==2 {x==1 } s

Newton: what changes with pointers? Simulation needs to handle pointer accesses Need to copy local heap across scopes to match Bebop’s semantics

void foo (int *a) { assume(*a > 5); assert(0); } main(int *x){ assume(*x < 5); foo(x); }

Contradictory! void foo (int *a) { assume(*a > 5); assert(0); } main(int *x){ assume(*x < 5); foo(x); } main: (1) x: X (2)*X: Y [1] foo : (3) a: A (4)*A: B [3] Conditions : (5)(Y< 5) [1,2] (6)(B < 5) [3,4,5] (7)(B > 5) [3,4] Predicates after simplification: *x 5

prog. P’ prog. P SLIC rule Non progress.. boolean program path predicates slic c2bp bebop newton

Non-progress due to alias analysis imprecision x = 0; y = 0; p = &y *p = *p + 1; assume(x!=0); p = &x; {x==0} {x==0} = T; skip; {x==0} = *; assume( !{x==0} ); skip; Pts-to(p) = { &x, &y}

x = 0; y = 0; p = &y if (p == &x) x = x + 1; else y = y + 1; assume(x!=0); p = &x; x = 0; y = 0; p = &y *p = *p + 1; assume(x!=0); p = &x; {x==0} = T; skip; {x==0} = *; assume(!{x==0} ); skip; Non-progress due to alias analysis imprecision

Consider values in abstract trace x = 0; y = 0; p = &y *p = *p + 1; assume(x!=0); p = &x; {x==0} = T; skip; {x==0} = *; assume(!{x==0} ); skip; {x==0} is true {x==0} is false INCONSISTENT

Consider values in abstract trace x = 0; y = 0; p = &y *p = *p + 1; assume(x!=0); p = &x; {x==0} = T; skip; {x==0} = *; assume(!{x==0} ); skip; WP(*p = *p +1, !(x==0) ) = ((p == &x) and !(x==-1)) or ((p != &x) and !(x==0)) (p!=&x) gets added as a predicate

Generalization of refinement predicates main() { int x; x = 0; while(*) { x++; } assert(x >= 0); } Iterative refinement produces predicates: (x == 0), (x == 1), (x ==2) … Need to generate (x >= 0) automatically! Difficult to come up with general methods to do such generalizations

Areas for future work Abstraction-refinement loop with richer models (than just booleans) for solving the progress problem with pointers Combination of over-approximation and under- approximation based methods to avoid refining in irrelevant places in the program