SymDiff: A differential program verifier

Slides:



Advertisements
Similar presentations
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
Advertisements

Demand-driven inference of loop invariants in a theorem prover
Writing specifications for object-oriented programs K. Rustan M. Leino Microsoft Research, Redmond, WA, USA 21 Jan 2005 Invited talk, AIOOL 2005 Paris,
Technologies for finding errors in object-oriented software K. Rustan M. Leino Microsoft Research, Redmond, WA Lecture 3 Summer school on Formal Models.
Technologies for finding errors in object-oriented software K. Rustan M. Leino Microsoft Research, Redmond, WA Lecture 1 Summer school on Formal Models.
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Challenges in increasing tool support for programming K. Rustan M. Leino Microsoft Research, Redmond, WA, USA 23 Sep 2004 ICTAC Guiyang, Guizhou, PRC joint.
Copyright W. Howden1 Programming by Contract CSE 111 6/4/2014.
Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Synthesis, Analysis, and Verification Lecture 04c Lectures: Viktor Kuncak VC Generation for Programs with Data Structures “Beyond Integers”
A Program Transformation For Faster Goal-Directed Search Akash Lal, Shaz Qadeer Microsoft Research.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
1 Regression-Verification Benny Godlin Ofer Strichman Technion.
1 1 Regression Verification for Multi-Threaded Programs Sagar Chaki, SEI-Pittsburgh Arie Gurfinkel, SEI-Pittsburgh Ofer Strichman, Technion-Haifa Originally.
SymDiff: Leveraging Program Verification for Comparing Programs Shuvendu Lahiri Research in Software Engineering (RiSE), Microsoft Research, Redmond Contributors:
Automatic Predicate Abstraction of C-Programs T. Ball, R. Majumdar T. Millstein, S. Rajamani.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
CS 355 – Programming Languages
K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond, WA part 0 International Summer School Marktoberdorf Marktoberdorf,
Formal Methods of Systems Specification Logical Specification of Hard- and Software Prof. Dr. Holger Schlingloff Institut für Informatik der Humboldt.
Using and Building an Automatic Program Verifier K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond Lecture 1 LASER.
K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond, WA part 0 Summer School on Logic and Theorem-Proving in Programming.
Building a program verifier K. Rustan M. Leino Microsoft Research, Redmond, WA 10 May 2006 Guest lecture, Shaz Qadeer’s cse599f, Formal Verification of.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
PSUCS322 HM 1 Languages and Compiler Design II Formal Semantics Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Well-cooked Spaghetti: Weakest-Precondition of Unstructured Programs Mike Barnett and Rustan Leino Microsoft Research Redmond, WA, USA.
Software Verification Bertrand Meyer Chair of Software Engineering Lecture 2: Axiomatic semantics.
Chair of Software Engineering Automatic Verification of Computer Programs.
Using and Building an Automatic Program Verifier K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond Lecture 5 LASER.
Describing Syntax and Semantics
Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.
Differential and cross-version program verification Shuvendu Lahiri Research in Software Engineering (RiSE), Microsoft Research, Redmond, WA USA Halmstad.
CS 363 Comparative Programming Languages Semantics.
Using and Building an Automatic Program Verifier K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond Lecture 0 Marktoberdorf.
Chapter 3 Part II Describing Syntax and Semantics.
Semantics In Text: Chapter 3.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Student: Shaobo He, Advisor: Zvonimir Rakamarić TOWARDS AUTOMATED DIFFERENTIAL PROGRAM VERIFICATION FOR APPROXIMATE COMPUTING.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
C HAPTER 3 Describing Syntax and Semantics. D YNAMIC S EMANTICS Describing syntax is relatively simple There is no single widely acceptable notation or.
Differential program verification: Verifying properties of differences (instead of programs) Shuvendu Lahiri Research in Software Engineering (RiSE), Microsoft.
Spring 2017 Program Analysis and Verification
Dafny An automatic program verifier for functional correctness
Weakest Precondition of Unstructured Programs
Further with Hoare Logic Sections 6.12, 6.10, 6.13
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Reasoning About Code.
Reasoning about code CSE 331 University of Washington.
Automating Induction for Solving Horn Clauses
Relatively Complete Refinement Type System for Verification of Higher-Order Non-deterministic Programs Hiroshi Unno (University of Tsukuba) Yuki Satake.
Aspect Validation: Connecting Aspects and Formal Methods
Lecture 5 Floyd-Hoare Style Verification
Programming Languages 2nd edition Tucker and Noonan
Over-Approximating Boolean Programs with Unbounded Thread Creation
Hoare-style program verification
Semantics In Text: Chapter 3.
Resolution Proofs for Combinational Equivalence
Dafny An automatic program verifier for functional correctness
Predicate Transformers
Assertions References: internet notes; Bertrand Meyer, Object-Oriented Software Construction; 4/25/2019.
Programming Languages and Compilers (CS 421)
Model Checking and Its Applications
Programming Languages 2nd edition Tucker and Noonan
COP4020 Programming Languages
Presentation transcript:

SymDiff: A differential program verifier Shuvendu Lahiri Research in Software Engineering (RiSE), Microsoft Research, Redmond, WA USA Workshop on Program Equivalence (April 2016)

What is SymDiff? A platform for Leveraging and extending program verification to reason about “program differences” http://research.microsoft.com/symdiff Contributors C. Hawblitzel, K. McMillan, O. Strichman, Z. Rakamaric, S. He, Interns: R. Sharma, M. Kawaguchi, H. Rebelo, R. Sinha, N. Partush, A. Gyori,…

Differential verification Verifying properties of program differences instead of the program itself Motivations No specs! Proving assertions statically is harder (program-specific invariants, environment modeling, ..) Applications Program evolution Compilers (equivalence preserving, approximate) Comparing independent implementations New class of compilers that sacrifice equivalence for performance. Loop perforation : skip every other iteration. Where precision does not matter that much. In certain domains (images, video processing). Outputnew = \delta * oldimplementation. Two independent implementation of SSL. No structural similarities. (dawson angler)

Boogie program verifier+ Z3 SymDiff architecture Language agnostic: relies on translators from C/C#/Java/x86 to Boogie (bpl) P1.bpl P1P2.bpl Invariant inference Product P2.bpl P1P2.bpl + invs A bug is a failure of a differential property such as equivalence, DAC, relative termination etc. Spec: in a separate file. Prove mutual summary of f,f’, per pair of functions. If f’ was split to two you can still specify it. So it is an expression over the inputs/oututs of functions from both sides. Diff spec Boogie program verifier+ Z3

Language Subset of Boogie (an intermediate verification language) [FMCO’05] Commands x := E assert E assume E S;T goto L1, L2, … Ln //non-deterministic jump to labels call x,y := Foo(e1,e2,..) //procedure call Loops encoded as tail-recursive procedures Can encode arrays using SMT theory of arrays (sel/upd) x[e]  sel(x, e) x[y] := z  x := upd(x, y, z) x == y  i. sel(x,i) == sel(y,i) Extentional arrays.

Modeling imperative programs/heaps Various translators to Boogie: C (HAVOC/SMACK/VCC/..), JAVA (Joogie/..), C# (BCT) E.g., C Heap modeling in HAVOC [CHLQ, POPL’09] A pointer is represented as an integer (int) One heap map per scalar/pointer structure field and pointer type struct A { int f; int g;} x; Mem_A_f : [int]int Mem_A_g : [int]int Simple example C code x->f = 1; Boogie Mem_A_f[x + Offset(f,A)] := 1; A map for each field of the structure. A is the name of the structure.

Differential specs: mutual summaries void F1(int x1){ if(x1 < 100){ g1 := g1 + x1; F1(x1 + 1); }} void F2(int x2){ if(x2 < 100){ g2 := g2 + 2* x2; F2(x2 + 1); }} MS(F1, F2): (x1 = x2 && g1 <= g2 && x1 >= 0) ==> g1’ <= g2’ How the specification is written Last line: formal definition A specification over the I/O vocabulary of (F1,F2) Inputs: parameters, globals (g). Outputs: return values, next state of globals (g’). [Hawblitzel, Kawaguchi, Lahiri, Rebelo CADE’13]

And now... verification. Differential verification ⇒ single-program verification Leverage existing verification machinery: VC generation, SMT solvers Invariant inference to infer intermediate specifications A novel product (P1 x P2) construction [FSE’13] Allows Interprocedural reasoning Synchronizes at procedure boundaries only Can map one procedure to many procedures Symmetric products: relies on similarity of the cfg’s.

The product program f1 Instrument calls proc f1(x1): r1 modifies g1 { w1 := call h1(e1); t1 } Instrument calls f2 proc f2(x2): r2 modifies g2 { s2; L2: w2 := call h2(e2); t2 } Suppose we have two procedures f1 and f2 that call procedures h1 and h2 then we can compose them to obtain a joint procedure for f1 and f2 which looks like this. The details are in the paper but the most important part of this composition is that the joint procedure of f1 and f2 calls the joint procedure for h1 and h2. This transformation helps us prove the following result. The 2nd call h_1h_2 is only required for the proof. It may contain a specification. The specification will be a s post-condition of h1_h_2, What we see here is the product f1_f_2 Will give this to any invariant-generation tool. Replay, constrain, restore

Property of the product Let p1_p2 be the product of (p1, p2) Theorem: If S_p1_p2 is a summary of p1_p2, then it is a mutual summary of (p1, p2) Aids in differential specification A specification of the summary S_p1_p2 (e.g. partial equivalence) can be added as a postcondition of p1_p2 More importantly, aids differential invariant inference Can perform analysis on program P1xP2 to infer sound mutual summaries of (P1, P2) Can infer the intermediate summaries of intermediate procedures.

Automatic differential invariant inference Performing invariant inference on the product program Experience with Duality (infers invariants) Diverges e.g., (( 𝑥 1 =0∧ 𝑥 1 =1)∨ (𝑥 1 =1∧ 𝑥 2 =2)∨…) instead of ( 𝑥 1 < 𝑥 2 ) Current approach is based on predicate abstraction Infer Boolean combination of predicates, or Houdini: Conjunction over a predefined set of predicates Automatically provide all simple differential predicates: x1  x2, where x1 in p1, x2 in p2,   {=,≤,≥,⇒…} Houdini. You have to provide the predicates.

Applications (1 / 3) Equivalence checking Compiler translation validation [CADE’13] Cross-version compiler validation by comparing binaries [FSE’13] Translation from binary to Boogie. Semantic slicing. Taint – things that changed.

Example (equivalence checking) f1(n) { if (n == 0) { return 1; } else { return n * f1(n - 1); } main(n) {return f1(n);} f2(n, a) { if (n == 0) { return a; } else { return f2(n - 1, a * n); } main(n) {return f2(n,1);} MS(f1, f2): (n1 == n2 ⇒ a2*r1’ == r2’) involves non-trivial diff specs user only provides predicate (a2*r1’ == r2’) Spec MS(main1, main2) = (n1 == n2 ⇒ r1’ == r2’)

Applications (2 / 3) Differential assertion checking [FSE’13] Translation from binary to Boogie. Semantic slicing. Taint – things that changed.

Differential Assertion Checking (DAC) Lahiri et al. FSE’13, Joshi, Lahiri, Lal POPL’12 Correctness  Relative correctness An input that can* satisfy p, cannot fail p’. Note: asymmetric check How? Replace assert A with ok := ok && A; Write a mutual summary: MS(f1,f2) = ((i1==i2 && ok1’) ==> ok2’)) Originally i1 == i2 && g1 = g2 => ok1 = ok2. We instead say that globals are part of the inputs. * Nondeterminism

DAC application: verifying bug fixes Does a fix inadvertently introduce new bugs? Verisec suite: “snippets of open source programs which contain buffer overflow vulnerabilities, and corresponding patched versions.” Relative memory safety (e.g. buffer overflow) checking Snippets from apache, madwifi, sendmail etc. Verified several bug fixes using automatic differential invariant inference First, let us talk about verifying bug fixes. We are interested in answering the question whether a bug fix can inadvertently introduce new errors. For this case study we use the Verisec benchmark suite. This suite has buggy and patched versions of snippets of open source software. Since the bugs are buffer overflow errors, we validate relative buffer overflows in the patched version w.r.t. the buggy version. Our tool is able to automatically prove the relative correctness of these snippets thus ensuring that a new buffer overflow vulnerability was not introduced during the fix. For more details please refer to the paper but to give an idea of these patches, I will show an example.

Example: DAC int main_buggy() { … fb := 0; while(c1=read()!=EOF) fbuf[fb] = c1; fb++; } int main_patched() { … fb := 0; while(c1=read()!=EOF) fbuf[fb] = c1; fb++; if(fb >= MAX) fb = 0; } Differential loop invariant: fb2  fb1 Here is a buggy version for one of the benchmarks in the verisec suite. The variable fb increases beyond bound and can overflow fbuf. In the patched version, fb is re-initialized to zero when it exceeds a bound. Houdini is able to infer that fb of the patched version always has a value less than or equal to fb of the buggy version and use this invariant to automatically prove the relative buffer overflow specification. Buffer Overflow Can verify (relative) memory safety automatically, without manual preconditions

Applications (3/3) Current research: Verifying approximate transformations Preserve assertions, termination and accuracy [w/ Rakamaric, He] Semantic change impact analysis Injecting change semantics into a dataflow-based taint analysis [w/ Partush, Gyori] Translation from binary to Boogie. Semantic slicing. Taint – things that changed.