Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: An Appetizer Verification/Analysis tools need some form of Symbolic Reasoning.

Slides:



Advertisements
Similar presentations
SMELS: Sat Modulo Equality with Lazy Superposition Christopher Lynch – Clarkson Duc-Khanh Tran - MPI.
Advertisements

Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Demand-driven inference of loop invariants in a theorem prover
Technologies for finding errors in object-oriented software K. Rustan M. Leino Microsoft Research, Redmond, WA Lecture 0 Summer school on Formal Models.
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Leonardo de Moura Microsoft Research. Z3 is a new solver developed at Microsoft Research. Development/Research driven by internal customers. Free for.
The Model Evolution Calculus with Built-in Theories Peter Baumgartner MPI Informatik, Saarbrücken
Satisfiability Modulo Theories and Network Verification Nikolaj Bjørner Microsoft Research Formal Methods and Networks Summer School Ithaca, June
Semantics Static semantics Dynamic semantics attribute grammars
Finding bugs: Analysis Techniques & Tools Symbolic Execution & Constraint Solving CS161 Computer Security Cho, Chia Yuan.
Satisfiability Modulo Theories (An introduction)
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Linked List Implementation class List { private List next; private Object data; private static List root; private static int size; public static void addNew(Object.
UIUC CS 497: Section EA Lecture #2 Reasoning in Artificial Intelligence Professor: Eyal Amir Spring Semester 2004.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
Methods of Proof Chapter 7, second half.. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound)
Leonardo de Moura and Nikolaj Bjørner Microsoft Research.
ISBN Chapter 3 Describing Syntax and Semantics.
An Integration of Program Analysis and Automated Theorem Proving Bill J. Ellis & Andrew Ireland School of Mathematical & Computer Sciences Heriot-Watt.
Leonardo de Moura Microsoft Research. Verification/Analysis tools need some form of Symbolic Reasoning.
Leonardo de Moura Microsoft Research. Satisfiability Modulo Theories: A Calculus of Computation Verification/Analysis tools need some form of Symbolic.
Formal Methods of Systems Specification Logical Specification of Hard- and Software Prof. Dr. Holger Schlingloff Institut für Informatik der Humboldt.
Plan for today Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Last time Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search strategy.
Nikolaj Bjørner Microsoft Research Lecture 3. DayTopicsLab 1Overview of SMT and applications. SAT solving, Z3 Encoding combinatorial problems with Z3.
Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.
Ch. 1: Software Development (Read) 5 Phases of Software Life Cycle: Problem Analysis and Specification Design Implementation (Coding) Testing, Execution.
K. Rustan M. Leino Research in Software Engineering (RiSE) Microsoft Research, Redmond, WA part 0 Summer School on Logic and Theorem-Proving in Programming.
Yeting Ge Leonardo de Moura New York University Microsoft Research.
Search in the semantic domain. Some definitions atomic formula: smallest formula possible (no sub- formulas) literal: atomic formula or negation of an.
Program Exploration with Pex Nikolai Tillmann, Peli de Halleux Pex
Last time Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search strategy.
Nikolaj Bjørner Leonardo de Moura Nikolai Tillmann Microsoft Research August 11’th 2008.
1 A propositional world Ofer Strichman School of Computer Science, Carnegie Mellon University.
Describing Syntax and Semantics
Austin, Texas 2011 Theorem Proving Tools for Program Analysis SMT Solvers: Yices & Z3 Austin, Texas 2011 Nikolaj Bjørner 2, Bruno Dutertre 1, Leonardo.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Leonardo de Moura Microsoft Research. Many approaches Graph-based for difference logic: a – b  3 Fourier-Motzkin elimination: Standard Simplex General.
Nikolaj Bjørner and Leonardo de Moura Microsoft Research Microsoft Corporation McMaster University, November 28, 2007.
Leonardo de Moura Microsoft Research. Quantifiers in Satisfiability Modulo Theories Logic is “The Calculus of Computer Science” (Z. Manna). High computational.
Nikolaj Bjørner, Leonardo de Moura Microsoft Research Bruno Dutertre SRI International.
CMU, Oct 4 DPLL-based Checkers for Satisfiability Modulo Theories Cesare Tinelli Department of Computer Science The University of Iowa Joint work with.
SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.
Leonardo de Moura and Nikolaj Bjørner Microsoft Research.
Introduction to Satisfiability Modulo Theories
Leonardo de Moura Microsoft Research. Is formula F satisfiable modulo theory T ? SMT solvers have specialized algorithms for T.
SAT & SMT. Software Verification & Testing SAT & SMT Test case generation Predicate Abstraction Verifying Compilers.
Leonardo de Moura Microsoft Research. SATTheoriesSMT Arithmetic Bit-vectors Arrays …
Lazy Annotation for Program Testing and Verification Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang November 26,
Reasoning about programs March CSE 403, Winter 2011, Brun.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module Logic Representations.
Integrating high-level constructs into programming languages Language extensions to make programming more productive Underspecified programs –give assertions,
Leonardo de Moura Microsoft Research. Quantifiers in Satisfiability Modulo Theories Logic is “The Calculus of Computer Science” (Z. Manna). High computational.
Leonardo de Moura Microsoft Research. Verification/Analysis tools need some form of Symbolic Reasoning.
Decision methods for arithmetic Third summer school on formal methods Leonardo de Moura Microsoft Research.
Leonardo de Moura and Nikolaj Bjørner Microsoft Research.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
Nikolaj Bjørner Microsoft Research DTU Winter course January 2 nd 2012 Organized by Flemming Nielson & Hanne Riis Nielson.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Leonardo de Moura Microsoft Research. Applications and Challenges in Satisfiability Modulo Theories Verification/Analysis tools need some form of Symbolic.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Logic Engines as a Service Leonardo de Moura and Nikolaj Bjørner Microsoft Research.
Finding Conflicting Instances of Quantified Formulas in SMT Andrew Reynolds Cesare Tinelli Leonardo De Moura July 18, 2014.
Selected Decision Procedures and Techniques for SMT More on combination – theories sharing sets – convex theory Un-interpreted function symbols (quantifier-free.
Knowledge Repn. & Reasoning Lecture #9: Propositional Logic UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005.
Satisfiability Modulo Theories and DPLL(T) Andrew Reynolds March 18, 2015.
Inference and search for the propositional satisfiability problem
Solving Linear Arithmetic with SAT-based MC
Symbolic Implementation of the Best Transformer
Presentation transcript:

Leonardo de Moura Microsoft Research

Satisfiability Modulo Theories: An Appetizer Verification/Analysis tools need some form of Symbolic Reasoning

Logic is “The Calculus of Computer Science” (Z. Manna). High computational complexity Satisfiability Modulo Theories: An Appetizer

Test case generationVerifying CompilersPredicate Abstraction Invariant Generation Type Checking Model Based Testing

VCC Hyper-V Terminator T-2 NModel HAVOC F7 SAGE Vigilante SpecExplorer Satisfiability Modulo Theories: An Appetizer

unsigned GCD(x, y) { requires(y > 0); while (true) { unsigned m = x % y; if (m == 0) return y; x = y; y = m; } We want a trace where the loop is executed twice. (y 0 > 0) and (m 0 = x 0 % y 0 ) and not (m 0 = 0) and (x 1 = y 0 ) and (y 1 = m 0 ) and (m 1 = x 1 % y 1 ) and (m 1 = 0) SolverSolver x 0 = 2 y 0 = 4 m 0 = 2 x 1 = 4 y 1 = 2 m 1 = 0 SSASSA Satisfiability Modulo Theories: An Appetizer

Signature: div : int, { x : int | x  0 }  int Satisfiability Modulo Theories: An Appetizer Subtype Call site: if a  1 and a  b then return div(a, b) Verification condition a  1 and a  b implies b  0

Satisfiability Modulo Theories: An Appetizer Is formula F satisfiable modulo theory T ? SMT solvers have specialized algorithms for T

b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: An Appetizer

ArithmeticArithmetic b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: An Appetizer

ArithmeticArithmetic Array Theory b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: An Appetizer

ArithmeticArithmetic Array Theory Uninterpreted Functions b + 2 = c and f(read(write(a,b,3), c-2) ≠ f(c-b+1) Satisfiability Modulo Theories: An Appetizer

A Theory is a set of sentences Alternative definition: A Theory is a class of structures Satisfiability Modulo Theories: An Appetizer

Z3 is a new solver developed at Microsoft Research. Development/Research driven by internal customers. Free for academic research. Interfaces: Z3 TextC/C++.NETOCaml Satisfiability Modulo Theories: An Appetizer

For most SMT solvers: F is a set of ground formulas Many Applications Bounded Model Checking Test-Case Generation Satisfiability Modulo Theories: An Appetizer

a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer a a b b c c d d e e s s t t

a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer a a b b c c d d e e s s t t a,b

a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer c c d d e e s s t t a,b a,b,c

d,e a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer d d e e s s t t a,b,c

a,b,c,s a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer s s t t a,b,c d,e

a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer t t d,e a,b,c,s d,e,t

a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer a,b,c,s d,e,t

a = b, b = c, d = e, b = s, d = t, a  e, a  s Satisfiability Modulo Theories: An Appetizer a,b,c,s d,e,t Unsatisfiable

a = b, b = c, d = e, b = s, d = t, a  e Satisfiability Modulo Theories: An Appetizer a,b,c,s d,e,t Model |M| = { 0, 1 } M(a) = M(b) = M(c) = M(s) = 0 M(d) = M(e) = M(t) = 1

a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: An Appetizer a,b,c,s d,e,t g(d) f(a,g(d)) g(e) f(b,g(e)) Congruence Rule: x 1 = y 1, …, x n = y n implies f(x 1, …, x n ) = f(y 1, …, y n )

a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: An Appetizer a,b,c,s d,e,t g(d) f(a,g(d)) g(e) f(b,g(e)) Congruence Rule: x 1 = y 1, …, x n = y n implies f(x 1, …, x n ) = f(y 1, …, y n ) g(d),g(e)

a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: An Appetizer a,b,c,s d,e,t f(a,g(d)) f(b,g(e)) Congruence Rule: x 1 = y 1, …, x n = y n implies f(x 1, …, x n ) = f(y 1, …, y n ) g(d),g(e) f(a,g(d)),f(b,g(e))

a = b, b = c, d = e, b = s, d = t, f(a, g(d))  f(b, g(e)) Satisfiability Modulo Theories: An Appetizer a,b,c,s d,e,t g(d),g(e) f(a,g(d)),f(b,g(e)) Unsatisfiable

(fully shared) DAGs for representing terms Union-find data-structure + Congruence Closure O(n log n) Satisfiability Modulo Theories: An Appetizer

x 2 y – 1 = 0, xy 2 – y = 0, xz – z + 1 = 0 Tool: Gröbner Basis

Satisfiability Modulo Theories: An Appetizer Polynomial Ideals: Algebraic generalization of zeroness 0  I p  I, q  I implies p + q  I p  I implies pq  I

Satisfiability Modulo Theories: An Appetizer The ideal generated by a finite collection of polynomials P = { p 1, …, p n } is defined as: I(P) = {p 1 q 1 + … + p n q n | q 1, …, q n are polynomials} P is called a basis for I(P). Intuition: For all s  I(P), p 1 = 0, …, p n = 0 implies s = 0

Satisfiability Modulo Theories: An Appetizer Hilbert’s Weak Nullstellensatz p 1 = 0, …, p n = 0 is unsatisfiable over C iff I({p 1, …, p n }) contains all polynomials 1  I({p 1, …, p n })

Satisfiability Modulo Theories: An Appetizer 1 st Key Idea: polynomials as rewrite rules. xy 2 – y = 0 Becomes xy 2  y The rewriting system is terminating but it is not confluent. xy 2  y, x 2 y  1 x2y2x2y2 xy y

Satisfiability Modulo Theories: An Appetizer 2 nd Key Idea: Completion. xy 2  y, x 2 y  1 x2y2x2y2 xy y Add polynomial: xy – y = 0 xy  y

Satisfiability Modulo Theories: An Appetizer x 2 y – 1 = 0, xy 2 – y = 0, xz – z + 1 = 0 x 2 y  1, xy 2  y, xz  z – 1 x 2 y  1, xy 2  y, xz  z – 1, xy  y xy  1, xy 2  y, xz  z – 1, xy  y y  1, xy 2  y, xz  z – 1, xy  y y  1, x  1, xz  z – 1, xy  y y  1, x  1, 1 = 0, xy  y

In practice, we need a combination of theory solvers. Nelson-Oppen combination method. Reduction techniques. Model-based theory combination. Satisfiability Modulo Theories: An Appetizer

M | F Partial model Set of clauses Satisfiability Modulo Theories: An Appetizer

Guessing (case-splitting) p,  q | p  q,  q  r p | p  q,  q  r Satisfiability Modulo Theories: An Appetizer

Deducing p, s| p  q,  p  s p | p  q,  p  s Satisfiability Modulo Theories: An Appetizer

Backtracking p, s| p  q, s  q,  p   q p,  s, q | p  q, s  q,  p   q Satisfiability Modulo Theories: An Appetizer

Efficient indexing (two-watch literal) Non-chronological backtracking (backjumping) Lemma learning … Satisfiability Modulo Theories: An Appetizer

Efficient decision procedures for conjunctions of ground literals. a=b, a 10 Satisfiability Modulo Theories: An Appetizer

a=b, a > 0, c > 0, a + c < 0 | F backtrack

Satisfiability Modulo Theories: An Appetizer SMT Solver = DPLL + Decision Procedure Standard question: Why don’t you use CPLEX for handling linear arithmetic? Standard question: Why don’t you use CPLEX for handling linear arithmetic?

Satisfiability Modulo Theories: An Appetizer Decision Procedures must be: Incremental & Backtracking Theory Propagation a=b, a<5 | … a<6  f(a) = a a=b, a<5, a<6 | … a<6  f(a) = a

Satisfiability Modulo Theories: An Appetizer Decision Procedures must be: Incremental & Backtracking Theory Propagation Precise (theory) lemma learning a=b, a > 0, c > 0, a + c < 0 | F Learn clause:  (a=b)   (a > 0)   (c > 0)   (a + c < 0) Imprecise! Precise clause:  a > 0   c > 0   a + c < 0

For some theories, SMT can be reduced to SAT bvmul 32 (a,b) = bvmul 32 (b,a) Higher level of abstraction Satisfiability Modulo Theories: An Appetizer

F  TF  T First-order Theorem Prover T may not have a finite axiomatization Satisfiability Modulo Theories: An Appetizer

Test (correctness + usability) is 95% of the deal: Dev/Test is 1-1 in products. Developers are responsible for unit tests. Tools: Annotations and static analysis (SAL + ESP) File Fuzzing Unit test case generation Satisfiability Modulo Theories: An Appetizer

Security is critical Security bugs can be very expensive: Cost of each MS Security Bulletin: $600k to $Millions. Cost due to worms: $Billions. The real victim is the customer. Most security exploits are initiated via files or packets. Ex: Internet Explorer parses dozens of file formats. Security testing: hunting for million dollar bugs Write A/V Read A/V Null pointer dereference Division by zero Satisfiability Modulo Theories: An Appetizer

Two main techniques used by “black hats”: Code inspection (of binaries). Black box fuzz testing. Black box fuzz testing: A form of black box random testing. Randomly fuzz (=modify) a well formed input. Grammar-based fuzzing: rules to encode how to fuzz. Heavily used in security testing At MS: several internal tools. Conceptually simple yet effective in practice Satisfiability Modulo Theories: An Appetizer

Execution Path Run Test and Monitor Path Condition Solve seed New input Test Inputs Constraint System Known Paths Satisfiability Modulo Theories: An Appetizer

PEX Implements DART for.NET. SAGE Implements DART for x86 binaries. YOGI Implements DART to check the feasibility of program paths generated statically. Vigilante Partially implements DART to dynamically generate worm filters. Satisfiability Modulo Theories: An Appetizer

Test input generator Pex starts from parameterized unit tests Generated tests are emitted as traditional unit tests Satisfiability Modulo Theories: An Appetizer

class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } Inputs

(0,null) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

InputsObserved Constraints (0,null)!(c<0) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } c < 0  false

InputsObserved Constraints (0,null)!(c<0) && 0==c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } 0 == c  true

InputsObserved Constraints (0,null)!(c<0) && 0==c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } item == item  true This is a tautology, i.e. a constraint that is always true, regardless of the chosen values. We can ignore such constraints.

Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } 0 == c  false

Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0(-1,null) class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0(-1,null)c<0 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } } c < 0  true

Constraints to solve InputsObserved Constraints (0,null)!(c<0) && 0==c !(c<0) && 0!=c(1,null)!(c<0) && 0!=c c<0(-1,null)c<0 class ArrayList { object[] items; int count; ArrayList(int capacity) { if (capacity < 0) throw...; items = new object[capacity]; } void Add(object item) { if (count == items.Length) ResizeArray(); items[this.count++] = item; }... class ArrayListTest { [PexMethod] void AddItem(int c, object item) { var list = new ArrayList(c); list.Add(item); Assert(list[0] == item); } }

Rich Combination Linear arithmetic BitvectorArrays Free Functions Models Model used as test inputs  -Quantifier Used to model custom theories (e.g.,.NET type system) API Huge number of small problems. Textual interface is too inefficient. Satisfiability Modulo Theories: An Appetizer

Pex “sends” several similar formulas to Z3. Plus: backtracking primitives in the Z3 API. push pop Reuse (some) lemmas. Satisfiability Modulo Theories: An Appetizer

Given a set of constraints C, find a model M that minimizes the interpretation for x 0, …, x n. In the ArrayList example: Why is the model where c = less desirable than the model with c = 1? !(c<0) && 0!=c Simple solution: Assert C while satisfiable Peek x i such that M[x i ] is big Assert x i < n, where n is a small constant Return last found model Satisfiability Modulo Theories: An Appetizer

Given a set of constraints C, find a model M that minimizes the interpretation for x 0, …, x n. In the ArrayList example: Why is the model where c = less desirable than the model with c = 1? !(c<0) && 0!=c Refinement: Eager solution stops as soon as the system becomes unsatisfiable. A “bad” choice (peek x i ) may prevent us from finding a good solution. Use push and pop to retract “bad” choices. Satisfiability Modulo Theories: An Appetizer

Apply DART to large applications (not units). Start with well-formed input (not random). Combine with generational search (not DFS). Negate 1-by-1 each constraint in a path constraint. Generate many children for each parent run. parent generation 1 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: ; h: ; h: ; h: ; h: ; h: ; h: ;.... Generation 0 – seed file Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: ; RIFF h: ; h: ; h: ; h: ; h: ; h: ;.... Generation 1 Satisfiability Modulo Theories: An Appetizer

` Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: ** ** ** ; RIFF....*** h: ; h: ; h: ; h: ; h: ; h: ;.... Generation 2

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ; h: ; h: ; h: ;.... Generation 3 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ;....strh h: ; h: ; h: ;.... Generation 4 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ;....strh....vids h: ; h: ; h: ;.... Generation 5 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ;....strh....vids h: ;....strf h: ; h: ;.... Generation 6 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ;....strh....vids h: ;....strf....( h: ; h: ;.... Generation 7 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ;....strh....vids h: ;....strf....( h: C9 9D E4 4E ; ɝäN h: ;.... Generation 8 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ;....strh....vids h: ;....strf....( h: ; h: ;.... Generation 9 Satisfiability Modulo Theories: An Appetizer

Starting with 100 zero bytes … SAGE generates a crashing test for Media1 parser h: D ** ** ** ; RIFF=...*** h: ; h: ; h: ;....strh....vids h: B A ;....strf²uv:( h: ; h: ;.... Generation 10 – CRASH Satisfiability Modulo Theories: An Appetizer

SAGE is very effective at finding bugs. Works on large applications. Fully automated Easy to deploy (x86 analysis – any language) Used in various groups inside Microsoft Powered by Z3. Satisfiability Modulo Theories: An Appetizer

Formulas are usually big conjunctions. SAGE uses only the bitvector and array theories. Pre-processing step has a huge performance impact. Eliminate variables. Simplify formulas. Early unsat detection. Satisfiability Modulo Theories: An Appetizer

Annotated Program Verification Condition F pre/post conditions invariants and other annotations pre/post conditions invariants and other annotations

class C { private int a, z; invariant z > 0 public void M() requires a != 0 { z = 100/a; }

… terminates diverges goes wrong

State Cartesian product of variables Execution trace Nonempty finite sequence of states Infinite sequence of states Nonempty finite sequence of states followed by special error state … (x: int, y: int, z: bool)

x := E x := x + 1 x := 10 havoc x S ; T assert P assume P S  T

Hoare triple{ P } S { Q }says that every terminating execution trace of S that starts in a state satisfying P does not go wrong, and terminates in a state satisfying Q

Hoare triple{ P } S { Q }says that every terminating execution trace of S that starts in a state satisfying P does not go wrong, and terminates in a state satisfying Q Given S and Q, what is the weakest P’ satisfying {P’} S {Q} ? P' is called the weakest precondition of S with respect to Q, written wp(S, Q) to check {P} S {Q}, check P  P’

wp( x := E, Q ) = wp( havoc x, Q ) = wp( assert P, Q ) = wp( assume P, Q ) = wp( S ; T, Q ) = wp( S  T, Q ) = Q[ E / x ] (  x  Q ) P  Q P  Q wp( S, wp( T, Q )) wp( S, Q )  wp( T, Q )

if E then S else T end = assume E; S  assume ¬E; T

while E invariant J do S end =assert J; havoc x; assume J; (assume E; S; assert J; assume false  assume ¬ E ) where x denotes the assignment targets of S “fast forward” to an arbitrary iteration of the loop check that the loop invariant holds initially check that the loop invariant is maintained by the loop body

BIG and-or tree (ground) BIG and-or tree (ground)  Axioms ( (non-ground)  Axioms ( (non-ground) Control & Data Flow

Meta OS: small layer of software between hardware and OS Mini: 60K lines of non-trivial concurrent systems C code Critical: must provide functional resource abstraction Trusted: a verification grand challenge Hardware Hypervisor

VCs have several Mb Thousands of non ground clauses Developers are willing to wait at most 5 min per VC Satisfiability Modulo Theories: An Appetizer

Partial solutions Automatic generation of: Loop Invariants Houdini-style automatic annotation generation Satisfiability Modulo Theories: An Appetizer

Quantifiers, quantifiers, quantifiers, … Modeling the runtime  h,o,f: IsHeap(h)  o ≠ null  read(h, o, alloc) = t  read(h,o, f) = null  read(h, read(h,o,f),alloc) = t Satisfiability Modulo Theories: An Appetizer

Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms  o, f: o ≠ null  read(h 0, o, alloc) = t  read(h 1,o,f) = read(h 0,o,f)  (o,f)  M Satisfiability Modulo Theories: An Appetizer

Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions  i,j: i  j  read(a,i)  read(b,j) Satisfiability Modulo Theories: An Appetizer

Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions Theories  x: p(x,x)  x,y,z: p(x,y), p(y,z)  p(x,z)  x,y: p(x,y), p(y,x)  x = y Satisfiability Modulo Theories: An Appetizer

Quantifiers, quantifiers, quantifiers, … Modeling the runtime Frame axioms User provided assertions Theories Solver must be fast in satisfiable instances. We want to find bugs! Satisfiability Modulo Theories: An Appetizer

There is no sound and refutationally complete procedure for linear integer arithmetic + free function symbols Satisfiability Modulo Theories: An Appetizer

Heuristic quantifier instantiationCombining SMT with Saturation proversComplete quantifier instantiationDecidable fragmentsModel based quantifier instantiation Satisfiability Modulo Theories: An Appetizer

Is the axiomatization of the runtime consistent? False implies everything Partial solution: SMT + Saturation Provers Found many bugs using this approach Satisfiability Modulo Theories: An Appetizer

Standard complain “I made a small modification in my Spec, and Z3 is timingout” This also happens with SAT solvers (NP-complete) In our case, the problems are undecidable Partial solution: parallelization Satisfiability Modulo Theories: An Appetizer

Joint work with Y. Hamadi (MSRC) and C. Wintersteiger Multi-core & Multi-node (HPC) Different strategies in parallel Collaborate exchanging lemmas Strategy 1 Strategy 2 Strategy 3 Strategy 4 Strategy 5 Satisfiability Modulo Theories: An Appetizer

Logic as a platform Most verification/analysis tools need symbolic reasoning SMT is a hot area Many applications & challenges Thank You! Satisfiability Modulo Theories: An Appetizer

SMT solvers use heuristic quantifier instantiation. E-matching (matching modulo equalities). Example:  x: f(g(x)) = x { f(g(x)) } a = g(b), b = c, f(a)  c Trigger Satisfiability Modulo Theories: An Appetizer

SMT solvers use heuristic quantifier instantiation. E-matching (matching modulo equalities). Example:  x: f(g(x)) = x { f(g(x)) } a = g(b), b = c, f(a)  c x=b f(g(b)) = b Equalities and ground terms come from the partial model M Satisfiability Modulo Theories: An Appetizer

Integrates smoothly with DPLL. Efficient for most VCs Decides useful theories: Arrays Partial orders … Satisfiability Modulo Theories: An Appetizer

E-matching is NP-Hard. In practice Satisfiability Modulo Theories: An Appetizer

Trigger: f(x1, g(x1, a), h(x2), b) Trigger: f(x1, g(x1, a), h(x2), b) Instructions: 1.init(f, 2) 2.check(r4, b, 3) 3.bind(r2, g, r5, 4) 4.compare(r1, r5, 5) 5.check(r6, a, 6) 6.bind(r3, h, r7, 7) 7.yield(r1, r7) Instructions: 1.init(f, 2) 2.check(r4, b, 3) 3.bind(r2, g, r5, 4) 4.compare(r1, r5, 5) 5.check(r6, a, 6) 6.bind(r3, h, r7, 7) 7.yield(r1, r7) CompilerCompiler Similar triggers share several instructions. Combine code sequences in a code tree Satisfiability Modulo Theories: An Appetizer

Tight integration: DPLL + Saturation solver. BIG and-or tree (ground) BIG and-or tree (ground) Axioms ( (non-ground) Axioms ( (non-ground) Saturation Solver SMT Satisfiability Modulo Theories: An Appetizer

Inference rule: DPLL(  ) is parametric. Examples: Resolution Superposition calculus … Satisfiability Modulo Theories: An Appetizer

DPLL + Theories DPLL + Theories Saturation Solver Saturation Solver Ground literals Ground clauses Satisfiability Modulo Theories: An Appetizer