Willem Visser Stellenbosch University

Slides:



Advertisements
Similar presentations
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
Advertisements

A Survey of Runtime Verification Jonathan Amir 2004.
Symbolic Execution with Mixed Concrete-Symbolic Solving
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Model Counting >= Symbolic Execution Willem Visser Stellenbosch University Joint work with Matt Dwyer (UNL, USA) Jaco Geldenhuys (SU, RSA) Corina Pasareanu.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
Symbolic execution © Marcelo d’Amorim 2010.
1 1 Lecture 14 Java Virtual Machine Instructors: Fu-Chiung Cheng ( 鄭福炯 ) Associate Professor Computer Science & Engineering Tatung Institute of Technology.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
JPF Tutorial - Part 1 JPF Core System
Symbolic Execution with Mixed Concrete-Symbolic Solving (SymCrete Execution) Jonathan Manos.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Java: Chapter 1 Computer Systems Computer Programming II.
1 Introduction to JVM Based on material produced by Bill Venners.
Basics of Java IMPORTANT: Read Chap 1-6 of How to think like a… Lecture 3.
Netprog: Java Intro1 Crash Course in Java. Netprog: Java Intro2 Why Java? Network Programming in Java is very different than in C/C++ –much more language.
CS527 Topics in Software Engineering (Software Testing and Analysis) Darko Marinov September 9, 2010.
Java PathFinder (JPF) cs498dm Software Testing January 19, 2012.
Model Counting A Quest for Nails 2 Willem Visser Stellenbosch University Joint work with Matt Dwyer (UNL, USA) Jaco Geldenhuys (SU, RSA) Corina Pasareanu.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
jFuzz – Java based Whitebox Fuzzing
Java Basics Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Programming Languages
Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.
Core Java Introduction Byju Veedu Ness Technologies httpdownload.oracle.com/javase/tutorial/getStarted/intro/definition.html.
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
Using Symbolic PathFinder at NASA Corina Pãsãreanu Carnegie Mellon/NASA Ames.
Model Counting with Applications to CodeHunt Willem Visser Stellenbosch University South Africa.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data.
Recap: Printing Trees into Bytecodes To evaluate e 1 *e 2 interpreter –evaluates e 1 –evaluates e 2 –combines the result using * Compiler for e 1 *e 2.
CSE 501N Fall ‘09 03: Class Members 03 September 2009 Nick Leidenfrost.
4 - Conditional Control Structures CHAPTER 4. Introduction A Program is usually not limited to a linear sequence of instructions. In real life, a programme.
RealTimeSystems Lab Jong-Koo, Lim
CS 536 © CS 536 Spring Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 15.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Chapter 1 Introduction.
Done By: Ashlee Lizarraga Ricky Usher Jacinto Roches Eli Gomez
Willem Visser Stellenbosch University
Learning to Program D is for Digital.
Model Checking Java Programs (Java PathFinder)
Compiler Construction (CS-636)
Java Programming Language
CS 326 Programming Languages, Concepts and Implementation
Chapter 1 Introduction.
Introduction Enosis Learning.
Programming Language Concepts (CIS 635)
Stacks Chapter 4.
Introduction Enosis Learning.
CSCI1600: Embedded and Real Time Software
Names, Binding, and Scope
Stacks.
Chap. 8 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
Objective of This Course
CS-0401 INTERMEDIATE PROGRAMMING USING JAVA
Over-Approximating Boolean Programs with Unbounded Thread Creation
Automatic Test Generation SymCrete
Control Structure Testing
Stacks.
Trace-based Just-in-Time Type Specialization for Dynamic Languages
CUTE: A Concolic Unit Testing Engine for C
CSCI1600: Embedded and Real Time Software
CSc 453 Interpreters & Interpretation
Presentation transcript:

Willem Visser Stellenbosch University Symbolic Execution Willem Visser Stellenbosch University

Overview What is Symbolic Execution History of Symbolic Execution Symbolic PathFinder Concolic Execution aka Dynamic SE DSE vs classic SE Resources: download the docker image at https://hub.docker.com/r/willemvisser/willem-jpf-mutation/ docker run -it willemvisser/willem-jpf-mutation “cd ..” look for TUTORIAL files to follow steps RW 745 - Willem Visser

Acknowledgements Corina Pasareanu My ex-colleague from NASA Ames and probably the world’s leading expert on symbolic execution, for doing this YouTube video (Symbolic Execution and Model Checking for Testing) and for putting the presentation on how JPF’s symbolic execution now works on the web at http://www.slideworld.com/slideshows.aspx/Symbolic-Execution-of-Java-Bytecode-ppt-823844 RW 745 - Willem Visser

What is Symbolic Execution? Static Analysis Technique Executes code in a non-standard way Instead of concrete inputs, symbolic values are manipulated At each program location, the state of the system is defined by The current assignments to the symbolic inputs and local variables A symbolic state represent a set of concrete states A path condition that must hold for the execution to reach this location Condition on the inputs to reach the location Program counter At each branch in the code, both paths must be followed On the true branch: the condition is added to the path condition On the false branch: the negation of the condition is added to the path condition If a branch is infeasible, then execution along that branch is terminated Idea first floated in mid 1970s

Symbolic Execution: Walking Many Paths at Once [pres = 460;pres_min = 640;pres_max = 960] if( (pres < pres_min) || (pres > pres_max)) { … } else { } [pres = X;pres_min = MIN;pres_max = MAX] [PC: TRUE] if ((pres < pres_min) || (pres > pres_max)) { … } else { } if ((pres < pres_min)) || (pres > pres_max)) { … } else { } if ((pres < pres_min) || (pres > pres_max)) { … } else { } [PC: X< MIN] [PC: X > MAX] [PC: X >= MIN && X <= MAX

Concrete Execution Path (example) int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert(false); } x = 1, y = 0 1 >? 0 x = 1 + 0 = 1 y = 1 – 0 = 1 x = 1 – 1 = 0 0 >? 1

Symbolic Execution Tree (example) int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert(false); } x = X, y = Y X >? Y [ X <= Y ] END [ X > Y ] x = X + Y [ X > Y ] y = X + Y – Y = X [ X > Y ] x = X + Y – X = Y [ X > Y ] Y >? X [ X > Y, Y <= X ] END [ X > Y, Y > X ] END

LETS TRY SPF

History of Symbolic Execution 1975-76 James King Lori Clarke 1980-2003 Nothing much happened Major improvement in SAT solving + Moore’s Law 2003 Generalized Symbolic Execution Classic King/Clarke style but for modern programming language, namely Java 2005 DART (Directed Automated Random Testing) First concolic/DSE system

Popular SE Systems Dynamic Symbolic Execution CUTE (C) and jCUTE (Java) CREST (C) PEX (.NET) SAGE (x86 binaries) KLEE (C) ? [New] Jalangi (JavaScript) Classic Symbolic Execution Symbolic PathFinder (Java)

Generalized Symbolic Execution 2003 Khurshid, Pasareanu, Visser Main idea is how to handle complex data structures Secondary was the use of model checking as an underlying infrastructure for symbolic execution

Data Structure Example NullPointerException class Node { int elem; Node next; Node swapNode() { if (next != null) if (elem > next.elem) { Node t = next; next = t.next; t.next = this; return t; } return this; } } ? null E0 E1 Input list + Constraint Output list E0 > E1 none E0 <= E1

Lazy Initialization Algorithm consider executing next = t.next; E0 next E1 t Precondition: acyclic list E0 next E1 t null t E0 next E1 ? E0 next E1 t E0 next E1 t E0 E1 next t null ?

First we need a quick primer on JPF Lets start by playing with JPF

JPF Overview What is JPF? Extending JPF Getting started Listeners Bytecode Factories Model classes Getting started Download, Install and Run (in Eclipse) Google Summer of Code

What is JPF? surprisingly hard to summarize - can be used for many things extensible virtual machine framework for Java bytecode verification: workbench to efficiently implement all kinds of verification tools typical use cases: software model checking (deadlock & race detection) deep inspection (numeric analysis, invalid access) test case generation (symbolic execution) ... and many more

History of JPF not a new project: around for 10 years and continuously developed: 1997 - project started as front end for Spin model checker 1999 - reimplementation as concrete virtual machine for software model checking (concurrency defects) 2003 - introduction of extension interfaces 2005 - open sourced on Sourceforge 2008 - participation in Google Summer of Code 2009 - moved to own server, hosting extension projects and Wiki

No Free Lunch you need to learn JPF is not a lightweight tool flexibility has its price - configuration can be intimidating might require extension for your SUT (properties, libraries) you will encounter unimplemented/missing parts (e.g. UnsatisfiedLinkError) usually easy to implement exception: state-relevant native libraries (java.io, java.net) can be either modeled or stubbed you need suitable test drivers

Key Points JPF is research platform and production tool (basis) JPF is designed for extensibility JPF is open source JPF is an ongoing collaborative development project JPF cannot find all bugs JPF is moderately sized system (~200ksloc core + extensions) JPF represents >20 man year development effort JPF is pure Java application (platform independent)

JPF and the Host JVM verified Java program is executed by JPF, which is a virtual machine implemented in Java, i.e. runs on top of a host JVM ⇒ easy to get confused about who executes what

JPF Top-level Structure two major constructs: Search and JVM JVM produces program states Search is the JVM driver

Search Policies state explosion mitigation: search the interesting state space part first (“get to the bug early, before running out of memory”) Search instances encapsulate (configurable) search policies

Exploring Choices model checker needs choices to explore state space there are many potential types of choices (scheduling, data, ..) choice types should not be hardwired in model checker

Choice Generators transitions begin with a choice and extend until the next ChoiceGenerator (CG) is set (by instruction, native peer or listener) advance positions the CG on the next unprocessed choice (if any) backtrack goes up to the next CG with unprocessed choices Choice Generators are configurable as well, i.e. create your own

Listeners, the JPF Plugins

Example Listener Checking NonNull Annotation on Return public class NonnullChecker extends ListenerAdapter { ... public void executeInstruction (JVM vm) { Instruction insn = vm.getLastInstruction(); ThreadInfo ti = vm.getLastThreadInfo(); if (insn instanceof ARETURN) { // check @NonNull method returns ARETURN areturn = (ARETURN)insn; MethodInfo mi = insn.getMethodInfo(); if (areturn.getReturnValue(ti) == null) { if (mi.getAnnotation(“java.annotation.Nonnull”) != null) { Instruction nextPc = ti.createAndThrowException( "java.lang.AssertionError", "null return from @Nonnull method: " + mi.getCompleteName()); ti.setNextPC(nextPC); return; }

Bytecode Instruction Factories

Example – Bytecode Factory provide alternative Instruction classes for relevant bytecodes create & configure InstructionFactory that instantiates them compiler ... [20] iinc [21] goto 10 [10] iload_4 [11] bipush [12] if_icmpge 22 [13] iload_3 [14] iload_2 [15] iadd void notSoObvious(int x){ int a = x*50; int b = 19437583; int c = a; for (int k=0; k<100; k++){ c += b; System.out.println(c); }} ... notSoObvious(21474836); JPF configuration vm.insn_factory.class = .numeric.NumericInstructionFactory class loading code execution (by JPF) class IADD extends Instruction { Instruction execute (.., ThreadInfo ti) { int v1 = ti.pop(); int v2 = ti.pop(); int res = v1 + v2; if ((v1>0 && v2>0 && res<=0) …throw ArithmeticException..

Now back to SPF

JPF Symbolic Execution JPF-SE Original approach based on program transformation 2003-2007 SPF (Symbolic JPF) Based on non-standard bytecode interpretation 2008-… Rest of the presentation focus on this RW 745 - Willem Visser

Symbolic JPF JPF search engine used To generate and explore the symbolic execution tree Also used to analyze thread inter-leavings and other forms of non-determinism that might be present in the code No state matching performed In general, un-decidable To limit the (possibly) infinite symbolic search state space resulting from loops, we put a limit on The model checker’s search depth or The number of constraints in the path condition Off-the-shelf decision procedures/constraint solvers used to check path conditions Model checker backtracks if path condition becomes infeasible Generic interface for multiple decision procedures Choco (for linear/non-linear integer/real constraints, mixed constraints), http://sourceforge.net/projects/choco/ IASolver (for interval arithmetic) http://www.cs.brandeis.edu/~tim/Applets/IAsolver.html Say we use omega library

Implementation Key mechanisms: Other mechanisms: JPF’s bytecode instruction factory Replace or extend standard concrete execution semantics of byte-codes with non-standard symbolic execution Attributes associated w/ program state Stack operands, fields, local variables Store symbolic information Propagated as needed during symbolic execution Other mechanisms: Choice generators: For handling branching conditions during symbolic execution Listeners: For printing results of symbolic analysis (method summaries) For enabling dynamic change of execution semantics (from concrete to symbolic) Native peers: For modeling native libraries, e.g. capture Math library calls and send them to the constraint solver JPF Structure: Instruction Factory

An Instruction Factory for Symbolic Execution of Byte-codes We created SymbolicInstructionFactory Contains instructions for the symbolic interpretation of byte-codes New Instruction classes derived from JPF’s core Conditionally add new functionality; otherwise delegate to super-classes Approach enables simultaneous concrete/symbolic execution JPF core: Implements concrete execution semantics based on stack machine model For each method that is executed, maintains a set of Instruction objects created from the method byte-codes Uses abstract factory design pattern to instantiate Instruction objects

Attributes for Storing Symbolic Information Used previous experimental JPF extension of slot attributes Additional, state-stored info associated with locals & operands on stack frame Generalized this mechanism to include field attributes Attributes are used to store symbolic values and expressions created during symbolic execution Attribute manipulation done mainly inside JPF core We only needed to override instruction classes that create/modify symbolic information E.g. numeric, compare-and-branch, type conversion operations Sufficiently general to allow arbitrary value and variable attributes Could be used for implementing other analyses E.g. keep track of physical dimensions and numeric error bounds or perform concolic execution Program state: A call stack/thread: Stack frames/executed methods Stack frame: locals & operands The heap (values of fields) Scheduling information

Handling Branching Conditions Symbolic execution of branching conditions involves: Creation of a non-deterministic choice in JPF’s search Path condition associated with each choice Add condition (or its negation) to the corresponding path condition Check satisfiability (mostly with z3) If un-satisfiable, instruct JPF to backtrack Created new choice generator public class PCChoiceGenerator extends IntIntervalGenerator { PathCondition[] PC; … }

Example: IADD public class IADD extends Instruction { … Concrete execution of IADD byte-code: Symbolic execution of IADD byte-code: public class IADD extends Instruction { … public Instruction execute(… ThreadInfo th){ int v1 = th.pop(); int v2 = th.pop(); th.push(v1+v2,…); return getNext(th); } public class IADD extends ….bytecode.IADD { … public Instruction execute(… ThreadInfo th){ Expression sym_v1 = ….getOperandAttr(0); Expression sym_v2 = ….getOperandAttr(1); if (sym_v1 == null && sym_v2 == null) // both values are concrete return super.execute(… th); else { int v1 = th.pop(); int v2 = th.pop(); th.push(0,…); // don’t care … ….setOperandAttr(Expression._plus( sym_v1,sym_v2)); return getNext(th); }

Example: IFGE Concrete execution of IFGE byte-code: Symbolic execution of IFGE byte-code: public class IFGE extends Instruction { … public Instruction execute(… ThreadInfo th){ cond = (th.pop() >=0); if (cond) next = getTarget(); else next = getNext(th); return next; } public class IFGE extends ….bytecode.IFGE { … public Instruction execute(… ThreadInfo th){ Expression sym_v = ….getOperandAttr(); if (sym_v == null) // the condition is concrete return super.execute(… th); else { PCChoiceGen cg = new PCChoiceGen(2);… cond = cg.getNextChoice()==0?false:true; if (cond) { pc._add_GE(sym_v,0); next = getTarget(); } pc._add_LT(sym_v,0); next = getNext(th); if (!pc.satisfiable()) … // JPF backtrack else cg.setPC(pc); return next; } } }

How to Execute a Method Symbolically JPF run configuration: +vm.insn_factory.class=gov.nasa.jpf.symbc.SymbolicInstructionFactory +jpf.listener=gov.nasa.jpf.symbc.SymbolicListener +vm.peer_packages=gov.nasa.jpf.symbc:gov.nasa.jpf.jvm +symbolic.dp=iasolver +symbolic.method=UnitUnderTest(sym#sym#con) Main Symbolic input globals (fields) and method pre-conditions can be specified via user annotations Instruct JPF to use symbolic byte-code set Print PCs and method summaries Use symbolic peer package for Math library Use IASolver as a decision procedure Method to be executed symbolically (3rd parameter left concrete) Main application class containing method under test

“Any Time” Symbolic Execution Can start at any point in the program Can use mixed symbolic and concrete inputs No special test driver needed – sufficient to have an executable program that uses the method/code under test Any time symbolic execution Use specialized listener to monitor concrete execution and trigger symbolic execution based on certain conditions Unit level analysis in realistic contexts Use concrete system-level execution to set-up environment for unit-level symbolic analysis Applications: Exercise deep system executions Extend/modify existing tests: e.g. test sequence generation for Java containers

Current State of SPF Downloadable as jpf-symbc from JPF website Recent Publication is the main reference for SPF “Symbolic PathFinder: Integrating Symbolic Execution with Model Checking for Java Bytecode Analysis” in Automated Software Engineering Journal 20(3) 2013

Lets play some more with SPF

Classic vs Dynamic Symbolic Execution Terminology Classic Symbolic Execution == Symbolic Execution Dynamic Symbolic Execution == Concolic Classic Everything is symbolic Need a special environment to run Dynamic Concrete and symbolic Execute the code for real and keep track of the symbolic world on the side

Example: Classic vs Dynamic SE int max(int a, int b) { if (a > b) return a; else return b; } void test(int x, int y) { int z = max(x,y); if (z == x) { // L1 } else { // L2 [true] test(X,Y) {x = X, y = Y} [true] max(X,Y) {a = X, b = Y} [X > Y] ret X [X <= Y] ret Y [X > Y] z = max(X,Y) {z = X} [X <= Y] z = max(X,Y) {z = Y} [X > Y & X == X] L1 [X != X] [X <= Y & Y == X] L1 [X <= Y & Y != X] L2 Solve and run Collect Path Condition [TRUE] ? test(0,1) [X <= Y & Y != X] L2 Negate, solve and run [X <= Y & Y == X] ? test(0,0) [X <= Y & Y == X] L1 [X <= Y & Y != X] done before [X > Y] ? test (1,0) [X > Y & X == X] L1

Example: Why bother with Dynamic? native int max(int a, int b) {} void test(int x, int y) { int z = max(x,y); if (z == x) { // L1 } else { // L2 } [true] test(X,Y) {x = X, y = Y} [true] max(X,Y) {a = X, b = Y} SPF has a “Symcrete” mode where we concretize on-the-fly KLEE has a similar feature ????? Solve and run Collect Path Condition [TRUE] ? test(0,1) [Y != X] L2 Negate, solve and run [Y == X] ? test(0,0) [Y == X] L1 [Y != X] done before

Approximations PC1: x*x*x > 0 && x > 0 && y != 10 public class DART { public static void test (int x ,int y ) { if (x*x*x > 0) { if (x >0 && y ==10) abort (); // A } else { if (x >0 && y ==20) abort (); // B } public static void main ( String [] args ) { test (2 , 9); PC1: x*x*x > 0 && x > 0 && y != 10 Instead of using x = 2 everywhere, we replace x with 2 until the formula becomes linear Negated: 4x > 0 && x > 0 && y == 10 Thus it can handle: public class DART { public static void test (int x ,int y ){ if (x*x*x > 0) { fine(); } else { abort(); } } }

Models: it makes (D)SE tick int max(int a, int b) { if (a > b) return a; else return b; } void test(int x, int y) { int z = max(x,y); if (z == x) { // L1 } else { // L2 Model/Summary for max(a,b): ((a > b) & (RET == a)) || ((a<=b) & (RET == b)) Solve and run: [TRUE] ? test(0,1) PC collected: [((X > Y) & (Z == X)) || ((X<=Y) & (Z == Y)) & Z != X] L2 [X<=Y & Y != X] Negated, solve and run: [X<=Y & Y == X] ? test(0,0) PC collected: [((X > Y) & (Z == X)) || ((X<=Y) & (Z == Y)) & Z == X] L1 [X<=Y & X == Y] Done after only 2 paths!

Models Implementations Properties file deepsea.config.dump = true deepsea.target = examples.simple.MaxChoice3 deepsea.args = deepsea.triggers = examples.simple.MaxChoice3.compute(X:int,Y:int,Z:int) deepsea.delegates = java.lang.Math:za.ac.sun.cs.deepsea.models.Math deepsea.produceoutput = false deepsea.explorer = za.ac.sun.cs.deepsea.explorer.DepthFirstExplorer green.log.level = OFF public boolean max_II_I(Symbolizer symbolizer) { SymbolicFrame frame = symbolizer.getTopFrame(); Expression arg0 = frame.pop(); Expression arg1 = frame.pop(); Expression var = new IntVariable(Symbolizer.getNewVariableName(), -1000, 1000); Expression pc = new Operation(Operator.OR, new Operation(Operator.AND, new Operation(Operator.GE, arg0, arg1), new Operation(Operator.EQ, arg0, var)), new Operation(Operator.LT, arg0, arg1), new Operation(Operator.EQ, arg1, var))); symbolizer.pushExtraConjunct(pc); frame.push(var); return true; }

Approaches How is DSE implemented? Instrument the source code Instrument the bytecode Instrument/Change the machine Screenshot from https://github.com/ksluckow/awesome-symbolic-execution

DEEPSEA High Level View Test Environment DEEPSEA Environment JVM Tests JVM JVM vs DEEPSEA Code JPDA Code TOO SLOW What is DEEPSEA Potential Benefits Dynamic Symbolic Execution Uses Green for constraint solving Multiple back-end solvers Caching of constraint results No instrumentation Uses Java Platform Debugger Interface Parallelization for performance Summaries to reduce path explosion Better behavioral coverage Symbolic rather than concrete inputs High fidelity Code under test is untouched Easier to integrate with existing systems Code runs the same, only the debugger interface is attached Works for any version of Java Supports all bytecodes

COASTAL Overview main A ∧ B ∧ ... new values main(...) SUT Strategy InstrumentationClassLoader Math .class SUT System Instrumenter SymbolicState frames instanceData Path condition A ∧ B ∧ ... new values PathTree DepthFirstStrategy BreadthFirstStrategy RandomStrategy Strategy

COASTAL Overview COASTAL repeatedly calls SUT.main() or some other routine A special class loader instruments class of interest Instrumented classes update the symbolic state as they execute frames store the expression stack instanceData store global and local data (including the heap) At the end of the run, the collected path condition is passed to a “strategy” to produce new values for variables strategy records visited paths generates values that will explore new behavior (potentially) uses an SMT solver to produce new values

COASTAL Instrumentation int choose(int x) { if (x < 10) { return -1; } else { return 1; } SymbolicVM.startMethod() x = SymbolicVM.getValue(”x”) SymbolicVM.insn(ILOAD, 0) 0: iload_0 LDC 12 // instruction nr LDC 16 // BIPUSH LDC 10 // 10 INVOKESTATIC SymbolicVM.insn (III)V 1: bipush 10 SymbolicVM.jump(IF_ICMPGE, 8) 3: if_icmpge 8 SymbolicVM.postJump() SymbolicVM.insn(ICONST, -1) 6: iconst_m1 SymbolicVM.insn(IRETURN) 7: ireturn SymbolicVM.insn(ICONST, 1) 8: iconst_1 9: ireturn actual instructions (almost) 0: iload_0 1: bipush 10 3: if_icmpge 8 6: iconst_m1 7: ireturn 8: iconst_1 9: ireturn

Generic handling of library calls COASTAL Overview “Regular” instructions simply update the symbolic state asfsf main(...) SymbolicState SUT frames BIPUSH 10 SUT .class INVOKESTATIC Math.max instanceData max__II__I() INVOKESTATIC Math.min Model for Math Path condition If configured, library calls are mirrored by routines that add extra constraints Math .class A ∧ B ∧ ... System .class Generic handling of library calls A call with no model is handled by introducing an additional symbolic variable constrained to be equal to the return value

Model for Math.Max public boolean max__II__I() { Expression arg0 = SymbolicVM.pop(); Expression arg1 = SymbolicVM.pop(); Expression var = new IntVariable(SymbolicVM.getNewVariableName(), min, max); Expression pc = new Operation(Operator.OR, new Operation(Operator.AND, new Operation(Operator.GE, arg0, arg1), new Operation(Operator.EQ, arg0, var)), new Operation(Operator.AND, new Operation(Operator.LT, arg0, arg1), new Operation(Operator.EQ, arg1, var))); SymbolicVM.pushExtraConjunct(pc); SymbolicVM.push(var); return true; } ((a >= b) & (RET == a)) || ((a <b ) & (RET == b))

COASTAL Instrumentation run(2,1) void run(int x, int y) { int z = Math.max(x, y); int q = Math.min(z, 5); if (q == x) { A } else { B } SymbolicVM.startMethod() SymbolicVM.insn(ILOAD, 0) iload 0 SymbolicVM.insn(ILOAD, 1) iload 1 SymbolicVM.method(INVOKESTATIC, "Math.max") invokestatic Math.max (II)I SymbolicVM.returnValue() SymbolicVM.insn(ISTORE, 2) ... (($0==X) ∧ (X>=Y)) ∨ (($0==Y) ∧ (Y>=X)) Y 1 Y X $0 X frames instanceData Path condition

COASTAL Instrumentation run(2,1) void run(int x, int y) { int z = Math.max(x, y); int q = Math.min(z, 5); if (q == x) { A } else { B } SymbolicVM.returnValue() SymbolicVM.insn(ISTORE, 2) istore 2 SymbolicVM.insn(ILOAD, 2) iload 2 SymbolicVM.insn(ICONST, 5) iconst_5 SymbolicVM.method("Math.min") invokestatic Math.min (II)I ... (($0==X) ∧ (X>=Y)) ∨ (($0==Y) ∧ (Y>X)) 2 $0 5 1 Y $0 X frames instanceData Path condition

COASTAL Instrumentation run(2,1) void run(int x, int y) { int z = Math.max(x, y); int q = Math.min(z, 5); if (q == x) { A } else { B } iconst_5 SymbolicVM.method("Math.min") invokestatic Math.min (II)I SymbolicVM.returnValue() SymbolicVM.insn(ISTORE, 3) istore 3 SymbolicVM.insn(ILOAD, 3) iload 3 SymbolicVM.insn(ILOAD, 0) ... 3 $1 (($0==X) ∧ (X>=Y)) ∨ (($0==Y) ∧ (Y>X)) 2 $0 ($1==2) 1 Y $1 $1 X frames instanceData Path condition

COASTAL Instrumentation run(2,1) void run(int x, int y) { int z = Math.max(x, y); int q = Math.min(z, 5); if (q == x) { A } else { B } iload 3 SymbolicVM.insn(ILOAD, 0) iload 0 SymbolicVM.jump(IF_ICMPNE, Label5) if_icmpne Label5 SymbolicVM.postJump() A instructions SymbolicVM.jump(GOTO, Label8) Label5: ... and so on... 3 $1 (($0==X) ∧ (X>=Y)) ∨ (($0==Y) ∧ (Y>X)) 2 $0 ($1==2) X 1 Y ($1==X) $1 X frames instanceData Path condition

DSE gets “Lucky” public void test (boolean b, int x, int y) { if (b) if(y <= 0) { ... } else if(x <= 0 && identity(y) == 1) { HERE! } }

DSE gets “Lucky” b=true,x=0,y=0 b=true,y<=0 b=true,x=0,y=1 public void test (boolean b, int x, int y) { if (b) if(y <= 0) { ... } else if(x <= 0 && identity(y) == 1) { HERE! } } b=true,x=0,y=0 b=true,y<=0 Negating last constraint y <= 0 b=true,x=0,y=1 Now b branches done, so negate b b=false,x=0,y=1

Some interesting new stuff Using Green Making analysis more efficient Probabilistic Symbolic Execution Adding model counting to Symbolic Execution Probabilistic Programming Using PSE for writing some cool programs

One-slide Green Optimized SAT and Model Counting Many features, but mostly for LIA Factorization splits constraints into independent parts Canonization puts a constraint into a canonical form Reuse results calculated in one run and across runs

Lets play with Green a little

Probabilistic Symbolic Execution Symbolic Execution asks SAT/UNSAT questions about constraints When it is SAT, one can also ask, how many solutions are there? #SAT Using model counters we can calculate the number of solutions One can do many interesting things with these numbers…

Symbolic Execution void test(int x, int y) { if (y == x*10) S0; else if (x > 3 && y > 10) S2; S3; } [ true ] test (X,Y) [ Y=X*10 ] S0 [ Y!=X*10 ] S1 [ X>3 & 10<Y=X*10] S2 [ X>3 & 10<Y!=X*10] S2 [ Y=X*10 & !(X>3 & Y>10) ] S3 [ Y!=X*10 & !(X>3 & Y>10) ] S3

Symbolic Execution Test(1,10) reaches S0,S3 Test(0,1) reaches S1,S3 void test(int x, int y) { if (y == x*10) S0; else S1; if (x > 3 && y > 10) S2; S3; } [ true ] test (X,Y) [ Y=X*10 ] S0 [ Y!=X*10 ] S1 [ X>3 & 10<Y=X*10] S2 [ X>3 & 10<Y!=X*10] S2 [ Y=X*10 & !(X>3 & Y>10) ] S3 [ Y!=X*10 & !(X>3 & Y>10) ] S3 Test(1,10) reaches S0,S3 Test(0,1) reaches S1,S3 Test(4,11) reaches S1,S2

Probabilistic SE void test(int x, int y: 0..99) { if (y == x*10) S0; else S1; if (x > 3 && y > 10) S2; S3; } 104 [ true ] y=10x [ Y=X*10 ] [ Y!=X*10 ] 9990 10 x>3 & y>10 x>3 & y>10 8538 1452 6 4 [ Y=X*10 & !(X>3 & Y>10) ] [ X>3 & 10<Y=X*10] [ X>3 & 10<Y!=X*10] [ Y!=X*10 & !(X>3 & Y>10) ]

Count solutions for conjunction of Linear Inequalities LattE Model Counter http://www.math.ucdavis.edu/~latte/ Count solutions for conjunction of Linear Inequalities

1 y=10x 0.999 Probabilities 0.001 x>3 & y>10 x>3 & y>10 0.855 0.145 0.6 0.4 0.1452 0.0004 0.8538 0.0006 [ Y=X*10 & !(X>3 & Y>10) ] [ Y!=X*10 & !(X>3 & Y>10) ] [ X>3 & 10<Y=X*10] [ X>3 & 10<Y!=X*10]

Lets play with PSE

Probabilistic Programming THE HOT NEW THING! Combine general purpose programming with probability distributions to answer interesting questions. (Easily) encode Bayesian Networks, Hidden Markov Models, etc. as a (Java) program with a few special keywords Our angle on this is a analytical inference algorithm, rather than the typical sampling approach (but ours only work in some cases)

Classic Example public static void FOSE1() { boolean c1 = flip(0.5); boolean c2 = flip(0.5); if (c1) probability(1); if (c2) probability(2); } public static void FOSE2() { boolean c1 = flip(0.5); boolean c2 = flip(0.5); observe(c1 || c2); if (c1) probability(1); if (c2) probability(2); } Winning at 1 -> 0.5 Winning at 2 -> 0.5 Winning at 1 -> 0.66667 Winning at 2 -> 0.66667 observe conditions the input: probability(c1 | (c1 || c2))) and probability(c2 | (c1 || c2)))

Encode DTMC in Java public static void PRISMDiceExample() { int s = 0; int d = 0; // dice value while (true) { if (s==0) { s = flip(0.5) ? 1 : 2; } else if (s == 1) { s = flip(0.5) ? 3 : 4;} else if (s == 2) { s = flip(0.5) ? 5 : 6;} else if (s == 3) { if (flip(0.5)) { s = 1;} else { s = 7; d = 1; }} else if (s == 4) { s = 7; d = flip(0.5) ? 2 : 3;} else if (s == 5) { s = 7; d = flip(0.5) ? 4 : 5;} else if (s == 6) { if (flip(0.5)) { s = 2;} else { s = 7; d = 6;}} else { /* s = 7 */ break; } } probability(d); // probability of seeing each value for d