Symbolic Execution and Software Testing Corina Pasareanu Carnegie Mellon/NASA Ames

Slides:



Advertisements
Similar presentations
Cristian Cadar, Peter Boonstoppel, Dawson Engler RWset: Attacking Path Explosion in Constraint-Based Test Generation TACAS 2008, Budapest, Hungary ETAPS.
Advertisements

Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
Symbolic Execution with Mixed Concrete-Symbolic Solving
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Parallel Symbolic Execution for Structural Test Generation Matt Staats Corina Pasareanu ISSTA 2010.
1 1 Regression Verification for Multi-Threaded Programs Sagar Chaki, SEI-Pittsburgh Arie Gurfinkel, SEI-Pittsburgh Ofer Strichman, Technion-Haifa Originally.
Model Counting >= Symbolic Execution Willem Visser Stellenbosch University Joint work with Matt Dwyer (UNL, USA) Jaco Geldenhuys (SU, RSA) Corina Pasareanu.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
Symbolic execution © Marcelo d’Amorim 2010.
Background for “KISS: Keep It Simple and Sequential” cs264 Ras Bodik spring 2005.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Ongoing projects in the Program Analysis Group Marcelo d’Amorim Informatics Center, Federal University of Pernambuco (UFPE) Belo Horizonte, MG-Brazil,
Hybrid Concolic Testing Rupak Majumdar Koushik Sen UC Los Angeles UC Berkeley.
CMSC 345, Version 11/07 SD Vick from S. Mitchell Software Testing.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Recursion Chapter 7. Chapter 7: Recursion2 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method To learn.
1 Loop-Extended Symbolic Execution on Binary Programs Pongsin Poosankam ‡* Prateek Saxena * Stephen McCamant * Dawn Song * ‡ Carnegie Mellon University.
1 Formal Engineering of Reliable Software LASER 2004 school Tutorial, Lecture1 Natasha Sharygina Carnegie Mellon University.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Symbolic Execution of Java Byte-code Corina Pãsãreanu Perot Systems/NASA Ames Research.
DART: Directed Automated Random Testing Koushik Sen University of Illinois Urbana-Champaign Joint work with Patrice Godefroid and Nils Klarlund.
Tao Xie (North Carolina State University) Nikolai Tillmann, Jonathan de Halleux, Wolfram Schulte (Microsoft Research, Redmond WA, USA)
Symbolic Execution with Mixed Concrete-Symbolic Solving (SymCrete Execution) Jonathan Manos.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Symbolic (Java) PathFinder – Symbolic Execution of Java Byte-code Corina Pãsãreanu Carnegie Mellon University/NASA Ames Research.
DySy: Dynamic Symbolic Execution for Invariant Inference.
JPF Tutorial – Part 2 Symbolic PathFinder – Symbolic Execution of Java Byte-code Corina Pãsãreanu Carnegie Mellon University/NASA Ames Research.
Introduction. 2COMPSCI Computer Science Fundamentals.
Verification of Java Programs using Symbolic Execution and Loop Invariant Generation Corina Pasareanu (Kestrel Technology LLC) Willem Visser (RIACS/USRA)
Java Pathfinder JPF Tutorial - Test Input Generation With Java Pathfinder.
Dynamic Analysis of Multithreaded Java Programs Dr. Abhik Roychoudhury National University of Singapore.
Ongoing projects in the Program Analysis Group Marcelo d’Amorim Informatics Center, Federal University of Pernambuco (UFPE) Belo Horizonte, MG-Brazil,
Finding Feasible Counter-examples when Model Checking Abstracted Java Programs Corina S. Pasareanu, Matthew B. Dwyer (Kansas State University) and Willem.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Model Counting A Quest for Nails 2 Willem Visser Stellenbosch University Joint work with Matt Dwyer (UNL, USA) Jaco Geldenhuys (SU, RSA) Corina Pasareanu.
Model Checking Java Programs using Structural Heuristics
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
jFuzz – Java based Whitebox Fuzzing
Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.
Symbolic Execution and Model Checking for Testing Corina Păsăreanu and Willem Visser Perot Systems/NASA Ames Research Center and SEVEN Networks.
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
A Test Case + Mock Class Generator for Coding Against Interfaces Mainul Islam, Christoph Csallner Software Engineering Research Center (SERC) Computer.
CS265: Dynamic Partial Order Reduction Koushik Sen UC Berkeley.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Using Symbolic PathFinder at NASA Corina Pãsãreanu Carnegie Mellon/NASA Ames.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Concrete Model Checking with Abstract Matching and Refinement Corina Păsăreanu QSS, NASA Ames Research Center Radek Pelánek Masaryk University, Brno, Czech.
( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data.
Random Test Generation of Unit Tests: Randoop Experience
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION SYMBOLIC TESTING Autumn 2011.
Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution Tao Xie, Darko Marinov, Wolfram Schulte, and David Notkin University.
On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.
Willem Visser Stellenbosch University
Symbolic Execution Suman Jana
A Test Case + Mock Class Generator for Coding Against Interfaces
RDE: Replay DEbugging for Diagnosing Production Site Failures
White-Box Testing Using Pex
Over-Approximating Boolean Programs with Unbounded Thread Creation
Automatic Test Generation SymCrete
CUTE: A Concolic Unit Testing Engine for C
CSE 1020:Software Development
Presentation transcript:

Symbolic Execution and Software Testing Corina Pasareanu Carnegie Mellon/NASA Ames

Overview “Classical” symbolic execution and its variants Generalized symbolic execution Dynamic and concolic testing Challenges Multi-threading, complex data structures Complex constraints Handling loops, native libraries Scalability issues – path explosion problem Compositional and parallel techniques Abstraction Applications & Tools

Symbolic Execution King [Comm. ACM 1976], Clarke [IEEE TSE 1976] Analysis of programs with unspecified inputs Execute a program on symbolic inputs Symbolic states represent sets of concrete states For each path, build a path condition Condition on inputs for the execution to follow that path Check path condition satisfiability -- explore only feasible paths Symbolic state Symbolic values/expressions for variables Path condition Program counter

x = 1, y = 0 1 > 0 ? true x = = 1 y = 1 – 0 = 1 x = 1 – 1 = 0 0 > 1 ? false int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } Concrete Execution PathCode that swaps 2 integers Example: Standard Execution

[PC:true]x = X,y = Y [PC:true] X > Y ? [PC:X>Y]y = X+Y–Y = X [PC:X>Y]x = X+Y–X = Y [PC:X>Y]Y>X ? int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } Code that swaps 2 integers: Symbolic Execution Tree: [PC:X≤Y]END[PC:X>Y]x= X+Y false true [PC:X>Y  Y≤X]END[PC:X>Y  Y>X]END falsetrue path condition False! Solve path conditions → test inputs Example: Symbolic Execution

Questions What about loops? What about overflow? What about multi-threading? What about data structures?

Generalized Symbolic Execution [TACAS’03] Handles dynamically allocated data structures and multi- threading Key elements: Lazy initialization for input data structures Standard model checker (Java PathFinder) for multi-threading Model Checker Analyzes thread inter-leavings Leverages optimizations Symmetry and partial order reductions, abstraction etc. Generates and explores the symbolic execution tree Explores different heap configurations explicitly -- non-determinism handles aliasing

Example Analysis class Node { int elem; Node next; Node swapNode() { if (next != null) if (elem > next.elem) { Node t = next; next = t.next; t.next = this; return t; } return this; } } ? null E0E1 E0 E1 null E0E1? E0E1 E0E1 Input list +ConstraintOutput list E0 > E1 none E0 ≤ E1 none E0 > E1 E1E0? E1E0 E1E0 E1E0 null E0E1 E0 ? null NullPointerException

Lazy Initialization (illustration) E0 next E1 next t null t E0 next E1 next ? E0 next E1 t next E0 next E1 next t E0 next E1 next t consider executing next = t.next; Precondition: acyclic list E0E1 next t null next t E0E1 next ? Heap Configuration

Implementation Symbolic execution of Java programs Code instrumentation Programs instrumented to enable JPF to perform symbolic execution Replace concrete operations with calls to methods that implement symbolic operations General: could use/leverage any model checker Decision procedures used to check satisfiability of path conditions Omega library for integer linear constraints CVCLite, STP (Stanford), Yices (SRI)

program instrumentation counterexample(s)/test suite [heap+constraint+thread scheduling] Implementation via Instrumentation model checking decision procedure instrumented program correctness specification/ coverage criterion continue/ backtrack path condition (data) heap configuration thread scheduling state: original program OmegaLib, CVCLite, STP, Yices

No longer uses code instrumentation Implements a non-standard interpreter of byte-codes Enables JPF to perform symbolic analysis Replaces standard byte-code execution with non-standard symbolic execution During execution checks for assert violations, run-time errors, etc. Symbolic information: Stored in attributes associated with the program data Propagated dynamically during symbolic execution Choice generators and listeners: Non-deterministic choices handle branching conditions Listeners print results: path conditions, test vectors/sequences Native peers model native libraries: Capture Math calls and send them to the constraint solver Generic interface for multiple decision procedures Choco, IASolver, CVC3, Yices, HAMPI, CORAL [NFM11], etc. Symbolic PathFinder

Example: IADD public class IADD extends Instruction { … public Instruction execute(… ThreadInfo th){ int v1 = th.pop(); int v2 = th.pop(); th.push(v1+v2,…); return getNext(th); } public class IADD extends ….bytecode.IADD { … public Instruction execute(… ThreadInfo th){ Expression sym_v1 = ….getOperandAttr(0); Expression sym_v2 = ….getOperandAttr(1); if (sym_v1 == null && sym_v2 == null) // both values are concrete return super.execute(… th); else { int v1 = th.pop(); int v2 = th.pop(); th.push(0,…); // don’t care … ….setOperandAttr(Expression._plus( sym_v1,sym_v2)); return getNext(th); } Concrete execution of IADD byte-code:Symbolic execution of IADD byte-code:

Handling Branching Conditions Involves: Creation of a non-deterministic choice in JPF’s search Path condition associated with each choice Add condition (or its negation) to the corresponding path condition Check satisfiability (with Choco, IASolver, CVC3 etc.) If un-satisfiable, instruct JPF to backtrack Created new choice generator public class PCChoiceGenerator extends IntIntervalGenerator { PathCondition[] PC; … }

Example: IFGE public class IFGE extends Instruction { … public Instruction execute(… ThreadInfo th){ cond = (th.pop() >=0); if (cond) next = getTarget(); else next = getNext(th); return next; } public class IFGE extends ….bytecode.IFGE { … public Instruction execute(… ThreadInfo th){ Expression sym_v = ….getOperandAttr(); if (sym_v == null) // the condition is concrete return super.execute(… th); else { PCChoiceGen cg = new PCChoiceGen(2);… cond = cg.getNextChoice()==0?false:true; if (cond) { pc._add_GE(sym_v,0); next = getTarget(); } else { pc._add_LT(sym_v,0); next = getNext(th); } if (!pc.satisfiable()) … // JPF backtrack else cg.setPC(pc); return next; } } } Concrete execution of IFGE byte-code:Symbolic execution of IFGE byte-code:

Complex mathematical constraints Model-level interpretation of calls to math functions Math.sin $x + 1sin($x + 1) Symbolic expression (un-interpreted function) denoting the result value of the call

CORAL solver [NFM’11] Target applications: Symbolic execution of programs that manipulate floating- point variables Use floating-point arithmetic Call specific math functions (from java.lang.Math) Meta-heuristic solver Distance-based fitness function Particle swarm optimization (PSO) Search simulates movements in a group of animals Used opt4j library (see opt4j.sourceforge.net)opt4j.sourceforge.net Common in software from NASA Coral Solver sqrt(exp(x+z))) 0 ∧ y>1 ∧ z>1 ∧ y<x+2 ∧ w=x+2 {x=4.31, y=6.08, z=9.51, w=6.31}

CORAL solvers Meta-heuristic solver Distance-based fitness function Optimizations Identification of dependent variables Inference of variable domains Efficient evaluation of constraints Particle swarm optimization (PSO) Similar to GA, but with fixed-sized population Search simulates movements in a group of animals Implemented very efficiently (matrix operations) Parameters to calibrate local and global influence Used opt4j library (see opt4j.sourceforge.net)opt4j.sourceforge.net

Java component (Bin Search Tree, UI) add(e) remove(e) find(e) Interface Generated test sequence: BinTree t = new BinTree(); t.add(1); t.add(2); t.remove(1); SymbolicSequenceListener generates JUnit tests JUnit tests can be run directly by the developers Measure coverage (e.g. MC/DC) Support for abstract state matching [ISSTA’04, ISSTA’06] Applications: Test Input and Sequence Generation

Application: Onboard Abort Executive (OAE) Prototype for CEV ascent abort handling being developed by JSC GN&C Inputs Pick Highest Ranked Abort Checks Flight Rules to see if an abort must occur Select Feasible Aborts OAE Structure Results Baseline –Manual testing: time consuming (~1 week) –Guided random testing could not cover all aborts Symbolic PathFinder –Generates tests to cover all aborts and flight rules –Total execution time is < 1 min –Test cases: 151 (some combinations infeasible) –Errors: 1 (flight rules broken but no abort picked) –Found major bug in new version of OAE –Flight Rules: 27 / 27 covered –Aborts: 7 / 7 covered –Size of input data: 27 values per test case Integration with End-to-end Simulation –Input data constrained by physical laws Example: inertial velocity can not be ft/s when the geodetic altitude is 0 ft –Need to encode these constraints explicitly [ISSTA’08]

Generated Test Cases and Constraints Test cases: // Covers Rule: FR A_2_A_2_B_1: Low Pressure Oxidizer Turbo pump speed limit exceeded // Output: Abort:IBB CaseNum 1; CaseLine in.stage_speed=3621.0; CaseTime ; // Covers Rule: FR A_2_A_2_A: Fuel injector pressure limit exceeded // Output: Abort:IBB CaseNum 3; CaseLine in.stage_pres=4301.0; CaseTime ; … Constraints: //Rule: FR A_2_A_1_A: stage1 engine chamber pressure limit exceeded Abort:IA PC (~60 constraints): in.geod_alt(9000) < && in.geod_alt(9000) < && in.geod_alt(9000) < && in.pres_rate(-2) >= -2 && in.pres_rate(-2) >= -15 && in.roll_rate(40) <= 50 && in.yaw_rate(31) <= 41 && in.pitch_rate(70) <= 100 && …

Shown: Shown: Polyglot Framework for model-based analysis and test case-generation; test cases used to test the generated code and to discover discrepancies between models and code. Orion orbits the moon (Image Credit: Lockheed Martin). Polyglot Framework [ISSTA’11] –Analysis for UML, Stateflow and Rhapsody interactive models –Automated test sequence generation –High degree of coverage –state, transition, path –Pluggable semantics –Study discrepancies between multiple statechart formalisms Demonstrations: –Orion’s Pad Abort-1 –Ares-Orion communication –JPL’s MER Arbiter –Apollo lunar autopilot Test-Sequence Generation for Multiple Statechart Models

Dynamic Techniques Classic symbolic execution is a static technique Dynamic techniques Collect symbolic constraints during concrete executions DART = Directed Automated Random Testing Concolic (Concrete Symbolic) testing P. Godefroid

DART = Directed Automated Random Testing Dynamic test generation Run the program starting with some random inputs Gather symbolic constraints on inputs at conditional statements Use a constraint solver to generate new test inputs Repeat the process until a specific program path or statement is reached (classic dynamic test generation [Korel90]) Or repeat the process to attempt to cover ALL feasible program paths (DART [Godefroid et al PLDI’05]) Detect crashes, assert violations, runtime errors etc. P. Godefroid

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 0, y = 0 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; }

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 0, y = 0 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } x ≤ y Solve: !(x≤y) Solution: x=1, y=0

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 1, y = 0 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; }

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 1, y = 0 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } x > y

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 1, y = 0 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } x > y x = x+y

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 1, y = 1 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } x > y y = x x = x+y

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 0, y = 1 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } x > y y = x x = y

Directed Search Concrete Execution Symbolic Execution Path Constraint x = 0, y = 1 create symbolic variables x, y int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert false; } x > y y = x x = y y ≤ x Solve: x> y AND !(y≤x) Impossible: DONE!

Dynamic Test Generation Very popular Implemented and extended in many interesting ways Many tools PEX, SAGE, CUTE, jCUTE, CREST, SPLAT, etc Many applications Bug finding, security, web and database applications, etc. EXE (Stanford Univ. [Cadar et al TISSEC 2008]) Related dynamic approach to symbolic execution

A Comparison [ISSTA’11] EXE results: stmt “S3” not covered DART results: path “S0;S4” not covered “Classic” sym exe w/ mixed concrete-symbolic solving: all paths covered Example Predicted path “S0;S4” != path taken “S1;S4” //hash(x)=10*x

Mixed Concrete-Symbolic Solving [ISSTA’11] Use un-interpreted functions for external library calls Split path condition PC into: simplePC – solvable constraints complexPC – non-linear constraints with un-interpreted functions Solve simplePC Use obtained solutions to simplify complexPC Check the result again for satisfiability Example (assume hash(x) = 10 *x): PC: X>3 ∧ Y>10 ∧ Y=hash(X) simplePC complexPC Solve simplePC; use solution X=4 to compute h(4)=40 Simplify complexPC: Y=40 Solve again: X>3 ∧ Y>10 ∧ Y=40 Satisfiable!

Challenge Path explosion Symbolic execution of a program may result in a very large, possibly infinite number of paths

Infinite symbolic execution tree void test(int n) { int x = 0; while(x < n) x = x + 1; } Example Code n:S PC:true n:S,x:0 PC:true n:S,x:1 PC:0<S n:S,x:0 PC:0<S n:S,x:0 PC:0>=S n:S,x:1 PC:0 =S n:S,x:1 PC:0<S ∧ 1<S... Problem: loops and recursion

Solutions Dealing with loops and recursion Put bound on search depth or on number of PCs Stop search when desired coverage achieved Loop abstraction [Saxena et al ISSTA’09] [Godefroid ISSTA’11] Solutions addressing path explosion Parallel Symbolic Execution Abstract State Matching Compositional DART = SMART

Parallel Symbolic Execution Path explosion Increases exponentially with the number of inputs specified as symbolic Very expensive in terms of time (weeks, months) Solution Speed-up symbolic execution using parallel or distributed techniques Symbolic execution is amenable to parallelization No sharing between sub-trees

Balancing partitions Nicely Balanced – linear speedup Poorly Balanced – no speedup Solutions Simple static partitioning [ISSTA’10] Dynamic partitioning [Andrew King’s Masters Thesis at KSU, Cloud9 at EPFL, Fujitsu]

Simple Static Partitioning Static partitioning of tree with light dynamic load balancing Flexible, little communication overhead Constraint-based partitioning Constraints used as initial pre-conditions Constraints are disjoint and complete Approach Shallow symbolic execution => produces large number of constraints Constraints selection – according to frequency of variables Combinatorial partition creation Intuition Commonly used variables likely to partition state space in useful ways Results maximum analysis time speedup of 90x observed using 128 workers and a maximum test generation time speedup of 70x observed using 64 workers.

Abstract State Matching State matching – subsumption checking [SPIN’06, J. STTT 2008] Obtained through DFS traversal of “rooted” heap configurations Roots are program variables pointing to the heap Unique labeling for “matched” nodes Check logical implication between numeric constraints Not enough to ensure termination Abstraction Store abstract versions of explored symbolic states Use subsumption checking to determine if an abstract state is re-visited Decide if the search should continue or backtrack

Abstract State Matching Enables analysis of under-approximation of program behavior Preserves errors to safety properties -- useful for testing Automated support for two abstractions (inspired by shape analysis [TVLA] Singly linked lists Arrays No refinement! See [Albarghouthi et al. CAV10] for symbolic execution with automatic abstraction-refinement

State Matching: Subsumption Checking E1E1 E2E2 E3E3 E4E4 E 1 > E 2  E 2 > E 3  E 2 ≤ E 4  E 1 > E 4 E1E1 E2E2 E3E3 E4E4 Stored state: New state:  Set of concrete states represented by stored state Set of concrete states represented by new state   E 1 > E 2  E 2 > E 3  E 2 < E 4  E 1 > E 4 1: 2: 4:3: 1: 2: 3:4: Normalized using existential quantifier elimination

Abstractions for Lists and Arrays Shape abstraction for singly linked lists Summarize contiguous list elements not pointed to by program variables into summary nodes Valuation of a summary node Union of valuations of summarized nodes Subsumption checking between abstracted states Same algorithm as subsumption checking for symbolic states Treat summary node as an “ordinary” node Abstraction for arrays Represent array as a singly linked list Abstraction similar to shape abstraction for linked lists

Abstraction for Lists E 1 = V 0  (E 2 = V 1  E 2 = V 2 )  E 3 = V 3 PC: V 0 ≤ v  V 1 ≤ v  V 2 ≤ v V0V0 next V1V1 n V2V2 this V3V3 next V0V0 { V 1 n, V 2 } next this V3V3 next V0V0 V1V1 n V2V2 this V0V0 next V1V1 n V2V2 this  Symbolic states Abstracted symbolic states 2:3:1: 2:3: PC: V 0 ≤ v  V 1 ≤ v PC: V 0 ≤ v  V 1 ≤ v  V 2 ≤ v E 1 = V 0  E 2 = V 1  E 3 = V 2 PC: V 0 ≤ v  V 1 ≤ v Unmatched!

Compositional DART [POPL’07] Idea: compositional dynamic test generation use summaries of individual functions like in inter-procedural static analysis if f calls g, analyze g separately, summarize the results, and use g’s summary when analyzing f A summary φ(g) is a disjunction of path constraints expressed in terms of input pre- conditions and output post-conditions: φ(g) = ∨ φ(w), with φ(w) = pre(w) ∧ post(w) g’s outputs are treated as symbolic inputs to calling function f SMART: Top-down strategy to compute summaries on a demand-driven basis from concrete calling contexts Same path coverage as DART but can be exponentially faster! Follow-up work Anand et al. [TACAS’08], Godefroid et al. [POPL’10] P. Godefroid

Example Program P = {top, is_positive} has 2 N feasible paths DART will perform 2 N runs SMART will perform only 4 runs 2 to compute summary φ(is_positive) = (x>0 ∧ ret=1) ∨ (x≤0 ∧ ret=0) 2 to execute both branches of (*) by solving: [(s[0]>0 ∧ ret 0 =1) ∨ (s[0]≤0 ∧ ret 0 =0)] ∧ [(s[1]>0 ∧ ret 1 =1) ∨ (s[1]≤0 ∧ ret 1 =0)] ∧ … ∧ [(s[N-1]>0 ∧ ret N-1 =1) ∨ (s[N-1]≤0 ∧ ret N-1 =0)] ∧ (ret 0 +ret 1 + … + ret N-1 =3) P. Godefroid int is_positive(int x) { if (x>0) return 1; return 0; } #define N 100 void top (int s[N]) {// N inputs int i, cnt=0; for (i=0;i<N; i++) cnt=cnt+is_positive(s[i]); if (cnt == 3) error(); // (*) return; }

1: d=d+1; 2: if (x > y) 3: return d / (x-y); else 4: return d / (y-x); PC: X>Y x: X, y: Y, d: D+1 PC: true PC: X<=Y PC: X>Y return: (D+1)/(X-Y) PC: X<=Y & Y-X!=0 return: (D+1)/(Y-X) PC: X<=Y & Y-X=0 Division by zero! Solve path conditions → test inputs Method m: Symbolic execution tree: [2:] [3:] [4:] x: X, y: Y, d: D Path condition PC: true [1:] Applications – An Example

Auto-generated JUnit public void t1() { m(1, 0, 1); public void t2() { m(0, 1, 1); public void t3() { m(1, 1, 1); } Achieves full path coverage Pass ✔ Fail ✗ PC: X<=Y & Y-X=0  X=Y

Program Repair and Synthesis Add JML Add argument check in m: If(x==y) throw new IllegalArgumentException(“requires: x!=y”) Add expected clause to test public void t3() { m(1, 1, 1); } Will fix the error or produce more useful output One can do more sophisticated program repairs. See [ICSE’11 “Angelic Debugging”]

Invariant Generation Pre-condition: “x!=y” Post-condition: “\result==((x>y) ? (d+1)/(x-y) : (d+1)/(y-x))” Use inductive and machine learning techniques to generate loop invariants See DySy [Csallner et al ICSE’08], also [SPIN’04]

Differential Symbolic Execution Computes logical difference between two program versions [FSE08, Person et al PLDI11]

Applications Automated test-input generation test vectors and test sequences Error detection, Invariant generation Program and data structure repair Security Robustness and stress testing Regression testing etc.

Scalability Compositional techniques [Godefroid, POPL’07] Pruning redundant paths [Boonstoppel et al, TACAS’08] Heuristic search [Brunim & Sen, ASE’08] [Majumdar & Se, ICSE’07] Parallel techniques [Siddiqui & Khurshid, ICSTE’10] [Staats & Pasareanu, ISSTA’10] Incremental techniques [Person et al, PLDI’11] Complex non-linear mathematical constraints Un-decidable or hard to solve Heuristic solving [Lakhotia et al., ICTSS’10][Souza et al, NFM’11] Testing web applications and security problems String constraints [Bjorner et al, 2009] … Mixed numeric and string constraints [ISSTA’11] [Fujitsu] Not covered: Symbolic execution for formal verification [Coen-Porisini et al, ESEC/FSE’01], [Dillon, ACM TOPLAS’90], [Harrison & Kemmerer’88] … Challenges

Symbolic Execution and Software Testing King [Comm. ACM 1976], Clarke [IEEE TSE 1976] Received renewed interest in recent years Algorithmic advances Increased availability of computational power and decision procedures Tools, many open-source NASA’s Symbolic (Java) Pathfinder UIUC’s CUTE and jCUTE Stanford’s KLEE UC Berkeley’s CREST and BitBlaze Microsoft’s Pex, SAGE, YOGI, PREfix IBM’s Apollo, Parasoft’s testing tools etc. Bibliography on symbolic execution (Saswat Anand): See ICSE’11 Impact Project talk on Thursday

Thank you!

References Java PathFinder [TACAS03]Sarfraz Khurshid, Corina S. Pasareanu, Willem Visser: Generalized Symbolic Execution for Model Checking and Testing. TACAS 2003: Sarfraz Khurshid, Corina S. Pasareanu, Willem Visser: Generalized Symbolic Execution for Model Checking and Testing. TACAS 2003: [CAV10] Aws Albarghouthi, Arie Gurfinkel, Ou Wei, Marsha Chechik: Abstract Analysis of Symbolic Executions. CAV 2010: Aws Albarghouthi, Arie Gurfinkel, Ou Wei, Marsha Chechik: Abstract Analysis of Symbolic Executions. CAV 2010: [FSE08] Suzette Person, Matthew B. Dwyer, Sebastian G. Elbaum, Corina S. Pasareanu: Differential symbolic execution. SIGSOFT FSE 2008: [FSE08] Suzette Person, Matthew B. Dwyer, Sebastian G. Elbaum, Corina S. Pasareanu: Differential symbolic execution. SIGSOFT FSE 2008: [PLDI11] Directed Incremental Symbolic Execution Suzette Person (NASA Langley Reseach Center), Guowei Yang (The Universiity of Texas at Austin), Neha Rungta (NASA Ames Research Center), and Sarfraz Khurshid (The University of Texas at Austin) [PLDI11] [ICSE08] C. Csallner, N. Tillmann, Y. Smaragdakis: DySy: dynamic symbolic execution for invariant inference. ICSE 2008: C. CsallnerN. TillmannICSE 2008 [SPIN04] Corina S. PasareanWillem Visser: Verification of Java Programs Using Symbolic Execution and Invariant Generation. SPIN 2004: Willem Visser: Verification of Java Programs Using Symbolic Execution and Invariant Generation. [ICSE11] Angelic Debugging Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik IBM Research, USA; UC Berkeley, USA ISTTA 2011 – loop abstractions Give example of infinite lopps Prateek Saxena, Pongsin Poosankam, Stephen McCamant, Dawn Song: Loop-extended symbolic execution on binary programs. ISSTA 2009: Prateek Saxena, Pongsin Poosankam, Stephen McCamant, Dawn Song: Loop-extended symbolic execution on binary programs. ISSTA 2009: Rupak Majumdar, Indranil Saha: Symbolic Robustness Analysis. IEEE Real-Time Systems Symposium 2009: Indranil Saha: Symbolic Robustness Analysis. IEEE Real-Time Systems Symposium 2009: Kiran Lakhotia, Nikolai Tillmann, Mark Harman, Jonathan de Halleux: FloPSy - Search-Based Floating Point Constraint Solving for Symbolic Execution. ICTSS 2010: Kiran Lakhotia, Nikolai Tillmann, Mark Harman, Jonathan de Halleux: FloPSy - Search-Based Floating Point Constraint Solving for Symbolic Execution.