Symbolic execution © Marcelo d’Amorim 2010.

Slides:



Advertisements
Similar presentations
AP Computer Science Anthony Keen. Computer 101 What happens when you turn a computer on? –BIOS tries to start a system loader –A system loader tries to.
Advertisements

Leonardo de Moura Microsoft Research. Z3 is a new solver developed at Microsoft Research. Development/Research driven by internal customers. Free for.
CO /1075 Recursion on linked lists All the linked list examples we have seen so far use iteration when they need to work through the nodes of a.
CS 11 C track: lecture 7 Last week: structs, typedef, linked lists This week: hash tables more on the C preprocessor extern const.
Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
Symbolic Execution with Mixed Concrete-Symbolic Solving
PLDI’2005Page 1June 2005 Example (C code) int double(int x) { return 2 * x; } void test_me(int x, int y) { int z = double(x); if (z==y) { if (y == x+10)
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
Getting started with ML ML is a functional programming language. ML is statically typed: The types of literals, values, expressions and functions in a.
Type checking © Marcelo d’Amorim 2010.
CSE503: SOFTWARE ENGINEERING SYMBOLIC TESTING, AUTOMATED TEST GENERATION … AND MORE! David Notkin Spring 2011.
Pexxxx White Box Test Generation for
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
Program Design and Development
Introduction to Computer Science Exam Information Unit 20.
Understanding Arrays and Pointers Object-Oriented Programming Using C++ Second Edition 3.
CS-341 Dick Steflik Introduction. C++ General purpose programming language A superset of C (except for minor details) provides new flexible ways for defining.
Program Exploration with Pex Nikolai Tillmann, Peli de Halleux Pex
Finding the Weakest Characterization of Erroneous Inputs Dzintars Avots and Benjamin Livshits.
Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Writing algorithms using the while-statement. Previously discussed Syntax of while-statement:
Sadegh Aliakbary Sharif University of Technology Fall 2012.
Introduction to C++ - How C++ Evolved Most popular languages currently: COBOL, Fortran, C, C++, Java (script) C was developed in 1970s at AT&T (Richie)
Introduction to Java Appendix A. Appendix A: Introduction to Java2 Chapter Objectives To understand the essentials of object-oriented programming in Java.
DART: Directed Automated Random Testing Koushik Sen University of Illinois Urbana-Champaign Joint work with Patrice Godefroid and Nils Klarlund.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Fundamentals of Python: From First Programs Through Data Structures Chapter 14 Linear Collections: Stacks.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.
Recitation 2 Main Method, API & Packages, Java Basics.
The string data type String. String (in general) A string is a sequence of characters enclosed between the double quotes "..." Example: Each character.
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
School of Computer Science & Information Technology G6DICP - Lecture 9 Software Development Techniques.
Basics of Java IMPORTANT: Read Chap 1-6 of How to think like a… Lecture 3.
© 2004 Goodrich, Tamassia Stacks. © 2004 Goodrich, Tamassia Stacks2 Abstract Data Types (ADTs) An abstract data type (ADT) is an abstraction of a data.
Testing Testing Techniques to Design Tests. Testing:Example Problem: Find a mode and its frequency given an ordered list (array) of with one or more integer.
 Pearson Education, Inc. All rights reserved Introduction to Java Applications.
Peyman Dodangeh Sharif University of Technology Spring 2014.
Ongoing projects in the Program Analysis Group Marcelo d’Amorim Informatics Center, Federal University of Pernambuco (UFPE) Belo Horizonte, MG-Brazil,
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
BEGINNING PROGRAMMING.  Literally – giving instructions to a computer so that it does what you want  Practically – using a programming language (such.
jFuzz – Java based Whitebox Fuzzing
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.
A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke Presented by: Xia Cheng.
School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
Object Oriented Programming (OOP) LAB # 1 TA. Maram & TA. Mubaraka TA. Kholood & TA. Aamal.
By Mr. Muhammad Pervez Akhtar
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Question of the Day  What three letter word completes the first word and starts the second one: DON???CAR.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Design issues for Object-Oriented Languages
CS314 – Section 5 Recitation 9
Control Flow Testing Handouts
Outline of the Chapter Basic Idea Outline of Control Flow Testing
Advanced Programming in Java
High Coverage Detection of Input-Related Security Faults
Starting JavaProgramming
The Metacircular Evaluator
The Metacircular Evaluator
Automatic Test Generation SymCrete
The Metacircular Evaluator (Continued)
6.001 SICP Variations on a Scheme
6.001 SICP Interpretation Parts of an interpreter
CUTE: A Concolic Unit Testing Engine for C
Lecture 6: References and linked nodes reading: 16.1
Loops CGS3416 Spring 2019 Lecture 7.
Presentation transcript:

Symbolic execution © Marcelo d’Amorim 2010

Goal and Input-Output Automate test input data generation – Input: parameterized function call – Output: inputs s.t. all* paths are explored © Marcelo d’Amorim 2010 foo(int x, int y){ if(x > y){... } else{... } } Symbolic Execution foo($a, $b); foo(1,0); foo(0,0)

Attention! Function foo can be arbitrarily complex – Other types, call to other functions, contain loops and branches, etc. One can obtain tests with user-defined assertions © Marcelo d’Amorim 2010

Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0)

Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0) Constraint generation Constraint solving

Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0) A path condition is a description of a path as function of symbolic inputs. Symbolic execution explores all program paths. Constraint generation Constraint solving path conditions

Opening the box… © Marcelo d’Amorim 2010 Symbolic Execution foo($a, $b); foo(1,0); foo(0,0) Constraint generation Constraint solving $a > $b $a <= $b foo(int x, int y){ if(x > y){... } else{... } }

Exercise Generate the path conditions for this program. © Marcelo d’Amorim 2010 void bar1(int x){ if (x > 0) { … } else if (x < 0) { … } else { ERROR; } }

Exercise Generate the path conditions for this program. © Marcelo d’Amorim 2010 void bar2(int x){ if (x > 0) { if (x > 10) {…} } else if (x < 0) { if (x < 2) {…} } else { ERROR; } }

Exercise Generate the path conditions for this program. © Marcelo d’Amorim 2010 void bar2(int x){ if (x > 0) { if (x > 10) {…} } else if (x < 0) { if (x < 2) {…} } else { ERROR; } } Infeasible path!

Exercise Generate the path conditions for this program. Hint: ignore paths with length > 2. © Marcelo d’Amorim 2010 int fact(int n){ return n * (n > 0) ? fact (n – 1) : 1; }

Exercise Generate the path conditions for this program. Hint: ignore paths with length > 2. © Marcelo d’Amorim 2010 int fact(int n){ return n * (n > 0) ? fact (n – 1) : 1; } Repeated states.

Part 1: constraint generator Modifies program semantics to handle symbolic state – Stack, heap, and static area hold symbolic values Two popular alternatives – Instrumentation – Modified interpreter (e.g., Java Virtual Machine) © Marcelo d’Amorim 2010

Instrumentation © Marcelo d’Amorim 2010 foo(int x) { x = x + 1; if (x > 10) { // … } else { // … } foo(SymInt x) { x = x.add(ONE); if (x.gt(TEN).choose()) { // … } else { // … } Types and operationschoice

Discussion What would you need to modify in a JVM to run programs in symbolic execution mode? What are pros-cons of instrumentation-based solution vs. modified JVM? © Marcelo d’Amorim 2010

Part 2: constraint solver Decision procedures can be used to solve simple constraints. For example: – Integer linear arithmetic: x > y + z and z < y Unfortunately, symbolic execution can generate complex constraints – Undecidable, intractable, or just not handled by decision procedures © Marcelo d’Amorim 2010

Pointers to the interested JVM symbolic execution: AQUA and SPF Complex constraints: CORAL or FloPSy Links: – AQUA and CORAL: – SPF: google JPF and symb project – FloPSy: us/people/nikolait/ us/people/nikolait/ © Marcelo d’Amorim 2010

Objects: Lazy initialization A symbolic object is an “unknown blob”. – Execution details the blob by need Assignment example: o.f = exp – Variable o holds the symbolic object ? (the blob) – 3 possible outcomes depending on ?: ? is null ? is a not yet seen object ? Is an already seen object © Marcelo d’Amorim 2010

Objects: Lazy initialization A symbolic object is an “unknown blob”. – Execution details the blob by need Assignment example: o.f = exp – Variable o holds the symbolic object ? (the blob) – 3 possible outcomes depending on ?: ? is null ? is a not yet seen object ? Is an already seen object © Marcelo d’Amorim 2010 Concretize the heap while making choices

Example © Marcelo d’Amorim 2010 Node root; add(Node n) { if (root == null) { root = n; } else { int v = root.val; if (v < n.val) {…} … } Notation: Primitive fields inside the box. Reference fields outside (omission indicates null). Dashed borders indicate symbolic objects. BST bst = new BST(); bst.add($a); bst.add($b); bst

Example © Marcelo d’Amorim 2010 Node root; add(Node n) { if (root == null) { root = n; } else { int v = root.val; if (v < n.val) {…} … } BST bst = new BST(); bst.add($a); bst.add($b); $abst root

Example © Marcelo d’Amorim 2010 Node root; add(Node n) { if (root == null) { root = n; } else { int v = root.val; if (v < n.val) {…} … } BST bst = new BST(); bst.add($a); bst.add($b); $a $x $y bst $b $a $x $y bst $b $a root left right $a == null $a != null and $a.val = $x and $b.val = $y and $y < $x $x bst $a root $a != null and $a.val = x and $b.val = y and $x=$y $a != null and $a.val = $x and $b.val = $y and $y > $x NPE!

Strings Two approaches – A string is an array of symbolic characters – Symbolic string + special interpretation of library methods First approach can be too expensive. Why? © Marcelo d’Amorim 2010

Strings Two approaches – A string is an array of symbolic characters – Symbolic string + special interpretation of library methods First approach can be too expensive. Why? © Marcelo d’Amorim 2010 foo(String s) { …if (s.equals(“hello”)) {…}… }

Automata for string constraints Second approach generates finite automata for string constraints generated with library calls Constraint solving = automata walk! © Marcelo d’Amorim 2010

Exercise Generate automata to characterize these constraints © Marcelo d’Amorim 2010 $s.startsWith(“hello”) and $s.indexOf(“class”)!=-1 and s.endsWith(“.”)

Concolic execution (a.k.a. fuzzing) Several problems with standard symbolic execution. In particular: – Exploration of infeasible paths – Symbolic arrays – Handling of loops and recursion – Native method calls © Marcelo d’Amorim 2010

Concolic Execution: How it works 1.Execute the problem with concrete and symbolic inputs 2.Save decisions as before, but execute a single path! 3.Solve pending decisions and back to 1 © Marcelo d’Amorim 2010 Can go from symbolic to concrete domain anytime during execution!

Summary Important technique to automate testing Found real errors in file systems, OS, network protocols, and several data structures See for industrial applicationswww.coverity.com © Marcelo d’Amorim 2010

What I believe is still missing Automation of driver and oracle generation Exploit natural parallelism © Marcelo d’Amorim 2010 SYMB.EXE Solver YICES … … queries: solutions: