jFuzz – Java based Whitebox Fuzzing

Slides:



Advertisements
Similar presentations
Leonardo de Moura Microsoft Research. Z3 is a new solver developed at Microsoft Research. Development/Research driven by internal customers. Free for.
Advertisements

1 Symbolic Execution Kevin Wallace, CSE
Finding bugs: Analysis Techniques & Tools Symbolic Execution & Constraint Solving CS161 Computer Security Cho, Chia Yuan.
Delta Debugging and Model Checkers for fault localization
Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
Symbolic Execution with Mixed Concrete-Symbolic Solving
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Making Choices in C if/else statement logical operators break and continue statements switch statement the conditional operator.
Fuzzing and Patch Analysis: SAGEly Advice. Introduction.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Parallel Symbolic Execution for Structural Test Generation Matt Staats Corina Pasareanu ISSTA 2010.
Model Counting >= Symbolic Execution Willem Visser Stellenbosch University Joint work with Matt Dwyer (UNL, USA) Jaco Geldenhuys (SU, RSA) Corina Pasareanu.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
Symbolic execution © Marcelo d’Amorim 2010.
FIT FIT1002 Computer Programming Unit 19 Testing and Debugging.
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Program Exploration with Pex Nikolai Tillmann, Peli de Halleux Pex
Information Hiding and Encapsulation
Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Automatic Creation of SQL Injection and Cross-Site Scripting Attacks 2nd-order XSS attacks 1st-order XSS attacks SQLI attacks Adam Kiezun, Philip J. Guo,
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
CREST Internal Yunho Kim Provable Software Laboratory CS Dept. KAIST.
Symbolic Execution with Mixed Concrete-Symbolic Solving (SymCrete Execution) Jonathan Manos.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
HAMPI A Solver for String Constraints Vijay Ganesh MIT (With Adam Kiezun, Philip Guo, Pieter Hooimeijer and Mike Ernst)
CSE 219 Computer Science III Testing. Testing vs. Debugging Testing: Create and use scenarios which reveal incorrect behaviors –Design of test cases:
Exercise Solutions 2014 Fall Term. Week 2: Exercise 1 public static Boolean repOK(Stack mystack) { if (mystack.capacity() < 0) { return false;
Introduction to Programming David Goldschmidt, Ph.D. Computer Science The College of Saint Rose Java Fundamentals (Comments, Variables, etc.)
Automated Whitebox Fuzz Testing (NDSS 2008) Presented by: Edmund Warner University of Central Florida April 7, 2011 David Molnar UC Berkeley
Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, ‏Michael Y. Levin, and ‏David Molnar Present.
Object-Oriented Program Development Using Java: A Class-Centered Approach, Enhanced Edition.
Generic API Test tool By Moshe Sapir Almog Masika.
Jose Sanchez 1 o Tielei Wang†, TaoWei†, Zhiqiang Lin‡, Wei Zou†. o Purdue University & Peking University o Proceedings of NDSS'09: Network and Distributed.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
Java PathFinder (JPF) cs498dm Software Testing January 19, 2012.
Parallelizing MiniSat I-Ting Angelina Lee Justin Zhang May 05, Final Project Presentation.
1 Original Source : and Problem and Problem Solving.ppt.
Symbolic Execution with Abstract Subsumption Checking Saswat Anand College of Computing, Georgia Institute of Technology Corina Păsăreanu QSS, NASA Ames.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.
Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.
Finding Errors in.NET with Feedback-Directed Random Testing Carlos Pacheco (MIT) Shuvendu Lahiri (Microsoft) Thomas Ball (Microsoft) July 22, 2008.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
Saleem Sabbagh & Najeeb Darawshy Supervisors: Mony Orbach, Technion & Ilia Averbouch, IBM Started at: Spring 2012 Duration: Semester.
CAPP: Change-Aware Preemption Prioritization Vilas Jagannath, Qingzhou Luo, Darko Marinov Sep 6 th 2011.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
Random Test Generation of Unit Tests: Randoop Experience
Symbolic Execution in Software Engineering By Xusheng Xiao Xi Ge Dayoung Lee Towards Partial fulfillment for Course 707.
Cs205: engineering software university of virginia fall 2006 Programming Exceptionally David Evans
Defensive Programming. Good programming practices that protect you from your own programming mistakes, as well as those of others – Assertions – Parameter.
Symstra: A Framework for Generating Object-Oriented Unit Tests using Symbolic Execution Tao Xie, Darko Marinov, Wolfram Schulte, and David Notkin University.
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Review A program is… a set of instructions that tell a computer what to do. Programs can also be called… software. Hardware refers to… the physical components.
 It is a pure oops language and a high level language.  It was developed at sun microsystems by James Gosling.
CS314 – Section 5 Recitation 9
Willem Visser Stellenbosch University
Control Flow Testing Handouts
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
Key Ideas from day 1 slides
Outline of the Chapter Basic Idea Outline of Control Flow Testing
A Test Case + Mock Class Generator for Coding Against Interfaces
Presented by Mahadevan Vasudevan + Microsoft , *UC-Berkeley
New Ideas Track: Testing MapReduce-Style Programs Christoph Csallner, Leonidas Fegaras, Chengkai Li Computer.
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis.
Automatic Test Generation SymCrete
CUTE: A Concolic Unit Testing Engine for C
Defensive Programming
Presentation transcript:

jFuzz – Java based Whitebox Fuzzing David Harvison Adam Kiezun

Summary Problem Generating interesting test inputs for file reading programs takes time. Approach Create a smart fuzzer to generate inputs that cause programs to crash. Results jFuzz generates a many input files and creates a base for others to expand upon.

Problem Bugs in a program may cause crashes for specific input files. A compiler with buggy code, a media player with a corrupt file, etc. Creating input files by hand takes time. Some of the files may exercise the same code. Want a way to automatically generate input files that crash the program.

Idea Generate inputs that cause crashes by generating many inputs that exercise unique execution paths.

Program Example File reading code. if (car0 == '-') { neg = true; } else { neg = false; cnt++; } if (car1 >= '0' && car1 <= '5') { val = car1 - '0'; val = car1;

Program Example File reading code. Want to generate files which exercise different program paths. if (car0 == '-') { neg = true; } else { neg = false; cnt++; } if (car1 >= '0' && car1 <= '5') { val = car1 - '0'; val = car1;

Program Example File reading code. Want to generate files which exercise different program paths. if (car0 == '-') { neg = true; } else { neg = false; cnt++; } if (car1 >= '0' && car1 <= '5') { val = car1 - '0'; val = car1; car0 == '-' car1 == '5'

Program Example File reading code. Want to generate files which exercise different program paths. if (car0 == '-') { neg = true; } else { neg = false; cnt++; } if (car1 >= '0' && car1 <= '5') { val = car1 - '0'; val = car1; car0 == '-' car1 == '7'

Program Example File reading code. Want to generate files which exercise different program paths. if (car0 == '-') { neg = true; } else { neg = false; cnt++; } if (car1 >= '0' && car1 <= '5') { val = car1 - '0'; val = car1; car0 == '+' car1 == '3'

Program Example File reading code. Want to generate files which exercise different program paths. if (car0 == '-') { neg = true; } else { neg = false; cnt++; } if (car1 >= '0' && car1 <= '5') { val = car1 - '0'; val = car1; car0 == '+' car1 == '9'

Related Tools Cute, EXE, SAGE, catchconv, Apollo JCute Smart fuzzers - programs that generate interesting new inputs for programs. Not for Java. JCute Smart fuzzer for Java. Reinstruments code – Requires source files. Has problems with the JDK.

Overall Idea Input Output Run the subject program in a modified JVM. A compiled (into bytecode) Java program. A valid input file. Output New input files which exercise unique control paths. Run the subject program in a modified JVM. A logic predicate, the Path Condition, is formed as the program executes. Describes control flow of execution. New inputs are created by manipulating the path condition.

Example good public void top(char[] input) { int cnt = 0; if (input[0] == ‘b’) cnt++; if (input[1] == ‘a’) cnt++; if (input[2] == ‘d’) cnt++; if (input[3] == ‘!’) cnt++; if (cnt > 3) crash(); } good

Example good Negate constraints in path condition. public void top(char[] input) { int cnt = 0; if (input[0] == ‘b’) cnt++; if (input[1] == ‘a’) cnt++; if (input[2] == ‘d’) cnt++; if (input[3] == ‘!’) cnt++; if (cnt > 3) crash(); } I0 != 'b' I1 != 'a' I2 != 'd' I3 != '!' path condition good Negate constraints in path condition. Solve the new path condition to create new inputs.

Example goo! good public void top(char[] input) { int cnt = 0; if (input[0] == ‘b’) cnt++; if (input[1] == ‘a’) cnt++; if (input[2] == ‘d’) cnt++; if (input[3] == ‘!’) cnt++; if (cnt > 3) crash(); } I0 != 'b' I1 != 'a' I2 != 'd' I3 == '!' good goo!

Example godd good public void top(char[] input) { int cnt = 0; if (input[0] == ‘b’) cnt++; if (input[1] == ‘a’) cnt++; if (input[2] == ‘d’) cnt++; if (input[3] == ‘!’) cnt++; if (cnt > 3) crash(); } I0 != 'b' I1 != 'a' I2 == 'd' godd good

All paths are explored systematically. Example bood public void top(char[] input) { int cnt = 0; if (input[0] == ‘b’) cnt++; if (input[1] == ‘a’) cnt++; if (input[2] == ‘d’) cnt++; if (input[3] == ‘!’) cnt++; if (cnt > 3) crash(); } gaod godd good goo! All paths are explored systematically.

Tool Used NASA Java PathFinder Dynamic analysis framework for Java implemented as a JVM. Allows backtracking including saving and restoring the whole state of the VM. Can execute all thread interleavings. Can execute a program on all possible inputs.

Attributes Additional state-stored information. Associated with runtime values. JPF propagates attributes across assignment, method calls, etc. Allows us to keep track of how the variables relate to the input using symbolic expressions.

Concolic execution is both concrete and symbolic. Concrete v. Concolic Normal With Attributes exp0 exp1 3 4 3 4 + + 7 Sum (exp0, exp1) 7 Concrete Concolic execution is both concrete and symbolic. Symbolic

Concrete v. Concolic Concrete Symbolic public class IADD extends Instruction { public Instruction execute (... ThreadInfo th) { [1] int v1 = th.pop(); [2] int v2 = th.pop(); [3] th.push(v1 + v2, ...); [4] return getNext(th); } public class IADD extends ...bytecode.IADD { public Instruction execute (... ThreadInfo th) { [1] int v1 = th.pop(); [2] int v2 = th.pop(); [3] th.push(v1 + v2, ...); [4] StackFrame sf = th.getTopFrame(); [5] IntExpr sym_v1 = sf.getOperandAttr(); [6] IntExpr sym_v2 = sf.getOperandAttr(); [7] if (sym_v1 == null && sym_v2 == null) return getNext(th); [7] IntExpr result = sym_v1._plus(sym_v2); [8] sf.setOperandAttr(result); [9] return getNext(th); } Concrete Symbolic

jFuzz Architecture Subject and Input Runs JPF many times on the subject program and input files. Each run: Collects the Path Condition (PC). Negates each constraint, reduces, and solves. Uses new PCs to generate new input files. Keeps track of inputs which caused exceptions to be thrown. jFuzz JPF Subject and Original Input PC Negated PC Solver New Input Inputs which cause crashes

Creating New Inputs For a given execution some parts of the input may not be read.

Creating New Inputs For a given execution some parts of the input may not be read.

Creating New Inputs For a given execution some parts of the input may not be read. When the path condition is solved, only the read parts will have new values.

Creating New Inputs For a given execution some parts of the input may not be read. When the path condition is solved, only the read parts will have new values. The changes are written over the original input, preserving the unused parts.

Reducing the Path Condition Subject and Input Path Conditions can be very long. Not all constraints are effected by negating the PC. Constraints not effected can be removed from the PC. jFuzz JPF Subject and Original Input PC Negated PC PC Minimizer Solver New Input Inputs which cause crashes

Example Reduction Path Condition: [1] a + b < 10 [2] b > 6 [5] c + d > 7 [6] e != 1 [7] c – e = 5 [8] a == 2

Example Reduction Path Condition: [1] a + b < 10 [2] b > 6 Start with fuzzing the last constraint. Path Condition: [1] a + b < 10 [2] b > 6 [3] c < 15 [4] a < 3 [5] c + d > 7 [6] e != 1 [7] c – e = 5 [8] a != 2

Example Reduction Path Condition: [1] a + b < 10 [2] b > 6 Start with fuzzing the last constraint. Select all constraints which contain variables in that constraint. Path Condition: [1] a + b < 10 [2] b > 6 [3] c < 15 [4] a < 3 [5] c + d > 7 [6] e != 1 [7] c – e = 5 [8] a != 2

Example Reduction Path Condition: [1] a + b < 10 [2] b > 6 Start with fuzzing the last constraint. Select all constraints which contain variables in that constraint. If one of the constraints contains multiple variables, select all constraints which contain those variables. Path Condition: [1] a + b < 10 [2] b > 6 [3] c < 15 [4] a < 3 [5] c + d > 7 [6] e != 1 [7] c – e = 5 [8] a != 2

Example Reduction Path Condition: [1] a + b < 10 [2] b > 6 Start with fuzzing the last constraint. Select all constraints which contain variables in that constraint. If one of the constraints contains multiple variables, select all constraints which contain those variables. All other constraints can be removed. Path Condition: [1] a + b < 10 [2] b > 6 [4] a < 3 [8] a != 2

Example Reduction Path Condition: [1] a + b < 10 [2] b > 6 Start with fuzzing the last constraint. Select all constraints which contain variables in that constraint. If one of the constraints contains multiple variables, select all constraints which contain those variables. All other constraints can be removed. Variables not in the new PC are left unchanged. Path Condition: [1] a + b < 10 [2] b > 6 [3] a < 3 [4] a != 2

Reducing the Path Condition Reductions are performed for every constraint that is negated. jFuzz uses a UnionFind data structure to find which variables are connected to each other. In our case study, the average reduction was from around 250 constraints to about 5 constraints.

Case Study Subject: Sat4J Goals: SAT solver written in Java. Takes inputs in dimacs files. ~10 kloc. Goals: Create inputs that crash Sat4J. Create a set of good inputs. test1.dimacs c test 3 single clauses and 2 c binary clauses p cnf 4 5 1 0 2 0 3 0 -2 4 0 -3 4 0

Results After 30 minutes of execution: 12,000 input files were created. 70 crashes where found. The crashes are actually normal for SAT4J 38 Invalid DIMACS files. 27 Contradictions. 4 Assertion Errors. A Java compiler would be more compelling. Any crash is due to a bug in the compiler. Much larger program.

Performance Sat4J was run 100 times in each VM. The times are average runtime. Simplifying the Path Condition reduces the solving time by 30%.

Conclusions jFuzz is the first concolic tester for Java which will work for any bytecode. This opens the door for more advanced fuzzing techniques, such as grammar based fuzzing.

Fuzzing the Input a b c d e Path Condition: [1] a + b < 10 [7] c – e = 5 [8] a != 2 a b c d e

Fuzzing the Input Path Condition: [1] a + b < 10 [2] b > 6 [3] a < 3 [4] a != 2 a b c d e The reduced PC does not effect all parts of the input. The solver will only return values for variables in the PC.

Fuzzing the Input The new values are written over the original input. Path Condition: [1] a + b < 10 [2] b > 6 [3] a < 3 [4] a != 2 a' b' c d e The new values are written over the original input. The new input only differs from the original in the values in the PC.