Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Concolic Execution for Automatic Exploit Generation Todd Frederick.

Slides:



Advertisements
Similar presentations
ByteWeight: Learning to Recognize Functions in Binary Code
Advertisements

Masahiro Fujita Yoshihisa Kojima University of Tokyo May 2, 2008
PLDI’2005Page 1June 2005 Example (C code) int double(int x) { return 2 * x; } void test_me(int x, int y) { int z = double(x); if (z==y) { if (y == x+10)
Fuzzing and Patch Analysis: SAGEly Advice. Introduction.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Abhinn Kothari, 2009CS10172 Parth Jaiswal 2009CS10205 Group: 3 Supervisor : Huzur Saran.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Paradyn Project Upcoming Features in Dyninst and its Components Bill Williams.
SMU SRG reading by Tey Chee Meng: Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications by David Brumley, Pongsin Poosankam,
David Brumley, Pongsin Poosankam, Dawn Song and Jiang Zheng Presented by Nimrod Partush.
Bouncer securing software by blocking bad input Miguel Castro Manuel Costa, Lidong Zhou, Lintao Zhang, and Marcus Peinado Microsoft Research.
Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.
1 Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation David Brumley, Juan.
Impeding Malware Analysis Using Conditional Code Obfuscation Paper by: Monirul Sharif, Andrea Lanzi, Jonathon Giffin, and Wenke Lee Conference: Network.
1 Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation David Brumley, Juan.
CSE503: SOFTWARE ENGINEERING SYMBOLIC TESTING, AUTOMATED TEST GENERATION … AND MORE! David Notkin Spring 2011.
Branch Regulation: Low-Overhead Protection from Code Reuse Attacks Mehmet Kayaalp, Meltem Ozsoy, Nael Abu-Ghazaleh and Dmitry Ponomarev Department of Computer.
Attacks Using Stack Buffer Overflow Boxuan Gu
AUTOMATIC CONCOLIC TEST GENERATION WITH VIRTUAL PROTOTYPES FOR POST-SILICON VALIDATION Reviewer: Shin-Yann Ho Instructor: Jie-Hong Jiang.
Recitation 2: Assembly & gdb Andrew Faulring Section A 16 September 2002.
CREST Internal Yunho Kim Provable Software Laboratory CS Dept. KAIST.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Paradyn Project Dyninst/MRNet Users’ Meeting Madison, Wisconsin August 7, 2014 The Evolution of Dyninst in Support of Cyber Security Emily Gember-Jacobson.
Stamping out worms and other Internet pests Miguel Castro Microsoft Research.
Presented by: Tom Staley. Introduction Rising security concerns in the smartphone app community Use of private data: Passwords Financial records GPS locations.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
Automated Whitebox Fuzz Testing (NDSS 2008) Presented by: Edmund Warner University of Central Florida April 7, 2011 David Molnar UC Berkeley
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011.
The Deconstruction of Dyninst: Experiences and Future Directions Drew Bernat, Madhavi Krishnan, Bill Williams, Bart Miller Paradyn Project 1.
Paradyn Project Petascale Tools Workshop Madison, Wisconsin Aug 4-Aug 7, 2014 Binary Code is Not Easy Xiaozhu Meng, Emily Gember-Jacobson, and Bill Williams.
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
© ETH Zürich Eric Lo ETH Zurich a joint work with Carsten Binnig (U of Heidelberg), Donald Kossmann (ETH Zurich), Tamer Ozsu (U of Waterloo) and Peter.
Automatic Diagnosis and Response to Memory Corruption Vulnerabilities Presenter: Jianyong Dai Jun Xu, Peng Ning, Chongkyung Kil, Yan Zhai, Chris Bookhot.
Recitation 2 – 2/11/02 Outline Stacks & Procedures Homogenous Data –Arrays –Nested Arrays Mengzhi Wang Office Hours: Thursday.
Stamping out worms and other Internet pests Miguel Castro Microsoft Research.
Machine-Level Programming 3 Control Flow Topics Control Flow Switch Statements Jump Tables.
Analyzing Memory Accesses in Obfuscated x86 Executables Michael Venable Mohamed R. Choucane Md. Enamul Karim Arun Lakhotia (Presenter) DIMVA 2005 Wien.
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
CNIT 127: Exploit Development Ch 1: Before you begin.
Machine Learning for Program Language Research Yao Peisen Prism Group, HKUST
Convicting Exploitable Software Vulnerabilities: An Efficient Input Provenance Based Approach Zhiqiang Lin Xiangyu Zhang, Dongyan Xu Purdue University.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
1 Linking. 2 Outline Symbol Resolution Relocation Suggested reading: 7.6~7.7.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 Detecting Code Reuse Attacks Using Dyninst Components Emily Jacobson, Drew.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 unstrip: Restoring Function Information to Stripped Binaries Using Dyninst Emily.
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION SYMBOLIC TESTING Autumn 2011.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2004 Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2004.
Correct RelocationMarch 20, 2016 Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat
OUTLINE 2 Pre-requisite Bomb! Pre-requisite Bomb! 3.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
Buffer Overflow Buffer overflows are possible because C doesn’t check array boundaries Buffer overflows are dangerous because buffers for user input are.
Recitation 3: Procedures and the Stack
Machine-Level Programming 2 Control Flow
The Hardware/Software Interface CSE351 Winter 2013
Olatunji Ruwase* Shimin Chen+ Phillip B. Gibbons+ Todd C. Mowry*
Emily Jacobson and Nathan Rosenblum
Discussion Section – 11/3/2012
High Coverage Detection of Input-Related Security Faults
Defeating Instruction Set Randomization Nora Sovarel
Machine-Level Programming 2 Control Flow
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis.
Machine-Level Representation of Programs III
Machine-Level Programming 2 Control Flow
Automatic Test Generation SymCrete
VUzzer: Application-aware Evolutionary Fuzzing
X86 Assembly Review.
CSC-682 Advanced Computer Security
CUTE: A Concolic Unit Testing Engine for C
Presentation transcript:

Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Concolic Execution for Automatic Exploit Generation Todd Frederick

Vulnerabilities are everywhere… 2 Binary Concolic Execution

rtm Robert Morris An exploit 3 Binary Concolic Execution DD8F2F736800DD8F2F6 2696ED05E5ADD00DD00 DD5ADD03D05E5CBC3B shell# Finger Server 1987

The problem: exploiting vulnerable code o Find an exploit state in a program o Use a known existing vulnerability o Previous work automatically finds vulnerable states [Giffin, Jha, Miller 2006] 4 Binary Concolic Execution o Find input that drives the program down a path to the exploit state o Analyze program control flow o Walk through the program, finding inputs to reach the current point o Explore paths in the program to reach the vulnerability o Find an exploit state in a program o Use a known existing vulnerability o Previous work automatically finds vulnerable states [Giffin, Jha, Miller 2006]

The problem 5 Binary Concolic Execution normal input exploit Program Assume we know of a vulnerability

Running example 6 Binary Concolic Execution Program login: goodbad password:Using backdoor!

Working with binary code 7 Binary Concolic Execution Program : lea 0x4(%esp),%ecx : and $0xfffffff0,%esp : pushl 0xfffffffc(%ecx) c: push %ebp d: mov %esp,%ebp f: push %ebx : push %ecx : sub $0x10,%esp : call : mov $0x3,%eax e: mov $0x0,%ebx 80482a3: mov $0x80bd884,%ecx 80482a8: mov $0x10,%edx 80482ad: int $0x af: mov %eax,0xfffffff0(%ebp) 80482b2: movzbl 0x80bd886,%eax 80482b9: movsbl %al,%edx 80482bc: movzbl 0x80bd884,%eax 80482c3: movsbl %al,%eax 80482c6: mov %edx,%ecx 80482c8: sub %eax,%ecx 80482ca: mov %ecx,%eax 80482cc: cmp $0x2,%eax 80482cf: jne d1: movzbl 0x80bd886,%eax 80482d8: movsbl %al,%edx 80482db: movzbl 0x80bd885,%eax 80482e2: movsbl %al,%eax 80482e5: mov %edx,%ecx 80482e7: sub %eax,%ecx 80482e9: mov %ecx,%eax 80482eb: cmp $0x3,%eax 80482ee: jne f0: movzbl 0x80bd886,%eax 80482f7: cmp $0x64,%al 80482f9: jne fb: call c : jmp : call : mov $0x1,%eax c: mov $0x0,%ebx : int $0x : mov %eax,0xfffffff4(%ebp) : mov $0x0,%eax b: add $0x10,%esp e: pop %ecx f: pop %ebx : pop %ebp : lea 0xfffffffc(%ecx),%esp : ret exploit

Conceptual approach 8 Binary Concolic Execution Symbolic Execution Program Generated Input o Run program, tracking variables as expressions instead of actual (concrete) values o Collect expressions along the current path o Find concrete input to satisfy these expressions

Conceptual approach 9 Binary Concolic Execution o Run program, tracking variables as expressions instead of actual (concrete) values o Collect expressions along the current path o Find concrete input to satisfy these expressions Program Generated Input Symbolic Executor Solver Path Conditions

Conceptual approach 10 Binary Concolic Execution o Exponential number of paths o Limit and prioritize the paths we will explore Program Generated Input Symbolic Executor Solver Path Conditions Path Selector

Traditional symbolic execution 11 Binary Concolic Execution read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login()

Traditional symbolic execution 12 Binary Concolic Execution if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login()

Traditional symbolic execution 13 Binary Concolic Execution if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login()

Traditional symbolic execution 14 Binary Concolic Execution read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login()

Traditional symbolic execution 15 Binary Concolic Execution read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() login()

Problems with symbolic execution Must maintain exponentially many symbolic states Expressions may be difficult or unfeasible to solve 16 Binary Concolic Execution Solution: Run program concretely and symbolically Concrete Execution Symbolic Execution Concolic Execution

Concolic execution overview 17 Binary Concolic Execution Instructions Program Concrete Executor Input Generated Input Symbolic Executor Solver Path Conditions Path Selector o Symbolic execution follows concrete path o Some expressions use concrete values

Concolic execution Advantages Track less state in parallel by following a single path at a time Simplify expressions by substituting concrete values for difficult sub expressions Disadvantage Concrete values only hold for a specific set of concrete inputs, so mixing concrete values and expressions may produce inaccurate expressions 18 Binary Concolic Execution

Concolic execution example 19 Binary Concolic Execution Input good read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:

Concolic execution example 20 Binary Concolic Execution Input good if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: g,o,o,d

Concolic execution example 21 Binary Concolic Execution Input good if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: g,o,o,d

Concolic execution example 22 Binary Concolic Execution Input good read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() Generated Input egg Concrete Memory buffer: g,o,o,d

Concolic execution example 23 Binary Concolic Execution Input egg read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:

Concolic execution example 24 Binary Concolic Execution Input egg if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: e,g,g

Concolic execution example 25 Binary Concolic Execution Input egg if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: e,g,g

Concolic execution example 26 Binary Concolic Execution Input egg read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login() Concrete Memory buffer: e,g,g

Concolic execution example 27 Binary Concolic Execution Input egg read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() Generated Input port Concrete Memory buffer: e,g,g

Concolic execution example 28 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:

Concolic execution example 29 Binary Concolic Execution Input port if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: p,o,r,t

Concolic execution example 30 Binary Concolic Execution Input port if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: p,o,r,t

Concolic execution example 31 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login() Concrete Memory buffer: p,o,r,t

Concolic execution example 32 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: p,o,r,t

Concolic execution example 33 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) Generated Input bad Concrete Memory buffer: p,o,r,t

Concolic execution example 34 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:

Concolic execution example 35 Binary Concolic Execution Input bad if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: b,a,d

Concolic execution example 36 Binary Concolic Execution Input bad if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: b,a,d

Concolic execution example 37 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login() Concrete Memory buffer: b,a,d

Concolic execution example 38 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: b,a,d

Concolic execution example 39 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) login() Success Concrete Memory buffer: b,a,d

Inaccurate expressions Some variables depend on input Replacing these variables with concrete values may yield inaccurate expressions Solving an inaccurate path condition may produce input that does not take the desired path 40 Binary Concolic Execution

Concolic execution system design 41 Binary Concolic Execution Concrete Executor ProgramInput Solver InstructionsGenerated Input Symbolic Executor Path Conditions Path Selector

Concolic execution system design 42 Binary Concolic Execution Concrete Executor ProgramInput InstructionsGenerated Input Symbolic Executor STP (Solver) Path Conditions Path Selector SymEval Dyninst ProcControl API

Concrete execution components 43 Binary Concolic Execution Concrete Executor Dyninst ProcControl API

Concrete execution components 44 Binary Concolic Execution Concrete Executor Redirects program input Reads actual values of instruction operands Tracks path taken Concrete Executor Redirects program input Reads actual values of instruction operands Tracks path taken Dyninst Assists with static analysis Dyninst Assists with static analysis ProcControl API Runs program using single-stepping or breakpoints ProcControl API Runs program using single-stepping or breakpoints

Concolic execution system design 45 Binary Concolic Execution Concrete Executor ProgramInput InstructionsGenerated Input Symbolic Executor STP (Solver) Path Conditions Path Selector SymEval Dyninst ProcControl API

Symbolic execution components 46 Binary Concolic Execution Symbolic Executor SymEval

Symbolic execution components 47 Binary Concolic Execution Symbolic Executor Symbolic memory Identify input Update symbolic memory Extract conditional predicates Symbolic Executor Symbolic memory Identify input Update symbolic memory Extract conditional predicates SymEval Represents instruction semantics as ASTs SymEval Represents instruction semantics as ASTs

Concolic execution system design 48 Binary Concolic Execution Concrete Executor ProgramInput InstructionsGenerated Input Symbolic Executor STP (Solver) Path Conditions Path Selector SymEval Dyninst ProcControl API

Path searching components 49 Binary Concolic Execution STP (Solver) Path Conditions Path Selector

Path searching components 50 Binary Concolic Execution STP (Solver) Designed for program analysis applications Handles bit-vector data types STP (Solver) Designed for program analysis applications Handles bit-vector data types Path Conditions One term for each branch taken Path Selector Decides where to branch off from current path Is a depth-first search for now Other strategies will use static CFG analysis Path Selector Decides where to branch off from current path Is a depth-first search for now Other strategies will use static CFG analysis

Previous Work in Binary Concolic Execution IDS signature generation [Song, et al. 2008] Combined exploit strings to create signatures Required an initial exploit, or a patch for the vulnerability Program testing [Godefroid, et al. 2008] Created test cases with maximum code coverage in mind Used instruction-level tracing for concrete execution 51 Binary Concolic Execution

Potential Benefits of our Approach Our approach will be capable of finding the initial exploit We will do concrete execution with instrumentation, which gives us the flexibility to instrument selectively We plan to develop smarter path selection techniques using static control flow analysis 52 Binary Concolic Execution

Status Concrete execution partially implemented using ProcControlAPI Using standard input Will support network and environment as inputs Symbolic execution and path selection not implemented yet Driving development of SymEval Instruction semantics AST simplification 53 Binary Concolic Execution

Conclusion 54 Binary Concolic Execution Finding the first exploit with binary concolic execution using instrumentation movzbl 0x80bd886,%eax cmp $0x64,%al jne call c input[2] == ‘d’ mov %edx,%ecx sub %eax,%ecx mov %ecx,%eax cmp $0x2,%eax jne movzbl 0x80bd886,%eax cmp $0x64,%al jne call c mov %edx,%ecx sub %eax,%ecx mov %ecx,%eax cmp $0x3,%eax jne