Download presentation
Presentation is loading. Please wait.
Published byMark Alan Reed Modified over 9 years ago
1
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Concolic Execution for Automatic Exploit Generation Todd Frederick
2
Vulnerabilities are everywhere… 2 Binary Concolic Execution
3
rtm Robert Morris An exploit 3 Binary Concolic Execution DD8F2F736800DD8F2F6 2696ED05E5ADD00DD00 DD5ADD03D05E5CBC3B shell# Finger Server 1987
4
The problem: exploiting vulnerable code o Find an exploit state in a program o Use a known existing vulnerability o Previous work automatically finds vulnerable states [Giffin, Jha, Miller 2006] 4 Binary Concolic Execution o Find input that drives the program down a path to the exploit state o Analyze program control flow o Walk through the program, finding inputs to reach the current point o Explore paths in the program to reach the vulnerability o Find an exploit state in a program o Use a known existing vulnerability o Previous work automatically finds vulnerable states [Giffin, Jha, Miller 2006]
5
The problem 5 Binary Concolic Execution normal input exploit Program Assume we know of a vulnerability
6
Running example 6 Binary Concolic Execution Program login: goodbad password:Using backdoor!
7
Working with binary code 7 Binary Concolic Execution Program 8048282: lea 0x4(%esp),%ecx 8048286: and $0xfffffff0,%esp 8048289: pushl 0xfffffffc(%ecx) 804828c: push %ebp 804828d: mov %esp,%ebp 804828f: push %ebx 8048290: push %ecx 8048291: sub $0x10,%esp 8048294: call 8048210 8048299: mov $0x3,%eax 804829e: mov $0x0,%ebx 80482a3: mov $0x80bd884,%ecx 80482a8: mov $0x10,%edx 80482ad: int $0x80 80482af: mov %eax,0xfffffff0(%ebp) 80482b2: movzbl 0x80bd886,%eax 80482b9: movsbl %al,%edx 80482bc: movzbl 0x80bd884,%eax 80482c3: movsbl %al,%eax 80482c6: mov %edx,%ecx 80482c8: sub %eax,%ecx 80482ca: mov %ecx,%eax 80482cc: cmp $0x2,%eax 80482cf: jne 8048302 80482d1: movzbl 0x80bd886,%eax 80482d8: movsbl %al,%edx 80482db: movzbl 0x80bd885,%eax 80482e2: movsbl %al,%eax 80482e5: mov %edx,%ecx 80482e7: sub %eax,%ecx 80482e9: mov %ecx,%eax 80482eb: cmp $0x3,%eax 80482ee: jne 8048302 80482f0: movzbl 0x80bd886,%eax 80482f7: cmp $0x64,%al 80482f9: jne 8048302 80482fb: call 804825c 8048300: jmp 8048307 8048302: call 8048236 8048307: mov $0x1,%eax 804830c: mov $0x0,%ebx 8048311: int $0x80 8048313: mov %eax,0xfffffff4(%ebp) 8048316: mov $0x0,%eax 804831b: add $0x10,%esp 804831e: pop %ecx 804831f: pop %ebx 8048320: pop %ebp 8048321: lea 0xfffffffc(%ecx),%esp 8048324: ret exploit
8
Conceptual approach 8 Binary Concolic Execution Symbolic Execution Program Generated Input o Run program, tracking variables as expressions instead of actual (concrete) values o Collect expressions along the current path o Find concrete input to satisfy these expressions
9
Conceptual approach 9 Binary Concolic Execution o Run program, tracking variables as expressions instead of actual (concrete) values o Collect expressions along the current path o Find concrete input to satisfy these expressions Program Generated Input Symbolic Executor Solver Path Conditions
10
Conceptual approach 10 Binary Concolic Execution o Exponential number of paths o Limit and prioritize the paths we will explore Program Generated Input Symbolic Executor Solver Path Conditions Path Selector
11
Traditional symbolic execution 11 Binary Concolic Execution read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login()
12
Traditional symbolic execution 12 Binary Concolic Execution if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login()
13
Traditional symbolic execution 13 Binary Concolic Execution if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login()
14
Traditional symbolic execution 14 Binary Concolic Execution read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login()
15
Traditional symbolic execution 15 Binary Concolic Execution read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() login()
16
Problems with symbolic execution Must maintain exponentially many symbolic states Expressions may be difficult or unfeasible to solve 16 Binary Concolic Execution Solution: Run program concretely and symbolically Concrete Execution Symbolic Execution Concolic Execution
17
Concolic execution overview 17 Binary Concolic Execution Instructions Program Concrete Executor Input Generated Input Symbolic Executor Solver Path Conditions Path Selector o Symbolic execution follows concrete path o Some expressions use concrete values
18
Concolic execution Advantages Track less state in parallel by following a single path at a time Simplify expressions by substituting concrete values for difficult sub expressions Disadvantage Concrete values only hold for a specific set of concrete inputs, so mixing concrete values and expressions may produce inaccurate expressions 18 Binary Concolic Execution
19
Concolic execution example 19 Binary Concolic Execution Input good read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:
20
Concolic execution example 20 Binary Concolic Execution Input good if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: g,o,o,d
21
Concolic execution example 21 Binary Concolic Execution Input good if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: g,o,o,d
22
Concolic execution example 22 Binary Concolic Execution Input good read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() Generated Input egg Concrete Memory buffer: g,o,o,d
23
Concolic execution example 23 Binary Concolic Execution Input egg read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:
24
Concolic execution example 24 Binary Concolic Execution Input egg if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: e,g,g
25
Concolic execution example 25 Binary Concolic Execution Input egg if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: e,g,g
26
Concolic execution example 26 Binary Concolic Execution Input egg read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login() Concrete Memory buffer: e,g,g
27
Concolic execution example 27 Binary Concolic Execution Input egg read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() Generated Input port Concrete Memory buffer: e,g,g
28
Concolic execution example 28 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:
29
Concolic execution example 29 Binary Concolic Execution Input port if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: p,o,r,t
30
Concolic execution example 30 Binary Concolic Execution Input port if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: p,o,r,t
31
Concolic execution example 31 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login() Concrete Memory buffer: p,o,r,t
32
Concolic execution example 32 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: p,o,r,t
33
Concolic execution example 33 Binary Concolic Execution Input port read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) Generated Input bad Concrete Memory buffer: p,o,r,t
34
Concolic execution example 34 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer:
35
Concolic execution example 35 Binary Concolic Execution Input bad if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: b,a,d
36
Concolic execution example 36 Binary Concolic Execution Input bad if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: b,a,d
37
Concolic execution example 37 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) backdoor() login() Concrete Memory buffer: b,a,d
38
Concolic execution example 38 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2]-input[1] == 3 ) backdoor() login() Concrete Memory buffer: b,a,d
39
Concolic execution example 39 Binary Concolic Execution Input bad read_input() if( input[2]–input[0] == 2 ) if( input[2] == ‘d’ ) if( input[2]-input[1] == 3 ) login() Success Concrete Memory buffer: b,a,d
40
Inaccurate expressions Some variables depend on input Replacing these variables with concrete values may yield inaccurate expressions Solving an inaccurate path condition may produce input that does not take the desired path 40 Binary Concolic Execution
41
Concolic execution system design 41 Binary Concolic Execution Concrete Executor ProgramInput Solver InstructionsGenerated Input Symbolic Executor Path Conditions Path Selector
42
Concolic execution system design 42 Binary Concolic Execution Concrete Executor ProgramInput InstructionsGenerated Input Symbolic Executor STP (Solver) Path Conditions Path Selector SymEval Dyninst ProcControl API
43
Concrete execution components 43 Binary Concolic Execution Concrete Executor Dyninst ProcControl API
44
Concrete execution components 44 Binary Concolic Execution Concrete Executor Redirects program input Reads actual values of instruction operands Tracks path taken Concrete Executor Redirects program input Reads actual values of instruction operands Tracks path taken Dyninst Assists with static analysis Dyninst Assists with static analysis ProcControl API Runs program using single-stepping or breakpoints ProcControl API Runs program using single-stepping or breakpoints
45
Concolic execution system design 45 Binary Concolic Execution Concrete Executor ProgramInput InstructionsGenerated Input Symbolic Executor STP (Solver) Path Conditions Path Selector SymEval Dyninst ProcControl API
46
Symbolic execution components 46 Binary Concolic Execution Symbolic Executor SymEval
47
Symbolic execution components 47 Binary Concolic Execution Symbolic Executor Symbolic memory Identify input Update symbolic memory Extract conditional predicates Symbolic Executor Symbolic memory Identify input Update symbolic memory Extract conditional predicates SymEval Represents instruction semantics as ASTs SymEval Represents instruction semantics as ASTs
48
Concolic execution system design 48 Binary Concolic Execution Concrete Executor ProgramInput InstructionsGenerated Input Symbolic Executor STP (Solver) Path Conditions Path Selector SymEval Dyninst ProcControl API
49
Path searching components 49 Binary Concolic Execution STP (Solver) Path Conditions Path Selector
50
Path searching components 50 Binary Concolic Execution STP (Solver) Designed for program analysis applications Handles bit-vector data types STP (Solver) Designed for program analysis applications Handles bit-vector data types Path Conditions One term for each branch taken Path Selector Decides where to branch off from current path Is a depth-first search for now Other strategies will use static CFG analysis Path Selector Decides where to branch off from current path Is a depth-first search for now Other strategies will use static CFG analysis
51
Previous Work in Binary Concolic Execution IDS signature generation [Song, et al. 2008] Combined exploit strings to create signatures Required an initial exploit, or a patch for the vulnerability Program testing [Godefroid, et al. 2008] Created test cases with maximum code coverage in mind Used instruction-level tracing for concrete execution 51 Binary Concolic Execution
52
Potential Benefits of our Approach Our approach will be capable of finding the initial exploit We will do concrete execution with instrumentation, which gives us the flexibility to instrument selectively We plan to develop smarter path selection techniques using static control flow analysis 52 Binary Concolic Execution
53
Status Concrete execution partially implemented using ProcControlAPI Using standard input Will support network and environment as inputs Symbolic execution and path selection not implemented yet Driving development of SymEval Instruction semantics AST simplification 53 Binary Concolic Execution
54
Conclusion 54 Binary Concolic Execution Finding the first exploit with binary concolic execution using instrumentation movzbl 0x80bd886,%eax cmp $0x64,%al jne 8048302 call 804825c input[2] == ‘d’ mov %edx,%ecx sub %eax,%ecx mov %ecx,%eax cmp $0x2,%eax jne 8048302 movzbl 0x80bd886,%eax cmp $0x64,%al jne 8048302 call 804825c mov %edx,%ecx sub %eax,%ecx mov %ecx,%eax cmp $0x3,%eax jne 8048302
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.