PASTE 2011 Szeged, Hungary September 5, 2011 Labeling Library Functions in Stripped Binaries Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
1 Inducements–Call Blocking. Aware of the Service?
Advanced Piloting Cruise Plot.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
© 2011, the Book Industry Study Group, Inc Book Industry Study Group Supporting an Industry in Transformation BISG Annual Meeting of Members September.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Chapter 11: Structure and Union Types Problem Solving & Program Design.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
2010 fotografiert von Jürgen Roßberg © Fr 1 Sa 2 So 3 Mo 4 Di 5 Mi 6 Do 7 Fr 8 Sa 9 So 10 Mo 11 Di 12 Mi 13 Do 14 Fr 15 Sa 16 So 17 Mo 18 Di 19.
ZMQS ZMQS
Solve Multi-step Equations
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
ABC Technology Project
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
2 |SharePoint Saturday New York City
VOORBLAD.
15. Oktober Oktober Oktober 2012.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
© 2012 National Heart Foundation of Australia. Slide 2.
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.4 Polynomials in Several Variables Copyright © 2013, 2009, 2006 Pearson Education, Inc.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Addition 1’s to 20.
25 seconds left…...
Januar MDMDFSSMDMDFSSS
Week 1.
Analyzing Genes and Genomes
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Energy Generation in Mitochondria and Chlorplasts
CpSc 3220 Designing a Database
0 x x2 0 0 x1 0 0 x3 0 1 x7 7 2 x0 0 9 x0 0.
Low level Programming. Linux ABI System Calls – Everything distills into a system call /sys, /dev, /proc  read() & write() syscalls What is a system.
ByteWeight: Learning to Recognize Functions in Binary Code
© 2006 Nathan RosenblumMarch 2006Unconventional Code Constructs The New Dyninst Code Parser: Binary Code Isn't as Simple as it Used to Be Nathan Rosenblum.
Analysis Of Stripped Binary Code Laune Harris University of Wisconsin – Madison
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 unstrip: Restoring Function Information to Stripped Binaries Using Dyninst Emily.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2004 Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2004.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
C function call conventions and the stack
Exploiting & Defense Day 2 Recap
Emily Jacobson and Nathan Rosenblum
Multi-modules programming
Presentation transcript:

PASTE 2011 Szeged, Hungary September 5, 2011 Labeling Library Functions in Stripped Binaries Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller Computer Sciences Department University of Wisconsin - Madison

Why Binary Code? o Source code isn’t available o Source code isn’t the right representation 2 Labeling Library Functions in Stripped Binaries

Binary Tools Need Symbol Tables o Debugging Tools o GDB, IDA Pro… o Instrumentation Tools o PIN, Dyninst,… o Static Analysis Tools o CodeSurfer/x86,… o Security Analysis Tools o IDA Pro,… 3 Labeling Library Functions in Stripped Binaries

Function locations Complicated by: o Missing symbol information o Variability in function layout (e.g. code sharing, outlined basic blocks) o High degree of indirect control flow program binary Restoring Information 4 Labeling Library Functions in Stripped Binaries targ80c3bd0 targ80c3df4

What about semantic information? o Program’s interaction with the operating system (system calls) encapsulated by wrapper functions Restoring Information 5 Labeling Library Functions in Stripped Binaries Library fingerprinting: identify functions based on patterns learned from exemplar libraries program binary targ80c3bd0 targ80c3df4

stripped binary parsing + library fingerprinting + binary rewriting unstrip 6 Labeling Library Functions in Stripped Binaries targ80c3bd0 targ80c3df4 getpid accept

: mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 mov %edx, %ebx cmp %0xffffff83,%eax jae syscall_error ret Set up system call arguments int $0x80 Invoke a system call mov %edx, %ebx cmp %0xffffff83,%eax jae syscall_error ret Error check and return mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx Save registers

: mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 mov %edx, %ebx cmp %0xffffff83,%eax jae syscall_error ret int $0x80 mov %edx, %ebx cmp %0xffffff83,%eax jae syscall_error ret mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx : cmpl $0x0,%gs:0xc jne 80f669c mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx call *0x814e93c mov %edx, %ebx cmp %0xffffff83,%eax jae syscall_error ret push %esi call enable_asyncancel mov %eax,%esi mov %ebx,%edx mov $0x66,%eax mov $0x5,%ebx lea 0x8(%esp),%ecx call *0x mov %edx, %ebx xchg %eax,%esi call disable_acynancel mov %esi,%eax pop %esi cmp $0xffffff83,%eax jae syscall_error ret : cmpl $0x0,%gs:0xc jne 80f669c mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 mov %edx, %ebx cmp %0xffffff83,%eax jae syscall_error ret push %esi call enable_asyncancel mov %eax,%esi mov %ebx,%edx mov $0x66,%eax mov $0x5,%ebx lea 0x8(%esp),%ecx int $0x80 mov %edx, %ebx xchg %eax,%esi call disable_acynancel mov %esi,%eax pop %esi cmp $0xffffff83,%eax jae syscall_error ret glibc 2.5 on RHEL with GCC The same function can be realized in a variety of ways in the binary glibc 2.5 on RHEL with GCC glibc on RHEL with GCC

o Function inlining o Code reordering o Minor code changes o Alternative code sequences Binary-level Code Variations 9 Labeling Library Functions in Stripped Binaries

Semantic Descriptors o Rather than recording byte patterns, we take a semantic approach o Record information that is likely to be invariant across multiple versions of the function 10 : mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 mov %edx, %ebx cmp %0xffffff83,%eax jae ret mov %esi,%esi int $0x80 mov %0x66,%eax mov $0x5,%ebx { }, 5 Labeling Library Functions in Stripped Binaries

Building Semantic Descriptors 11 Labeling Library Functions in Stripped Binaries We parse an input binary, locate system calls and wrapper function calls, and employ dataflow analysis. binary reboot: push %ebp mov %esp,%ebp sub $0x10,%esp push %edi push %ebx mov 0x8(%ebp),%edx mov $0xfee1dead,%edi mov $0x ,%ecx push %ebx mov %edi,%ebx mov $0x58,%eax int $0x80 … SYSTEM CALL 0x580x EAX EBXECX %edi 0xfee 1 dead { } EAX (reboot)

Building Semantic Descriptors Recursively 12 Labeling Library Functions in Stripped Binaries sethostid: … call open … call write … mov $0x6, eax int $0x80 … { } open: … mov $0x5, eax int $0x80 … { } write: … mov $0x4, eax int $0x80 … { } {,, }

unstrip Building a Descriptor Database 13 Labeling Library Functions in Stripped Binaries Descriptor Database : mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 … Locate wrapper functions Build semantic descriptors { }: accept { }: listen { }: getpid … glibc reference library glibc reference library

glibc reference library glibc reference library glibc reference library glibc reference library glibc reference library glibc reference library glibc reference library glibc reference library unstrip Building a Descriptor Database 14 Labeling Library Functions in Stripped Binaries Descriptor Database Build semantic descriptors Locate wrapper functions { }: accept { }: listen { }: getpid … { }: accept { }: listen { }: getpid … { }: accept { }: listen { }: getpid … { }: accept { }: listen { }: getpid … 1 : mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 … 1 : mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 … 1 : mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 … 1 : mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 …

o Two stages 1)Exact matches 2)Best match based on coverage criterion o Handle minor code variations by allowing flexible matches Pattern Matching Criteria 15 Labeling Library Functions in Stripped Binaries

Pattern Matching Criteria 16 Labeling Library Functions in Stripped Binaries A: { } B: {,, } fingerprint from the database semantic descriptor from the code

Multiple Matches o It’s possible that two or more functions are indistinguishable o Policy decision: return set of potential matches o In practice, we’ve observed 8% of functions have multiple matches, but the size of the match set is small (≤ 3) 17 Labeling Library Functions in Stripped Binaries

unstrip Identifying Functions in a Stripped Binary 18 Labeling Library Functions in Stripped Binaries stripped binary unstripped binary Descriptor Database For each wrapper function { 1. Build the semantic descriptor. 2. Search the database for a match (apply two- stage matching process). 3. Add label to symbol table. }

stripped binary parsing + library fingerprinting + binary rewriting Implementation 19 Labeling Library Functions in Stripped Binaries

Evaluation o To evaluate across three dimensions of variation, we constructed three data sets: o GCC version o glibc version o distribution vendor o In each set, compile statically-linked binaries, build a DDB, compare unstrip to IDA Pro’s FLIRT o Evaluation measure is accuracy 20 Labeling Library Functions in Stripped Binaries

Evaluation Results: GCC Version Study 21 Labeling Library Functions in Stripped Binaries

Evaluation Results: glibc Version Study 22 Labeling Library Functions in Stripped Binaries

Evaluation Results: Distribution Study 23 Labeling Library Functions in Stripped Binaries

24 Labeling Library Functions in Stripped Binaries unstrip is available at

Backup slides follow

Evaluation Results: GCC Version Study (Temporal: backwards) 26 Labeling Library Functions in Stripped Binaries

Evaluation Results: glibc Version Study (Temporal: backwards) 27 Labeling Library Functions in Stripped Binaries

Evaluation Results: Distribution Study (one predicts the rest) 28 Labeling Library Functions in Stripped Binaries

Evaluation Results: GCC Version Study (one predicts the rest) 29 Labeling Library Functions in Stripped Binaries

Evaluation Results: glibc Version Study (one predicts the rest) 30 Labeling Library Functions in Stripped Binaries

Evaluation Results: Distribution Study (one predicts the rest) 31 Labeling Library Functions in Stripped Binaries