Analysis Of Stripped Binary Code Laune Harris University of Wisconsin – Madison

Slides:



Advertisements
Similar presentations
PASTE 2011 Szeged, Hungary September 5, 2011 Labeling Library Functions in Stripped Binaries Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller.
Advertisements

ByteWeight: Learning to Recognize Functions in Binary Code
© 2006 Nathan RosenblumMarch 2006Unconventional Code Constructs The New Dyninst Code Parser: Binary Code Isn't as Simple as it Used to Be Nathan Rosenblum.
ITEC 352 Lecture 25 Memory(3). Review Questions RAM –What is the difference between register memory, cache memory, and main memory? –What connects the.
Native x86 Decompilation Using Semantics-Preserving Structural Analysis and Iterative Control-Flow Structuring Edward J. Schwartz *, JongHyup Lee ✝, Maverick.
Secure In-VM Monitoring Using Hardware Virtualization Monirul Sharif, Wenke Lee, Weidong Cui, and Andrea Lanzi Presented by Tyler Bletsch.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Paradyn Project Upcoming Features in Dyninst and its Components Bill Williams.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Self-propelled Instrumentation Wenbin Fang.
Assembly Code Verification Using Model Checking Hao XIAO Singapore University of Technology and Design.
© 2006 Barton P. MillerFebruary 2006Binary Code Analysis and Editing A Framework for Binary Code Analysis, and Static and Dynamic Patching Barton P. Miller.
Machine-Learning Assisted Binary Code Analysis
Binary Obfuscation Using Signals Igor V. Popov ( University of Arizona)‏ Saumya K. Debray (University of Arizona)‏ Gregory R. Andrews (University of Arizona)
School of Computer Science A Global Progressive Register Allocator David Ryan Koes Seth Copen Goldstein Carnegie Mellon University
Eliminating Stack Overflow by Abstract Interpretation John Regehr Alastair Reid Kirk Webb University of Utah.
IPT Readings on Instrumentation, Profiling, and Tracing Seminar presentation by Alessandra Gorla University of Lugano December 7, 2006.
Lecture 25 Generating Code for Basic Blocks Topics Code Generation Readings: April 19, 2006 CSCE 531 Compiler Construction.
Partial Automation of an Integration Reverse Engineering Environment of Binary Code Author : Cristina Cifuentes Reverse Engineering, 1996., Proceedings.
X86 ISA Compiler Baojian Hua Front End source code abstract syntax tree lexical analyzer parser tokens IR semantic analyzer.
C Prog. To Object Code text text binary binary Code in files p1.c p2.c
Fast Dynamic Binary Translation for the Kernel Piyus Kedia and Sorav Bansal IIT Delhi.
Experimental Computer Systems Lab A Binary Rewriting Defense Against Stack-based Buffer Overflow Attacks Manish Prasad, Tzi-cker Chiueh SUNY Stony Brook.
Automated Tracing and Visualization of Software Security Structure and Properties Symposium on Visualization for Cyber Security 2012 (VizSec’12) Seattle,
University of Maryland Compiler-Assisted Binary Parsing Tugrul Ince PD Week – 27 March 2012.
Machine-Level Programming 3 Control Flow Topics Control Flow Switch Statements Jump Tables.
Paradyn Project Dyninst/MRNet Users’ Meeting Madison, Wisconsin August 7, 2014 The Evolution of Dyninst in Support of Cyber Security Emily Gember-Jacobson.
Instrumentation - initial results Sung Kim, Jeff Perkins MIT.
University of Washington x86 Programming III The Hardware/Software Interface CSE351 Winter 2013.
Lecture-1 Compilation process
 2/9/ Binary Analysis and Rewriting Arvind Ayyangar Niranjan Hasabnis Alireza Saberi Tung Tran R. Sekar Stony Brook University Min Gyung Kang Stephen.
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011.
Binary Analysis and Rewriting Arvind Ayyangar Niranjan Hasabnis Alireza Saberi Tung Tran R. Sekar Stony Brook University Min Gyung Kang Stephen McCamant.
Paradyn Project Petascale Tools Workshop Madison, Wisconsin Aug 4-Aug 7, 2014 Binary Code is Not Easy Xiaozhu Meng, Emily Gember-Jacobson, and Bill Williams.
Recitation 6 – 2/26/01 Outline Linking Exam Review –Topics Covered –Your Questions Shaheen Gandhi Office Hours: Wednesday.
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
Machine-Level Programming 3 Control Flow Topics Control Flow Switch Statements Jump Tables.
Analyzing Memory Accesses in Obfuscated x86 Executables Michael Venable Mohamed R. Choucane Md. Enamul Karim Arun Lakhotia (Presenter) DIMVA 2005 Wien.
CNIT 127: Exploit Development Ch 1: Before you begin.
Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware.
Gogul Balakrishnan, Radu Gruian and Thomas Reps Computer Science Dept., Univ. of Wisconsin GrammaTech, Inc. April, 2005 CodeSurfer / x86 A Platform for.
AMD64/EM64T – Dyninst & ParadynMarch 17, 2005 The AMD64/EM64T Port of Dyninst and Paradyn Greg Quinn Ray Chen
University of Maryland Instrumentation with Relocatable Program Code Tugrul Ince Department of Computer Science University of Maryland, College Park, MD.
Functions/Methods in Assembly
Buffer Overflow Attack- proofing of Code Binaries Ramya Reguramalingam Gopal Gupta Gopal Gupta Department of Computer Science University of Texas at Dallas.
Introduction to Information Security מרצים : Dr. Eran Tromer: Prof. Avishai Wool: מתרגלים : Itamar Gilad
April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI The Deconstruction of Dyninst Part 1: The SymtabAPI Giridhar Ravipati University of Wisconsin,
© 2006 Andrew R. BernatMarch 2006Generalized Code Relocation Generalized Code Relocation for Instrumentation and Efficiency Andrew R. Bernat University.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 Detecting Code Reuse Attacks Using Dyninst Components Emily Jacobson, Drew.
Improvements to the Compiler Lecture 27 Mon, Apr 26, 2004.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 unstrip: Restoring Function Information to Stripped Binaries Using Dyninst Emily.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2004 Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2004.
Correct RelocationMarch 20, 2016 Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat
OUTLINE 2 Pre-requisite Bomb! Pre-requisite Bomb! 3.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
Instruction Set Architecture
Static and dynamic analysis of binaries
Introduction to Compiler Construction
Emily Jacobson and Nathan Rosenblum
Teaching Computing to GCSE
Computer Architecture and Assembly Language
Ramblr Making Reassembly Great Again
C Prog. To Object Code text text binary binary Code in files p1.c p2.c
Lesson Objectives Aims Key Words Compiler, interpreter, assembler
Efficient x86 Instrumentation:
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Machine-Level Programming I: Basics Comp 21000: Introduction to Computer Organization & Systems Instructor: John Barr * Modified slides from the book.
Computer Architecture and System Programming Laboratory
Computer Architecture and Assembly Language
Dynamic Binary Translators and Instrumenters
Reverse Engineering for CTFs
Presentation transcript:

Analysis Of Stripped Binary Code Laune Harris University of Wisconsin – Madison

2 856c: d: 89e5 856f: 83ec : e8ddffffff 857b: c9 857c: c3 857d: e: 89e5 8581: 83ec18 858b: e8bfffffff 8591: c9 8592: c3 Binary code

3 856c: d: 89e5 856f: 83ec : e8ddffffff 857b: c9 857c: c3 857d: e: 89e5 8581: 83ec18 858b: e8bfffffff 8591: c9 8592: c3 push %ebp mov %esp, %ebp sub 8, %esp call 857d leave ret push %ebp mov %esp, %ebp sub %eax, %ebp call 866c leave ret Binary code (with assembly)

4 856c: d: 89e5 856f: 83ec : e8ddffffff 857b: c9 857c: c3 857d: e: 89e5 8581: 83ec18 858b: e8bfffffff 8591: c9 8592: c3 push %ebp mov %esp, %ebp sub 8, %esp call foo leave ret push %ebp mov %esp, %ebp sub %eax, %ebp call printf leave ret main foo Binary code (with symbol info)

5 A lot of code is stripped Commercial applications (usually) Proprietary libraries (often) Viruses OS libraries and utilities (depends on OS and OS version)

6 Steps in symbol reconstruction Find and name functions Find function size

7 Finding functions Build a call graph and traverse it to find function start addresses Opportunistic parsing: use existing symbol names and addresses where available Works on a spectrum of binaries ranging from binaries with all symbols to fully stripped binaries

8 push %ebp 856c:main Call Graph creation

9 push %ebp mov %esp, %ebp sub 8, %esp call 857d leave ret 856c: 856d: 856f: 8572: 857b: 857c: main Call Graph creation

10 push %ebp mov %esp, %ebp sub 8, %esp call func857d leave ret push %ebp 856c: 856d: 856f: 8572: 857b: 857c: 857d: main func857d Call Graph creation

11 push %ebp mov %esp, %ebp sub 8, %esp call func857d leave ret push %ebp mov %esp, %ebp sub %eax, %ebp call 865e call 866d leave ret 856c: 856d: 856f: 8572: 857b: 857c: 857d: 857e: 8581: 858b: 8591: 8596: 8597: main func857d Call Graph creation

12 Parsing Functions Disassemble function’s code by traversing intra-procedural control flow graph Highest address determines function size

13 Error Detection And Recovery CFG exit points are sometimes hard to identify Assume branches that are not obvious exits are intra-procedural Errors result in overestimation of function size Overlapping functions indicate error

14 Problems and Solutions Functions that are only called indirectly Problem: static call graph traversal does not discover these functions Solution: examine gaps in text space and use heuristics to find functions

15 Problems and Solutions cont’d Indirect Jumps Problem: need to find targets to complete CFG Solution: parse jump tables to find possible targets

16 Problems and Solutions cont’d Exception handling code Problem: creates code blocks that appear unreachable Solution: get block addresses from exception table

17 Test Programs paradyn ,676 condor_starter ,168 gimp ,329 eon ,163 om alara bubba size (MB) unstripped size (MB) stripped number of functions

18 Evaluation Parse time (includes CFG creation) ~1.4x faster than prev. parser (with cfg) ~1.7x slower than prev. parser (without cfg) Stripped parse time Varies: 1.2x - 1.9x slower than unstripped Symbol recreation 80% - 98% of original functions

19 Related Work Binary rewriters/instrumentation tools eel, emil, etch, goblin, leel, plto Disassemblers (lots available) IDAPro, Objdump, dumpbin, etc Symbol table reconstructors dress, objdump-output-beautifier

20 Status Implemented on x86 Ready for measurement and instrumentation Good start for security, but needs work

21 Future Work Develop more accurate heuristics to identify code in unlit areas of the binary Data flow analyses Port to other platforms Support unconventional function constructs Comprehensive comparison with other tools Evaluation on obfuscated code