© 2006 Barton P. MillerFebruary 2006Binary Code Analysis and Editing A Framework for Binary Code Analysis, and Static and Dynamic Patching Barton P. Miller.

Slides:



Advertisements
Similar presentations
Lecture 16 Buffer Overflow modified from slides of Lawrie Brown.
Advertisements

MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 3 Memory Management Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
SYSTEM PROGRAMMING & SYSTEM ADMINISTRATION
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
Program Representations. Representing programs Goals.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Paradyn Project Upcoming Features in Dyninst and its Components Bill Williams.
© 2000 Barton P. MillerSeptember 6, 2000DynInst Security Playing Inside the Blackbox: Using Dynamic Instrumentation to Create Security Holes Barton P.
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
© 2001 Barton P. MillerDecember 2001DynInst Security Playing Inside the Blackbox: Using Dynamic Instrumentation to Create Security Holes Barton P. Miller.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Cpeg421-08S/final-review1 Course Review Tom St. John.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Combining Static and Dynamic Data in Code Visualization David Eng Sable Research Group, McGill University PASTE 2002 Charleston, South Carolina November.
Partial Automation of an Integration Reverse Engineering Environment of Binary Code Author : Cristina Cifuentes Reverse Engineering, 1996., Proceedings.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Chapter 10 Application Development. Chapter Goals Describe the application development process and the role of methodologies, models and tools Compare.
Database Management Systems (DBMS)
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Precision Going back to constant prop, in what cases would we lose precision?
Previous Next 06/18/2000Shanghai Jiaotong Univ. Computer Science & Engineering Dept. C+J Software Architecture Shanghai Jiaotong University Author: Lu,
University of Maryland Compiler-Assisted Binary Parsing Tugrul Ince PD Week – 27 March 2012.
Paradyn Project Dyninst/MRNet Users’ Meeting Madison, Wisconsin August 7, 2014 The Evolution of Dyninst in Support of Cyber Security Emily Gember-Jacobson.
University of Maryland parseThat: A Robust Arbitrary-Binary Tester for Dyninst Ray Chen.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Support for Debugging Automatically Parallelized Programs Robert Hood Gabriele Jost CSC/MRJ Technology Solutions NASA.
March 17, 2005 Roadmap of Upcoming Research, Features and Releases Bart Miller & Jeff Hollingsworth.
Andrew Bernat, Bill Williams Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 New Features in Dyninst
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS266 Software Reverse Engineering (SRE) Reversing and Patching Java Bytecode Teodoro (Ted) Cipresso,
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
Topic 2d High-Level languages and Systems Software
Processes Introduction to Operating Systems: Module 3.
02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.
An OBSM method for Real Time Embedded Systems Veronica Eyo Sharvari Joshi.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Rewriting with Dyninst Madhavi Krishnan and Dan McNulty.
April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI The Deconstruction of Dyninst Part 1: The SymtabAPI Giridhar Ravipati University of Wisconsin,
© 2006 Andrew R. BernatMarch 2006Generalized Code Relocation Generalized Code Relocation for Instrumentation and Efficiency Andrew R. Bernat University.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 Paradyn Project Deconstruction of Dyninst: Best Practices and Lessons Learned Bill.
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
Presented by : A best website designer company. Chapter 1 Introduction Prof Chung. 1.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Lecture 1 Page 1 CS 111 Summer 2013 Important OS Properties For real operating systems built and used by real people Differs depending on who you are talking.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
CHAPTER 1 INTRODUCTION TO COMPILER SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
Chapter Goals Describe the application development process and the role of methodologies, models, and tools Compare and contrast programming language generations.
Chapter 10 Application Development
CIS 375 Bruce R. Maxim UM-Dearborn
Chapter 1 Introduction.
Names and Attributes Names are a key programming language feature
Introduction to Advanced Topics Chapter 1 Text Book: Advanced compiler Design implementation By Steven S Muchnick (Elsevier)
Compiler Construction (CS-636)
SOFTWARE DESIGN AND ARCHITECTURE
Chapter 1 Introduction.
课程名 编译原理 Compiling Techniques
CompSci 725 Presentation by Siu Cho Jun, William.
Chapter 1: Introduction to Compiling (Cont.)
Compiler Lecture 1 CS510.
A configurable binary instrumenter
Compiler Construction
CSE401 Introduction to Compiler Construction
Presentation transcript:

© 2006 Barton P. MillerFebruary 2006Binary Code Analysis and Editing A Framework for Binary Code Analysis, and Static and Dynamic Patching Barton P. Miller University of Wisconsin Jeffrey Hollingsworth University of Maryland

– 2 –© 2006 Barton P. Miller Binary Code Analysis and Editing Motivation Multi-platform Open architecture Extensible Open source Testable Suitable for batch processing Accurate Efficient  Binary code analysis is a basic tool of security analysts, application developers, system designers and tool developers.  We are designing and building a new foundation to support such analysis.  Existing binary analysis tools have significant limitations.

– 3 –© 2006 Barton P. Miller Binary Code Analysis and Editing Why Binary Code?  Access to the source code often is not possible: Proprietary software packages. Stripped executables. Proprietary libraries: communication (MPI, PVM), linear algebra (NGA), database query (SQL libraries).  Binary code is the only authoritative version of the program. Changes occurring in the compile, optimize and link steps can create non-trivial semantic differences from the source and binary.  Worms and viruses are rarely provided with source code

– 4 –© 2006 Barton P. Miller Binary Code Analysis and Editing Binary Analysis and Editing  Analysis: processing of the binary code to extract syntactic and symbolic information. Symbol tables (if present) Decode (disassemble) instructions Control-flow information: basic blocks, loops, functions Data-flow information: from basic register information to highly sophisticated (and expensive) analyses.

– 5 –© 2006 Barton P. Miller Binary Code Analysis and Editing Binary Analysis and Editing  Binary rewriting: static (before execution) modification of a binary program: Analyze the program and then insert, remove, or change the binary code, producing a new binary.  Dynamic instrumentation: dynamic (during execution) modification of a binary program: Analyze the code of the running program and then insert, remove, or change the binary code, changing the execution of the program. Can operate on running programs and servers.

– 6 –© 2006 Barton P. Miller Binary Code Analysis and Editing Uses of Binary Analysis and Editing  Cyber-forensics Analysis: understand the nature of malicious code Binary-rewriting: produce a new version of the code that might be instrumented, sandboxed, or modified for study. Dynamic instrumentation: same features, but can do it interactively on an executing program. Hybrid static/dynamic: control execution and produce intermediate versions of the binary that can be re-executed (and further instrumented).  Program tracing: instructions, memory accesses, function calls, system calls,...  Debugging  Testing  Performance profiling  Performance modeling  Reverse engineering

– 7 –© 2006 Barton P. Miller Binary Code Analysis and Editing Our Starting Point: Dyninst  A machine-independent library for machine level code patching. Functions for binary code analysis Functions for binary code patching  Clean abstractions to encapsulate the tool complexity.  Originally designed as part of the Paradyn performance profiling tool, but now widely used in many areas, including cyber-security.

– 8 –© 2006 Barton P. Miller Binary Code Analysis and Editing Dynamic Instrumentation  Does not require recompiling or relinking Saves time: compile and link times are significant in real systems. Can instrument without the source code (e.g., proprietary libraries). Can instrument without linking (relinking is not always possible.  Instrument optimized code.

– 9 –© 2006 Barton P. Miller Binary Code Analysis and Editing Dynamic Instrumentation (con’d)  Only instrument what you need, when you need No hidden cost of latent instrumentation. Enables “one pass” tools.  Can instrument running programs (such as Web or database servers) Production systems. Embedded systems. Systems with complex start-up procedures.

– 10 –© 2006 Barton P. Miller Binary Code Analysis and Editing The Basic Mechanism Application Program Function foo Trampoline Instrumentation Relocated Instruction(s)

– 11 –© 2006 Barton P. Miller Binary Code Analysis and Editing The DynInst Interface  Machine independent representation  Write-once, analyze/instrument-many (portable)  Object-based interface to insert new code: Abstract Syntax Trees (AST’s)  Hides most of the complexity in the API Easy to build tools: e.g., an MPI tracer: 250 lines of C++ code.

– 12 –© 2006 Barton P. Miller Binary Code Analysis and Editing incl ctr sethi %hi(ctr) ld [...],%o1 add %o1,%o1,1 st %o1,[...] SPARC Code Machine Independent Code Abstract Syntax Trees: cau r3,r0,hi%ctr l r4,lo%ctr(r3) addi r4,1(r4) st r4,lo%ctr(r3) Power Code IA32 Code

– 13 –© 2006 Barton P. Miller Binary Code Analysis and Editing Basic DynInst Operations  Code query routines: Find control-flow elements: modules, procedures, loops, basic blocks, instructions –For functions, find entry, exit, call sites. –For loops, find entry, exit, body. Find data elements: variables and parameters Call graph (parent/child) queries Intra-procedural control-flow graph Other symbol table information, e.g., line numbers.

– 14 –© 2006 Barton P. Miller Binary Code Analysis and Editing Basic DynInst Operations  Code modification routines: Remove Function Call –Disable an existing function call in the application Replace Function Call –Redirect a function call to a new function Replace Function –Redirect all calls (current and future) to a function to a new function. Wrap Function –Allow the new function to call the replaced one (potentially with all its original parameters).

– 15 –© 2006 Barton P. Miller Binary Code Analysis and Editing Basic DynInst Operations  Process control: Attach/create process Monitor process status changes Callbacks for fork/exec/exit  Inferior (application processor) operations: Malloc/free –Allocate heap space in application process Inferior RPC –Asynchronously execute a function in the application. Load module –Cause a new.so/.dll to be loaded into the application.

– 16 –© 2006 Barton P. Miller Binary Code Analysis and Editing Basic DynInst Operations  Building AST code sequences: Control structures: if and goto Arithmetic and Boolean expressions Get PID/TID operations Read/write registers and global variables Read/write parameters and return value Function call

– 17 –© 2006 Barton P. Miller Binary Code Analysis and Editing Dyninst Automated Testing  A test suite of almost 100 operation-specific tests.  Runs each night on each platform on the nightly build.  Variations for different compilers, languages (C, C++, Fortran), stripped vs. non-stripped code, etc.  Results reported on the web (reachable from paradyn.org or dyninst.org home pages):

– 18 –© 2006 Barton P. Miller Binary Code Analysis and Editing BinInst Design Goals  Tool-kit component architecture for binary analysis and editing  Open source  Open data structure definitions  Machine-independent abstract interfaces  Batch-enabled analyses  Static and dynamic code patching  All major analysis products are exportable  Enhanced testability and accompanying test suites

– 19 –© 2006 Barton P. Miller Binary Code Analysis and Editing Raw Disassembly Symbol Table Dump Call Graph Intra-Proc CFG Binary Decode and Parsing Code Queries and Instrumentation Requests Binary Code AST Static Editing Scenario (Binary Rewriting) Instr Control Code Gen Idiom Signatures

– 20 –© 2006 Barton P. Miller Binary Code Analysis and Editing Raw Disassembly Symbol Table Dump Binary Decode and Parsing Binary Code Interactive Editing Scenario (Static or Dynamic) Instr Control Code Gen Call Graph Intra-Proc CFG Idiom Signatures

– 21 –© 2006 Barton P. Miller Binary Code Analysis and Editing Raw Disassembly Symbol Table Dump Binary Decode and Parsing Code Queries and Instrumentation Requests Binary Code AST Dynamic Editing Scenario (Dynamic Instrumentation) Instr Control Code Gen Process Control User Process Call Graph Intra-Proc CFG Idiom Signatures Stack Walker

– 22 –© 2006 Barton P. Miller Binary Code Analysis and Editing Raw Disassembly Symbol Table Dump Binary Decode and Parsing Binary Code Analysis Scenario Connector 2 Code Surfer VSA Buffer Overrun Other Tool Call Graph Intra-Proc CFG Idiom Signatures

– 23 –© 2006 Barton P. Miller Binary Code Analysis and Editing Binary Code Symbol Table Parser PE ELF COFF IA32 AMD64 Power Raw Disassembly Symbol Table Dump Code Parser Instruction Decoder Code Queries and Instrumentation Requests AST Instr Control Code Gen Process Control Call Graph Intra Proc CFG Idiom Signatures Stack Walker Idiom Detector

– 24 –© 2006 Barton P. Miller Binary Code Analysis and Editing