Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29 - May 1, 2013 Using Dyninst for Program Binary Analysis and Instrumentation Emily Jacobson.

Slides:



Advertisements
Similar presentations
CPSC 388 – Compiler Design and Construction
Advertisements

Dynamic Memory Allocation in C.  What is Memory What is Memory  Memory Allocation in C Memory Allocation in C  Difference b\w static memory allocation.
Chapter 7 Process Environment Chien-Chung Shen CIS, UD
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29 - May 1, 2013 DynC: High Level Instrumentation With Dyninst Emily Jacobson DynC and.
Paradyn. Paradyn Goals Performance measurement tool that –scales to long-running programs on large parallel and distributed systems –automates much of.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Paradyn Project Upcoming Features in Dyninst and its Components Bill Williams.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Self-propelled Instrumentation Wenbin Fang.
Kernighan/Ritchie: Kelley/Pohl:
Memory allocation CSE 2451 Matt Boggus. sizeof The sizeof unary operator will return the number of bytes reserved for a variable or data type. Determine:
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-3, 2011 Introduction to the PatchAPI Wenbin Fang, Drew Bernat.
© 2006 Barton P. MillerFebruary 2006Binary Code Analysis and Editing A Framework for Binary Code Analysis, and Static and Dynamic Patching Barton P. Miller.
Memory Image of Running Programs Executable file on disk, running program in memory, activation record, C-style and Pascal-style parameter passing.
ספטמבר 04Copyright Meir Kalech1 C programming Language Chapter 3: Functions.
1 Homework Reading –PAL, pp , Machine Projects –Finish mp2warmup Questions? –Start mp2 as soon as possible Labs –Continue labs with your.
. Memory Management. Memory Organization u During run time, variables can be stored in one of three “pools”  Stack  Static heap  Dynamic heap.
Process. Process Concept Process – a program in execution Textbook uses the terms job and process almost interchangeably A process includes: – program.
AutoHacking with Phoenix Enabled Data Flow Analysis Richard Johnson |
Paradyn Project Dyninst/MRNet Users’ Meeting Madison, Wisconsin August 7, 2014 The Evolution of Dyninst in Support of Cyber Security Emily Gember-Jacobson.
University of Maryland parseThat: A Robust Arbitrary-Binary Tester for Dyninst Ray Chen.
Object Oriented Programming in C++ Dr. Hammadi Nait-Charif Media School Bournemouth University
University of Maryland The New Dyninst Event Model James Waskiewicz.
C questions A great programmer codes excellent code in C and Java. The code does video decoding. Java code works faster then C on my computer. how come?
Practical Session 4. Labels Definition - advanced label: (pseudo) instruction operands ; comment valid characters in labels are: letters, numbers, _,
Andrew Bernat, Bill Williams Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 New Features in Dyninst
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011.
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011.
The Deconstruction of Dyninst: Experiences and Future Directions Drew Bernat, Madhavi Krishnan, Bill Williams, Bart Miller Paradyn Project 1.
Stack and Heap Memory Stack resident variables include:
Dynamic Memory Allocation The process of allocating memory at run time is known as dynamic memory allocation. C does not Inherently have this facility,
Chapter 6 Programming Languages (2) Introduction to CS 1 st Semester, 2015 Sanghyun Park.
University of Maryland New APIs from P/D Separation James Waskiewicz.
C Programming in Linux Jacob Chan. C/C++ and Java  Portable  Code written in one system and works in another  But in C, there are some libraries that.
CPS4200 Unix Systems Programming Chapter 2. Programs, Processes and Threads A program is a prepared sequence of instructions to accomplish a defined task.
Lecture 3 Classes, Structs, Enums Passing by reference and value Arrays.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Linking Ⅱ.
November 2005 New Features in Paradyn and Dyninst Matthew LeGendre Ray Chen
University of Maryland Paradyn as a Strict Dyninst Client James Waskiewicz.
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
AMD64/EM64T – Dyninst & ParadynMarch 17, 2005 The AMD64/EM64T Port of Dyninst and Paradyn Greg Quinn Ray Chen
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Rewriting with Dyninst Madhavi Krishnan and Dan McNulty.
Programs and Processes Jeff Chase Duke University.
String Analysis for Binaries Mihai Christodorescu Nicholas Kidd Wen-Han Goh University of Wisconsin, Madison.
April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI The Deconstruction of Dyninst Part 1: The SymtabAPI Giridhar Ravipati University of Wisconsin,
© 2006 Andrew R. BernatMarch 2006Generalized Code Relocation Generalized Code Relocation for Instrumentation and Efficiency Andrew R. Bernat University.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 Detecting Code Reuse Attacks Using Dyninst Components Emily Jacobson, Drew.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 Paradyn Project Deconstruction of Dyninst: Best Practices and Lessons Learned Bill.
Dynamic Tuning of Parallel Programs with DynInst Anna Morajko, Tomàs Margalef, Emilio Luque Universitat Autònoma de Barcelona Paradyn/Condor Week, March.
Kernel Structure and Infrastructure David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
Chapter 7 Process Environment Chien-Chung Shen CIS/UD
String Analysis for Binaries Mihai Christodorescu Nicholas Kidd Wen-Han Goh
Program Execution in Linux David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Stack and Heap Memory Stack resident variables include:
Function: Declaration
New Features in Dyninst 5.1
Computer Architecture and Assembly Language
Program Execution in Linux
A configurable binary instrumenter
New Features in Dyninst 6.1 and 6.2
Assembly Language Programming II: C Compiler Calling Sequences
Optimizing Your Dyninst Program
Memory Allocation CS 217.
Kernel Structure and Infrastructure
Multi-modules programming
Virtual Memory: Systems CSCI 380: Operating Systems
Program Execution in Linux
Binary Rewriting with Dyninst
Presentation transcript:

Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29 - May 1, 2013 Using Dyninst for Program Binary Analysis and Instrumentation Emily Jacobson

No Source Code — No Problem With Dyninst we can: o Find (stripped) code o in program binaries o in live processes o Analyze code o functions o control-flow-graphs o loop, dominator analyses o Instrument code o statically (rewrite binary) o dynamically (instrument live process) Using Dyninst for Analysis and Instrumentation Libraries Executables a.out Live Process Executable Library 1 Library N … lib.so prog.exe lib.dll 2

Static RewritingDynamic Instrumentation o Amortize parsing and instrumentation time. o Execute instrumentation at a particular time (oneTimeCode). o Potential to generate more efficient modified binaries. o Insert and remove instrumentation at run time. o 3 rd party response to runtime events o 1 st party response to runtime events 3 Using Dyninst for Analysis and Instrumentation Choice of Static vs. Dynamic Instrumentation

Find memory leaks Add printfs to malloc, free Stackwalk malloc calls that are not freed 4 Example Dyninst Program Using Dyninst for Analysis and Instrumentation ChaosPro ver 3.1

Dyninst Components Using Dyninst for Analysis and Instrumentation Binary Code Code Generator Instrumenter Stack Walker (Stackwalker- API) Process Controller (ProcControl- API) Symbol Table Parser (SymtabAPI) Code Parser (ParsingAPI Instruction Decoder (Instruction- API) Instrumentation Requests Stack Walk Requests Analysis Requests 5

Process Control Several supported OS’s Using Dyninst for Analysis and Instrumentation Linux Windows Process Controller 6

Process Control Several supported OS’s Broad functionality Attach/create process Monitor process status changes Callbacks for fork/exec/exit Mutatee operations: malloc, load library, inferior RPC Uses debugger interface Using Dyninst for Analysis and Instrumentation Analyst Program (Mutator) Dyninst Library Monitored Process (Mutatee) Dyninst Runtime Lib Process Controller Debugger Interface 7

... Dyninst’s Process Interface Using Dyninst for Analysis and Instrumentation 8

Example: Create a ChaosPro.exe Process BPatch bpatch; static void exitCallback(BPatch_thread*,BPatch_exitType) { printf(“About to exit\n”); } int main(int argc, char *argv[]) { if (argc < 2) { fprintf(stderr, "Usage: %s prog_filename\n", argv[0]); return 1; } BPatch_process *proc = bpatch.processCreate( argv[1], argv+1 ); bpatch.registerExitCallback( exitCallback ); proc->continueExecution(); while ( ! proc->isTerminated() ) bpatch.waitForStatusChange(); return 0; } > mutator.exe C:\Chaos\ChaosPro.exe 9 Using Dyninst for Analysis and Instrumentation

Unified Abstractions Using Dyninst for Analysis and Instrumentation 10 BPatch_processBPatch_binaryEdit a.out libc.so Live Process BPatch_addressSpace a.out libc.so Add/remove instrumentation, lookups by address, allocate variables in mutatee Process state, threads, one-time instrument- ation write file

Symbol Table Parsing Using Dyninst for Analysis and Instrumentation Mutatee Code Generator InstrumenterStack Walker Process Controller Symbol Table Parser Code Parser Instruction Decoder chaospro.exe Runtime Lib msvcrt.dll Where are malloc, free? Mutator Dyninst Library 11

Symbol Table Parsing Using Dyninst for Analysis and Instrumentation Where are malloc, free? Mutatee Symbol Table Parser PE ELF XCOFF Program Headers Shared Object Dependencies Type Information Exception Information Symbols Symbol Versions Section Headers Section Data Dynamic Segment Information Relocations Local variable Information Line Number Information SymbolAddress func1 func20x0804cd1d variable1 0x0804cc84 0x0804cd00 Size Runtime Lib 12 chaospro.exe msvcrt.dll

int main(int argc, char *argv[]) {... BPatch_image* image = proc->getImage(); BPatch_module* libc = image->findModule( “msvcrt” ); vector * funcs = libc->findFunction( “malloc” ); BPatch_function * bp_malloc = (*funcs)[0]; Address start = bp_malloc->getBaseAddr(); Address size = bp_malloc->getSize(); printf( “malloc: [%x %x]\n", start, start + size );... } Example: Find malloc Using Dyninst for Analysis and Instrumentation Mutatee Mutator Dyninst Library Runtime Lib 13 chaospro.exe msvcrt.dll

Decoding and Parsing of Binary Code Using Dyninst for Analysis and Instrumentation Mutatee Code Generator InstrumenterStack Walker Code Parser Instruction Decoder Mutator Dyninst Library Runtime Lib Process Controller Symbol Table Parser 14 chaospro.exe msvcrt.dll Get parameters, return values for malloc, free

Instruction Decoding Using Dyninst for Analysis and Instrumentation Instruction Decoder Abstract Syntax Tree mov eax -> [ebx * 4 + ecx] deref add mult moveax[ebx * 4 + ecx] ecx ebx4 IA32 AMD64 POWER Mutatee 8b e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe af 40 0c b6 f f af 40 0c b6 f f5 07 b c 85 a5 94 2b 20 fd 5b 15

Parsing Identify basic blocks, functions Builds control-flow graph Operate on stripped code, but use symbol information opportunistically Using Dyninst for Analysis and Instrumentation Instruction Decoder Mutatee 8b e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe af 40 0c b6 f f af 40 0c b6 f f5 07 b c 85 a5 94 2b 20 fd 5b Code Parser mov eax -> [ebx * 4 + ecx] deref add multecx ebx4 moveax[ebx * 4 + ecx]  Parse-time analyses: 16 IA32 AMD64 POWER

Binary Code Parsing Task: instrument malloc at its entry and exit points, instrument free at its entry point Subtask: find malloc and parse it Using Dyninst for Analysis and Instrumentation Code Parser Instruction Decoder Process Controller Symbol Table Parser chaospro.exe Mutatee e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe af 40 0c b6 f f af 40 0c b6 f f5 07 b c 85 a5 94 2b 20 fd 5b msvcrt.dll malloc 77C2C407 free 77C2C21B atoi 77C1BE7B strcpy 77C46030 memmove 77C472B0 mov eax -> [ebx * 4 + ecx] deref add multecx ebx4 moveax[ebx * 4 + ecx] 17

Control Flow Traversal Parsing Function symbols may be sparse Executables must provide only one function address Libraries provide symbols for exported functions Parsing finds additional functions by following call edges Using Dyninst for Analysis and Instrumentation _start [80483b fa] _init [ b] _fini [ c] main [ cf] targ3d4 [80483d fa] targ400 [ e] targ440 [ ] 18

Control Flow Graph Using Dyninst for Analysis and Instrumentation E C E E CR R R R 19 Address pointAddr; BPatch_procedureLocation type; enum {BPatch_entry, BPatch_exit, BPatch_subroutine, BPatch_address } Graph elements: BPatch_function BPatch_basicBlock BPatch_edge Instrumentation points: BPatch_point

Example: Find malloc’s Exit Points vector * funcs; funcs = bp_image->getProcedures(); funcs = bp_image->findFunction(“malloc”); Using Dyninst for Analysis and Instrumentation E C E E CR R R R Mutatee chaospro.exe msvcrt.dll Parsing is triggered automatically as needed malloc kernel32.dll 20

Example: Find malloc’s Exit Points vector * funcs; funcs = bp_image->findFunction(“malloc”); funcs = libc_mod->findFunction(“malloc”); Using Dyninst for Analysis and Instrumentation E C E E CR R R R Mutatee chaospro.exe msvcrt.dll Parsing is triggered automatically as needed malloc kernel32.dll 21

BPatch_function * bp_malloc = (*funcs)[0]; vector * points = BPatch_entry bp_malloc->findPoints BPatch_subroutine ; BPatch_exit Example: Find malloc’s Exit Points Using Dyninst for Analysis and Instrumentation E C E E C R R R R Mutatee malloc 22 chaospro.exe msvcrt.dll kernel32.dll

Instrumentation (at last!) Using Dyninst for Analysis and Instrumentation Code Generator InstrumenterStack Walker Code Parser Instruction Decoder Mutatee chaospro.exe Mutator Dyninst Library Runtime Lib msvcrt.dll Process Controller Symbol Table Parser 23

Instrument- ation Points Abstract Syntax Tree Snippet Specifying Instrumentation Requests Using Dyninst for Analysis and Instrumentation Instrumentation Requests Code Generator Instrumenter R R what where 24

BPatch_Snippet Subclasses BPatch_sequence( vector items ) BPatch_variableExpr() int value BPatch_constExpr char* value void* value BPatch_ifExpr( BPatch_boolExpr condition, BPatch_Snippet then_clause, BPatch_Snippet else_clause ) BPatch_funcCallExpr( BPatch_function * func, vector args ) BPatch_paramExpr( int param_number ) BPatch_retExpr() Using Dyninst for Analysis and Instrumentation 25

BPatch_Snippet Classes Using Dyninst for Analysis and Instrumentation 26

Example: Forming printf Snippet Using Dyninst for Analysis and Instrumentation printf( “free(%x)\n”, arg0 ); BPatch_funcCallExpr BPatch_paramExpr arg0(0) Bpatch_function bp_printf E free(ptr) vector “free(%x)\n” BPatch_constExpr BPatch_funcCallExpr ( BPatch_function * func, vector args ) 27

Example: Instrument free w/ call to printf Using Dyninst for Analysis and Instrumentation BPatch_function * bp_free; vector entryPoints;... BPatch_constExpr arg0 ( “free(%x)\n” ); BPatch_paramExpr arg1 (0); vector printf_args; printf_args.push_back( & arg0 ); printf_args.push_back( & arg1 ); BPatch_funcCallExpr callPrintf( *bp_printf, printfArgs ); bpatch.beginInsertionSet(); for ( int idx =0; idx < entryPoints.size(); idx++ ) proc->insertSnippet( callPrintf, *entryPoints[idx] ); bpatch.finalizeInsertionSet(); BPatch_funcCallE xpr BPatch_paramExpr arg0(0) bp_printf vector “free(%x)\n” BPatch_constExpr E free(ptr) 28

Using Variables Find / create variable bp_image->findVariable(“global1”); bp_proc->malloc(bp_image->findType(“int”)); Initialization instrumentation e.g., assignment at entry point of main Manipulation instrumentation e.g., arithmetic assignment expression Gather / print out values e.g., through callback instrumentation Using Dyninst for Analysis and Instrumentation 29 malloc instrumentation: save argument in a variable

Example: Instrumenting malloc Using Dyninst for Analysis and Instrumentation void * malloc ( size_t size ) { MALLOC_ARG = size;... if (MALLOC_ARG > 1000) printf(“%x = malloc(%x)\n”, retnValue, MALLOC_ARG); } E R R malloc BPatch_assign BPatch_arithExpr MALLOC_ARGBPatch_constExpr 1 30

vector Example: Instrumenting malloc Using Dyninst for Analysis and Instrumentation BPatch_ifExpr Bpatch_boolExpr E R R malloc BPatch_constExpr(100) MALLOC_ARG BPatch_gt BPatch_funcCallExpr BPatch_function bp_printf “%x = malloc(.)\n” BPatch_retExpr retnValue BPatch_constExpr 31 void * malloc ( size_t size ) { MALLOC_ARG = size;... if (MALLOC_ARG > 100) printf(“%x = malloc(%x)\n”, retnValue, MALLOC_ARG); }

Generating the Instrumentation Code Using Dyninst for Analysis and Instrumentation Code Generator Instrumenter BPatch_funcCallExpr BPatch_paramExpr arg0(0) bp_printf vector “free(%x)\n” BPatch_constExpr mov eax -> [ebx * 4 + ecx] deref add multecx ebx4 moveax[ebx * 4 + ecx] Instrumentation snippet Code at the instrumented point IA32 AMD64 POWER 32

Stack Walking Using Dyninst for Analysis and Instrumentation Code Generator InstrumenterStack Walker Code Parser Instruction Decoder Mutatee chaospro.exe Mutator Dyninst Library Runtime Lib msvcrt.dll Process Controller Symbol Table Parser 33

Example: Stack Walk of malloc Call Callback triggers stackwalk BPatch_thread:: getCallStack(…) Using Dyninst for Analysis and Instrumentation Mutatee chaospro.exe Mutator Dyninst Library Runtime Lib msvcrt.dll  Choose instrumentation point the exit points of malloc  Insert callback instrumentation use stopThreadExpr snippet Stack Walker E R R malloc 34

Implementation Session Code Coverage Create a mutator that counts function invocations See description of the lab at Using Dyninst for Analysis and Instrumentation 35