Intermission
Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware analysis, binary editor/rewriter, …
3 Familiar territory Benjamin Schwarz, Saumya Debray, and Gregory R. Andrews. Disassembly of executable code revisited Cristina Cifuentes and K. John Gough. Decompilation of binary programs Richard L. Sites, Anton Chernoff, Matthew B. Kirk, Maurice P. Marks, and Scott G. Robinson. Binary translation HenrikTheiling. Extracting safe and precise control flow from binaries Ramkumar Chinchani and Eric van den Berg. A fast static analysis approach to detect exploit code inside network flows J. Troger and C. Cifuentes. Analysis of virtual method invocation for binary translation Laune C. Harris and Barton P. Miller. Practical analysis of stripped binary code Christopher Kruegel, William Robertson, Fredrik Valeur, and Giovanni Vigna. Static disassembly of obfuscated binaries Nathan Rosenblum, Xiaojin Zhu, Barton P. Miller, and Karen Hunt. Learning to analyze binary computer code Amitabh Srivastava and Alan Eustace. ATOM: a system for building customized program analysis tools Barton Miller, Jeffrey Hollingsworth, and Mark Callaghan. Dynamic Program Instrumentation for Scalable Performance Tools
We’ve been down this road… 4 The Deconstruction of Dyninst recursive traversal parsing“gap” parsing heuristicsprobabilistic code models non-contiguous functions code sharing non-returning functions preamble scanning handles stripped binaries learn to recognize function entry points very accurate gap parsing the DYNINST binary parser
What makes a parsing component? 5 The Deconstruction of Dyninst Parsing API simple, intuitive representation 2 functions blocks edges InstructionAPI SymtabAPI platform independence supported by previous Dyninst components 3 Binary code source abstraction 1
Flexible code sources 6 The Deconstruction of Dyninst a binary code object Parser code source requirements: code location codedata access to code bytes unsigned char * buf fe … mainfoobarbaz function hints & names a few (optional) facts pointer width external linkage PLT
Code source contract 7 The Deconstruction of Dyninst boolisValidAddress boolisExecutableAddress void *getPtrToInstruction void *getPtrToData unsignedgetAddressWidth boolisCode boolisData AddresscodeOffset AddresscodeLength Nine mandatory methods SymtabAPI implementation in 232 lines (including optional hints, function names) Any binary code object that can be memory mapped can be parsed
Simple control flow interface 8 The Deconstruction of Dyninst FunctionsBlocksEdges start addr. extents containjoined by start addr. end addr. in edges out edges srctarg type
Views of control flow 9 The Deconstruction of Dyninst while(!work.empty()) { Block *b = work.pop(); /* do something with b */ edgeiter eit = b->out().begin(); while(eit != b->out().end()) { work.push(*eit++); } Walking a control flow graph starting here What if we only want intraprocedural edges?
Edge predicates 10 The Deconstruction of Dyninst while(!work.empty()) { Block *b = work.pop(); /* do something with b */ IntraProc pred; edgeiter eit = b->out().begin(&pred); while(eit != b->out().end()) { work.push(*eit++); } Walking a control flow graph Edge Predicates Tell iterator whether Edge argument should be returned Composable (and, or) Examples: Intraprocedural Single function context Direct branches only
Extensible CFG objects 11 The Deconstruction of Dyninst image_func Function Dyninst image_func ParseAPI Function Simple, only need to represent control flow graph Complex, handles instrumentation, liveness, relocation, etc. Special callback points during parsing parse parse parse unresBranchNotify(insn) [derived class does stuff] parse parse parse Factory interface for CFG objects parser custom factory mkfunc() (Function*) image_func
What’s in the box? 12 The Deconstruction of Dyninst * box to be released soon Binary Parser Control Flow Graph Representation SymtabAPI-based Code Source recursive descent parsing speculative gap parsing cross platform: x86, x86-64, PPC, IA64, SPARC graph interface extensible objects for easy tool integration exports Dyninst InstructionAPI interface cross-platform supports ELF, PE, XCOFF formats
Status 13 The Deconstruction of Dyninst conceptioncode refactoringinterface design Dyninst re-integration (major test case) other major test case: compiler provenance (come tomorrow!)