The Deconstruction of Dyninst: Experiences and Future Directions Drew Bernat, Madhavi Krishnan, Bill Williams, Bart Miller Paradyn Project 1.

Slides:



Advertisements
Similar presentations
PASTE 2011 Szeged, Hungary September 5, 2011 Labeling Library Functions in Stripped Binaries Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller.
Advertisements

Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,
University of Maryland Smarter Code Generation for Dyninst Nick Rutar.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Program Representations. Representing programs Goals.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Paradyn Project Upcoming Features in Dyninst and its Components Bill Williams.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Self-propelled Instrumentation Wenbin Fang.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-3, 2011 Introduction to the PatchAPI Wenbin Fang, Drew Bernat.
Assembly Code Verification Using Model Checking Hao XIAO Singapore University of Technology and Design.
© 2006 Barton P. MillerFebruary 2006Binary Code Analysis and Editing A Framework for Binary Code Analysis, and Static and Dynamic Patching Barton P. Miller.
Automated creation of verification models for C-programs Yury Yusupov Saint-Petersburg State Polytechnic University The Second Spring Young Researchers.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 SWE Introduction to Software Engineering Lecture 22 – Architectural Design (Chapter 13)
Previous finals up on the web page use them as practice problems look at them early.
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
Objects and Classes David Walker CS 320. Advanced Languages advanced programming features –ML data types, exceptions, modules, objects, concurrency,...
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Instrumentation and Measurement CSci 599 Class Presentation Shreyans Mehta.
Understanding and Managing WebSphere V5
Precision Going back to constant prop, in what cases would we lose precision?
Automated Tracing and Visualization of Software Security Structure and Properties Symposium on Visualization for Cyber Security 2012 (VizSec’12) Seattle,
Paradyn Project Dyninst/MRNet Users’ Meeting Madison, Wisconsin August 7, 2014 The Evolution of Dyninst in Support of Cyber Security Emily Gember-Jacobson.
University of Maryland parseThat: A Robust Arbitrary-Binary Tester for Dyninst Ray Chen.
March 17, 2005 Roadmap of Upcoming Research, Features and Releases Bart Miller & Jeff Hollingsworth.
Andrew Bernat, Bill Williams Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 New Features in Dyninst
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011.
1 Malware Analysis and Instrumentation Andrew Bernat and Kevin Roundy Paradyn Project Center for Computing Science June 14, 2011.
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
University of Maryland New APIs from P/D Separation James Waskiewicz.
Christopher Kruegel University of California Engin Kirda Institute Eurecom Clemens Kolbitsch Thorsten Holz Secure Systems Lab Vienna University of Technology.
DARPA Jul A Binary Agent Technology for COTS Software Integrity Anant Agarwal Richard Schooler InCert Software.
Instrumentation in Software Dynamic Translators for Self-Managed Systems Bruce R. Childers Naveen Kumar, Jonathan Misurda and Mary.
Eclipse 24-Apr-17.
BridgePoint Integration John Wolfe / Robert Day Accelerated Technology.
University of Maryland Using Dyninst to Measure Floating-point Error Mike Lam, Jeff Hollingsworth and Pete Stewart.
© 2001 Barton P. MillerParadyn/Condor Week (12 March 2001, Madison/WI) The Paradyn Port Report Barton P. Miller Computer Sciences Department.
November 2005 New Features in Paradyn and Dyninst Matthew LeGendre Ray Chen
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Intermission. Binary parsing 2 The Deconstruction of Dyninst _lock_foo main foo dynamic instrumentation, debugger, static binary analysis tools, malware.
AMD64/EM64T – Dyninst & ParadynMarch 17, 2005 The AMD64/EM64T Port of Dyninst and Paradyn Greg Quinn Ray Chen
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Rewriting with Dyninst Madhavi Krishnan and Dan McNulty.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Concolic Execution for Automatic Exploit Generation Todd Frederick.
Paradyn Week Monday 8:45-9:15am - The Deconstruction of Dyninst: Lessons Learned and Best Practices The Deconstruction of Dyninst: Lessons Learned.
April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI The Deconstruction of Dyninst Part 1: The SymtabAPI Giridhar Ravipati University of Wisconsin,
© 2006 Andrew R. BernatMarch 2006Generalized Code Relocation Generalized Code Relocation for Instrumentation and Efficiency Andrew R. Bernat University.
Antigone Engine. Introduction Antigone = “Counter Generation” Library of functions for simplifying 3D application development Written in C for speed (compatible.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 Detecting Code Reuse Attacks Using Dyninst Components Emily Jacobson, Drew.
A Binary Agent Technology for COTS Software Integrity Anant Agarwal Richard Schooler.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 Paradyn Project Deconstruction of Dyninst: Best Practices and Lessons Learned Bill.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 unstrip: Restoring Function Information to Stripped Binaries Using Dyninst Emily.
Eclipse 27-Apr-17.
Correct RelocationMarch 20, 2016 Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.
Recent and Upcoming Advances in the Dyninst Toolkits
New Features in Dyninst 5.1
Performance Optimizations in Dyninst
CompSci 725 Presentation by Siu Cho Jun, William.
Program Execution in Linux
Workshop in Nihzny Novgorod State University Activity Report
Emily Jacobson and Nathan Rosenblum
A configurable binary instrumenter
New Features in Dyninst 6.1 and 6.2
Efficient x86 Instrumentation:
Reverse engineering through full system simulations
Program Execution in Linux
Dynamic Binary Translators and Instrumenters
Presentation transcript:

The Deconstruction of Dyninst: Experiences and Future Directions Drew Bernat, Madhavi Krishnan, Bill Williams, Bart Miller Paradyn Project 1

Why components? Share tools Build new tools quickly 2

Share Tools 33

Dataflow API Dyninst Components 4 Patch API Instruction API Parse API Stackwalker API ProcControl API CodeGen API Symtab API DynC API DyninstAPI

StackwalkerAPI A Dyninst Component Dyninst Component Users 5 SymtabAPI A Dyninst Component

Build New Tools Quickly: Dataflow Analysis 6 PowerPC jump tables and return instruction detection Malware return address tampering Behavior-preserving relocation

Build New Tools Quickly: Binary Rewriter 7 SymtabAPI A Dyninst Component

Build New Tools Quickly: Unstrip 8 targ8056f50 targ805c3bd0 targ805ee40 targ ParseAPI A Dyninst Component SymtabAPI A Dyninst Component getpidkill Symbol Table

July 2007 Down The Memory Lane SymtabAPI – version 1.0 DynStackwalker – coming soon InstructionAPI – proposed BinInst – proposed 9

PatchAPI A Dyninst Component ParseAPI A Dyninst Component DataflowAPI A Dyninst Component DynC API A Dyninst Component SymtabAPI A Dyninst Component StackwalkerAPI A Dyninst Component InstructionAPI A Dyninst Component ProcControlAPI A Dyninst Component Dyninst Components Timeline Design and Implementation Beta Release First Release Integration into Dyninst SymtabAPI StackwalkerAPI InstructionAPI ParseAPI PatchAPI ProcControlAPI DataflowAPI DynCAPI

Componentization: Design Decisions Define the scope of the component 11 Block Edge Function Cached register liveness info Instrumentability InstPoints Dyninst CFG model ParseAPI CFG model

Componentization: Design Decisions Balance internal and external user requirement 12 StackwalkerAPI A Dyninst Component

Componentization: Design Decisions Refine requirements 13 PatchAPI A Dyninst Component

Componentization: Design Decisions Create right level of abstractions 14 SymtabAPI A Dyninst Component

Componentization: Design Decisions Design extensible and adaptable interfaces 15 StackwalkerAPI A Dyninst Component PatchAPI A Dyninst Component Stack frame stepper Standard frame Debug frame Signal frame ParseAPI A Dyninst Component

Componentization: Design Decisions Plan for reintegration 16 StackwalkerAPI A Dyninst Component ProcControlAPI A Dyninst Component

Ongoing Research 17

Ongoing Research Lightweight, Self-Propelled Instrumentation Wenbin Fang Binary Editing Andrew Bernat Malware Analysis and Instrumentation Kevin Roundy Binary Provenance and Authorship Nate Rosenblum Instrumenting Virtualized Environments Emily Jacobson 18

Lightweight Instrumentation Analyze intermittent bugs and fine-grained performance problems Autonomy Little perturbation High level of detail Rapid activation Ability to analyze black-box systems User level and kernel level 19

User Mutator Self-Propelled Instrumentation 20 Snippet PatchAPI A Dyninst Component

void foo() { { bar() } void bar() { baz() } How it Works 21 Instrumenter.so Process Snippet PatchAPI A Dyninst Component ProcControlAPI A Dyninst Component

22 Binary Instrumentation PatchAPI A Dyninst Component ParseAPI A Dyninst Component

Binary Editing 23 Insert error checking and handling Predicate switching Dynamic patching Code surgery

Malware Analysis and Instrumentation 24 Unpacking Code Overwriting Code Self- Checksumming Address Space Sensitive

SR-Dyninst 25 ParseAPI A Dyninst Component PatchAPI A Dyninst Component ProcControlAPI A Dyninst Component DataflowAPI A Dyninst Component Parse Reachable Code Catch Exceptions Dynamic Code Discovery Overcome Sensitivity

CFG of Conficker A 26

Forensics 27 linux-vdso.so.1 libpthread.so.0 libasound.so.2 libdl.so.2 libstdc++.so.6 libm.so.6 libgcc_s.so.1 libc.so.6 /lib64/ld-linux-x86-64.so.2 librt.so.1 Debugging remote deployments Why Provenance?

I C++ Binary Provenance and Authorship

Provenance System Overview TRAINING DATABINARY ANALYSIS TOOL ParseAPI A Dyninst Component LEARNING FRAMEWORK provenance model PROGRAM

30 Language.999 Compiler.998 Optimization.993 LOHI Version programs x 2,686 binaries 955k functions Acc. Provenance Evaluation

Instrumenting Virtualized Environments 31

Status Update 32

33 Dyninst Major new features: New platforms for binary rewriter x86 and x86_64 - statically linked binaries ppc32 and BlueGene/P - dynamically linked binaries Improvements to parsing speed Reductions in memory usage Deprecated Solaris and IA64 platforms AIX pending due to support difficulties

Component Status Update SymtabAPI Speed and space optimizations InstructionAPI PowerPC (ppc32, ppc64) platform Full integration with Dyninst ParseAPI Platform independent API for parsing binaries Control flow graph representation Interprocedural edges (call and return) Built on InstructionAPI and SymtabAPI Full integration with Dyninst 34

Component Status Update StackwalkerAPI 2.1 Significant reduction in memory usage ProcControlAPI Platform independent interface for creating, monitoring and controlling processes High level abstraction for process control, breakpoints and callbacks for process events DynC API Instrumentation language for specifying snippets C like instrumentation snippets for easy and more legible mutator Handles creation and destruction of snippet-local variables 35

Dyninst 8.0 ProcControl API - Windows and BlueGene Stackwalker API - Windows and VxWorks Stackwalker & ProcControl integration into Dyninst PatchAPI and integration into Dyninst SR Dyninst for tamper resistant and obfuscated binaries New platforms for binary rewriter Dynamically linked binaries on ppc64 and Windows Statically linked binaries on ppc32 and BlueGene/P Dataflow API official release 36

MRNet Support for loading several filters from the same library Lightweight MRNet back-end support for non- blocking receives CrayXT support for staging files using ALPS tool helper Improved build structure that permits configuration for multiple platforms from a single source distribution Numerous bug fixes and enhancements

38 Software and Manuals Dyninst 7.0.1, MRNet 3.0.1: available now! Downloads: Dyninst 8.0 – 4th quarter, 2011 MRNet – coming soon!

New Environments Virtual Machines Whole-system profiling (guest + VMM) using instrumentation VMM-level information to understand how and why an application's performance is affected by the virtualized environment Expand performance profiling in the virtualized environment, where traditional approaches do not work or may not be sufficient Mobile environments – VxWorks, ARM GPUs 39

Questions 40

Unstrip: Semantic Descriptors We take a semantic approach Record information that is likely to be invariant across multiple versions of the function 41 unstrip : Restoring Function Information to Stripped Binaries : mov %ebx, %edx mov %0x66,%eax mov $0x5,%ebx lea 0x4(%esp),%ecx int $0x80 mov %edx, %ebx cmp %0xffffff83,%eax jae ret mov %esi,%esi int $0x80 mov %0x66,%eax mov $0x5,%ebx { }, 5

unstrip Identifying Functions in a Stripped Binary 42 unstrip : Restoring Function Information to Stripped Binaries stripped binary unstripped binary Descriptor Database For each wrapper function { 1. Build the semantic descriptor. 2. Search the database for a match (two stages). 3. Add label to symbol table. }

Performance: Capturing Fine-grained behavior 43 Introduction to the PatchAPI User Mutator DyninstAPI PatchAPI find point insert snippet delete snippet Process void foo () { } void bar () { } void baz () { } Snippet Process void foo () { bar() } void bar () { baz() } void baz () { } Instrumenter.so PatchAPI Snippet Dyninst (3 rd party instrumentation) Self-propelled instrumentation (1 st party instrumentation)

Address Space Snippet CFG Parsing Instrumentation Engine Plugin Interface Public Interface New Component: PatchAPI 44 Introduction to the PatchAPI DyninstInternal PatchAPI

Dyninst Analysis tool Dyninst Dyninst is a toolbox for analysts Mutator  Specifies instrumentation  Gets callbacks for runtime events  Builds high-level analysis program binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e c0 73 1c a d8 6a d0 56 4b fe af 40 0c b6 f f5 07 b Control flow analyzer Instrumenter Data flow analyzer CFG 45 loop, block, function, instruction instrument- ation function replace- ment call stack walking forward & backward slices loop analysis process control library injection symbol table reading, writing binary rewriting machine language parsing

What we could do because of components? SymtabAPI & StackwalkerAPI DyninstAPI Instrumentor ROSE semantics engine Tools we developed - quickly Binary rewriter, unstrip 46

Componentization Trade-offs Internal requirements vs. external requirements Early feedback vs. interface stability Development time vs. scope Structured vs. organic Lesson learned Keep the project details where they belong Change code incrementally Test new interfaces 47

Binary rewriter Read binary file format from disk Parse binary code and build CFG Generate code for instrumentation Patch code Emit new binary file 48 SymtabAPIPatchAPIDyninstAPIParseAPI

Binary rewriter 49 SymtabAPI A Dyninst Component ParseAPI A Dyninst Component PatchAPI A Dyninst Component StackwalkerAPI A Dyninst Component ProcControlAPI A Dyninst Component DataflowAPI A Dyninst Component

Componentization: Design decisions Define the scope of the component Balance internal and external user requirement Refine the assumptions Create right level of abstractions Build from scratch or improve existing code Early feedback vs. interface stability 50 Dyninst Paradyn SymtabAPI A Dyninst Component ProcControlAPI A Dyninst Component InstructionAPI A Dyninst Component StackwalkerAPI A Dyninst Component PatchAPI A Dyninst Component

DyninstAPI Patch API Dyninst and the components AST Binary Process Symtab API Symtab API Binary DynCAPI Symtab API Instruction API Parse API Dataflow API Stackwalker API ProcControl API CodeGen API Symtab API