1 Dimension: An Instrumentation Tool for Virtual Execution Environments Jing Yang, Shukang Zhou and Mary Lou Soffa Department of Computer Science University of Virginia VEE 2006 June 16 th 2006 Ottawa, Canada
2 Motivation Increasing usage of VEEs in many areas –Performance [Bala et al, PLDI ’00] –Security [Scott et al, ACSAC ’02] –Power consumption [Hazelwood et al, ISLPED ’04] Increasing importance of instrumentation for VEEs –Requested by both developers and users Challenge: Building instrumentation for VEEs When to add instrumentation –Instrumentation is added when a VEE is built Repetitive work, time-consuming Only for some preplanned purposes –Instrumentation is added after a VEE is built A standalone instrumentation system which can be used by different VEEs for different purposes – even harder
3 Translation-based VEEs We focus on translation-based VEEs –Dynamically translate source binary to target binary –Target binary is stored in code cache for execution –Handle two binaries simultaneously Input source binary Dynamically generated target binary Instrumentation for translation-based VEEs –Perform on both source binary and target binary –Belong to binary instrumentation
4 Dimension Flexibility – plug and play –Minimum modification to a VEE to use Dimension –Minimum reconfiguration to Dimension (architecture, language) Comprehensiveness –Be able to instrument both source binary and target binary –Instrumentation can be done at various levels of granularities Easy-of-Use –Simple user specification for instrumentation Efficiency –Reasonable instrumentation overhead
5 Relationship between VEEs and Instrumentation Application VEE with instrumentation OS + Hardware (a) Application VEE Instrumentation Application VEEInstrumentation OS + Hardware (b) OS + Hardware (c) Significant modification Hard to reuse Unnecessary translations Unnecessary context-switches Easy to reuse Lightweight modification One translation and context-switch
6 Scenario Binary-Editing Utility Library IA-32MIPS …… Stub Functions Library ……Java VEE Dimension IA-32 MIPS Java Initialize Instrument Finalize IA-32 to MIPS JavaC
7 When to Add Instrumentation VEE Initializer DispatcherTranslator Code Cache Finalizer Application Dimension Probe-based technique Instrument source binary via corresponding target binary Clear interfaces between Dimension and VEE Instrumentation Unit Translation Unit
8 Probe-Based Technique for Variable- length ISA 01 add eax, ebx D8 29 sub eax, edx D8 83 add, eax, 0x12 C0 12 …… JMP 01 add eax, ebx D8 Analysis Routine Save Context Set Up Parameters Call Analysis Routines Restore Context 29 sub eax, edx D8 83 add, eax, 0x12 C0 12 Save Context Set Up Parameters Call Analysis Routines Restore Context Trampoline Instrumentation Uint
9 Components and Interfaces Initializer Translator Finalizer Dispatcher Code Cache Initialization Assistant Instrumentation Assistant Finalization Assistant Instrumentation Repository Instrumenter Auxiliary Code Cache VEEDimension void InitDimension(); void StartInstrumentation(addr src_start, addr src_end, addr tgt_start, addr tgt_end, src_to_tgt_mapping map, bb_info bb); void FinalizeDimension(); _______________________ _______________________ ____
10 Instrumentation Algorithms Dimension Instrumentation Repository Initialization Assistant Finalization Assistant Instrumentation AssistantInstrumenter Auxiliary Code Cache Instrumentation Specification Basic Block Information Source-to-Target Mapping Source Binary Target Binary Plan 1 Plan 2 Opt Plan Trampoline
11 Instrumentation Algorithms Dimension Instrumentation Repository Initialization Assistant Finalization Assistant Instrumentation AssistantInstrumenter Auxiliary Code Cache Instrumentation Specification Basic Block Information Source-to-Target Mapping Source Binary Target Binary Plan 1 Plan 2 Opt Plan Trampoline
12 Optimizing Instrumentation Instrumentation overhead and optimizations –Execute the jump which branches to the trampoline Probe-coalescing [Kumar et al, PASTE ’05] Parameters should remain available if coalesced –Perform the context-switch Partial context-switch Registers in most platforms –Transfer control to analysis routines Analysis routine inlining Only inline short ones to avoid code expansion –Execute analysis routines Lightweight binary-to-binary optimization
13 Case Study Strata [Scott et al, CGO ’03] –SPARC/Solaris –Single-entry translation units –Mainly one-to-one mapping from source binary to target binary, except for some control-transfer instructions Jikes RVM [Arnold et al, OOPSLA ’02] –IA-32/Linux –Multiple-entry translation units – basic block information provided –Mapping from bytecode to machine code is maintained Interface insertion points are easily located
14 Scenario Binary-Editing Utility Library IA-32MIPS …… Stub Functions Library ……Java Strata Dimension SPARC Initialize Instrument Finalize SPARC to SPARC CC
15 Scenario Binary-Editing Utility Library IA-32MIPS …… Stub Functions Library ……Java Jikes RVM Dimension Bytecode IA-32 Java Initialize Instrument Finalize Bytecode to IA-32 JavaC
16 Evaluation Experiments –Effectiveness of optimizations Inlining, partial context-switch, probe coalescing Calculating the average integer-add instructions executed in each basic block –Generality versus efficiency Dimension versus Jazz Branch coverage testing –Comparison in traditional execution environments Strata-Dimension versus Valgrind, DynamoRIO and Pin Basic block counting The data for Valgrind, DynamoRIO and Pin is from [luk, PLDI ’05]
17 Effectiveness of Optimizations Target binary instrumentation for Strata 8.6x 6.4x 2.6x 2.0x
18 Effectiveness of Optimizations Target binary instrumentation for Jikes RVM 2.4x 2.1x 1.4x 1.1x
19 Effectiveness of Optimizations Source binary instrumentation for Jikes RVM 1.7x 1.5x 1.1x 1.2x
20 Evaluation Experiments –Effectiveness of optimizations Inlining, partial context-switch, probe coalescing Calculating the average integer-add instructions executed in each basic block –Generality versus efficiency Dimension versus Jazz [Misurda et al, ICSE ‘05] Branch coverage testing –Comparison in traditional execution environments Strata-Dimension versus Valgrind, DynamoRIO and Pin Basic block counting The data for Valgrind, DynamoRIO and Pin is from [luk, PLDI ’05]
21 Generality versus efficiency Comparison of slowdown from instrumentation between Jazz and Dimension
22 Evaluation Experiments –Effectiveness of optimizations Inlining, partial context-switch, probe coalescing Calculating the average integer-add instructions executed in each basic block –Generality versus efficiency Dimension versus Jazz Branch coverage testing –Comparison in traditional execution environments Strata-Dimension versus three dynamic instrumentation systems –Valgrind [Nethercote, Ph.D. thesis, Univ. of Cambridge, 2004 ] –DynamoRIO [Bruening et al, CGO ‘03] –Pin [Luk et al, PLDI ‘05] Basic block counting The data for Valgrind, DynamoRIO and Pin is from [Luk et al, PLDI ’05]
23 Comparison in traditional execution environments Comparison of slowdown from instrumentation in traditional execution environments 7.5x 4.9x 2.3x 2.6x
24 Related Work Binary instrumentation systems developed for traditional execution environments –Static instrumentation systems ATOM [Srivastava et al, PLDI ’94] Can not handle a VEE’s target binary which is generated on-the-fly –Dynamic instrumentation systems DTrace [Cantrill et al, OSDI ’04], Pin [Luk et al, PLDI ’05] Can not handle a VEE’s source binary if it is non-executable Binary instrumentation systems designed for VEEs –DynamoRIO [Bruening et al, CGO ’03] –FIST [Kumar et al, WOSS ’04] –Tightly bound with a specific VEE –Can not instrument both the source and target binaries
25 Conclusion Dimension – first standalone instrumentation tool specially designed for VEEs Easy to be used by different VEEs Generality does not impact efficiency Reasonable instrumentation overhead compared to other systems ?
26 Instrumentation Specification 1 FILE *trace; 2 3 // Called when program begins 4 EXPORT void DIM_ProgramBegin() { 5 trace = fopen("trace.out", "w"); 6 DIM_InsertBBCall(SOURCE, ENTRY, 7 FUNCPTR(record_bb), ARG_BB_ADDR, ARG_END); 8 } 9 10 // Called when program ends 11 EXPORT void DIM_ProgramEnd() { 12 fclose(trace); 13 } // Print a basic block record 16 void record_bb(void *addr) { 17 fprintf(trace, "%p\n", addr); 18 }
27 Probe-Based Technique Replace each instruction with a jump that branches to a trampoline, which is a code sequence that does: –Execute the original instruction –Perform a context-switch –Prepare the parameters for the analysis routine –Transfer control to the analysis routine Problems with variable-length ISAs –A jump is longer than the original instruction A jump replaces several instructions Each instrumentation unit should have a single entry at its top –The instrumentation unit is shorter than the size of a jump Use a shorter but expensive instruction instead of a jump
28 Reconfiguration For new architectures that VEEs are executing on –Binary-editing utility library –Provide general binary-editing services to Dimension For new languages used in VEE implementation –Dimension is written in C and compiled as a shared object –If a VEE is not implemented in C, stub functions are needed to call C functions, e.g., Java native interface –Parameter wrapping in stub functions, e.g., Java Dimension needs no direct modification
29 Future Work Overcome the ISA and VEE restrictions –Fixed-length ISA: limit offset of a jump –Variable-length ISA: short instrumentation unit problem –VEE: fragment patching Determine the information by its own –Basic block information and source-to-target mapping Automatic reconfiguration –Binary-editing utility library and stub functions High-level contexts capture –An arbitrary local variable in a Java bytecode method
30 Acknowledgements This paper benefited from fruitful discussions with Naveen Kumar and Jonathan Misurda We also thank the anonymous reviewers for their useful suggestions and comments on how to improve the work