Download presentation
Presentation is loading. Please wait.
Published byHerbert Neal Modified over 9 years ago
1
Dimension: An Instrumentation Tool for Virtual Execution Environments Master’s Project Presentation by Jing Yang Department of Computer Science University of Virginia Advisor: Prof. Mary Lou E. Soffa
2
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
3
Background – virtual execution environments (VEEs) A self-contained operating environment Sit between CPU and application Facilitate programmatic modification of an executing program Used for diverse purposes – performance, security, architecture portability, power consumption, instrumentation… Translation-based VEEs Use dynamic translation to produce high quality code and to utilize resources efficiently Handle two binaries simultaneously Input source binary Dynamically generated target binary Focus of this project
4
Background – instrumentation Insert extra code into a program for profiling, monitoring, and controlling execution Classification Source instrumentation Binary instrumentation Static instrumentation – rewrite a program before its execution Dynamic instrumentation – insert extra code on demand during program execution
5
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
6
Motivation – program analysis Program analysis in important for both the source binary and the target binary of a VEE Users – source binary analysis Developers – both source and target binary analysis Instrumentation is widely used for program analysis Profiling Monitoring Controlling How can we do instrumentation in virtual execution environments
7
Motivation – existing instrumentation systems Developed for traditional execution environments – not suitable for VEEs Source instrumentation systems – no source at all Binary instrumentation systems Static instrumentation systems – can not handle the target binary which is generated on-the-fly during program execution Dynamic instrumentation systems – can not handle the source binary which is non-executable Research efforts expanded on instrumentation in VEEs Tightly bound with a specific VEE Can not instrument both the source and target binaries
8
Motivation – goal A standalone instrumentation system specially designed for VEEs Flexible portability to different VEEs Easy to communicate Easy to reconfigure Be able to instrument both the source and target binaries
9
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
10
Design Decisions – paradigm (1) Application VEE with instrumentation OS + Hardware (a) Application VEE Instrumentation Application VEEInstrumentation OS + Hardware (b) OS + Hardware (c)
11
Design Decisions – paradigm (2) Paradigm (a) Implement instrumentation inside a VEE Intermix code for virtual execution and code for instrumentation Significant modification to the VEE Code for instrumentation is difficult to separate and reuse Paradigm (b) Implement the instrumentation system as another VEE Unnecessary translations Unnecessary context-switches
12
Design Decisions – paradigm (3) Paradigm (c) Develop the instrumentation system as another tool The separate tool can provide instrumentation services to various VEEs Directly change code in a VEE’s code cache No extra translations No extra context-switches We choose this one
13
Design Decisions – conceptual modules in a VEE VEE Initializer DispatcherTranslator Code CacheFinalizer Instrumen t here
14
Design Decisions – probe-based technique (1) To avoid interfering with a VEE’s code generation and code cache management mechanisms Replace a program’s instruction with a jump that branches to a trampoline A trampoline is a code sequence Perform a context-switch Prepare the parameters to be passed to the instrumentation code Transfer control to the instrumentation code The VEE needs to provide the locations of binaries Analyze the instructions and replace them with jumps
15
Design Decisions – probe-based technique (2) Definitions Translation unit – a piece of code translated by a VEE at a time an instruction, a basic block, a super block, a method Instrumentation unit – a piece of code instrumented at a time Two kinds of ISAs Fixed-length ISAs A translation unit can always be an instrumentation unit Variable-length ISAs Require single-entry instrumentation units If translation units are multiple-entry, the VEE needs to provide the identification of basic blocks
16
Design Decisions – source binary instrumentation Instrumentation before translation Extra translation overhead paid for instrumentation code Translation after instrumentation Source binary instrumentation is achieved by instrumenting the corresponding target binary The VEE needs to provide the source-to-target mapping Avoid translation overhead, but has communication overhead We choose this one for simplicity
17
Design Decisions – instrumentation specification (1) User-model is similar to ATOM and Pin Instrumentation routines – specify instrumentation policies Where to place calls to analysis routines What arguments are passed Tasks to be performed at the beginning and the end of program execution Analysis routines – invoked when execution hits some program points
18
Design Decisions – instrumentation specification (2) 1 FILE *trace; 2 3 // Called when program begins 4 EXPORT void DIM_ProgramBegin() { 5 trace = fopen("trace.out", "w"); 6 DIM_InsertBBCall(SOURCE, ENTRY, 7 FUNCPTR(record_bb), ARG_BB_ADDR, ARG_END); 8 } 9 10 // Called when program ends 11 EXPORT void DIM_ProgramEnd() { 12 fclose(trace); 13 } 14 15 // Print a basic block record 16 void record_bb(void *addr) { 17 fprintf(trace, "%p\n", addr); 18 }
19
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
20
Dimension – characteristics (1) Flexibility Easy to modify a VEE to use Dimension Easy to be reconfigured to interface with VEEs running on different architectures and written in different languages Comprehensiveness Can instrument both the source and target binaries Instrumentation can be done at various levels of granularities from instruction level to method level
21
Dimension – characteristics (2) Easy-to-use Dimension is transparent to the instrumentation users User-model is similar to ATOM and Pin Efficiency Instrumentation optimizations are applied automatically Slowdown from instrumentation is reasonable, compared to other instrumentation systems
22
Dimension – component organization (1) Initializer Translator Finalizer Dispatcher Code Cache Initialization Assistant Instrumentation Assistant Finalization Assistant Instrumentation Repository Instrumenter Auxiliary Code Cache VEEDimension
23
Dimension – component organization (2) Components that interface with VEEs Initialization assistant – triggered by a VEE’s initializer Set up Dimension at the beginning of program execution Load the instrumentation specification Instrumentation assistant – triggered by a VEE’s translator Receive code segments with relevant information Prepare for actual instrumentation Finalization assistant – triggered by a VEE’s finalizer Clean up Dimension at the end of program execution Record instrumentation results
24
Dimension – component organization (3) Other components Instrumenter Instrument the code from the instrumentation assistant according to the instrumentation specification Only architecture-dependent component Auxiliary Code Cache Store instrumentation code Instrumentation Repository Maintain information used for instrumentation purposes Instrumentation specification Source-to-target mapping
25
Dimension – communication interfaces (1) Interface InitDimension Trigger Dimension’s initialization assistant by a VEE’s initializer No parameters void InitDimension(); Interface FinalizeDimension Trigger Dimension’s finalization assistant by a VEE’s finalizer No parameters void FinalizeDimension();
26
Dimension – communication interfaces (2) Interface StartInstrumentation Trigger Dimension’s instrumentation assistant by a VEE’s translator Six parameters void StartInstrumentation (addr src_start, addr src_end, addr tgt_start, addr tgt_end, src_to_tgt_mapping map, bb_info bb);
27
Dimension – communication interfaces (3) Parameters for Interface StartInstrumentation (1) The start address and (2) the end address of the source binary code of the translation unit Needed for source binary instrumentation Always readily available (3) The start address and (4) the end address of the target binary code of the translation unit Always needed Always readily available
28
Dimension – communication interfaces (4) Parameters for Interface StartInstrumentation (5) Source-to-target mapping information Needed for source binary instrumentation The mapping can be one-to-one, one-to-many and many-to-one VEE developers should add the functionality to support Dimension Previous work has shown that the exact mapping information from un-optimized cod to optimized code can be achieved (6) Identification of basic blocks in translation units Needed when a VEE executes on a variable-length ISA and uses a multiple-entry translation unit Maintained by a VEE for virtual execution purposes (e.g., garbage collection)
29
Dimension – mechanism for instrumenting target binary (1) 1 StartInstrumentation(addr src_start, 2 addr src_end, addr tgt_start, addr tgt_end, 3 src_to_tgt_mapping map, bb_info bb) { 4 5 store_mapping(map, repository); 6 bbs[] = partition_bb(src_start, src_end, bb); 7 8 foreach basic block b in bbs[] { 9 = 10 get_bb_boundary(b, ); 11 = 12 get_bb_boundary(b,, 13 map); 14 InstrumentUnit(usrc_start, usrc_end, 15 utgt_start, utgt_end); 16 } 17 }
30
Dimension – mechanism for instrumenting target binary (2) 1 InstrumentUnit(addr usrc_start, addr usrc_end, 2 addr utgt_start, addr utgt_end) { 3 4 p = load_policy(repository); 5 if(p needs source instrumentation) { 6 map = load_mapping(repository); 7 foreach source insn si between 8 usrc_start and usrc_end { 9 if(si belongs to p.where) { 10 ti = map_tgt(map, si); 11 record_plan(plan_pool, ti.addr, 12 p.analysis_routine, p.parameter); 13 } 14 } 15 } 16 17 if(p needs target instrumentation) { 18 foreach target insn ti between 19 utgt_start and utgt_end { 20 if(ti belongs to p.where) { 21 record_plan(plan_pool, ti.addr, 22 p.analysis_routine, p.parameter); 23 } 24 } 25 } 26 27 opt_plan_pool = opt_plan(plan_pool); 28 foreach optimized plan op in opt_plan_pool { 29 trampoline = gen_trampoline 30 (op.analysis_routine, op.parameter); 31 replace_jump(op.addr, trampoline); 32 } 33 }
31
Dimension – mechanism for instrumenting source binary (1) 1 StartInstrumentation(addr src_start, 2 addr src_end, addr tgt_start, addr tgt_end, 3 src_to_tgt_mapping map, bb_info bb) { 4 5 store_mapping(map, repository); 6 bbs[] = partition_bb(src_start, src_end, bb); 7 8 foreach basic block b in bbs[] { 9 = 10 get_bb_boundary(b, ); 11 = 12 get_bb_boundary(b,, 13 map); 14 InstrumentUnit(usrc_start, usrc_end, 15 utgt_start, utgt_end); 16 } 17 }
32
Dimension – mechanism for instrumenting source binary (2) 1 InstrumentUnit(addr usrc_start, addr usrc_end, 2 addr utgt_start, addr utgt_end) { 3 4 p = load_policy(repository); 5 if(p needs source instrumentation) { 6 map = load_mapping(repository); 7 foreach source insn si between 8 usrc_start and usrc_end { 9 if(si belongs to p.where) { 10 ti = map_tgt(map, si); 11 record_plan(plan_pool, ti.addr, 12 p.analysis_routine, p.parameter); 13 } 14 } 15 } 16 17 if(p needs target instrumentation) { 18 foreach target insn ti between 19 utgt_start and utgt_end { 20 if(ti belongs to p.where) { 21 record_plan(plan_pool, ti.addr, 22 p.analysis_routine, p.parameter); 23 } 24 } 25 } 26 27 opt_plan_pool = opt_plan(plan_pool); 28 foreach optimized plan op in opt_plan_pool { 29 trampoline = gen_trampoline 30 (op.analysis_routine, op.parameter); 31 replace_jump(op.addr, trampoline); 32 } 33 }
33
Dimension – optimizing instrumentation (1) Instrumentation overhead and optimizations Execute the jump which branches to the trampoline Probe-coalescing Perform the context-switch Partial context-switch Transfer control to analysis routines Analysis routine inlining Execute analysis routines Lightweight binary-to-binary optimization Solve all Dynamic instrumentation enabling/disabling
34
Dimension – optimizing instrumentation (2)
35
Dimension – reconfiguration For new architectures Binary-editing utility library Providing general binary-editing services to Dimension Dimension does not directly handle binary instructions For new architectures, new functions are added to the library and Dimension needs no modification For new languages used in VEE implementation Stub functions for a VEE to call Dimension’s interfaces Most languages provide mechanisms to call C functions For new languages implementing the VEE, new stub functions are written and Dimension needs no modification Number of Architecture s is limited Number of languages is limited
36
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
37
Implementation – probe-base instrumentation (1) Fixed-length ISAs A jump always replaces a single complete instruction Variable-length ISAs A jump may be longer than the original instruction Replace several instructions The instrumentation unit should be single-entry The instrumentation unit may be longer than the size of a jump Use a shorter but expensive instruction instead of a jump Dimension relocates each instruction as long as part of it is replaced by a jump
38
Implementation – probe-base instrumentation (2)
39
Implementation – parameter wrapping The choice of six parameters for interfaces follows the C language feature Use a start address and an end address to represent a binary code segment Each address is defined as a pointer to a byte (i.e., char *) Stub functions should wrap the information which VEE can provide to the formats that the interfaces expect In Java, use a byte array to represent a binary segment Use GetByteArrayElements to get the addresses
40
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
41
Case Study – Strata Interfaces insertion Easy to locate the insertion points Single-entry translation units Mainly one-to-one mapping from source binary to target binary SPARC/Solaris Fixed-length ISA – easy encoding Delay-slot instructions – instrument the control transfer instruction just before it Partial context-switch – global registers, floating-pointer state register, conditional code, Y register
42
Case Study – Jikes RVM Interfaces insertion Easy to locate the insertion points Multiple-entry translation units – basic block information provided and easily accessed Mapping from bytecode to machine code is accessed directly after translation IA-32/Linux Variable-length ISA – hard to decode and encode Partial context-switch – eflags
43
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
44
Experiments – setup Strata Architecture – UltraSPARC-III OS – Solaris 9 Benchmark – SPECint2000 Jikes RVM Architecture – Pentium III OS – Fedora Core release 3 Benchmark – SPECjvm98
45
Experiments – methodology Effectiveness of optimizations Inlining, partial context-switch, probe coalescing Dynamic instrumentation enabling/disabling – sampling Flexibility versus efficiency Dimension versus Jazz Use in traditional execution environments Strata-Dimension versus Valgrind, DynamoRIO and Pin
46
Evaluation – effectiveness of optimizations (1)
47
Evaluation – effectiveness of optimizations (2)
48
Evaluation – effectiveness of optimizations (3)
49
Evaluation – dynamic instrumentation enabling/disabling (1) 1 unsigned bb_count = 0; 2 unsigned add_count = 0; 3 4 void record_bb() { 5 bb_count++; 6 } 7 8 void record_add(void *addr) { 9 if(bb_count % 10 == 0) { 10 add_count++; 11 print_loc(addr); 12 } 13} 1 unsigned bb_count = 0; 2 unsigned add_count = 0; 3 4 void record_bb() { 5 bb_count++; 6 if(bb_count % 10 == 0) { 7 enable_instrumentation(); 8 } 9 if(bb_count % 10 == 1) { 10 disable_instrumentation(); 11 } 12} 13 14 void record_add(void *addr) { 15 add_count++; 16 print_loc(addr); 17}
50
Evaluation – dynamic instrumentation enabling/disabling (2)
51
Evaluation – dynamic instrumentation enabling/disabling (3)
52
Evaluation – dynamic instrumentation enabling/disabling (4)
53
Evaluation – dynamic instrumentation enabling/disabling (5)
54
Evaluation – dynamic instrumentation enabling/disabling (6)
55
Evaluation – flexibility versus efficiency
56
Evaluation – use in traditional execution environments
57
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
58
Limitations – ISA Fixed-length ISA Limit offset of a jump Code buffer for both the VEE and Dimension can not be arbitrary large Variable-length ISA Short basic block problem
59
Limitations – information form VEEs Problem Basic block information and source-to-target mapping are needed from VEEs Lose the instrumentation ability if not provided Potential solution Dimension partitions multiple-entry translation units into single- entry instrumentation units Dimension uses de-optimization techniques to map between source instructions and target instructions
60
Limitations – architecture and language portability Problem Binary-editing utility library is implemented manually Stub functions are implemented manually Potential solution Automatic generation of binary-editing utility library based on architecture specification Automatic generation of stub functions based on language specification
61
Limitations – capturing the high-level contexts Problem Do not capture the high-level contexts of both binaries E.g., an arbitrary local variable in a Java bytecode method Potential solution More powerful binary analysis to capture these contexts
62
Future Work Limitations for ISAs Broaden the interfaces Negotiation between VEEs and Dimension to find the appropriate size for code buffer More powerful binary analysis Use shorter but expensive instructions instead of jumps (e.g., trap instructions) Other limitations Try the potential solutions
63
Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion
64
First standalone instrumentation tool specially designed for VEEs Characteristics Flexibility Comprehensiveness Easy-to-use Efficiency
65
Acknowledgement Thanks to my advisor – Dr. Mary Lou E. Soffa Thanks to Shukang Zhou, Naveen Kumar and Jonathan Misurda Thanks to Dr. Hazelwood and all people attending my presentation
66
Questions ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.