Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dimension: An Instrumentation Tool for Virtual Execution Environments Master’s Project Presentation by Jing Yang Department of Computer Science University.

Similar presentations


Presentation on theme: "Dimension: An Instrumentation Tool for Virtual Execution Environments Master’s Project Presentation by Jing Yang Department of Computer Science University."— Presentation transcript:

1 Dimension: An Instrumentation Tool for Virtual Execution Environments Master’s Project Presentation by Jing Yang Department of Computer Science University of Virginia Advisor: Prof. Mary Lou E. Soffa

2 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

3 Background – virtual execution environments (VEEs) A self-contained operating environment Sit between CPU and application Facilitate programmatic modification of an executing program Used for diverse purposes – performance, security, architecture portability, power consumption, instrumentation… Translation-based VEEs Use dynamic translation to produce high quality code and to utilize resources efficiently Handle two binaries simultaneously Input source binary Dynamically generated target binary Focus of this project

4 Background – instrumentation Insert extra code into a program for profiling, monitoring, and controlling execution Classification Source instrumentation Binary instrumentation Static instrumentation – rewrite a program before its execution Dynamic instrumentation – insert extra code on demand during program execution

5 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

6 Motivation – program analysis Program analysis in important for both the source binary and the target binary of a VEE Users – source binary analysis Developers – both source and target binary analysis Instrumentation is widely used for program analysis Profiling Monitoring Controlling How can we do instrumentation in virtual execution environments

7 Motivation – existing instrumentation systems Developed for traditional execution environments – not suitable for VEEs Source instrumentation systems – no source at all Binary instrumentation systems Static instrumentation systems – can not handle the target binary which is generated on-the-fly during program execution Dynamic instrumentation systems – can not handle the source binary which is non-executable Research efforts expanded on instrumentation in VEEs Tightly bound with a specific VEE Can not instrument both the source and target binaries

8 Motivation – goal A standalone instrumentation system specially designed for VEEs Flexible portability to different VEEs Easy to communicate Easy to reconfigure Be able to instrument both the source and target binaries

9 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

10 Design Decisions – paradigm (1) Application VEE with instrumentation OS + Hardware (a) Application VEE Instrumentation Application VEEInstrumentation OS + Hardware (b) OS + Hardware (c)

11 Design Decisions – paradigm (2) Paradigm (a) Implement instrumentation inside a VEE Intermix code for virtual execution and code for instrumentation Significant modification to the VEE Code for instrumentation is difficult to separate and reuse Paradigm (b) Implement the instrumentation system as another VEE Unnecessary translations Unnecessary context-switches

12 Design Decisions – paradigm (3) Paradigm (c) Develop the instrumentation system as another tool The separate tool can provide instrumentation services to various VEEs Directly change code in a VEE’s code cache No extra translations No extra context-switches We choose this one

13 Design Decisions – conceptual modules in a VEE VEE Initializer DispatcherTranslator Code CacheFinalizer Instrumen t here

14 Design Decisions – probe-based technique (1) To avoid interfering with a VEE’s code generation and code cache management mechanisms Replace a program’s instruction with a jump that branches to a trampoline A trampoline is a code sequence Perform a context-switch Prepare the parameters to be passed to the instrumentation code Transfer control to the instrumentation code The VEE needs to provide the locations of binaries Analyze the instructions and replace them with jumps

15 Design Decisions – probe-based technique (2) Definitions Translation unit – a piece of code translated by a VEE at a time an instruction, a basic block, a super block, a method Instrumentation unit – a piece of code instrumented at a time Two kinds of ISAs Fixed-length ISAs A translation unit can always be an instrumentation unit Variable-length ISAs Require single-entry instrumentation units If translation units are multiple-entry, the VEE needs to provide the identification of basic blocks

16 Design Decisions – source binary instrumentation Instrumentation before translation Extra translation overhead paid for instrumentation code Translation after instrumentation Source binary instrumentation is achieved by instrumenting the corresponding target binary The VEE needs to provide the source-to-target mapping Avoid translation overhead, but has communication overhead We choose this one for simplicity

17 Design Decisions – instrumentation specification (1) User-model is similar to ATOM and Pin Instrumentation routines – specify instrumentation policies Where to place calls to analysis routines What arguments are passed Tasks to be performed at the beginning and the end of program execution Analysis routines – invoked when execution hits some program points

18 Design Decisions – instrumentation specification (2) 1 FILE *trace; 2 3 // Called when program begins 4 EXPORT void DIM_ProgramBegin() { 5 trace = fopen("trace.out", "w"); 6 DIM_InsertBBCall(SOURCE, ENTRY, 7 FUNCPTR(record_bb), ARG_BB_ADDR, ARG_END); 8 } 9 10 // Called when program ends 11 EXPORT void DIM_ProgramEnd() { 12 fclose(trace); 13 } 14 15 // Print a basic block record 16 void record_bb(void *addr) { 17 fprintf(trace, "%p\n", addr); 18 }

19 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

20 Dimension – characteristics (1) Flexibility Easy to modify a VEE to use Dimension Easy to be reconfigured to interface with VEEs running on different architectures and written in different languages Comprehensiveness Can instrument both the source and target binaries Instrumentation can be done at various levels of granularities from instruction level to method level

21 Dimension – characteristics (2) Easy-to-use Dimension is transparent to the instrumentation users User-model is similar to ATOM and Pin Efficiency Instrumentation optimizations are applied automatically Slowdown from instrumentation is reasonable, compared to other instrumentation systems

22 Dimension – component organization (1) Initializer Translator Finalizer Dispatcher Code Cache Initialization Assistant Instrumentation Assistant Finalization Assistant Instrumentation Repository Instrumenter Auxiliary Code Cache VEEDimension

23 Dimension – component organization (2) Components that interface with VEEs Initialization assistant – triggered by a VEE’s initializer Set up Dimension at the beginning of program execution Load the instrumentation specification Instrumentation assistant – triggered by a VEE’s translator Receive code segments with relevant information Prepare for actual instrumentation Finalization assistant – triggered by a VEE’s finalizer Clean up Dimension at the end of program execution Record instrumentation results

24 Dimension – component organization (3) Other components Instrumenter Instrument the code from the instrumentation assistant according to the instrumentation specification Only architecture-dependent component Auxiliary Code Cache Store instrumentation code Instrumentation Repository Maintain information used for instrumentation purposes  Instrumentation specification  Source-to-target mapping

25 Dimension – communication interfaces (1) Interface InitDimension Trigger Dimension’s initialization assistant by a VEE’s initializer No parameters void InitDimension(); Interface FinalizeDimension Trigger Dimension’s finalization assistant by a VEE’s finalizer No parameters void FinalizeDimension();

26 Dimension – communication interfaces (2) Interface StartInstrumentation Trigger Dimension’s instrumentation assistant by a VEE’s translator Six parameters void StartInstrumentation (addr src_start, addr src_end, addr tgt_start, addr tgt_end, src_to_tgt_mapping map, bb_info bb);

27 Dimension – communication interfaces (3) Parameters for Interface StartInstrumentation (1) The start address and (2) the end address of the source binary code of the translation unit Needed for source binary instrumentation Always readily available (3) The start address and (4) the end address of the target binary code of the translation unit Always needed Always readily available

28 Dimension – communication interfaces (4) Parameters for Interface StartInstrumentation (5) Source-to-target mapping information Needed for source binary instrumentation The mapping can be one-to-one, one-to-many and many-to-one VEE developers should add the functionality to support Dimension Previous work has shown that the exact mapping information from un-optimized cod to optimized code can be achieved (6) Identification of basic blocks in translation units Needed when a VEE executes on a variable-length ISA and uses a multiple-entry translation unit Maintained by a VEE for virtual execution purposes (e.g., garbage collection)

29 Dimension – mechanism for instrumenting target binary (1) 1 StartInstrumentation(addr src_start, 2 addr src_end, addr tgt_start, addr tgt_end, 3 src_to_tgt_mapping map, bb_info bb) { 4 5 store_mapping(map, repository); 6 bbs[] = partition_bb(src_start, src_end, bb); 7 8 foreach basic block b in bbs[] { 9 = 10 get_bb_boundary(b, ); 11 = 12 get_bb_boundary(b,, 13 map); 14 InstrumentUnit(usrc_start, usrc_end, 15 utgt_start, utgt_end); 16 } 17 }

30 Dimension – mechanism for instrumenting target binary (2) 1 InstrumentUnit(addr usrc_start, addr usrc_end, 2 addr utgt_start, addr utgt_end) { 3 4 p = load_policy(repository); 5 if(p needs source instrumentation) { 6 map = load_mapping(repository); 7 foreach source insn si between 8 usrc_start and usrc_end { 9 if(si belongs to p.where) { 10 ti = map_tgt(map, si); 11 record_plan(plan_pool, ti.addr, 12 p.analysis_routine, p.parameter); 13 } 14 } 15 } 16 17 if(p needs target instrumentation) { 18 foreach target insn ti between 19 utgt_start and utgt_end { 20 if(ti belongs to p.where) { 21 record_plan(plan_pool, ti.addr, 22 p.analysis_routine, p.parameter); 23 } 24 } 25 } 26 27 opt_plan_pool = opt_plan(plan_pool); 28 foreach optimized plan op in opt_plan_pool { 29 trampoline = gen_trampoline 30 (op.analysis_routine, op.parameter); 31 replace_jump(op.addr, trampoline); 32 } 33 }

31 Dimension – mechanism for instrumenting source binary (1) 1 StartInstrumentation(addr src_start, 2 addr src_end, addr tgt_start, addr tgt_end, 3 src_to_tgt_mapping map, bb_info bb) { 4 5 store_mapping(map, repository); 6 bbs[] = partition_bb(src_start, src_end, bb); 7 8 foreach basic block b in bbs[] { 9 = 10 get_bb_boundary(b, ); 11 = 12 get_bb_boundary(b,, 13 map); 14 InstrumentUnit(usrc_start, usrc_end, 15 utgt_start, utgt_end); 16 } 17 }

32 Dimension – mechanism for instrumenting source binary (2) 1 InstrumentUnit(addr usrc_start, addr usrc_end, 2 addr utgt_start, addr utgt_end) { 3 4 p = load_policy(repository); 5 if(p needs source instrumentation) { 6 map = load_mapping(repository); 7 foreach source insn si between 8 usrc_start and usrc_end { 9 if(si belongs to p.where) { 10 ti = map_tgt(map, si); 11 record_plan(plan_pool, ti.addr, 12 p.analysis_routine, p.parameter); 13 } 14 } 15 } 16 17 if(p needs target instrumentation) { 18 foreach target insn ti between 19 utgt_start and utgt_end { 20 if(ti belongs to p.where) { 21 record_plan(plan_pool, ti.addr, 22 p.analysis_routine, p.parameter); 23 } 24 } 25 } 26 27 opt_plan_pool = opt_plan(plan_pool); 28 foreach optimized plan op in opt_plan_pool { 29 trampoline = gen_trampoline 30 (op.analysis_routine, op.parameter); 31 replace_jump(op.addr, trampoline); 32 } 33 }

33 Dimension – optimizing instrumentation (1) Instrumentation overhead and optimizations Execute the jump which branches to the trampoline Probe-coalescing Perform the context-switch Partial context-switch Transfer control to analysis routines Analysis routine inlining Execute analysis routines Lightweight binary-to-binary optimization Solve all Dynamic instrumentation enabling/disabling

34 Dimension – optimizing instrumentation (2)

35 Dimension – reconfiguration For new architectures Binary-editing utility library Providing general binary-editing services to Dimension Dimension does not directly handle binary instructions For new architectures, new functions are added to the library and Dimension needs no modification For new languages used in VEE implementation Stub functions for a VEE to call Dimension’s interfaces Most languages provide mechanisms to call C functions For new languages implementing the VEE, new stub functions are written and Dimension needs no modification Number of Architecture s is limited Number of languages is limited

36 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

37 Implementation – probe-base instrumentation (1) Fixed-length ISAs A jump always replaces a single complete instruction Variable-length ISAs A jump may be longer than the original instruction Replace several instructions The instrumentation unit should be single-entry The instrumentation unit may be longer than the size of a jump Use a shorter but expensive instruction instead of a jump Dimension relocates each instruction as long as part of it is replaced by a jump

38 Implementation – probe-base instrumentation (2)

39 Implementation – parameter wrapping The choice of six parameters for interfaces follows the C language feature Use a start address and an end address to represent a binary code segment Each address is defined as a pointer to a byte (i.e., char *) Stub functions should wrap the information which VEE can provide to the formats that the interfaces expect In Java, use a byte array to represent a binary segment Use GetByteArrayElements to get the addresses

40 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

41 Case Study – Strata Interfaces insertion Easy to locate the insertion points Single-entry translation units Mainly one-to-one mapping from source binary to target binary SPARC/Solaris Fixed-length ISA – easy encoding Delay-slot instructions – instrument the control transfer instruction just before it Partial context-switch – global registers, floating-pointer state register, conditional code, Y register

42 Case Study – Jikes RVM Interfaces insertion Easy to locate the insertion points Multiple-entry translation units – basic block information provided and easily accessed Mapping from bytecode to machine code is accessed directly after translation IA-32/Linux Variable-length ISA – hard to decode and encode Partial context-switch – eflags

43 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

44 Experiments – setup Strata Architecture – UltraSPARC-III OS – Solaris 9 Benchmark – SPECint2000 Jikes RVM Architecture – Pentium III OS – Fedora Core release 3 Benchmark – SPECjvm98

45 Experiments – methodology Effectiveness of optimizations Inlining, partial context-switch, probe coalescing Dynamic instrumentation enabling/disabling – sampling Flexibility versus efficiency Dimension versus Jazz Use in traditional execution environments Strata-Dimension versus Valgrind, DynamoRIO and Pin

46 Evaluation – effectiveness of optimizations (1)

47 Evaluation – effectiveness of optimizations (2)

48 Evaluation – effectiveness of optimizations (3)

49 Evaluation – dynamic instrumentation enabling/disabling (1) 1 unsigned bb_count = 0; 2 unsigned add_count = 0; 3 4 void record_bb() { 5 bb_count++; 6 } 7 8 void record_add(void *addr) { 9 if(bb_count % 10 == 0) { 10 add_count++; 11 print_loc(addr); 12 } 13} 1 unsigned bb_count = 0; 2 unsigned add_count = 0; 3 4 void record_bb() { 5 bb_count++; 6 if(bb_count % 10 == 0) { 7 enable_instrumentation(); 8 } 9 if(bb_count % 10 == 1) { 10 disable_instrumentation(); 11 } 12} 13 14 void record_add(void *addr) { 15 add_count++; 16 print_loc(addr); 17}

50 Evaluation – dynamic instrumentation enabling/disabling (2)

51 Evaluation – dynamic instrumentation enabling/disabling (3)

52 Evaluation – dynamic instrumentation enabling/disabling (4)

53 Evaluation – dynamic instrumentation enabling/disabling (5)

54 Evaluation – dynamic instrumentation enabling/disabling (6)

55 Evaluation – flexibility versus efficiency

56 Evaluation – use in traditional execution environments

57 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

58 Limitations – ISA Fixed-length ISA Limit offset of a jump Code buffer for both the VEE and Dimension can not be arbitrary large Variable-length ISA Short basic block problem

59 Limitations – information form VEEs Problem Basic block information and source-to-target mapping are needed from VEEs Lose the instrumentation ability if not provided Potential solution Dimension partitions multiple-entry translation units into single- entry instrumentation units Dimension uses de-optimization techniques to map between source instructions and target instructions

60 Limitations – architecture and language portability Problem Binary-editing utility library is implemented manually Stub functions are implemented manually Potential solution Automatic generation of binary-editing utility library based on architecture specification Automatic generation of stub functions based on language specification

61 Limitations – capturing the high-level contexts Problem Do not capture the high-level contexts of both binaries E.g., an arbitrary local variable in a Java bytecode method Potential solution More powerful binary analysis to capture these contexts

62 Future Work Limitations for ISAs Broaden the interfaces Negotiation between VEEs and Dimension to find the appropriate size for code buffer More powerful binary analysis Use shorter but expensive instructions instead of jumps (e.g., trap instructions) Other limitations Try the potential solutions

63 Outline Background Motivation Design Decisions Dimension Implementation Case Study Experiments and Evaluation Limitations and Future Work Conclusion

64 First standalone instrumentation tool specially designed for VEEs Characteristics Flexibility Comprehensiveness Easy-to-use Efficiency

65 Acknowledgement Thanks to my advisor – Dr. Mary Lou E. Soffa Thanks to Shukang Zhou, Naveen Kumar and Jonathan Misurda Thanks to Dr. Hazelwood and all people attending my presentation

66 Questions ?


Download ppt "Dimension: An Instrumentation Tool for Virtual Execution Environments Master’s Project Presentation by Jing Yang Department of Computer Science University."

Similar presentations


Ads by Google