Download presentation
Presentation is loading. Please wait.
Published byImogen Jean Butler Modified over 9 years ago
1
JIT Instrumentation – A Novel Approach To Dynamically Instrument Operating Systems Marek Olszewski Keir Mierle Adam Czajkowski Angela Demke Brown University of Toronto
2
1 Instrumenting Operating Systems Operating systems are growing in complexity Kernel instrumentation can help Used for: debugging, profiling, monitoring, and security auditing... Dynamic instrumentation No recompilation & no reboot Good for debugging systemic problems Feasible in production settings
3
2 Current Approach: Probe-Based Dynamic instrumentation tools for OSs are probe based Overwrite existing code with jump/trap Efficient on fixed length architectures Slow on variable length architectures Not safe to overwrite multiple instructions with jump: Branch to between instructions might exist Thread might be sleeping in between the instructions Must use trap instruction
4
3 Current Approach: Trap-based Area of interestInstrumentation Code Trap Handler 1.Save processor state 2.Lookup which instrumentation to call 3.Call instrumentation 4.Emulate overwritten instruction 5.Restore processor state add $1,count_l adc $0,count_h inc 14(edx)int3 mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp Very Expensive!
5
4 Alternative: JIT Instrumentation Propose to use just-in-time dynamic instrumentation Rewrite code to insert new instructions in between existing ones More Efficient. More Powerful. Supports: Instrumenting branch directions Basic block-level instrumentation Per execution-path instrumentation Proven itself in user space (Pin, Valgrind)
6
5 JIT Instrumentation Instrumentation Code Area of Interest mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) adc $0,count_h add $1,count_l Code Cache
7
6 popf JIT Instrumentation Instrumentation Code Area of Interest mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) mov $ffffe000,edx and esp,edx sub $6c,esp mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) adc $0,count_h add $1,count_l call instrmtn pushf Code Cache
8
7 JIT Instrumentation Instrumentation Code Area of Interest mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) mov $ffffe000,edx and esp,edx sub $6c,esp adc $0,count_h add $1,count_lpopf mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) call instrmtn pushf Code Cache
9
8 Dynamic Binary Rewriting Use dynamic binary rewriting to insert the new instructions. Interleaves binary rewriting with execution Performed by a runtime system Typically at basic block granularity Code is rewritten into a code cache Rewritten code must be: Efficient Unaware of its new location
10
9 Dynamic Binary Rewriting Original Code:Code Cache: bb1 Runtime System bb3bb2 bb4 bb1
11
10 Dynamic Binary Rewriting Original Code:Code Cache: bb1 Runtime System bb3bb2 bb4 bb1 bb2
12
11 Dynamic Binary Rewriting Original Code:Code Cache: bb1 Runtime System bb3bb2 bb4 bb1 bb2 bb4 bb1 bb2 No longer need to enter runtime system
13
12 Dynamic Binary Rewriting Used for rewriting operating systems Virtualization (VMware) Emulation (QEMU) Never used for instrumentation of OSs Never used to rewrite host OS in a general manner Allows instrumentation of live system
14
13 Outline Prototype (JIFL) Design OS Issues Performance comparison Kprobes vs JIFL Example Plugin Checking branch hint directions
15
14 Prototype Design JIFL - JIT Instrumentation Framework for Linux Instruments code reachable from system calls
16
15 JIFL Software Architecture JIFL Plugin (Loadable Kernel Module) Linux Kernel (All code reachable from system calls) JIFL Plugin Starter User Space Kernel Space Code Cache Runtime System JIFL (Loadable Kernel Module) Heap Memory Allocator Dispatcher JIT Compiler JIFL Plugin Starter
17
16 Gaining Control Runtime System must gain control before it can start rewriting/instrumenting OS Update system call table entry to point to dynamically emitted entry stub Calls per-system call instrumentation Calls dispatcher Passing original system call pointer
18
17 Dispatcher Saves registers and condition code states Dispatcher checks if target basic block is in code cache If so it jumps to this basic block Otherwise it invokes the JIT to compile and instrument the new basic block
19
18 JIT Compiler Like conventional JIT compiler, except its input/output is x86 machine code Compiles at a dynamic basic block granularity All but the last control flow instruction are copied directly into the code cache Control flow instructions are modified to account for the new location of the code Communicates with the JIFL plugin to determine what instrumentation to insert
20
19 JIT: Inserting Instrumentation Instrumentation is added by inserting a call instruction into the basic block Additional instructions are also needed to: Push/Pop instrumentation parameters Save/Restore volatile registers (eax, edx, ecx) Save/Restore condition code register Several optimizations can be performed to reduce instrumentation cost
21
20 Eliminating Redundant State Saving Eliminate dead register and condition code saving code Perform liveness analysis Reduce state saving overhead Per-basic block Instrumentation Search for the cheapest place to insert it
22
21 Inlining Instrumentation Small instrumentation can be inlined into the basic block Removes the call and ret instructions Constant parameters are propagated to remove stack accesses Copy propagation and dead-code elimination is applied to specialize the instrumentation routine for context All done on native x86 code. No IR!
23
22 Effect of Optimizations Normalized Execution Time Average system call latencies with per-basic block instrumentation
24
23 Prototype Operating System Issues
25
24 Memory Allocator While JITing JIFL often needs to allocate dynamic memory Cannot rely on Linux kmalloc and vmalloc routines as they are not reentrant Instead, we created our own memory allocator Pre-allocate a heap when JIFL starts up
26
25 Releasing Control Calls to schedule() have to be redirected Otherwise, JIFL keeps control even after context switch Have to: Save return address in hash table Call schedule() Look up and call dispatcher
27
26 Performance Comparison JIFL vs. Kprobes
28
27 Performance Evaluation Instrument every system call with three types of instrumentation: System Call Monitoring (Coarse Grained) Call Tracing (Medium Grained) Basic Block Counting (Fine Grained) LMbench and ApacheBench2 benchmarks Test Setup 4-way Intel Pentium 4 Xeon SMP - 2.8GHz Linux 2.6.17.13 With SMP support and no preemption
29
28 System Call Monitoring Normalized Execution Time
30
29 Call Tracing Normalized Execution Time Log Scale
31
30 Basic Block Counting Normalized Execution Time Log Scale
32
31 Apache Throughput Normalized Requests / Second
33
32 Example Plugin Checking Correctness of Branch Hints
34
33 Example Plugin: Checking Branch Hints Int correct_count 0 Int incorrect_count 0 // Called for every newly discovered basic block. Procedure Basic_Block_Callback if last instruction is not a hinted branch return if hinted in the branch not taken direction call Insert_Branch_Not_Taken_Instrumentation( Increment_Counter, &correct_count) call Insert_Branch_Taken_Instrumentation( Increment_Counter, &incorrect_count) else // Insert same instrumentation but for reverse // branch directions // Executed for every instrumented branch. Procedure Increment_Counter(Int *Counter) *Counter *Counter + 1
35
34 Example Plugin: Checking Branch Hints 5 system calls with bad branch hint performance Misprediction rates > 75% Contained > 30% of hinted branch executed Examined using a second plugin Monitored individual branches Found 4 greatest contributors Mapped back to source code Can’t fix: Not hinted by programmer!
36
35 Conclusions JIT instrumentation viable for operating systems Developed a prototype for the Linux kernel (JIFL) Results are very competitive JIFL outperforms Kprobes by orders of magnitude Enables more powerful instrumentation e.g. Branch Hints
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.