Comprehensive Kernel Instrumentation via Dynamic Binary Translation Peter Feiner, Angela Demke Brown, Ashvin Goel University of Toronto Presenter: Chuong Ngo
THE ORIGIN STORY STARTING IN MEDIAS RES No parents, uncles, or girlfriends were killed during the creation of this presentation
DBT is the Answer! Emulation of one instruction set by another through translation of binary code during execution. More practical than static binary translation. ◦ Simplifies identification of executable code. ◦ Amortization of translation overhead costs over time.
…and I Remember Everything!
The Answer to What? Ports ◦ Abandonware Analysis Bug finding Security
Assemble! User Level JIFL PinOS Pin DynamoRio Valgrind Power Level < 9K
IT’S A BIRD! IT’S A PLANE! IT’S DRK! All the way from Earth-1610 via Cataclysm
But Who Hides Behind the Mask? 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent. DynamoRio for the kernel.
DynamoRio Flashback! Code cache CTIs return control to dispatcher Direct branching patches Next Executing Tail Client callbacks
Well Victor…I’ve been thinking. All kernel entry points point to dispatcher. ◦ Shadow descriptor table Self-contained dispatcher ◦ Custom heap allocator ◦ “Pull” I/O model CPU-private data Interrupts delayed in code cache, disabled in dispatcher. Exceptions use restored native states.
A Carbonadium Skeleton
DRK Initialization Individual CPU initialization ◦ Allocate CPU resources ◦ All kernel entry points to dispatcher ◦ All interrupts redirected Allocates memory for heap ◦ Checks all processors for successful memory mapping. ◦ Must be within 2GB of text and data segments.
DRK Normal Operations Determine target of control transfer instruction and dispatch. Kernel exit points executed via native instructions. Dispatcher creates and caches code fragment. Context switches to the code fragment.
You Can’t Escape This Timeline! Exceptions run native ◦ Native state must be restored. Interrupts are delayed and emulated. ◦ Other interrupts are disabled. ◦ Captured interrupt executed between block dispatches.
HOW DOES IT STACK UP? How did--? This… you… What are you?
I’ve always found hardware to be more reliable Test System: Dell Optiplex 980 ◦ 8 GB RAM ◦ 4x Intel Core i7s at 2.8 GHz, no hyperthreading 2 Clients: ◦ Null Client ◦ Instruction Count Filebench
I’m the best at what I do?
There’s a whole new master of magnetism in town!
I know everything. I can’t help it.
With great power… 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent.
I’ll be there…around every corner Full coverage of kernel code. Preserve original concurrency and execution interleaving.
Fastest man alive with a limp No direct overhead for user level code. ◦ Increased cache and TLB misses.
The cosmic rays…what did they do to us? Be transparent. ◦ No code cache consistency. ◦ Shadow descriptor tables readable via hardware registers. ◦ Page table inconsistencies. ◦ CPU-private data.
…comes great responsibility. 4 Goals for kernel DBT framework: ◦ Full coverage of kernel code. ◦ No direct overhead for user level code. ◦ Preserve original concurrency and execution interleaving. ◦ Be transparent.
DRK APPLICATIONS This was the world that I had created.
DRK’s Shadow Memory Storing metadata about memory used. Ported UMBRA. ◦ Simple indirect mapping. ◦ Copy-on-write. ◦ 10x overhead vs. native.
KAddrcheck Memory addressability checking tool. Scans slab allocator’s data structures to locate all pages and freelists. ◦ Triggers shadow memory allocations. Addressability checks run on every memory access.
Stackcheck ◦ Checks for addressability errors. ◦ Kills calling thread and continues. Modified KAddrcheck Resolves overflow without system crash. Stack overflow guard
Triumph! DRK is a kernel-level DBT. DynamoRIO “port”. Heavy implementation. Missing a number of features.