AMD64/EM64T – Dyninst & ParadynMarch 17, 2005 The AMD64/EM64T Port of Dyninst and Paradyn Greg Quinn Ray Chen
AMD64/EM64T – Dyninst & Paradyn -2- Goals 64-bit Dyninst library and Paradyn daemon that handle both 32-bit and 64-bit mutatees Leverage as much existing functionality as possible
AMD64/EM64T – Dyninst & Paradyn -3- Talk Outline 32-Bit Compatibility 64-Bit Mode –Architectural Overview –Issues for Dyninst Current status and timeline for the port
AMD64/EM64T – Dyninst & ParadynMarch 17, Bit Compatibility
AMD64/EM64T – Dyninst & Paradyn -5- Problematic Porting Conceptually simple –ISA extension –Hardware compatibility –Pre-existing code base –Nightly regression tests
AMD64/EM64T – Dyninst & Paradyn -6- System Structures What’s wrong with this code? struct link_map { /* Base address shared object is loaded at.*/ ElfW(Addr) l_addr; /* Absolute file name object was found in.*/ char *l_name; /* Dynamic section of the shared object.*/ ElfW(Dyn) *l_ld; /* Chain of loaded objects. */ struct link_map *l_next, *l_prev; };
AMD64/EM64T – Dyninst & Paradyn -7- System Structures Compile-time decisions unacceptable Structure size depends on target platform –X86: sizeof( ElfW(Addr) ) == 4 –X86_64: sizeof( ElfW(Addr) ) == 8 Similar problem with pointer data types #define ElfW(type) \ Elf ## __ELF_NATIVE_CLASS ## type
AMD64/EM64T – Dyninst & Paradyn -8- System Structures No backwards compatible structure –Must create and maintain our own Multiple structures affected –link_map, r_debug, libelf routines struct link_map_dyn32 { Elf32_Addr l_addr; uint32_t l_name; uint32_t l_ld; uint32_t l_next, l_prev; };
AMD64/EM64T – Dyninst & Paradyn -9- System Structures Class based solution –Hierarchy with 32-bit and 64-bit siblings –Virtual functions instead of control structures Multiple benefits –No code duplication –Less source clutter –Minor function call overhead
AMD64/EM64T – Dyninst & Paradyn -10- What Works? Operation on 32-bit binaries at 95% –Passes most nightly regression tests Tests 1-12, attach, relocate –Save the World not fully tested Existing x86 shared library issue
AMD64/EM64T – Dyninst & ParadynMarch 17, Bit Mode Architectural Overview
AMD64/EM64T – Dyninst & Paradyn -12- Registers 32-bit Mode: –Eight 32-bit registers EAX EBX ECX EDX EBP ESP EDI ESI
AMD64/EM64T – Dyninst & Paradyn -13- Registers 32-bit Mode: –Eight 32-bit registers 64-bit Mode: –Registers extended to 64 bits RAX RBX RCX RDX RBP RSP RDI RSI
AMD64/EM64T – Dyninst & Paradyn -14- Registers 32-bit Mode: –Eight 32-bit registers 64-bit Mode: –Registers extended to 64 bits –Eight additional registers RAX RBX RCX RDX RBP RSP RDI RSI R8 R9 R10 R11 R12 R13 R14 R15
AMD64/EM64T – Dyninst & Paradyn -15- Registers Encoded using REX prefix: WRXB0100 Determines Width of Operation (32/64) Serve as High Order Bits for Register Numbers
AMD64/EM64T – Dyninst & Paradyn -16- Immediate Values Variable-length instructions allow for register-sized immediates (8 bytes) –MOV RAX, 0x abcdef This is the only way to specify an 8-byte value in an instruction Most importantly for Dyninst: –there is no JMP w/ 8-byte displacement
AMD64/EM64T – Dyninst & ParadynMarch 17, 2005 Handling 64-Bit Mutatees
AMD64/EM64T – Dyninst & Paradyn -18- Instruction Parsing x86 instruction parser collects basic block information and searches for instrumentation points We can use the same parsing algorithm for 64-bit mutatees –Architectural changes are abstracted away by instruction decoding –Bonus: support for stripped binaries
AMD64/EM64T – Dyninst & Paradyn -19- Executing Instrumentation Dyninst maintains a heap of non-contiguous memory areas in the mutatee Instrumentation points jump to code in nearby heap region Code for this already exists (AIX, Solaris) executable library code Mutatee Address Space
AMD64/EM64T – Dyninst & Paradyn -20- Executing Instrumentation Dyninst maintains a heap of non-contiguous memory areas in the mutatee Instrumentation points jump to code in nearby heap region Code for this already exists (AIX, Solaris) executable dyninst heap region library code Mutatee Address Space
AMD64/EM64T – Dyninst & Paradyn -21- Executing Instrumentation Dyninst maintains a heap of non-contiguous memory areas in the mutatee Instrumentation points jump to code in nearby heap region Code for this already exists (AIX, Solaris) executable dyninst heap region library code Mutatee Address Space >> 4GB spacing
AMD64/EM64T – Dyninst & Paradyn -22- Executing Instrumentation Dyninst maintains a heap of non-contiguous memory areas in the mutatee Instrumentation points jump to code in nearby heap region Code for this already exists (AIX, Solaris) executable dyninst heap region library code dyninst heap region Mutatee Address Space >> 4GB spacing
AMD64/EM64T – Dyninst & Paradyn -23- Code Generation Improved architecture allows for more efficient code generation –Stack no longer used for passing arguments –More registers means stack no longer needed for temporary values
AMD64/EM64T – Dyninst & Paradyn -24- Good Things™ We have been able to leverage x86 port extensively (code reuse) Some 32-bit headaches go away –Non-standard optimizations in mutatee code (_dl_open example) More registers allow for more efficient instrumentation code
AMD64/EM64T – Dyninst & Paradyn -25- Status/Timeline Now working: –32-bit support –Instruction decoding, parsing Left to do: –Code generation –Memory allocation –Counter, timers, and sampling code for Paradyn Beta release: 2Q05 –Available for partners and friends Production release: 3Q05
AMD64/EM64T – Dyninst & Paradyn -26- Questions?