Dynamic Compilation and Modification CS 671 April 15, 2008.

Slides:



Advertisements
Similar presentations
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
Advertisements

Profiler In software engineering, profiling ("program profiling", "software profiling") is a form of dynamic program analysis that measures, for example,
Instrumentation of Linux Programs with Pin Robert Cohn & C-K Luk Platform Technology & Architecture Development Enterprise Platform Group Intel Corporation.
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,
Integrity & Malware Dan Fleck CS469 Security Engineering Some of the slides are modified with permission from Quan Jia. Coming up: Integrity – Who Cares?
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
1 1 Lecture 14 Java Virtual Machine Instructors: Fu-Chiung Cheng ( 鄭福炯 ) Associate Professor Computer Science & Engineering Tatung Institute of Technology.
Pin : Building Customized Program Analysis Tools with Dynamic Instrumentation Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff.
Pin PLDI Tutorial Advantages of Pin Instrumentation Easy-to-use Instrumentation: Uses dynamic instrumentation –Do not need source code, recompilation,
SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance Steven Wallace and Kim Hazelwood.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Pin Tutorial Robert Cohn Intel. Pin Tutorial Academia Sinica About Me Robert Cohn –Original author of Pin –Senior Principal Engineer at Intel –Ph.D.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
JETT 2003 Java.compareTo(C++). JAVA Java Platform consists of 4 parts: –Java Language –Java API –Java class format –Java Virtual Machine.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
San Diego Supercomputer Center Performance Modeling and Characterization Lab PMaC Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation.
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
1 Software Testing and Quality Assurance Lecture 31 – SWE 205 Course Objective: Basics of Programming Languages & Software Construction Techniques.
CSc 453 Interpreters & Interpretation Saumya Debray The University of Arizona Tucson.
Session-02. Objective In this session you will learn : What is Class Loader ? What is Byte Code Verifier? JIT & JAVA API Features of Java Java Environment.
University of Colorado
Pin2 Tutorial1 Pin Tutorial Kim Hazelwood Robert Muth VSSAD Group, Intel.
Chapter 4: Threads Adapted to COP4610 by Robert van Engelen.
Intro to Java The Java Virtual Machine. What is the JVM  a software emulation of a hypothetical computing machine that runs Java bytecodes (Java compiler.
About Us Kim Hazelwood Vijay Janapa Reddi
David Evans CS201j: Engineering Software University of Virginia Computer Science Lecture 18: 0xCAFEBABE (Java Byte Codes)
Compiler Construction Lecture 17 Mapping Variables to Memory.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
JIT in webkit. What’s JIT See time_compilation for more info. time_compilation.
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
Process Virtualization and Symbiotic Optimization Kim Hazelwood ACACES Summer School July 2009.
Lecture 10 : Introduction to Java Virtual Machine
CSC 310 – Imperative Programming Languages, Spring, 2009 Virtual Machines and Threaded Intermediate Code (instead of PR Chapter 5 on Target Machine Architecture)
PMaC Performance Modeling and Characterization Performance Modeling and Analysis with PEBIL Michael Laurenzano, Ananta Tiwari, Laura Carrington Performance.
- 1 - Copyright © 2006 Intel Corporation. All Rights Reserved. Using the Pin Instrumentation Tool for Computer Architecture Research Aamer Jaleel, Chi-Keung.
Introduction 1-1 Introduction to Virtual Machines From “Virtual Machines” Smith and Nair Chapter 1.
1 Introduction to JVM Based on material produced by Bill Venners.
Pin Tutorial Kim Hazelwood David Kaeli Dan Connors Vijay Janapa Reddi.
1 Instrumentation of Intel® Itanium® Linux* Programs with Pin download: Robert Cohn MMDC Intel * Other names and brands.
1 Software Instrumentation and Hardware Profiling for Intel® Itanium® Linux* CGO’04 Tutorial 3/21/04 Robert Cohn, Intel Stéphane Eranian, HP CK Luk, Intel.
Java Virtual Machine Case Study on the Design of JikesRVM.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS266 Software Reverse Engineering (SRE) Reversing and Patching Java Bytecode Teodoro (Ted) Cipresso,
Instrumentation in Software Dynamic Translators for Self-Managed Systems Bruce R. Childers Naveen Kumar, Jonathan Misurda and Mary.
Scalable Support for Multithreaded Applications on Dynamic Binary Instrumentation Systems Kim Hazelwood Greg Lueck Robert Cohn.
Source: Operating System Concepts by Silberschatz, Galvin and Gagne.
Day 2: Building Process Virtualization Systems Kim Hazelwood ACACES Summer School July 2009.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 14 Threads 2 Read Ch.
Part Two: Optimizing Pintools Robert Cohn Kim Hazelwood.
Performance Optimization of Pintools C K Luk Copyright © 2006 Intel Corporation. All Rights Reserved. Reducing Instrumentation Overhead Total Overhead.
1 ROGUE Dynamic Optimization Framework Using Pin Vijay Janapa Reddi PhD. Candidate - Electrical And Computer Engineering University of Colorado at Boulder.
RealTimeSystems Lab Jong-Koo, Lim
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Lecture 3: Scaffolding and Output Announcements & Review Announcements Discussion Sections: PAI 5.70 until further notice Pair in same discussion
Introduction to Operating Systems Concepts
Chapter Goals Describe the application development process and the role of methodologies, models, and tools Compare and contrast programming language generations.
Kim Hazelwood Robert Cohn Intel SPI-ST
Chapter 2 Processes and Threads Today 2.1 Processes 2.2 Threads
Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 4 – The Instruction Set Architecture.
Advantages of Pin Instrumentation
CSc 453 Interpreters & Interpretation
Introduction to Virtual Machines
Introduction to Virtual Machines
CSc 453 Interpreters & Interpretation
Dynamic Binary Translators and Instrumenters
Presentation transcript:

Dynamic Compilation and Modification CS 671 April 15, 2008

CS 671 – Spring So Far… Static Compilation Compiler High-Level Programming Languages Machine Code Error Messages High-Level Programming Languages Machine Code Front End Back End Error Messages Compiler Digging Deeper…

CS 671 – Spring Alternatives to the Traditional Model Static Compilation All work is done “ahead-of-time” Just-in-Time Compilation Postpone some compilation tasks Multiversioning and Dynamic Feedback Include multiple options in binary Dynamic Binary Optimization Traditional compilation model Executables can adapt

CS 671 – Spring Move More of Compilation to Run Time Execution environment may be quite different from the assumptions made at compile time Dynamically loaded libraries User inputs Hardware configurations Dependence on software vendors Apps on tap Incorporate profiling

CS 671 – Spring Just-in-Time Compilation High-Level Programming Languages Machine Code Front End Back End Error Messages Ship bytecodes (think IR) rather than binaries Binaries execute on machines Bytecodes execute on virtual machines Compiler

CS 671 – Spring Just-in-Time Compilation javac the Java bytecode compiler java the Java virtual machine Bytecode: machine independent, portable Step One: “Compile” Circle.java % javac Circle.java -> Circle.class Step Two: “Execute” % java Circle.class javac sourcebytecode java bytecodeexecute

CS 671 – Spring Each frame contains local variables and an operand stack Instruction set Load/store between locals and operand stack Arithmetic on operand stack Object creation and method invocation Array/field accesses Control transfers and exceptions The type of the operand stack at each program point is known at compile time Bytecodes

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) Bytecodes (cont.)

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) a b c Bytecodes (cont.)

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) a b c Bytecodes (cont.)

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) a b c Bytecodes (cont.)

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) a b c Bytecodes (cont.)

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) a b c Bytecodes (cont.)

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) a b c Bytecodes (cont.)

CS 671 – Spring Example: iconst 2 iload a iload b iadd imul istore c Computes: c := 2 * (a + b) a b c Bytecodes (cont.)

CS 671 – Spring Executing Bytecode java Circle.class - What happens? Interpreting map each bytecode to a machine code sequence, for each bytecode, execute the sequence Translation to machine code map all the bytecodes to machine code (or a higher level intermediate representation) massage them (e.g., remove redundancies) execute the machine code

CS 671 – Spring Hotspot Compilation A hybrid approach Initially interpret Find the “hot” (frequently executed) methods Translate only hot methods to machine code

CS 671 – Spring The Virtual Machine An extreme version of an old idea Previously: Now: MyApp x86 P IIIP IV MyApp alpha MyApp pa-risc PA-8000PA-7000 MyApp JVM P IIIP IV PA-8000PA-7000 VM

CS 671 – Spring Compile-Time Multiversioning Multiple versions of code sections are generated at compile-time Most appropriate variant is selected at runtime based upon characteristics of the input data and/or machine environment Multiple variants can cause code explosion –Thus typically only a few versions are created

CS 671 – Spring Another Alternative Modify a traditional application as it executes Why? Don’t have source code! ???? binary modified binary

CS 671 – Spring A Dynamic Optimization System? Transforms* an application at run time * {translate, optimize, extend} Application Transform Code Cache Execute Profile

CS 671 – Spring Classification Dynamic binary optimizers (x86  x86opt) Complement the static compiler –User inputs, phases, DLLs, hardware features –Examples: DynamoRIO, Mojo, Strata Dynamic translators (x86  PPC) Convert applications to run on a new architecture –Examples: Rosetta, Transmeta CMS, DAISY Binary instrumentation (x86  x86instr) Inspect and/or add features to existing applications –Examples: Pin, Valgrind JITs + adaptive systems (Java bytecode  x86)

CS 671 – Spring Dynamic Instrumentation Demo Pin Four architectures – IA32, EM64T, IPF, XScale Four OSes – Linux, FreeBSD, MacOS, Windows

CS 671 – Spring What is Instrumentation? A technique that inserts extra code into a program to collect runtime information Instrumentation approaches: Source instrumentation: –Instrument source programs Binary instrumentation: –Instrument executables directly

CS 671 – Spring No need to recompile or relink Discover code at runtime Handle dynamically-generated code Attach to running processes Why use Dynamic Instrumentation?

CS 671 – Spring How is Instrumentation used in PL/Compiler Research? Program analysis –Code coverage –Call-graph generation –Memory-leak detection –Instruction profiling Thread analysis –Thread profiling –Race detection

CS 671 – Spring Trace Generation Branch Predictor and Cache Modeling Fault Tolerance Studies Emulating Speculation Emulating New Instructions How is Instrumentation used in Computer Architecture Research?

CS 671 – Spring Pin Features Dynamic Instrumentation: Do not need source code, recompilation, post-linking Programmable Instrumentation: Provides rich APIs to write in C/C++ your own instrumentation tools (called Pintools) Multiplatform: Supports x86, x86-64, Itanium, Xscale Supports Linux, Windows, MacOS Robust: Instruments real-life applications: Database, web browsers, … Instruments multithreaded applications Supports signals Efficient: Applies compiler optimizations on instrumentation code

CS 671 – Spring Using Pin Launch and instrument an application $ pin –t pintool –- application Instrumentation engine (provided in the kit) Instrumentation tool (write your own, or use one provided in the kit) Attach to and instrument an application $ pin –t pintool –pid 1234

CS 671 – Spring Pin Instrumentation APIs Basic APIs are architecture independent: Provide common functionalities like determining: –Control-flow changes –Memory accesses Architecture-specific APIs e.g., Info about segmentation registers on IA32 Call-based APIs: Instrumentation routines Analysis routines

CS 671 – Spring Instrumentation vs. Analysis Instrumentation routines define where instrumentation is inserted e.g., before instruction  Occurs first time an instruction is executed Analysis routines define what to do when instrumentation is activated e.g., increment counter  Occurs every time an instruction is executed

CS 671 – Spring Pintool 1: Instruction Count sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax counter++;

CS 671 – Spring Pintool 1: Instruction Count Output $ /bin/ls Makefile imageload.out itrace proccount imageload inscount0 atrace itrace.out $ pin -t inscount0.so -- /bin/ls Makefile imageload.out itrace proccount imageload inscount0 atrace itrace.out Count

CS 671 – Spring ManualExamples/inscount0.cpp instrumentation routine analysis routine #include #include "pin.h" UINT64 icount = 0; void docount() { icount++; } void Instruction(INS ins, void *v) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END); } void Fini(INT32 code, void *v) { std::cerr << "Count " << icount << endl; } int main(int argc, char * argv[]) { PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0; }

CS 671 – Spring Pintool 2: Instruction Trace sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax Print(ip); Need to pass ip argument to the analysis routine (printip())

CS 671 – Spring Pintool 2: Instruction Trace Output $ pin -t itrace.so -- /bin/ls Makefile imageload.out itrace proccount imageload inscount0 atrace itrace.out $ head -4 itrace.out 0x40001e90 0x40001e91 0x40001ee4 0x40001ee5

CS 671 – Spring ManualExamples/itrace.cpp argument to analysis routine analysis routine instrumentation routine #include #include "pin.H" FILE * trace; void printip(void *ip) { fprintf(trace, "%p\n", ip); } void Instruction(INS ins, void *v) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printip, IARG_INST_PTR, IARG_END); } void Fini(INT32 code, void *v) { fclose(trace); } int main(int argc, char * argv[]) { trace = fopen("itrace.out", "w"); PIN_Init(argc, argv); INS_AddInstrumentFunction(Instruction, 0); PIN_AddFiniFunction(Fini, 0); PIN_StartProgram(); return 0; }

CS 671 – Spring Examples of Arguments to Analysis Routine IARG_INST_PTR Instruction pointer (program counter) value IARG_UINT32 An integer value IARG_REG_VALUE Value of the register specified IARG_BRANCH_TARGET_ADDR Target address of the branch instrumented IARG_MEMORY_READ_EA Effective address of a memory read And many more … (refer to the Pin manual for details)

CS 671 – Spring Instrumentation Points Instrument points relative to an instruction: Before (IPOINT_BEFORE) After: –Fall-through edge (IPOINT_AFTER) –Taken edge (IPOINT_TAKEN_BRANCH) cmp%esi, %edx jle mov$0x1, %edi : mov $0x8,%edi count()

CS 671 – Spring Instruction Basic block –A sequence of instructions terminated at a control-flow changing instruction –Single entry, single exit Trace –A sequence of basic blocks terminated at an unconditional control-flow changing instruction –Single entry, multiple exits Instrumentation Granularity sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax jmp 1 Trace, 2 BBs, 6 insts Instrumentation can be done at three different granularities:

CS 671 – Spring Pintool 3: Faster Instruction Count sub$0xff, %edx cmp%esi, %edx jle mov$0x1, %edi add$0x10, %eax counter += 3 counter += 2 basic blocks (bbl)

CS 671 – Spring Modifying Program Behavior Pin allows you not only to observe but also change program behavior Ways to change program behavior: Add/delete instructions Change register values Change memory values Change control flow

Pin Internals

CS 671 – Spring Pin’s Software Architecture JIT Compiler Emulation Unit Virtual Machine (VM) Code Cache Instrumentation APIs Application Operating System Hardware Pin Pintool Address space

CS 671 – Spring Dynamic Instrumentation Original code Code cache Pin fetches trace starting block 1 and start instrumentation 7’ 2’ 1’ Pin Exits point back to Pin

CS 671 – Spring Dynamic Instrumentation Original code Code cache Pin transfers control into code cache (block 1) ’ 2’ 1’ Pin

CS 671 – Spring Dynamic Instrumentation Original code Code cache 7’ 2’ 1’ Pin Pin fetches and instrument a new trace 6’ 5’ 3’ trace linking

CS 671 – Spring Implementation Challenges Linking –Straightforward for direct branches –Tricky for indirects, invalidations Re-allocating registers Maintaining transparency Self-modifying code Supporting MT applications…

CS 671 – Spring Thread-safe accesses Pin, Pintool, and App –Pin: One thread in the VM at a time –Pintool: Locks, ThreadID, event notification –App: Thread-local spill area Providing pthreads functions to instrumentation tools Pin’s Multithreading Support Pintool Application System’s libpthread signalhandler signalhandler set up signal handlers Pin’s mini-libpthread Redirect all other pthreads function calls to application’s libpthread

CS 671 – Spring Pin Overhead SPEC Integer 2006

CS 671 – Spring Adding User Instrumentation

CS 671 – Spring Dynamic Optimization Summary Complement the static compiler Shouldn’t compete with static compilers Observe execution pattern Optimize frequently executed code –Optimization overhead could degrade performance Exploits opportunities Arise only at runtime –DLLs –Runtime constants –Hardware features, user patterns, etc. Too expensive to fully exploit statically –Path-sensitive optimizations