Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo
Motivations for Dynamic Optimization Object Oriented Languages results in delayed binding, reduces scope optimizations DLLs limit static compile-time optimizations Java JIT and dynamic binary translators are impractical for heavyweight static compiler optimization
Motivations for Dynamic Optimization Computer system vendors are totally reliant on software vendors to enable optimizations to take advantage of their hardware Software is now commonly installed on a network file system server and run on machines of varying configurations
Some Traditional Dynamic Optimization Techniques Compile-Time Multiversioning –Multiple versions of code sections are generated at compile-time –Most appropriate variant is selected at runtime based upon characteristics of the input data and/or machine environment –No runtime information can be exploited during code generation –Multiple variants can cause code explosion Thus typically only a few versions are created
Some Traditional Dynamic Optimization Techniques Dynamic Feedback –Similar to Compile-Time Multiversioning Multiple versions generated at compile-time No runtime information can be exploited during code generation Only a few versions created to prevent code explosion –Chooses variant by sampling Measures execution times for variants and selects the fastest
Some Traditional Dynamic Optimization Techniques Dynamic Compilation –Generates new code variants during program executions Takes advantage of runtime information –More overhead than the other methods –To reduce overheads Dynamic compilation is staged at compile-time Dynamic compilation only be applied to code sections that may benefit from it
ADAPT (Automated De-Coupled Adaptive Program Transformation) Michael J. Voss and Rudolf Eigenmann Purdue University
Overview of ADAPT ADAPT tries to combine the features of the other methods Uses a source-to-source compiler to perform optimizations Dynamic selection mechanism selects best code variant to run and does code generation (similar to JIT)
Intervals Optimization occurs at the granularity of intervals –Single entry, single exit Typically loop nests Source-to-source compiler replaces intervals with an if-else block that selects between a call to the Dynamic Selector and the default static version
Compiler Component ADAPT can use off-the-shelf compilers –Set different optimization flags or compilers to produce different variants Loop distribution Tiling Unrolling Automatic Parallelization
ADAPT Components The Inspector monitors the runtime environment –Timings –Each interval –Each optimized variant of the interval –Machine configuration Used to maintain and prioritize the Optimization Queue
ADAPT Components The Optimization Queue is a priority queue that orders interval descriptors by execution time –Used to minimize overheads by avoiding insignificant intervals The Dynamic Selector chooses variants to run –Variants become “stale” after a period of time and are removed
Sample Walkthrough Source-to-source compilation Start with static version and run Inspector tracks intervals for frequent use Optimized variants of the frequently used intervals and other runtime information are generated When an interval takes sufficiently long time, the Dynamic Selector is called and chooses the best variant –If the variant takes too long, go back to the static version
Dynamo: A Transparent Dynamic Optimization System HP Labs Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia
Overview of Dynamo Takes in binary instruction codes Optimizes the code dynamically without code annotations or binary means Transparent operation: –Accepts and optimizes legacy code –Runs like a hybrid user DLL and a virtual machine
How Dynamo Starts Dynamo takes over, takes snapshot of registers and environment stack. Dynamo activates the “interpreter” –Intercepts and scans the native code from the program like a filter Program Dynamo’s Interpreter CPU
How Dynamo Works Fragment Cache Potential Code Fragment Code Fragment Start trace End trace Optimize And link with Other code fragments In cache Already? yes no
Code Fragments The interpreter can create and optimize code fragments Code fragments are code traces –Code trace starts when a certain piece of code is executed many times –Program most likely to follow the same path while tracing
Code Fragments Code fragments consist of: –Start which is the line of code after a taken backward branch –End which is a backward taken branch or another branch leading to another fragment Easy to optimize code fragment –One entrance, multiple exits –Requires one iteration of a backward and forward data flow analysis
Optimizations of Fragments Remove branches expressing fall-throughs only Keeps conditional branches A BC D A C D Optimizations include: Constant Propagation Copy Propagation Loop Invariant Strength Reduction Branch, load, assignment redundancy
Linking Cached Fragments Conditional branches and exits may lead to links to other fragments in cache which speeds up Dynamo. If no fragments exist, start another trace. A C D B D E F G true false Start tracing again
Cache Management New fragment entries require to create links between existing fragments Deletion of fragment requires removal of all links which is slow Cache may get filled up with fragments –Flush when a lot of new code is being traced –Means you have entered a new section of the program
Performance Single PA-8000 processor SpecInt95 benchmarks, compiled with –O2, O4, O2+P, O4+P with/without Dynamo –O2 + Dynamo ran as well as O4 native –O4 + P ran as well with or without Dynamo Overhead of Dynamo was 1.5% of the execution time with SpecInt95.
Conclusion ADAPT and Dynamo are –Opposite approaches of Internal representation, single-exit-single-entry or single-entry-multiple-exits Dynamo use binary code and ADAPT uses source- to-source high level code –Both use standard compilers, no special annotations, and utilize runtime info ADAPT can allow programmers to customize the selection of optimizations.