Fast Effective Dynamic Compilation Joel Auslander, Mathai Philipose, Craig Chambers, etc. PLDI’96 Department of Computer Science and Engineering Univ. of Washington Presented by Zhelong Pan April 12 th, 2002
Introduction Enable optimizations based on the invariant data computed at run-time: Enable optimizations based on the invariant data computed at run-time: –Eliminate memory loads –Perform constant propagation and folding –Remove branches –Fully unroll loops Overhead: the run-time cost of dynamic compilation. Overhead: the run-time cost of dynamic compilation. Goal: Fast compilation and high-quality code. Goal: Fast compilation and high-quality code.
Basic Ideas Step 1: The programmer annotates the dynamic regions. Step 3: Compute and patch the run-time constants to the template. Step 2: Generate pre- optimized templates. Step 4: Use the new code during the execution. Templates: pre-compiled optimized machine-code containing holes to be filled by the run-time data. Set-up code: calculating the values of derived run- time constants. Directives: instructing the dynamic compiler to produce executable code.
Annotation DynamicRegion delineates the section of code to be dynamically compiled. Its arguments indicate the variables are constant at the entry of the dynamic region and remain unchanged. The contents of arrays and pointer-based data are assumed to be run-time constants. Otherwise, the dereference operators should be used, e.g. x:=dynamic* p, x:=p dynamic->f, and x:=a dynamic[i] Unroll directs to completely unroll a loop. To produce several compiled versions, each optimized for a different set of run- time constants, key Variables are defined, e.g. dynamicRegion key(cache)
Static Compiler Identify the constant variables and expressions based on the annotated constants. Identify the constant variables and expressions based on the annotated constants. Split into set-up and template code subgraphs. Split into set-up and template code subgraphs. Apply the standard optimizations with few restrictions. Apply the standard optimizations with few restrictions. Generate machine code and stitcher directives. Generate machine code and stitcher directives.
Computing Derived Run-Time Constants Start with the initial set of constants through the control flow graph, updating the set after each instruction as follows: + x:=y iff y is a constant. + x:=y op z iff y and z are constants, op is an idempotent, side-effect-free, non-trapping operator. + x:=f(y1,…,yn) iff y’s are constants, f is an idempotent, side-effect-free, non-trapping function. + x:=*p iff p is a constant. – x:=dynamic* p: x is not a constant. – *p:=x Start with the initial set of constants through the control flow graph, updating the set after each instruction as follows: + x:=y iff y is a constant. + x:=y op z iff y and z are constants, op is an idempotent, side-effect-free, non-trapping operator. + x:=f(y1,…,yn) iff y’s are constants, f is an idempotent, side-effect-free, non-trapping function. + x:=*p iff p is a constant. – x:=dynamic* p: x is not a constant. – *p:=x
A forward dataflow analysis on control flow graph test is not a constant test is a constant After a control flow merge, if a variable has the same run-time constant reaching definition along all predecessors, it is considered a constant after the merge.
Reachability Analysis(1) A forward dataflow analysis to compute the branch conditions at each program point.
Reachability Analysis (2) If the reachability conditions for each merge predecessor are mutually exclusive, the merge is labeled as a constant merge. For a constant merge, the union is taken. For a non-constant merge, the intersection is taken. (The paper implies it, although it is not given explicitly.)
Reachability Analysis (3) The reachability conditions of the loop entry arc and loop back edge arc are not normally mutually- exclusive, the loop head is treated as a non-constant merge. For an unrolled loop, only one predecessor arc exists. So, its loop head is a constant merge.
Setup codes, templates, directives Static compiler divides each dynamic region into setup code and template code. Setup code computes the runtime constants. Templates contain all the remaining code with “holes” embedded for runtime constants. Stitcher directives are generated to inform the stitcher to patch the holes and unroll the loops.
Optimizations Optimizations (e.g. CSE) can be performed both before and after the dynamic region is divided into setup and template codes. Optimizations (e.g. CSE) can be performed both before and after the dynamic region is divided into setup and template codes. Optimizations performed afterwards must be modified slightly to deal with “holes”, marked as compile time constants with unknown values. Optimizations performed afterwards must be modified slightly to deal with “holes”, marked as compile time constants with unknown values.
Experimental Results
Discussion + Using the run time data to do optimization. + The idea of doing most work during static compilation is important for dynamic/adaptive compilers. – Run-time constant assumption limits its applicability. How much could the real application gain from it? – Annotation is not added automatically by the compiler, which may require inter-procedural analysis and pointer analysis.