A Region-Based Compilation Technique for a Java Just-In-Time Compiler Toshio Suganuma, Toshiaki Yasue and Toshio Nakatani Presenter: Ioana Burcea
Agenda Research question FBC – function based compilation VS RBC – region based compilation System overview Region exit handling Region selection Special optimizations Partial inlining Partial dead code elimination Escape analysis Experimental results
Research Question Method – the right unit of compilation? Methods often contain rarely or never executed code Waste of compilation time Conservative data flow analysis Restrictive method inlining Possible solution: Region-based compilation (RBC) Region selection Partial inlining Region exit handling
Dynamic Optimization Framework Sampling profiler The more CPU time used the hotter the method Instrumenting profiler Basic block execution frequency Virtual/interface call receiver type distribution
Region-Based Compilation
Intra-Procedural Region Selection Static heuristics Rare Backup blocks / guards (e.g., devirtualizations) Blocks that end with an exception throwing instruction Exception handler blocks Blocks containing unresolved or uninitialized class references Non-rare Blocks that end with normal return instructions Dynamic profile information A block is non-rare if its dynamic count is beyond a certain threshold (which value they use?) Higher priority when conflicting with static heuristics Iterative framework for region selection
Region Exit Handling
Partial Inlining
Special Optimizations Partial dead code elimination Pushing computations that are only live in the region exit paths to the region exit BB Partial escape analysis Objects that are local to the method are allocated to the stack Objects that are local to a single thread do not need synchronization
Partial Escape Analysis
Experimental Results Machine Pentium 4 Xeon 2.8GHz, 1Gb WinXP Benchmarks SPECjvm Initial and max heap 128 Mb SPECjbb Initial and max heap 256 Mb Thresholds MMI to level-0 compilation: 500 Timer interval for sampling profiler: 3 ms The number of samples: 10,000 The max number of OSR: 10 Other thresholds (rare bb, promotion to level-1 & level-2 compilation) ??
Configurations 5 configurations FBC – function based compilation (the baseline) RBC-noopt – RBC without any special optimizations RBC-nopi – RBC with partial dead code elimination and partial escape analysis, without partial inlining RBC-full – all optimizations on RBC-offline – RBC-full with offline profile information No OSR, no recompilation (?)
Statistics for RBC
Performance Improvement
Compile Time Ratio
Compiled Code Size Ratio
Peak Work Memory Usage Ratio
Discussion