A Region-Based Compilation Technique for a Java Just-In-Time Compiler Toshio Suganuma, Toshiaki Yasue and Toshio Nakatani Presenter: Ioana Burcea.

Slides:



Advertisements
Similar presentations
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Advertisements

ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
JAVA Processors and JIT Scheduling. Overview & Literature n Formulation of the problem n JAVA introduction n Description of Caffeine * Literature: “Java.
Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th, 2008 Presented by Eun Jung Park.
NUMA Tuning for Java Server Applications Mustafa M. Tikir.
© 2011 IBM Corporation Reducing Trace Selection Footprint for Large- scale Java Applications without Performance Loss Peng Wu, Hiroshige Hayashizaki, Hiroshi.
Trace-Based Automatic Parallelization in the Jikes RVM Borys Bradel University of Toronto.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
Partial Method Compilation using Dynamic Profile Information John Whaley Stanford University October 17, 2001.
An Adaptive, Region-based Allocator for Java Feng Qian & Laurie Hendren 2002.
1 Memory Model of A Program, Methods Overview l Memory Model of JVM »Method Area »Heap »Stack.
Fast Dynamic Binary Translation for the Kernel Piyus Kedia and Sorav Bansal IIT Delhi.
Adaptive Optimization in the Jalapeño JVM M. Arnold, S. Fink, D. Grove, M. Hind, P. Sweeney Presented by Andrew Cove Spring 2006.
SAGE: Self-Tuning Approximation for Graphics Engines
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Dynamic Compilation II John Cavazos University.
Previous Next 06/18/2000Shanghai Jiaotong Univ. Computer Science & Engineering Dept. C+J Software Architecture Shanghai Jiaotong University Author: Lu,
Just-In-Time Java Compilation for the Itanium Processor Tatiana Shpeisman Guei-Yuan Lueh Ali-Reza Adl-Tabatabai Intel Labs.
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
CORTEX-M0 Structure Discussion 2 – Core Peripherals
An Adaptive, Region-based Allocator for Java Feng Qian, Laurie Hendren {fqian, Sable Research Group School of Computer Science McGill.
O VERVIEW OF THE IBM J AVA J UST - IN -T IME C OMPILER Presenters: Zhenhua Liu, Sanjeev Singh 1.
Oct Using Platform-Specific Performance Counters for Dynamic Compilation Florian Schneider and Thomas Gross ETH Zurich.
1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.
Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University.
P ath & E dge P rofiling Michael Bond, UT Austin Kathryn McKinley, UT Austin Continuous Presented by: Yingyi Bu.
Java Virtual Machine Case Study on the Design of JikesRVM.
Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler Compiler workshop ’08.
1 Fast and Efficient Partial Code Reordering Xianglong Huang (UT Austin, Adverplex) Stephen M. Blackburn (Intel) David Grove (IBM) Kathryn McKinley (UT.
Concurrent Programming. Concurrency  Concurrency means for a program to have multiple paths of execution running at (almost) the same time. Examples:
Dynamic Object Sampling for Pretenuring Maria Jump Department of Computer Sciences The University of Texas at Austin Stephen M. Blackburn.
Speculative Region-based Memory Management for Big Data Systems Khanh Nguyen, Lu Fang, Harry Xu, Brian Demsky Donald Bren School of Information and Computer.
JIT Instrumentation – A Novel Approach To Dynamically Instrument Operating Systems Marek Olszewski Keir Mierle Adam Czajkowski Angela Demke Brown University.
Practical Path Profiling for Dynamic Optimizers Michael Bond, UT Austin Kathryn McKinley, UT Austin.
CSE 598c – Virtual Machines Survey Proposal: Improving Performance for the JVM Sandra Rueda.
Static Identification of Delinquent Loads V.M. Panait A. Sasturkar W.-F. Fong.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
1 Understanding the Energy-Delay Tradeoff of ILP-based Compilation Techniques on a VLIW Architecture G. Pokam, F. Bodin CPC 2004 Chiemsee, Germany, July.
High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.
Sunpyo Hong, Hyesoon Kim
1 ROGUE Dynamic Optimization Framework Using Pin Vijay Janapa Reddi PhD. Candidate - Electrical And Computer Engineering University of Colorado at Boulder.
An Offline Approach for Whole-Program Paths Analysis using Suffix Arrays G. Pokam, F. Bodin.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
Just-In-Time Compilation. Introduction Just-in-time compilation (JIT), also known as dynamic translation, is a method to improve the runtime performance.
Practical Hadoop: do’s and don’ts by example Kacper Surdy, Zbigniew Baranowski.
Data Flow Analysis Suman Jana
Cork: Dynamic Memory Leak Detection with Garbage Collection
  Performance Pitfalls in Large-Scale Java Applications Translated from COBOL Toshio Suganuma Toshiaki Yasue Tamiya Onodera Toshio Nakatani Presented.
ENERGY 211 / CME 211 Lecture 25 November 17, 2008.
Effective Data-Race Detection for the Kernel
Improving java performance using Dynamic Method Migration on FPGAs
Capriccio – A Thread Model
Feedback directed optimization in Compaq’s compilation tools for Alpha
Designing with Java Exception Handling
Department of Computer Science University of California, Santa Barbara
University Of Virginia
Ann Gordon-Ross and Frank Vahid*
Adaptive Code Unloading for Resource-Constrained JVMs
Inlining and Devirtualization Hal Perkins Autumn 2011
Correcting the Dynamic Call Graph Using Control Flow Constraints
Adaptive Optimization in the Jalapeño JVM
Lecture 9 Dynamic Compilation
Trace-based Just-in-Time Type Specialization for Dynamic Languages
자바 언어를 위한 정적 분석 (Static Analyses for Java) ‘99 한국정보과학회 가을학술발표회 튜토리얼
Designing with Java Exception Handling
Practical Assignment Sinking for Dynamic Compilers
Nikola Grcevski Testarossa JIT Compiler IBM Toronto Lab
Exception Delivery Requirement
Presentation transcript:

A Region-Based Compilation Technique for a Java Just-In-Time Compiler Toshio Suganuma, Toshiaki Yasue and Toshio Nakatani Presenter: Ioana Burcea

Agenda Research question FBC – function based compilation VS RBC – region based compilation System overview Region exit handling Region selection Special optimizations Partial inlining Partial dead code elimination Escape analysis Experimental results

Research Question Method – the right unit of compilation? Methods often contain rarely or never executed code  Waste of compilation time  Conservative data flow analysis  Restrictive method inlining Possible solution: Region-based compilation (RBC)  Region selection  Partial inlining  Region exit handling

Dynamic Optimization Framework Sampling profiler The more CPU time used the hotter the method Instrumenting profiler Basic block execution frequency Virtual/interface call receiver type distribution

Region-Based Compilation

Intra-Procedural Region Selection Static heuristics Rare  Backup blocks / guards (e.g., devirtualizations)  Blocks that end with an exception throwing instruction  Exception handler blocks  Blocks containing unresolved or uninitialized class references Non-rare  Blocks that end with normal return instructions Dynamic profile information A block is non-rare if its dynamic count is beyond a certain threshold (which value they use?) Higher priority when conflicting with static heuristics Iterative framework for region selection

Region Exit Handling

Partial Inlining

Special Optimizations Partial dead code elimination Pushing computations that are only live in the region exit paths to the region exit BB Partial escape analysis Objects that are local to the method are allocated to the stack Objects that are local to a single thread do not need synchronization

Partial Escape Analysis

Experimental Results Machine Pentium 4 Xeon 2.8GHz, 1Gb WinXP Benchmarks SPECjvm  Initial and max heap 128 Mb SPECjbb  Initial and max heap 256 Mb Thresholds MMI to level-0 compilation: 500 Timer interval for sampling profiler: 3 ms The number of samples: 10,000 The max number of OSR: 10 Other thresholds (rare bb, promotion to level-1 & level-2 compilation) ??

Configurations 5 configurations FBC – function based compilation (the baseline) RBC-noopt – RBC without any special optimizations RBC-nopi – RBC with partial dead code elimination and partial escape analysis, without partial inlining RBC-full – all optimizations on RBC-offline – RBC-full with offline profile information  No OSR, no recompilation (?)

Statistics for RBC

Performance Improvement

Compile Time Ratio

Compiled Code Size Ratio

Peak Work Memory Usage Ratio

Discussion