The Use of Traces for Inlining in Java Programs Borys J. Bradel Tarek S. Abdelrahman Edward S. Rogers Sr.Department of Electrical and Computer Engineering.

Slides:



Advertisements
Similar presentations
Idempotent Code Generation: Implementation, Analysis, and Evaluation Marc de Kruijf ( ) Karthikeyan Sankaralingam CGO 2013, Shenzhen.
Advertisements

ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
1 IIES 2008 Thomas Heinz (Saarland University, CR/AEA3) | 22/03/2008 | © Robert Bosch GmbH All rights reserved, also regarding any disposal, exploitation,
Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
Trace-Based Automatic Parallelization in the Jikes RVM Borys Bradel University of Toronto.
Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
Wish Branches A Review of “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution” Russell Dodd - October 24, 2006.
© 2002 IBM Corporation IBM Toronto Software Lab October 6, 2004 | CASCON2004 Interprocedural Strength Reduction Shimin Cui Roch Archambault Raul Silvera.
Dynamic Tainting for Deployed Java Programs Du Li Advisor: Witawas Srisa-an University of Nebraska-Lincoln 1.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Regression Test Selection for AspectJ Software Guoqing Xu and Atanas.
San Diego Supercomputer Center Performance Modeling and Characterization Lab PMaC Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation.
Intermediate Code. Local Optimizations
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
1 Feedback-directed optimizations with estimated edge profiles from hardware event sampling Open64 workshop, CGO 2008 April 6, 2008 Vinodha Ramasamy, Robert.
Adaptive Optimization in the Jalapeño JVM M. Arnold, S. Fink, D. Grove, M. Hind, P. Sweeney Presented by Andrew Cove Spring 2006.
Dynamic Optimization as typified by the Dynamo System See “Dynamo: A Transparent Dynamic Optimization System”, V. Bala, E. Duesterwald, and S. Banerjia,
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Dynamic Compilation II John Cavazos University.
Secure Web Applications via Automatic Partitioning Stephen Chong, Jed Liu, Andrew C. Meyers, Xin Qi, K. Vikram, Lantian Zheng, Xin Zheng. Cornell University.
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
Fast, Effective Code Generation in a Just-In-Time Java Compiler Rejin P. James & Roshan C. Subudhi CSE Department USC, Columbia.
1 Dimension: An Instrumentation Tool for Virtual Execution Environments Jing Yang, Shukang Zhou and Mary Lou Soffa Department of Computer Science University.
7. Just In Time Compilation Prof. O. Nierstrasz Jan Kurs.
Adaptive Optimization in the Jalapeño JVM Matthew Arnold Stephen Fink David Grove Michael Hind Peter F. Sweeney Source: UIUC.
O VERVIEW OF THE IBM J AVA J UST - IN -T IME C OMPILER Presenters: Zhenhua Liu, Sanjeev Singh 1.
1 Introduction to JVM Based on material produced by Bill Venners.
1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.
P ath & E dge P rofiling Michael Bond, UT Austin Kathryn McKinley, UT Austin Continuous Presented by: Yingyi Bu.
“Dynamo: A Transparent Dynamic Optimization System ” V. Bala, E. Duesterwald, and S. Banerjia, PLDI 2000 “Dynamo: A Transparent Dynamic Optimization System.
1 Fast and Efficient Partial Code Reordering Xianglong Huang (UT Austin, Adverplex) Stephen M. Blackburn (Intel) David Grove (IBM) Kathryn McKinley (UT.
Code Optimization 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture of a.
Lengthening Traces to Improve Opportunities for Dynamic Optimization Chuck Zhao, Cristiana Amza, Greg Steffan, University of Toronto Youfeng Wu Intel Research.
Instrumentation in Software Dynamic Translators for Self-Managed Systems Bruce R. Childers Naveen Kumar, Jonathan Misurda and Mary.
Statistical Analysis of Inlining Heuristics in Jikes RVM Jing Yang Department of Computer Science, University of Virginia.
Dynamo: A Transparent Dynamic Optimization System Bala, Dueterwald, and Banerjia projects/Dynamo.
Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,
Limits of Instruction-Level Parallelism Presentation by: Robert Duckles CSE 520 Paper being presented: Limits of Instruction-Level Parallelism David W.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality.
Practical Path Profiling for Dynamic Optimizers Michael Bond, UT Austin Kathryn McKinley, UT Austin.
1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),
Trace Fragment Selection within Method- based JVMs Duane Merrill Kim Hazelwood VEE ‘08.
Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.
CSE 598c – Virtual Machines Survey Proposal: Improving Performance for the JVM Sandra Rueda.
A Generalized Architecture for Bookmark and Replay Techniques Thesis Proposal By Napassaporn Likhitsajjakul.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
Dynamic Parallelization of JavaScript Applications Using an Ultra-lightweight Speculation Mechanism ECE 751, Fall 2015 Peng Liu 1.
Vertical Profiling : Understanding the Behavior of Object-Oriented Applications Sookmyung Women’s Univ. PsLab Sewon,Moon.
Jun 14, 2004RAM-SE'04 Workshop, Oslo, Norway 1 Negligent Class Loaders for Software Evolution Yoshiki Sato, Shigeru Chiba (Tokyo Institute of Technology.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Method Profiling John Cavazos University.
1 ROGUE Dynamic Optimization Framework Using Pin Vijay Janapa Reddi PhD. Candidate - Electrical And Computer Engineering University of Colorado at Boulder.
Re-configurable Bus Encoding Scheme for Reducing Power Consumption of the Cross Coupling Capacitance for Deep Sub-micron Instructions Bus Siu-Kei Wong.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
University of Michigan Electrical Engineering and Computer Science 1 Low Cost Control Flow Protection Using Abstract Control Signatures Daya S Khudia and.
1 The Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT), Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss.
1 Removing Impediments to Loop Fusion Through Code Transformations Bob Blainey 1, Christopher Barton 2, and Jos’e Nelson Amaral 2 1 IBM Toronto Software.
Online Subpath Profiling
The Simplest Heuristics May Be The Best in Java JIT Compilers
Feedback directed optimization in Compaq’s compilation tools for Alpha
Adaptive Code Unloading for Resource-Constrained JVMs
Inlining and Devirtualization Hal Perkins Autumn 2011
Inlining and Devirtualization Hal Perkins Autumn 2009
Adaptive Optimization in the Jalapeño JVM
Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt
Lecture 17: Register Allocation via Graph Colouring
Garbage Collection Advantage: Improving Program Locality
Nikola Grcevski Testarossa JIT Compiler IBM Toronto Lab
Presentation transcript:

The Use of Traces for Inlining in Java Programs Borys J. Bradel Tarek S. Abdelrahman Edward S. Rogers Sr.Department of Electrical and Computer Engineering University of Toronto Toronto, Ontario, Canada

2 Introduction Feedback-directed systems provide information to a compiler regarding program behaviour Examples: Jikes RVM [AFG+00] Open Runtime Platform [Mic03] Source Code Compiler Program Feedback

3 Work Overview Explore whether traces are useful in offline feedback directed systems Create trace collection system for Jikes Use traces to guide Jikes’s built in optimizing compiler Help with a single optimization, inlining Improves execution time

4 Outline Background Implementation Results Related work Conclusion

5 Trace Definition A trace is a frequently executed sequence of unique basic blocks or instructions a=0 i=0 goto B2 a+=i i++ if (i<5) goto B1 return a B0 B1 B2 B3 Trace 1 public static int foo() { int a=0; for (int i=0;i<5;i++) a++; return a; }

6 Traces and Optimization Traces may offer a better opportunity for optimization: Enable inter-procedural analysis Reduce the amount of instructions optimized Simplify the control flow graph, allowing for more optimization

7 Multiple Methods Inter-procedural analysis without an additional framework Increase possibility of optimization B1,A1,B2 can be simplified to two instructions a+=(5+i) i++ B0 t=returned value a+=t i++ B3 B4 B1 call g(i) B2 t=5+i return t A1 Trace 1

8 Fewer Instructions Fewer instructions to optimize May allow for extra optimization If know that B3 is executed then know that t=5 B0 B6 B1 B5 B6 B2: t=f(...) Trace 1 B3: t=5 B4

9 Trace Exits Traces usually contain many basic blocks Traces may not execute completely Unlike basic blocks B0 B6 B1 B5 B6 B2 Trace 1 B3 B4

10 Trace Collection System Monitor program execution Record traces Start traces at frequently occurring events Backward branches Trace exits Returns Stop at backward branches and trace starts Captures frequently executed loops and functions a=0 i=0 goto B2 a+=i i++ if (i<5) goto B1 return a B0 B1 B2 B3 Trace 1

11 Jikes Baseline Compiler Optimizing Compiler Program Adaptive System

12 Jikes and our TCS Baseline Compiler Optimizing Compiler Program Adaptive System TCS Inform TCS Trace Information

13 Jikes – Second Phase Baseline Compiler Optimizing Compiler Program Adaptive System Trace Information

14 Inlining and Traces Traces are executed frequently Therefore invocations on traces should be inlined Reduce invocation overhead Allow for more opportunities for optimization May lead to large code expansion a:call b() b: … method a() … invoke b() … method b() …

15 Code Expansion Control There are ways to control inline expansion Inline sequences [HG03,BB04] Selectively inlining: What if compile method a()? What if compile method b()? a:call b() b:call c() c:…c:…

16 Code Expansion Control Compile method a() Inline methods b() and c() Compile method b() No inlining method a() … invoke b() … method c() … method b() … invoke c() … method b() … invoke c() … method c() …

17 Results Provide inline information to Jikes based on previous executions Compare our approach to two others: Inline information provided by the Adaptive system of Jikes A greedy algorithm based on work by Arnold et al. [Arn00] Evaluate two approaches: Just in Time and Ahead of Time Measure overhead of system

18 JIT Inlining – Execution Time

19 JIT Inlining – Compilation Time

20 JIT Inlining – Code Expansion

21 AOT Inlining – Execution Time

22 AOT Inlining – Compilation Time

23 Overhead

24 Related Work Arnold et al. [Arn00] Feedback-directed inlining in Java Collected edge counts at method invocations Used a greedy algorithm to select inlines that maximize invocations relative to code expansion Dynamo [BDB99] Trace collection system PA-RISC architecture Assembly Instructions Compiled traces

25 Conclusions Traces are beneficial for inlining: Decreased execution time compared to one approach Decrease competitive with another approach Increases compilation time and code size A potential avenue of future research

26 Future Work Different trace collection strategies Trace based compilation and execution Reduction of code size Application of traces to other optimizations Usage of an online feedback directed system

27 References [MSD00] Matthew Arnold, Stephen Fink, David Grove, Michael Hind, and Peter F. Sweeney. Adaptive optimization in the Jalapeno JVM. ACM SIGPLAN Notices, 35(10):47-65, [Mic03] Michael Cierniak et al. The open runtime platform: A flexible high-performance managed runtime environment. Intel Technology Journal, February [HG03] Kim Hazelwood and David Grove. Adaptive online context- sensitive inlining. International Symposium on Code Generation and Optimization, p , [BB04] Bradel, B.J.: The use of traces in optimization. Master’s thesis, University of Toronto (2004). [Arn00] Matthew Arnold et al: A comparative study of static and profile-based heuristics for inlining. SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization. (2000) [BDB99] Vasanth Bala, Evelyn Duesterwald, and Sanjeev Banerjia. Transparent dynamic optimization: The design and implementation of dynamo. HP Laboratories Technical Report HPL1999 –78, 1999.

28 AOT – Compilation Time (Wall Time)