Uncovering Performance Problems in Java Applications with Reference Propagation Profiling PRESTO: Program Analyses and Software Tools Research Group, Ohio.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Slide 1 Insert your own content. Slide 2 Insert your own content.
Software Transactional Objects Guy Eddon Maurice Herlihy TRAMP 2007.
Credit hours: 4 Contact hours: 50 (30 Theory, 20 Lab) Prerequisite: TB143 Introduction to Personal Computers.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Chapter 11: Structure and Union Types Problem Solving & Program Design.
6 Copyright © 2005, Oracle. All rights reserved. Building Applications with Oracle JDeveloper 10g.
Multiplication Facts Review. 6 x 4 = 24 5 x 5 = 25.
0 - 0.
1 1  1 =.
1  1 =.
2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt Time Money AdditionSubtraction.
2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt ShapesPatterns Counting Number.
Addition Facts
CS4026 Formal Models of Computation Running Haskell Programs – power.
Welcome to Who Wants to be a Millionaire
Year 6/7 mental test 5 second questions
EECE 310: Software Engineering Modular Decomposition, Abstraction and Specifications.
Copyright 2007, Information Builders. Slide 1 Performance and Tuning Mark Nesson, Vashti Ragoonath June 2008.
Runtime Techniques for Efficient and Reliable Program Execution Harry Xu CS 295 Winter 2012.
Randomized Algorithms Randomized Algorithms CS648 1.
Complex Test Suites Organization Victor Kuliamin ISP RAS, Moscow.
Chapter 1 Object Oriented Programming 1. OOP revolves around the concept of an objects. Objects are created using the class definition. Programming techniques.
Chapter 10: Virtual Memory
The Modular Structure of Complex Systems Team 3 Nupur Choudhary Aparna Nanjappa Mark Zeits.
1 CSC 221: Computer Programming I Fall 2006 interacting objects modular design: dot races constants, static fields cascading if-else, logical operators.
1 Directed Depth First Search Adjacency Lists A: F G B: A H C: A D D: C F E: C D G F: E: G: : H: B: I: H: F A B C G D E H I.
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
Processes Management.
DB analyzer utility An overview 1. DB Analyzer An application used to track discrepancies and other reports in Sanchay Post Constantly updated by SDC.
Addition 1’s to 20.
Test B, 100 Subtraction Facts
11 = This is the fact family. You say: 8+3=11 and 3+8=11
Week 1.
Bottoms Up Factoring. Start with the X-box 3-9 Product Sum
AVL Trees (Adelson – Velskii – Landis) AVL tree:
Foundations of Data Structures Practical Session #7 AVL Trees 2.
Go with the Flow: Profiling Copies to Find Run-time Bloat Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, Gary Sevitsky Ohio State University.
Resurrector: A Tunable Object Lifetime Profiling Technique Guoqing Xu University of California, Irvine OOPSLA’13 Conference Talk 1.
Finding Low-Utility Data Structures Guoqing Xu 1, Nick Mitchell 2, Matthew Arnold 2, Atanas Rountev 1, Edith Schonberg 2, Gary Sevitsky 2 1 Ohio State.
Software Bloat Analysis: Detecting, Removing, and Preventing Performance Problems in Modern Large- Scale Object-Oriented Applications Guoqing Xu, Nick.
Scaling CFL-Reachability-Based Points- To Analysis Using Context-Sensitive Must-Not-Alias Analysis Guoqing Xu, Atanas Rountev, Manu Sridharan Ohio State.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks Guoqing Xu, Michael D. Bond, Feng Qin, Atanas Rountev Ohio State University.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Regression Test Selection for AspectJ Software Guoqing Xu and Atanas.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
Quarantine: A Framework to Mitigate Memory Errors in JNI Applications Du Li , Witawas Srisa-an University of Nebraska-Lincoln.
1 Memory Model of A Program, Methods Overview l Memory Model of JVM »Method Area »Heap »Stack.
CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen Guoqing Xu UC Irvine USA.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio.
Control Flow Resolution in Dynamic Language Author: Štěpán Šindelář Supervisor: Filip Zavoral, Ph.D.
Understanding Parallelism-Inhibiting Dependences in Sequential Java Programs Atanas (Nasko) Rountev Kevin Van Valkenburgh Dacong Yan P. Sadayappan Ohio.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Rethinking Soot for Summary-Based Whole- Program Analysis PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Dacong Yan.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
Static Detection of Loop-Invariant Data Structures Harry Xu, Tony Yan, and Nasko Rountev University of California, Irvine Ohio State University 1.
CBSE'051 Component-Level Dataflow Analysis Atanas (Nasko) Rountev Ohio State University.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
CoCo: Sound and Adaptive Replacement of Java Collections Guoqing (Harry) Xu Department of Computer Science University of California, Irvine.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
Vertical Profiling : Understanding the Behavior of Object-Oriented Applications Sookmyung Women’s Univ. PsLab Sewon,Moon.
Static Analysis of Object References in RMI-based Java Software
Debugging Memory Issues
Compositional Pointer and Escape Analysis for Java Programs
CACHETOR Detecting Cacheable Data to Remove Bloat
Capriccio – A Thread Model
Demand-Driven Context-Sensitive Alias Analysis for Java
Presentation transcript:

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Dacong Yan 1, Guoqing Xu 2, Atanas Rountev 1 1 Ohio State University 2 University of California, Irvine

Overview Performance inefficiencies – Often exist in Java applications – Excessive memory usage – Long running times, even for simple tasks Challenges – Limited compiler optimizations – Complicated behavior – Large libraries and frameworks Solution: manual tuning assisted with performance analysis tools 2

An Example 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 3 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t

An Example 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 4 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t

An Example 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 5 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t

8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t An Example 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 6

An Example 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 7 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t

Implementation Reference propagation profiling – Implemented in Jikes RVM – Modify the runtime compiler for code instrumentation – Create shadow locations to track data dependence – Instrument method calls to track interprocedural propagation Overheads – Space: 2-3× – Time: 30-50× 8

Code Shadow Graph 6 a = new A; a ʹ = RefAssign(6,6) 7 b = a; b ʹ = RefAssign(6,7) 8 c = new C; c ʹ = RefAssign(8, 8) 9 b.fld = c; b.fld ʹ = RefAssign(8, 9) Interprocedural propagation – Per-thread scratch space save and restore shadows for parameters and return variables Reference Propagation Profiling Intraprocedural propagation – Shadows for every memory location (stack and heap) to record last assignment that writes to it – Update shadows and the graph accordingly 9

Client Analyses Not-assigned-to-heap (NATH) analysis – Locate producer nodes that do not reach heap propagation nodes (heap reads and writes) – Variant: mostly-NATH analysis Cost-benefit imbalance analysis – Detect imbalance between the cost of interesting operations, and the benefits they produce – For example, analysis of write read imbalance Analysis of never-used allocations – Identify producer nodes that do not reach the consumer node – Variant: analysis of rarely-used allocations 10

8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t A Real Tuning Session 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } 11

8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t A Real Tuning Session 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 }

A Real Tuning Session 1 class Vec { 2 double x, y; 3 sub_rev(v, res) { 4 res.x = x-v.x; 5 res.y = y-v.y; 6 } 7 } 13 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 in[i+1].sub_rev(a[i-2], nt); 12 // use of fields of nt 13 } … … 80 t=q[*]; 81 // use of fields of t 1 class Vec { 2 double x, y; 3 sub(v) { 4 res=new Vec(x-v.x, y-v.y); 5 return res; 6 } 7 } tuning 8 for (i = 0; i < N; i++) { 9 t = in[i+2].sub(a[i-1]); 10 q[i] = t; 11 t = in[i+1].sub(a[i-2]); 12 // use of fields of t 13 } … … 80 t=q[*]; 81 // use of fields of t nt = new Vec; // reusable Reductions: 13% in running time and 73% in #allocated objects

Examples of Inefficiency Patterns Temporary objects for method returns – Reductions for euler : 13% in running time and 73% in #allocated objects Redundant data representation –mst : 63% and 40% Unnecessary eager object creation –chart : 8% and 8% –jflex : 3% and 27% Expensive specialization for sanity checks –bloat : 10% and 11% 14

Conclusions Reference propagation profiling in Jikes RVM Understanding reference propagation is a good starting point for performance tuning Client analyses can uncover performance inefficiencies, and lead to effective tuning solutions 15

Thank you 16