Analyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat Guoqing Xu CSE Department Ohio State University Ph.D. Thesis Defense Aug.

Slides:



Advertisements
Similar presentations
Uncovering Performance Problems in Java Applications with Reference Propagation Profiling PRESTO: Program Analyses and Software Tools Research Group, Ohio.
Advertisements

Runtime Techniques for Efficient and Reliable Program Execution Harry Xu CS 295 Winter 2012.
OOPSLA 2005 Workshop on Library-Centric Software Design The Diary of a Datum: An Approach to Modeling Runtime Complexity in Framework-Based Applications.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Go with the Flow: Profiling Copies to Find Run-time Bloat Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, Gary Sevitsky Ohio State University.
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
A Regression Test Selection Technique for Aspect- Oriented Programs Guoqing Xu The Ohio State University
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Finding Low-Utility Data Structures Guoqing Xu 1, Nick Mitchell 2, Matthew Arnold 2, Atanas Rountev 1, Edith Schonberg 2, Gary Sevitsky 2 1 Ohio State.
Software Bloat Analysis: Detecting, Removing, and Preventing Performance Problems in Modern Large- Scale Object-Oriented Applications Guoqing Xu, Nick.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks Guoqing Xu, Michael D. Bond, Feng Qin, Atanas Rountev Ohio State University.
Chapter 1 Program Design
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen Guoqing Xu UC Irvine USA.
Introduction to Systems Analysis and Design
Efficient Instruction Set Randomization Using Software Dynamic Translation Michael Crane Wei Hu.
DATA STRUCTURE Subject Code -14B11CI211.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
XFindBugs: eXtended FindBugs for AspectJ Haihao Shen, Sai Zhang, Jianjun Zhao, Jianhong Fang, Shiyuan Yao Software Theory and Practice Group (STAP) Shanghai.
A Portable Virtual Machine for Program Debugging and Directing Camil Demetrescu University of Rome “La Sapienza” Irene Finocchi University of Rome “Tor.
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
Programming. What is a Program ? Sets of instructions that get the computer to do something Instructions are translated, eventually, to machine language.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Course Overview John Cavazos University.
Mihir Daptardar Software Engineering 577b Center for Systems and Software Engineering (CSSE) Viterbi School of Engineering 1.
Bug Localization with Machine Learning Techniques Wujie Zheng
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
Conrad Benham Java Opcode and Runtime Data Analysis By: Conrad Benham Supervisor: Professor Arthur Sale.
JCMP: Linking Architecture with Component Building Guoqing Xu, Zongyuan Yang and Haitao Huang Software Engineering Lab, East China Normal University SACT-01,
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Advanced Compilers CMPSCI 710 Spring 2004 Lecture 1 Emery Berger University of Massachusetts,
CS527 Topics in Software Engineering (Software Testing and Analysis) Darko Marinov September 9, 2010.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
1 Program Slicing Amir Saeidi PhD Student UTRECHT UNIVERSITY.
Mark Marron IMDEA-Software (Madrid, Spain) 1.
Static Detection of Loop-Invariant Data Structures Harry Xu, Tony Yan, and Nasko Rountev University of California, Irvine Ohio State University 1.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
CoCo: Sound and Adaptive Replacement of Java Collections Guoqing (Harry) Xu Department of Computer Science University of California, Irvine.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
A Generalized Architecture for Bookmark and Replay Techniques Thesis Proposal By Napassaporn Likhitsajjakul.
CPSC 871 John D. McGregor Module 8 Session 1 Testing.
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
Performance Problems You Can Fix: A Dynamic Analysis of Memoization Opportunities Luca Della Toffola – ETH Zurich Michael Pradel – TU Darmstadt Thomas.
From Use Cases to Implementation 1. Structural and Behavioral Aspects of Collaborations  Two aspects of Collaborations Structural – specifies the static.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
CS223: Software Engineering Lecture 21: Unit Testing Metric.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
From Use Cases to Implementation 1. Mapping Requirements Directly to Design and Code  For many, if not most, of our requirements it is relatively easy.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
CPSC 372 John D. McGregor Module 8 Session 1 Testing.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Phoenix Based Dynamic Slicing Debugging Tool Eric Cheng Lin Xu Matt Gruskin Ravi Ramaseshan Microsoft Phoenix Intern Team (Summer '06)
Content Coverity Static Analysis Use cases of Coverity Examples
Code Optimization.
Chapter 1 Introduction.
Introduction to Compiler Construction
Architecture Concept Documents
Harvesting Runtime Values in Android Applications That Feature Anti-Analysis Techniques Presented by Vikraman Mohan.
Seminar in automatic tools for analyzing programs with dynamic memory
Chapter 1 Introduction.
Un</br>able’s MySecretSecrets
CACHETOR Detecting Cacheable Data to Remove Bloat
Jipeng Huang, Michael D. Bond Ohio State University
Optimizing Compilers CISC 673 Spring 2009 Course Overview
SPL – PS1 Introduction to C++.
Presentation transcript:

Analyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat Guoqing Xu CSE Department Ohio State University Ph.D. Thesis Defense Aug 3, 2011

Rapid Heap Growth Between 2002 and Mitchell et al. LCSD’05

The Big Pile-up * Framework-intensive SAP Netweaver App server 3 * Easy to have performance problems * Hard to diagnose * Millions lines of code

The Big Pile-up Causes Runtime Bloat Heaps are getting bigger Grown from 500M to 2-3G or more in the past few years But not necessarily supporting more users or functions Surprisingly common (all are from real apps): Supporting thousands of users (millions are expected) Saving 500K session state per user (2K is expected) Requiring 2M for a text index per simple document Creating 100K temporary objects per web hit Consequences for scalability, power usage, and performance 4

Problems and Insights Problem with compiler optimizations – limited scope – lack of developer insight Problem with human optimizations – lack of good tools Our insight: There is a way to identify larger optimization opportunities with a small amount of developer time Our methodology : Advocate compiler-assisted manual tuning – Static and dynamic analysis techniques to find and remove bloat 5

An Overview of Bloat Analysis Results – Dynamic: [ICSE’08, PLDI’09, PLDI’10-a, PLDI’11] – Static: [PLDI’10-b, Onging] – Roadmap: [FoSER’10] Platforms – Java Virtual Machine: IBM J9 [PLDI’09, PLDI’10-a] JikesRVM [PLDI’11] – Java Virtual Machine Tool Interface [ICSE’08] – Soot [PLDI’10-b, Ongoing] 6

Overview of Techniques Dynamic techniques Static techniques observe Execution Evidence Lots of copies [PLDI’09] Large cost/benefit [PLDI’10-a] High leak confidence [ICSE’ 08, PLDI’11] analyze Bloat/leak report Bytecode analyze Bloated containers [PLDI’10-b] Invariant objects [Ongoing] search Bloat warning 7

An Overview of My Work Software bloat analysis– dissertation work – [ICSE’08, PLDI’09, FoSER’10, PLDI’10-a, PLDI’10-b, PLDI’11, Ongoing] Scalable points-to and data-flow analysis – [CC’08, ISSTA’08, ECOOP’09, ISSTA’11] Control- and data-flow analysis for AspectJ – [ICSE’07, AOSD’08, ASE’09, TSE’11] Language-level checkpointing and replay – [FSE’07] 8

Research Areas Dynamic analysisStatic analysis Runtime SystemsCompilers PLDI’09, PLDI’10-a, PLDI’11 ISSTA’08, CC’08, ECOOP’09, PLDI’10-b, ISSTA’11, Ongoing Languages/ Software Engineering 9 ICSE’07, FSE’07, ICSE’08, AOSD’08, ASE’09, FoSER’10, TSE’11

Outline Introduction – Introduction to bloat – Overview of my thesis I: Finding low-utility data structures [PLDI’10-a] – Dynamic analysis to help performance tuning II: Finding loop-invariant data structures [Ongoing] – Static analysis that targets a specific bloat pattern found by I Future work and conclusions 10

Finding Low-Utility Data Structures Identify high-cost-low-benefit data structures that are likely to be performance bottlenecks To achieve this, we need to – Compute cost and benefit measurements for objects – Present to users a list of data structures ranked by their cost/benefit rates 11

A Systematic Way of Detecting Bloat Bloat can manifest itself through many symptoms temporary objects [Dufour FSE’08, Shankar OOPSLA’08] excessive data copies [Xu PLDI’09] highly-stale objects [Bond ASPLOS’06, Novak PLDI’09] problematic container behaviors [Xu ICSE’08, Shaham PLDI’09, Xu PLDI’10] … 12 Is there a common way to characterize bloat? At the heart of all inefficiencies lie computations, with great expense, produce data values that have little impact on the forward progress

What is Common About Bloat e.g. “java/io” boolean isPackage(String s){ List packs = directoryList(s); return packs != null; } List directoryList(String pkgName){ //find and return all files in directory pkgName // return null if nothing is found } --- from Eclipse 13 High cost-benefit rate 8% running time reduction achieved by specializing the method

Cost and Benefit Absolute cost of a heap value: total number of instructions executed to produce it – a = b or a = b + c, each has unit cost Benefit of heap value v: v (output) : benefit is a very large number v (not used) : benefit is 0 v v’ : benefit is M (amount of work) 14 M

Cost Computation b = 10; a = b + 3; c = b * 5; d = a / c; An example of a backward dynamic flow problem Requires dynamic slicing: record all mem. accesses 15 The cost of d is 4

Abstract Dynamic Slicing Trace generated by dynamic slicing is unbounded – The amount of memory needed depends on the dynamic behavior of a program Trace contains many more details than a client would need – Is it possible to capture only part of the execution relevant to the client For many problems, equivalence classes exist a = b.m 1 a = b.m 3 a = b.m 67 a = b.m 23 a = b.m 1235 … a = b.m 45 a = b.m 217 a = b.m 29 a = b.m 35 a = b.m 9 … 16 E1 E2

Cost Computation Use calling contexts to define equivalence classes “E 102, E 47, E 0 …” are object-sensitivity-based calling contexts (i.e., chains of receiver objects) 17 Absolute costAbstract cost a = b + c 20 a = b + c E0 b=e/f 34 c=h/g 2135 … b=e/f E102 c=h/g E47 … #nodesΣ(n.size) concrete dependence graph abstract dependence graph

Relative Abstract Cost – Cumulative cost measures the effort made from the beginning of the execution to produce a value – It is certain that the later a value is produced, the higher cost it has 18 Cumulative cost Program input f … … … … … g … h j … Relative cost O3O3 O2O2 O1O1

Relative Abstract Benefit 19 – Relative benefit: the effort made to transform a value read from a heap loc l to the value written into another heap loc l 19 Relative benefit Program input f … … … … g … h j … k … l O3O3 O2O2 O1O1 O4O4 O5O5

Key Problems and Ideas Problem 1: Dynamic slicing is too expensive and unnecessary – Insight 1: Abstract slicing Problem 2: How to abstract runtime instances for computing costs for object-oriented data structures – Insight 2: (Object-sensitivity-based) calling contexts Problem 3: Cumulative cost is not correlated with performance problems – Insight 3: Relative cost Solution: Compute relative abstract cost 20

Performance Improvements Implemented in IBM J9 Java Virtual Machine Case studies on real-world large applications 21

Impact The first dynamic analysis targeting general bloat Identified patterns leading to new techniques – Container inefficiencies [PLDI’10-b, context-free-language reachability formulation] – Loop-invariant data structures [Ongoing, type and effect system] – Problematic implementations of certain design patterns – Anonymous classes 22

Outline Introduction – Introduction to bloat – Overview of my thesis I: Finding low-utility data structures [PLDI’10-a] – Dynamic analysis to help performance tuning II: Finding loop-invariant data structures [Ongoing] – Creating objects with the same content many times in loops really hurts performance – Pulling them out of loops (i.e., hoisting) would potentially save a lot of computation 23

Example for(int i = 0; i < N; i++){ SimpleDateFormat sdf = new SimpleDateFormat(); try{ Date d = sdf.parse(date[i]); … } catch(...) {...} } From applications studied in IBM T. J. Watson Research Center Hoistable logical data structure Computing hoistability measurements O sdf O1O1 O2O2 … … … …… … … …

Hoistable Data Structures 25 Understand how a data structure is built up – points-to relationships Understand where the data comes from – dependence relationships Understand in which iterations objects are created – Compute iteration count abstraction (ICA) for each allocation site o : 0, 1, ┴ – 0: created outside the loop (i.e., exists before the loop starts) – 1: inside the loop and does not escape the current iter – ┴ : inside the loop and escapes to later iterations

ICA Annotation Annotate points-to and dependence relationships with ICAs 26 A a = new A(); //O 1,0 String s = null; for (int i = 0; i < 100, i++){ Triple t = new Triple(); // O 2,1 Integer p = new Integer(i); // O 3,1 if(i < 10) s = new String(“s”); // O 4, ┴ t.f = a; // connect O 2 and O 1 t.g = p; // connect O 2 and O 3 t.h = s; // connect O 2 and O 4 } O2O2 O1O1 f, (1, 0) O3O3 g, (1, 1) O4O4 h, (1, ┴ ) O 3. value i (1, 0) (0, 0) Annotated points-to relationships Annotated dependence relationships

Detecting Hoistable Data Structures Disjoint – Check: s does not contain objects with ┴ ICAs 27 O2O2 O1O1 f (1, 0) O3O3 g (1, 1) O4O4 h (1, ┴ ) Annotated points-to relationships

Detecting Hoistable Data Structures (Cond) Loop-invariant fields – Check: No dependence chain starting from a heap location in s is annotated with ┴ – Check: Nodes whose ICA are 0 are not involved in any cycle 28 O 3. value i (1, 0) (0, 0) Annotated dependence relationships

Hoistability Measurements Instead of doing transformation, we compute hoistability measurements Structure-based hoistability (SH) – How many objects in the data structure are disjoint Data-based hoistability (DH) – How many fields in the data structure are loop- invariant 29

Evaluation The technique was implemented using Soot 2.3 and evaluated on a set of 19 large Java programs A total 981 loop data structures considered – 155 are disjoint data structures(15.8%) 30

Case Studies We studied five large applications by inspecting the top 10 data structures ranked based on hoistability measurements Large performance improvements achieved by manual hoisting – ps: 82.1% – xalan: 10% (confirmed by the DaCapo team) – bloat: 11.1% – soot-c: 2.5% – sablecc-j: 6.7% No problems that we found have been reported before 31

Outline Introduction I: Finding low-utility data structures [PLDI’10-a] II: Finding and hoisting loop-invariant data structures[Ongoing] Future Work and Conclusions 32

Future Work--Bloat Analysis Look a bit deeper into bloat causes, we will see Object-orientation encourages excess Leverage techniques from other fields such as systems and architecture 33

Conclusions Static and dynamic analysis – Principles and applications My dissertation – Software bloat analysis – Dynamic analyses that can make more sense of heaps and executions – Static analyses that can remove and prevent bloat Hopes – Tool support: Tuning is no longer a daunting task – Education: Performance is important 34

Thank You Q & A 35