Static Detection of Loop-Invariant Data Structures Harry Xu, Tony Yan, and Nasko Rountev University of California, Irvine Ohio State University 1.

Slides:



Advertisements
Similar presentations
Uncovering Performance Problems in Java Applications with Reference Propagation Profiling PRESTO: Program Analyses and Software Tools Research Group, Ohio.
Advertisements

Object-Orientation Meets Big Data Language Techniques towards Highly- Efficient Data-Intensive Computing Harry Xu UC Irvine.
OOPSLA 2005 Workshop on Library-Centric Software Design The Diary of a Datum: An Approach to Modeling Runtime Complexity in Framework-Based Applications.
Object-oriented Software Change Dynamic Impact Analysis Lulu Huang and Yeong-Tae Song Dept. of Computer and Information Sciences Towson University Towson,
Go with the Flow: Profiling Copies to Find Run-time Bloat Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, Gary Sevitsky Ohio State University.
Chair of Software Engineering From Program slicing to Abstract Interpretation Dr. Manuel Oriol.
Resurrector: A Tunable Object Lifetime Profiling Technique Guoqing Xu University of California, Irvine OOPSLA’13 Conference Talk 1.
Praveen Yedlapalli Emre Kultursay Mahmut Kandemir The Pennsylvania State University.
Lightweight Abstraction for Mathematical Computation in Java 1 Pavel Bourdykine and Stephen M. Watt Department of Computer Science Western University London.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Precise Memory Leak Detection for Java Software Using Container Profiling.
Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th, 2008 Presented by Eun Jung Park.
CSC321: Programming Languages 11-1 Programming Languages Tucker and Noonan Chapter 11: Memory Management 11.1 The Heap 11.2 Implementation of Dynamic Arrays.
The Type System1. 2.NET Type System The type system is the part of the CLR that defines all the types that programmers can use, and allows developers.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Memory Systems Performance Workshop 2004© David Ryan Koes MSP 2004 Programmer Specified Pointer Independence David Koes Mihai Budiu Girish Venkataramani.
Finding Low-Utility Data Structures Guoqing Xu 1, Nick Mitchell 2, Matthew Arnold 2, Atanas Rountev 1, Edith Schonberg 2, Gary Sevitsky 2 1 Ohio State.
Lecture #4 Agenda Cell phones off & name signs out Review Questions? Objects The birds-and-the-bees talk.
Scaling CFL-Reachability-Based Points- To Analysis Using Context-Sensitive Must-Not-Alias Analysis Guoqing Xu, Atanas Rountev, Manu Sridharan Ohio State.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks Guoqing Xu, Michael D. Bond, Feng Qin, Atanas Rountev Ohio State University.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen Guoqing Xu UC Irvine USA.
Expediting Programmer AWAREness of Anomalous Code Sarah E. Smith Laurie Williams Jun Xu November 11, 2005.
Deep Typechecking and Refactoring Zachary Tatlock, Chris Tucker, David Shuffleton, Ranjit Jhala, Sorin Lerner 1 University of California, San Diego.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Introduction To System Analysis and design
Language Evaluation Criteria
Chapter 1 Algorithm Analysis
Database Systems Group Department for Mathematics and Computer Science Lars Hamann, Martin Gogolla, Mirco Kuhlmann OCL-based Runtime Monitoring of JVM.
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
Static Control-Flow Analysis for Reverse Engineering of UML Sequence Diagrams Atanas (Nasko) Rountev Ohio State University with Olga Volgin and Miriam.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Putting Pointer Analysis to Work Rakesh Ghiya and Laurie J. Hendren Presented by Shey Liggett & Jason Bartkowiak.
Understanding Parallelism-Inhibiting Dependences in Sequential Java Programs Atanas (Nasko) Rountev Kevin Van Valkenburgh Dacong Yan P. Sadayappan Ohio.
Semantics-Aware Performance Optimization Harry Xu CS Departmental Seminar 01/13/2012.
Analyzing Large-Scale Object-Oriented Software to Find and Remove Runtime Bloat Guoqing Xu CSE Department Ohio State University Ph.D. Thesis Defense Aug.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Rethinking Soot for Summary-Based Whole- Program Analysis PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Dacong Yan.
Mark Marron 1, Deepak Kapur 2, Manuel Hermenegildo 1 1 Imdea-Software (Spain) 2 University of New Mexico 1.
Mark Marron IMDEA-Software (Madrid, Spain) 1.
We will talking about story of JAVA language. By Kristsada Songpartom.
CBSE'051 Component-Level Dataflow Analysis Atanas (Nasko) Rountev Ohio State University.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Aspect Mining Jin Huang Huazhong University of Science & Technology, China
CoCo: Sound and Adaptive Replacement of Java Collections Guoqing (Harry) Xu Department of Computer Science University of California, Irvine.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
Java FilesOops - Mistake Java lingoSyntax
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
Performance Problems You Can Fix: A Dynamic Analysis of Memoization Opportunities Luca Della Toffola – ETH Zurich Michael Pradel – TU Darmstadt Thomas.
5/7/03ICSE Fragment Class Analysis for Testing of Polymorphism in Java Software Atanas (Nasko) Rountev Ohio State University Ana Milanova Barbara.
Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University.
Cs205: engineering software university of virginia fall 2006 Programming Exceptionally David Evans
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Wrap up. Structures and views Quality attribute scenarios Achieving quality attributes via tactics Architectural pattern and styles.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Compositional Pointer and Escape Analysis for Java programs
Static Analysis of Object References in RMI-based Java Software
Atanas (Nasko) Rountev Ohio State University
CACHETOR Detecting Cacheable Data to Remove Bloat
.NET and .NET Core 5.2 Type Operations Pan Wuming 2016.
Demand-Driven Context-Sensitive Alias Analysis for Java
Interactive Exploration of Reverse-Engineered UML Sequence Diagrams
Course Overview PART I: overview material PART II: inside a compiler
Subtype Substitution Principle
Presentation transcript:

Static Detection of Loop-Invariant Data Structures Harry Xu, Tony Yan, and Nasko Rountev University of California, Irvine Ohio State University 1

Is Today’s Software Fast Enough? Pervasive use of large-scale, enterprise-level applications – Layers of libraries and frameworks Object-orientation encourages excess – Pay attention to high-level program properties – Leave performance to the compiler and run-time system 2

Loop-Invariant Data Structures Part of a series of techniques to find and remove runtime inefficiencies – Motivated by a study using the analysis in [Xu-PLDI’10- a] Developers often create data structures that are independent of loop iterations Significant performance degradation – Increased creation and GC time – Unnecessary operations executed to initialize data structure instances 3

Example String[] date =...; for(int i = 0; i < N; i++){ SimpleDateFormat sdf = new SimpleDateFormat(); try{ Date d = sdf.parse(date[i]); … } catch(...) {...} } --- From a real-world application studied Hoistable logical data structure Computing hoistability measurements O sdf … … … … O1O1 O2O2 … … … …… ……… … … 4

Hoistable Data Structures Understand how a data structure is built up – points-to relationships Understand where the data comes from – dependence relationships Understand in which iterations objects are created – Compute iteration count abstraction (ICA) for each allocation site o : 0, 1, ┴ [Naik-POPL’07] – 0: created outside the loop (i.e., exists before the loop starts) – 1: inside the loop and does not escape the current iterations – ┴ : inside the loop and escapes to later iterations 5

ICA Annotation Annotate points-to and dependence relationships with ICAs A a = new A(); //O 1, 0 String s = null; for (int i = 0; i < 100, i++){ Triple t = new Triple(); // O 2, 1 Integer p = new Integer(i); // O 3, 1 if(i < 10) s = new String(“s”); // O 4, ┴ t.f = a; // connect O 2 and O 1 t.g = p; // connect O 2 and O 3 t.h = s; // connect O 2 and O 4 } O2O2 O1O1 f, (1, 0) O3O3 g, (1, 1) O4O4 h, (1, ┴ ) O 3. value i (1, 0) (0, 0) Annotated points-to relationships Annotated dependence relationships 6

Detecting Hoistable Data Structures Disjoint – Check: a data structure does not contain objects with ┴ ICAs O2O2 O1O1 f (1, 0) O3O3 g (1, 1) O4O4 h (1, ┴ ) Annotated points-to relationships 7

Detecting Hoistable Data Structures (Cond) Loop-invariant fields – Check: No dependence chain starting from a heap location in the data structure is annotated with ┴ – Check: Nodes whose ICA are 0 are not involved in any cycle 8 O 3. value i (1, 0) (0, 0) Annotated dependence relationships

Analysis Implementation CFL-reachability is used to compute points-to and dependence relationships – [Sridharan-PLDI’06, Xu-PLDI’10-b] A dataflow analysis is used to compute ICAs Our analysis is demand-driven – Analyze each loop object individually to discover its data structure 9

Hoistability Measurements Structure-based hoistability (SH) – How many objects in the data structure have disjoint instances Data-based hoistability (DH) – The total number of fields in a data structure: f – The number of loop-invariant fields: n – DH: f n/f Incorporate dynamic frequency info (DDH) – k * f n/f Compute DDH only for data structures whose SH is 1 10

Evaluation The technique was implemented using Soot 2.3 and evaluated on a set of 19 large Java programs – Benchmarks are from SPECjbb98, Ashes, and DaCapo Analysis running time depends on the number of loop objects – Ranges from 63s (xalan) to 21557s (eclipse) A total 981 loop data structures considered – 155 are disjoint data structures(15.8%) 11

Case Studies We studied five large applications by inspecting the top 10 data structures ranked based on hoistability measurements Large performance improvements achieved by manual hoisting – ps: 82.1% – xalan: 10% (confirmed by the DaCapo team) – bloat: 11.1% – soot-c: 2.5% – sablecc-j: 6.7% No problems that we found have been reported before 12

Conclusions A static analysis for detecting a common type of program inefficiencies – Annotate points-to and dependence relationships with iteration count abstractions Computing hoistability measurements – Show a new way to use static analysis – Combined with dynamic information to improve its real-world usefulness Five case studies 13

Thank You Q/A