Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07.

Slides:



Advertisements
Similar presentations
New Security Issues Raised by Open Cards Pierre GirardJean-Louis Lanet GERMPLUS R&D.
Advertisements

ASSUMPTION HIERARCHY FOR A CHA CALL GRAPH CONSTRUCTION ALGORITHM JASON SAWIN & ATANAS ROUNTEV.
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
Compilation 2007 Code Generation Michael I. Schwartzbach BRICS, University of Aarhus.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Aarhus University, 2005Esmertec AG1 Implementing Object-Oriented Virtual Machines Lars Bak & Kasper Lund Esmertec AG
1 Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlation Branches from a Larger Global History CSE 340 Project Presentation.
CS 536 Spring Run-time organization Lecture 19.
November 29, 2005Christopher Tuttle1 Linear Scan Register Allocation Massimiliano Poletto (MIT) and Vivek Sarkar (IBM Watson)
Honors Compilers Addressing of Local Variables Mar 19 th, 2002.
Previous finals up on the web page use them as practice problems look at them early.
Combining Static and Dynamic Data in Code Visualization David Eng Sable Research Group, McGill University PASTE 2002 Charleston, South Carolina November.
Run-Time Storage Organization
Shangri-La: Achieving High Performance from Compiled Network Applications while Enabling Ease of Programming Michael K. Chen, Xiao Feng Li, Ruiqi Lian,
Run time vs. Compile time
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
An Adaptive, Region-based Allocator for Java Feng Qian & Laurie Hendren 2002.
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
1 Utilizing Field Usage Patterns for Java Heap Space Optimization Z. Guo, N. Amaral, D. Szafron and Y. Wang Department of Computing Science University.
1 Memory Model of A Program, Methods Overview l Closer Look at Methods l Memory Model of JVM »Method Area »Heap »Stack l Preview: Parameter Passing.
Run-time Environment and Program Organization
Schedule Midterm out tomorrow, due by next Monday Final during finals week Project updates next week.
1 Memory Model of A Program, Methods Overview l Memory Model of JVM »Method Area »Heap »Stack.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
1 The Java Virtual Machine Yearly Programming Project.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Fast, Effective Code Generation in a Just-In-Time Java Compiler Rejin P. James & Roshan C. Subudhi CSE Department USC, Columbia.
An Adaptive, Region-based Allocator for Java Feng Qian, Laurie Hendren {fqian, Sable Research Group School of Computer Science McGill.
Putting Pointer Analysis to Work Rakesh Ghiya and Laurie J. Hendren Presented by Shey Liggett & Jason Bartkowiak.
Winrunner Usage - Best Practices S.A.Christopher.
Lecture 10 : Introduction to Java Virtual Machine
O VERVIEW OF THE IBM J AVA J UST - IN -T IME C OMPILER Presenters: Zhenhua Liu, Sanjeev Singh 1.
1 Java Bytecode Optimization Optimizing Java Bytecode for Embedded Systems Stefan Hepp.
Chapter 06 (Part I) Functions and an Introduction to Recursion.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Dynamic Analysis and Visualizing of Data Structures in Java Programs Presented by Sokhom Pheng (Supervisor: Clark Verbrugge) McGill University February.
Object-Oriented Programming Chapter Chapter
CS 153: Concepts of Compiler Design November 23 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Static Identification of Delinquent Loads V.M. Panait A. Sasturkar W.-F. Fong.
Functional Programming IN NON-FUNCTIONAL LANGUAGES.
Escape Analysis for Java Will von Rosenberg Noah Wallace.
Vertical Profiling : Understanding the Behavior of Object-Oriented Applications Sookmyung Women’s Univ. PsLab Sewon,Moon.
5/7/03ICSE Fragment Class Analysis for Testing of Polymorphism in Java Software Atanas (Nasko) Rountev Ohio State University Ana Milanova Barbara.
D A C U C P Speculative Alias Analysis for Executable Code Manel Fernández and Roger Espasa Computer Architecture Department Universitat Politècnica de.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
Preocedures A closer look at procedures. Outline Procedures Procedure call mechanism Passing parameters Local variable storage C-Style procedures Recursion.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
RealTimeSystems Lab Jong-Koo, Lim
Software, IEE Proceedings, Vol.152, Num.3, June 2005,Page(s): Prasanthi.S March, Java-based component framework for dynamic reconfiguration.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.
Dynamic Allocation in C
Chapter 14: System Protection
Compositional Pointer and Escape Analysis for Java Programs
Java Programming Language
Adaptive Code Unloading for Resource-Constrained JVMs
Inlining and Devirtualization Hal Perkins Autumn 2011
자바 언어를 위한 정적 분석 (Static Analyses for Java) ‘99 한국정보과학회 가을학술발표회 튜토리얼
UNIT V Run Time Environments.
Introduction to Data Structure
Assertions References: internet notes; Bertrand Meyer, Object-Oriented Software Construction; 4/25/2019.
SPL – PS1 Introduction to C++.
Presentation transcript:

Dynamic Purity Analysis for Java Programs Haiying Xu, Christopher J.F. Pickett, Clark Verbrugge School of Computer Science, McGill University PASTE ’07 Conference, San Diego, CA Presented by Derek White CSE 6329

Outline Introduction Approach and Contributions Design: Static Purity Analysis Kinds of Dynamic Purity Design: Dynamic Purity Analysis Memoization Experimental Evaluation Conclusions

Introduction Functional programming emphasizes application of functions and avoids mutable data (side effects) Popular functional languages include Scheme, Haskell, F#, OCaml, Scala, etc But you can program in a functional style using other languages “Pure” methods are methods that have functional (side effect free) behavior – Several definitions for purity, either no externally visible side effects or the extent of side effects is limited – Constraints may also be placed on level of dependency on previously available state

Introduction (2) Why do we care if a method is pure? Helpful in program understanding, allows us to isolate side effect free parts Verification in model checking Can be used to guide compiler optimization – Better method purity info allows for less conservative assumptions – Caching (memoization) of function calls

Introduction (3) Static analysis has allowed large classifications for pure methods, there is variation in precise definitions used Static analysis is conservative with respect to runtime behavior It is unclear if some classes of pure methods have any practical value So, the authors present a detailed examination of method purity for Java – Considering several definitions of purity – Investigating both static and dynamic properties

Approach and Contributions Extending previous work on static analysis, showing different forms of purity at different frequencies in dynamic environment Design and implementation of dynamic purity analysis, online and offline – Scalable, handles SPECjvm98 at size 100 “with acceptable overhead” Support for multiple purity definitions in order to compare to static purity analysis, also identified pure forms only observable dynamically

Approach and Contributions (2) Three metrics for the evaluation of extent of dynamic purity – Method, invocation, bytecode – These are applied to a static analysis as well as dynamic purity definitions Implementation of memoization on JVM, a traditional consumer of purity information – Doesn’t achieve any speedup, just a functional test module

Design: Static Analysis Previous work has found that a large number of methods have weak purity properties, stronger purity properties result in fewer pure method Static work done here considers strong purity – Method is “strongly pure” iff it doesn’t depend on OR change initial state beyond primitive input values – Must always return the same result for the same input Specifically, the method may not: – Read/write heap or static data – Synchronize – Allocate objects – Invoke native methods – Throw exceptions – Invoke any non-pure methods

Design: Static Analysis (2) Java class files used as input Flow-insensitive analysis done using Soot Soot SableVM Class files Jimple Static Analysis Attribute Generation Class files + attributes Attribute Parser Dynamic Metrics Output Figure 1. Static analysis framework

Design: Static Analysis (3) Instructions within a method are scanned, any instructions found to be impure mark the method as impure Interprocedural analysis is done next, propagating impurity up from leaves of a CHA-based call graph Assumption is made that exceptions do not propagate up the call stack unchecked ImpurityInstructions Native code execnative INVOKE * Heap access NEW, NEWARRAY, ANEWARRAY, MULTIANEWARRAY, GETFIELD, PUTFIELD, *ALOAD, *ASTORE Static accessGETSTATIC, PUTSTATIC Synchronizationsynchronized INVOKE*, synchronized *RETURN, MONITORENTER, MONITOREXIT Exceptions ATHROW

Design: Static Analysis (4) Easily extended for dynamic evaluation of strong static purity analysis Soot writes purity information to class file attributes SableVM reads attributes and records: – Pure methods reached at runtime – Frequency of pure method invocations – Percentage of pure bytecode executed by pure methods Provides indications about how static results correlate with dynamic runtime behavior

Design: Dynamic Analysis Under the static analysis, a method is determined to be pure for all possible executions or is impure otherwise – may be too conservative Methods that were flagged impure with static analysis may only execute pure flow control at runtime Goal of dynamic analysis is to identify pure methods based on runtime behavior, increasing number of pure methods found

Design: Dynamic Analysis (2) Figure 2. Dynamic purity analysis framework

Design: Dynamic Analysis (3) Class files read into SableVM, instruction stream is examined for purity Purity analysis module uses an online escape analysis tracking writes to locally allocated objects Purity information can be used immediately by the VM or written to a file as offline analysis for a later execution Offline analysis removes the execution overhead Clients of analysis are memoization and metrics used in static analysis Four kinds of purity: strong, moderate, weak, once- impure

Kinds of Dynamic Purity: Strong Same criteria as strong static purity Only executed instructions are considered All methods start with unknown status Impure method information propagates up the call stack As with static, once a method is identified as impure it is conservatively always considered impure

Kinds of Dynamic Purity: Moderate Objects can be created and altered as long as the objects do not escape the method execution context A method may call an impure method as long as the impurity is contained Must not change behavior based on heap or global state, based completely on primitive input arguments Methods still cannot: – Invoke native methods – Read/write existing heap or static objects – Perform monitor operations – Throw exceptions – Call moderately impure methods, unless modified data belongs to and is contained in the caller Native System.arraycopy() and Object.clone() treated as heap access and allocation instructions

Kinds of Dynamic Purity: Moderate (2) Analysis needs to take a closer look at *NEW*, GETFIELD, PUTFIELD, *ALOAD, *ASTORE *NEW* instructions used to determine object locality – Objects of a method are local if they do not escape the method, or if they escape from a callee – Frames in the call stack have an object table storing all currently local objects PUTFIELD can allow objects local to the callee to escape to the caller (requires an update to the object table) GETFIELD, PUTFIELD, *ALOAD, *ASTORE can be classified depending on a frame’s object table Moderately pure methods can only use object parameters for reference comparisons

Kinds of Dynamic Purity: Weak Allows heap reads so a method can inspect object parameters Maintains property that the method is function on its input GETFIELD is always safe PUTFIELD still is considered in the context of the escape analysis

Kinds of Dynamic Purity: Once-Impure Observed that some impure methods became weakly pure after a first invocation Once-Impure is a weakly pure method that was impure during its first execution

Memoization: Optimization with Purity All forms of purity mentioned previously ensure that there is a unique result for any given input All are candidates for memoization Memoization caches argument to return value mapping allowing the VM to bypass repeated execution of a method with the same arguments Benefit from jumping past execution must outweigh cost of looking up the return value in cache

Memoization (2) Method must be long enough to be worth optimizing After the first invocation, arguments are hashed together, looked up in a hash table, and the stored return value is substituted for invocation Primitive args stored directly, reference args are flattened (gathering type and primitive fields) – Done so that garbage collection doesn’t invalidate memo tables Direct object reference comparisons cannot be safely memoized, so ACMP_* bytecodes must be considered impure Upper bounds on memory consumption limit the number of method invocations that can be cached

Experimental Evaluation Experiments conducted using programs from SPEC JVM98 benchmark Metrics – Static method purity - percentage of all methods in the call graph that are pure – Dynamic method purity - percentage of methods reached at runtime that are pure – Dynamic invocation purity – percentage of method invocations that are pure – Dynamic bytecode purity – percentage of executed bytecode stream belonging to pure methods

Experimental Evaluation: Static Experimental analysis includes both application and class library code used On average, 13% of methods are found to be strongly pure Not all methods are invoked at runtime, dynamically it is found that 5-6% of reached methods are statically identified as pure Many of these methods are small (20 inst or less) or are executed infrequently Table 2. Strong Static Purity: Static methods row shows percentage of all methods in the call graph identified as statically pure. Dynamic methods row shows percentage of all dynamic method invocations that execute a statically pure method. Bytecode row shows the percentage of the bytecode stream that is executed by a statically pure method

Experimental Evaluation: Dynamic Strong dynamic purity is a weaker than the static equivalent First row of Tables 3, 4, 5 show an improvement over the runtime use of strong static purity in rows 2-4 of Table 2 Table 3 shows up to 4% more pure methods reached with strong dynamic purity Some methods invoked with significant frequency, Table 4 shows 13% more pure invocations for db

Experimental Evaluation: Dynamic (2) Table 3. Dynamic method purity: All reached methods Table 4. Dynamic invocation purity: Invoked methods that are pure for dynamic purity definitions Table 5. Dynamic bytecode purity: Bytecode instruction streams that are pure for dynamic purity definitions

Experimental Evaluation: Dynamic (3) Reasons for impurity Table 8. Reasons for dynamic impurity

Experimental Evaluation: Memoization Once-impure dynamic purity analysis used, a method is always invoked once prior to memoization Only applied to methods meeting cost effective criteria Table 11. Memoized/memoizable methods: Minimum method size setting shown in far left column

Experimental Evaluation: Execution Figure 3. Execution times: Minimum method size for memoization is set to 50

Conclusions Dynamic purity analyses identify considerable amounts of purity Actual program behavior is not predictable based on only on static observations Little variation in purity over the benchmark suite May be the case that memoization is of limited use for non-functional languages

Questions