Improving the Static Resolution of Dynamic Java Features Jason Sawin Ohio State University.

Slides:



Advertisements
Similar presentations
Chapter 6 Server-side Programming: Java Servlets
Advertisements

Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
ASSUMPTION HIERARCHY FOR A CHA CALL GRAPH CONSTRUCTION ALGORITHM JASON SAWIN & ATANAS ROUNTEV.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Java Virtual Machine (JVM). Lecture Objectives Learn about the Java Virtual Machine (JVM) Understand the functionalities of the class loader subsystem.
In Defense of Unsoundness Ben Livshits, Manu Sridharan, Yannis Smaragdakis, and Ondřej Lhoták.
ITEC200 – Week03 Inheritance and Class Hierarchies.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Inheritance and Class Hierarchies Chapter 3. Chapter 3: Inheritance and Class Hierarchies2 Chapter Objectives To understand inheritance and how it facilitates.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Aarhus University, 2005Esmertec AG1 Implementing Object-Oriented Virtual Machines Lars Bak & Kasper Lund Esmertec AG
OOP in Java Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
Object-oriented design CS 345 September 20,2002. Unavoidable Complexity Many software systems are very complex: –Many developers –Ongoing lifespan –Large.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Advanced Inheritance Concepts. In this chapter, we will cover: Creating and using abstract classes Using dynamic method binding Creating arrays of subclass.
Programming Languages and Paradigms Object-Oriented Programming.
P ARALLEL P ROCESSING I NSTITUTE · F UDAN U NIVERSITY 1.
CISC6795: Spring Object-Oriented Programming: Polymorphism.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Control Flow Resolution in Dynamic Language Author: Štěpán Šindelář Supervisor: Filip Zavoral, Ph.D.
Introduction to Object Oriented Programming. Object Oriented Programming Technique used to develop programs revolving around the real world entities In.
Putting Pointer Analysis to Work Rakesh Ghiya and Laurie J. Hendren Presented by Shey Liggett & Jason Bartkowiak.
Change Impact Analysis for AspectJ Programs Sai Zhang, Zhongxian Gu, Yu Lin and Jianjun Zhao Shanghai Jiao Tong University.
AN EXTENDED OPENMP TARGETING ON THE HYBRID ARCHITECTURE OF SMP-CLUSTER Author : Y. Zhao 、 C. Hu 、 S. Wang 、 S. Zhang Source : Proceedings of the 2nd IASTED.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
CS200 Algorithms and Data StructuresColorado State University Part 4. Advanced Java Topics Instructor: Sangmi Pallickara
APCS Java AB 2004 Review of CS1 and CS2 Review for AP test #1 Sources: 2003 Workshop notes from Chris Nevison (Colgate University) AP Study Guide to go.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Inheritance in the Java programming language J. W. Rider.
Auther: Kevian A. Roudy and Barton P. Miller Speaker: Chun-Chih Wu Adviser: Pao, Hsing-Kuo.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
Data Structures Using Java1 Chapter 2 Inheritance and Exception Handling.
Object Oriented Software Development
 All calls to method toString and earnings are resolved at execution time, based on the type of the object to which currentEmployee refers.  Known as.
CBSE'051 Component-Level Dataflow Analysis Atanas (Nasko) Rountev Ohio State University.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
Estimating the Run-Time Progress of a Call Graph Construction Algorithm Jason Sawin, Atanas Rountev Ohio State University.
Inheritance. Inheritance - Introduction Idea behind is to create new classes that are built on existing classes – you reuse the methods and fields and.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Using Types to Analyze and Optimize Object-Oriented Programs By: Amer Diwan Presented By: Jess Martin, Noah Wallace, and Will von Rosenberg.
Test Stubs... getting the world under control. TDD of State Pattern To implement GammaTown requirements I CS, AUHenrik Bærbak Christensen2.
Object Oriented Programming
Object-Oriented Programming © 2013 Goodrich, Tamassia, Goldwasser1Object-Oriented Programming.
CS 343 presentation Concrete Type Inference Department of Computer Science Stanford University.
A System Performance Model Distributed Process Scheduling.
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
Inheritance and Class Hierarchies Chapter 3. Chapter 3: Inheritance and Class Hierarchies2 Chapter Objectives To understand inheritance and how it facilitates.
Java Programming, Second Edition Chapter Twelve Advanced Inheritance Concepts.
Inheritance and Class Hierarchies Chapter 3. Chapter Objectives  To understand inheritance and how it facilitates code reuse  To understand how Java.
Chapter 11: Advanced Inheritance Concepts. Objectives Create and use abstract classes Use dynamic method binding Create arrays of subclass objects Use.
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
D A C U C P Speculative Alias Analysis for Executable Code Manel Fernández and Roger Espasa Computer Architecture Department Universitat Politècnica de.
Terms and Rules II Professor Evan Korth New York University (All rights reserved)
Whole Test Suite Generation. Abstract Not all bugs lead to program crashes, and not always is there a formal specification to check the correctness of.
Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University.
Presented by : A best website designer company. Chapter 1 Introduction Prof Chung. 1.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
INFORMATION-FLOW ANALYSIS OF ANDROID APPLICATIONS IN DROIDSAFE JARED YOUNG.
Static Analysis of Object References in RMI-based Java Software
Chapter 1 Introduction.
Java Primer 1: Types, Classes and Operators
Chapter 1 Introduction.
Object-Orientated Programming
Overview of Eclipse Lectures
Plug-In Architecture Pattern
Presentation transcript:

Improving the Static Resolution of Dynamic Java Features Jason Sawin Ohio State University

Explaining My Research To Dad 2

Code is Like a Recipe

Dynamic Recipe?

Dynamic Features of Java Dynamic Class Loading – Ability to install classes at run time Reflection – Ability to examine or modify the run-time behavior of a running applications JVM – Implicitly calls certain code elements Native Method – Ability to interface with libraries written in non-Java languages 5

Common Uses of Dynamic Features Many large applications support plug-ins – Eclipse, columba, Apache Tomcat On startup they read specification files or look for class files in specific directories Dynamic class loading loads the found plug-ins Reflection can be used to access the members of the loaded classes 6

Dynamic Features In Action Class c; String className; Method m; Object h;..... Class c = Class.forName(className); m = c.getMethod("handle", …); h = c.newInstance(); m.invoke(h,…)

Static Analysis: Not just Counting Eggs Analyzes static representations of programs – No execution Key component of tools for: program understanding, program transformation, testing, debugging, performance optimizations … Dynamic features pose a significant challenge to static analyses 8

Traditional Approaches Ignore it – Do not model the implicit actions of dynamic features – Results will be unsound for any application that uses dynamic features Overly conservative – Assume that every applicable entity can be a potential target of dynamic feature constructs – All relevant information is obfuscated by infeasible interactions Query the user 9

Modern Approaches String analysis – Class.forName( ) Utilize casting information – z = (Foo)clazz.newInstance() Aid analysis user through specification points that identify where string values flow from Livshits APLAS

Problem 1 Problem: string analysis only considers purely static string values – Many uses of dynamic features will rely on string values which are not hard coded into application Insight: hybrid or “semi-static” analysis can increase information being considered – Increases precision – Loss of soundness 11

12 Problem 2 Problem: state-of-the-art string analysis is not powerful enough to precisely model all language features Insight: extensions can be made that improve the precision of string analysis

Problem 3 Problem: precise string analysis is costly in both time and memory Insight 1: parallelizing string analysis allows it to leverage modern multi-core architectures Insight 2: extensions can be made that can increase the efficiency of string analysis – Identify and remove irrelevant information 13

Problem 4 Problem: CHA call graph construction algorithm must provide treatment for dynamic features Essential component of many static analyses Insight: Treatments require consideration of the assumptions Incorporate string analysis to aid in precise resolution of dynamic features Less precise treatments for unresolved instances

Outline Background Extensions to String Analysis Increasing the Scalability of String Analysis CHA and Dynamic Features 15

Dynamic Class Loading In Java Libraries Extended state-of-the-art string analysis for Java: Java String Analyzer (JSA) – Publicly available implementation – Semi-static extension – Precision improving extensions Investigated dynamic class loading sites in Java 1.4 standard libraries – Used by all Java applications 16

Semi-Static Extension At analysis time dynamically gather information about the values of system environment variables – Typically they inform applications about the environment in which they are executing – Inject this information at environment variable entry points – System.getProperty( ) – Use JSA to resolve Challenges: default values and multiple key values – System.getProperty(, ) 17

Example private static final String handlerPropName = "sun.awt.exception.handler"; private static String handlerClassName = null; private boolean handleException(Throwable thrown) { handlerClassName = ((String) AccessController.doPrivileged( new GetPropertyAction(handlerPropName)));.... Class c = Class.forName(handlerClassName);.... } 18

Increasing the Precision of JSA Field Extension: JSA provides no treatment for fields – We provide a context-insensitive treatment for fields Currently only considering private fields of type String and specific instances of array fields of type String Type Extension: String is a subclass of Object – Use type inferencing analysis to determine objects of static type Object which are actually objects of type String – Needed for calls to doPrivileged 19

Overview of Experiment Evaluation Part 1: manual investigation – Establish a “perfect” baseline – Gain insights to the validity of our approach Part 2: evaluation of semi-static JSA – Compare to perfect baseline – Compare to current state-of-the-art 20

Part 1: Manual Investigation Results Dynamic class loading sites were categorized into 3 groups – Static dependent (SD) – The target string depended on only static string values – Environment dependent (EVD) – The target string depended on values returned from environment variable entry point – Other dependent (OD) – The target string flowed from sources such as files or directory structures SD EVD OD TOTAL Manual Inv

Part 2: Experimental Results 22

Summary of Initial Study Hybrid and modeling extensions greatly increase JSA ability to resolve instances of dynamic class loading – Very practical relaxation of assumptions – Java Library: 40% of client-independent dynamic class loading sites depend upon the values of environment variables – Resolved 2.6 times more sites than unenhanced JSA JSA scaled well to package sized inputs but performance degraded for larger inputs 23

24 Outline Background Extensions to String Analysis Increasing the Scalability of String Analysis CHA and Dynamic Features

JSA Design Front-end: creates a graph that abstractly represents the flow of string values through the input classes – Nodes: string operations – Edges: def/use relationship Back-end: Builds a context-free grammar from the flow graph and ultimately generates a finite state automaton for each hotspot 25

Baseline Testing Executed JSA on 25 benchmark applications For 18 benchmarks the front-end required more resources than the back-end For 3 benchmarks JSA exhausted a 6GB heap and failed to complete 26

Front-end Design 27

Parallel Front-end Design 28

Parallel Front-end Time Results 29

Concat Simplification 30 PassJason PassJason

Remove Superfluous Information 31 Anger is brief madness hotspot

Propagating “anystring” Value 32 Don’tthink just do anystring

Total Time With New Simplifications 33

Time Results Cont 34

Missing 3 Benchmarks New simplifications allowed JSA to complete an analysis of the previously unanalyzed benchmarks BenchmarkFront-end (4 threads) Back-endTotal Time (ms) JEdit Jess JGap

Summary For 22 benchmarks the parallel version of JSA’s front- end achieved an average speedup of 1.54 and reduced the average memory used by 43% New graph simplifications achieved a speedup of over 180 for 2 benchmarks and allow JSA to complete an analysis of 3 benchmarks that previously exhausted the heap memory These improvements make it more practical to incorporate JSA into analysis frameworks

37 Outline Background Extensions to String Analysis Increasing the Scalability of String Analysis CHA and Dynamic Features

CHA Call Graph Construction Algorithm Class Hierarchy Analysis (CHA) – For every virtual call site e.m(…) where T is the static type of e, it examines all subtypes of T for methods which override m(…) – The set of overriding methods are considered possible targets of the call Used by a wide range of interprocedural static analyses Problems with dynamic features 38

CHA and Dynamic Features Every implementation of CHA makes assumptions about dynamic features – Wide range of possible assumptions Very conservative to unsound Different assumptions allow for different resolution techniques – String analysis – Cast information First investigation of different sources of unsoundness 39

Assumption Hierarchy Overview 40 Encapsulation Safe Correct Casting Correct Strings Semi-static Strings Fully Conservative

Conservative Treatment Fully Conservative – Every class could be loaded at dynamic class loading sites – Every concrete class could be instantiated at calls of the from Class.newInstance and Constructor.newInstance – All methods could be implicit targets of Method.invoke calls – All methods could be called by native methods 41

Encapsulation Safe Assume that dynamic features will respect encapsulation – Every class could be loaded at dynamic class loading sites – Calls Method.invoke and newInstance will respect encapsulation – All public methods could be called by native methods 42

Correct Cast Information New assumption: casts will not generate run-time exceptions Can use cast information to resolve instances of dynamic class loading, reflective invocation and instantiation – Requires a post-dominance analysis and a reaching definitions analysis Unresolved dynamic features are treated in an encapsulation safe manner 43

Correct String Information New assumption: dynamic features will not affect the value of private String fields and formal String type parameters of private and package-private methods whose values flow to dynamic class loading sites – Allows JSA to be used to resolve instances of dynamic class loading – The type information from resolved instances is propagated to reflective calls 44

Semi-static String Values New assumption: values of environment variables which can affect dynamic class loading sites will be the same at analysis time and run time – Allows JSA to consider semi-static string values 45

Experimental Evaluation Implemented a version of the CHA call graph construction algorithm for each level of the assumption hierarchy Applied each implementation to 10 benchmark applications Compared the number of nodes and edges 46

Experimental Results: Nodes On average Semi contains 10% fewer nodes than Cons 47

Experimental Results: Edges On average Semi contains 54% fewer edges than Cons 48

Summary of Results The semi-static version of CHA created a graphs that contained, on average, 10% fewer nodes and 54% fewer edges than the fully conservative version Semi-static version was able to resolve an average of 6% of the reflective invocation calls, 50% of the dynamic class loading sites, and 61% of the reflective instantiation sites Under very reasonable assumptions a much more precise call graph can be created 49

Summary of Results Cont. We also compared Semi to the CHA in Soot – Soot provides “conservative” treatment for calls to Class.forName and Class.newInstance Soot’s graphs contained an average of 37% fewer nodes and 62% fewer edges than those created by Semi Significant portions of a call graph could be missing if dynamic features are treated unsoundly 50

Conclusion Increased the modeling capabilities of string analysis – Semi-static and modeling extensions Decreased the cost of precise string analysis – Parallel design and removing irrelevant information Increased a CHA call graph construction algorithm’s ability to precisely treat instances of dynamic features. This work is a step toward making static analysis tools better equipped to handle dynamic features of Java 51

Future work Explore other semi-static sources of string values – Configuration files Further increase JSA scalability – Library summaries – Object pools Refine resolution techniques – String analysis to resolve Class.getMethod calls Apply resolution techniques to other analyses – RTA 52