Pick Your Contexts Well: Understanding Object-Sensitivity The Making of a Precise and Scalable Pointer Analysis Yannis Smaragdakis University of Massachusetts,

Slides:



Advertisements
Similar presentations
Yannis Smaragdakis / 11-Jun-14 General Adaptive Replacement Policies Yannis Smaragdakis Georgia Tech.
Advertisements

Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Tail Recursion. Problems with Recursion Recursion is generally favored over iteration in Scheme and many other languages – It’s elegant, minimal, can.
File Processing - Indirect Address Translation MVNC1 Hashing Indirect Address Translation Chapter 11.
CSC 213 – Large Scale Programming or. Today’s Goals  Begin by discussing basic approach of quick sort  Divide-and-conquer used, but how does this help?
Practice Quiz Question
1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.
Interprocedural analysis © Marcelo d’Amorim 2010.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Mayur Naik Alex Aiken John Whaley Stanford University Effective Static Race Detection for Java.
Scaling CFL-Reachability-Based Points- To Analysis Using Context-Sensitive Must-Not-Alias Analysis Guoqing Xu, Atanas Rountev, Manu Sridharan Ohio State.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Run time vs. Compile time
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
2-1 Sample Spaces and Events Conducting an experiment, in day-to-day repetitions of the measurement the results can differ slightly because of small.
Intraprocedural Points-to Analysis Flow functions:
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Swerve: Semester in Review. Topics  Symbolic pointer analysis  Model checking –C programs –Abstract counterexamples  Symbolic simulation and execution.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
DATA STRUCTURE Subject Code -14B11CI211.
Tail Recursion. Problems with Recursion Recursion is generally favored over iteration in Scheme and many other languages – It’s elegant, minimal, can.
Lesson 3 Variables – How our Brains Work - Variables in Python.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 – Multiple comparisons, non-normality, outliers Marshall.
MA/CSSE 473 Day 22 Binary Heaps Heapsort Answers to student questions Exam 2 Tuesday, Nov 4 in class.
Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Unit III : Introduction To Data Structures and Analysis Of Algorithm 10/8/ Objective : 1.To understand primitive storage structures and types 2.To.
Sorting HKOI Training Team (Advanced)
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
Resource Identity and Semantic Extensions: Making Sense of Ambiguity David Booth, Ph.D. Cleveland Clinic (contractor) Semantic Technology Conference 25-June-2010.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
CMPT 438 Algorithms. Why Study Algorithms? Necessary in any computer programming problem ▫Improve algorithm efficiency: run faster, process more data,
MA/CSSE 473 Day 27 Hash table review Intro to string searching.
COP4020 Programming Languages Subroutines and Parameter Passing Prof. Xin Yuan.
Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT.
CSC 212 – Data Structures Lecture 37: Course Review.
MA/CSSE 473 Day 23 Student questions Space-time tradeoffs Hash tables review String search algorithms intro.
Mark Marron IMDEA-Software (Madrid, Spain) 1.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Nicki Dell Spring 2014.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Pointer and Escape Analysis for Multithreaded Programs Alexandru Salcianu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
5/7/03ICSE Fragment Class Analysis for Testing of Polymorphism in Java Software Atanas (Nasko) Rountev Ohio State University Ana Milanova Barbara.
Sept 12ICSM'041 Precise Identification of Side-Effect-Free Methods in Java Atanas (Nasko) Rountev Ohio State University.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
Data Structures and Algorithms Instructor: Tesfaye Guta [M.Sc.] Haramaya University.
CMPT 238 Data Structures More on Sorting: Merge Sort and Quicksort.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
1 On Writing Research Papers Kathryn McKinley The University of Texas at Austin Style: Toward Clarity and Grace by Joseph Williams thanks to: William Cook.
Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting Tian Tan, Yue Li and Jingling Xue SAS 2016 September,
CMPT 438 Algorithms.
Pick Your Contexts Well: Understanding Object-Sensitivity The Making of a Precise and Scalable Pointer Analysis Yannis Smaragdakis University of Massachusetts,
Pointer Analysis Lecture 2
Declarative Static Program Analysis with Doop
Pointer analysis.
CS703 - Advanced Operating Systems
Presentation transcript:

Pick Your Contexts Well: Understanding Object-Sensitivity The Making of a Precise and Scalable Pointer Analysis Yannis Smaragdakis University of Massachusetts, Amherst and University of Athens Martin Bravenboer LogicBlox Inc. Ondřej Lhoták University of Waterloo

Context Object-sensitivity: an abstraction already behind the most precise and scalable points-to analyses Introduced by Milanova, Rountev and Ryder in 2002, quickly adopted in many practical settings mostly for OO languages Still not completely understood: the design space yields algorithms with very different precision not clear how context affects precision and scalability Yannis Smaragdakis2

What is this paper about? We offer a clearer understanding of object-sensitivity design space, tradeoffs We exploit it to produce better points-to analysis: type- sensitive analysis like object sensitive, but with some contexts replaced by types choice matters a lot! Why do you care? because there are some really cool insights easy to follow because the result is practical: currently the best tradeoff of precision and performance Yannis Smaragdakis3

First: what is points-to analysis? Static analysis: what objects can a variable point to? Highly recursive Yannis Smaragdakis4 class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } foo’s o can point to whatever someobj1 can point to foo’s o can point to whatever someobj2 can point to

Call-site-sensitive points-to analysis / kCFA Typically made precise using “context”: e.g., call-sites Yannis Smaragdakis5 class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } foo analyzed separately: once for o pointing to whatever someobj1 can point to once for o pointing to whatever someobj2 can point to Important because of further analysis inside foo foo analyzed separately: once for o pointing to whatever someobj1 can point to once for o pointing to whatever someobj2 can point to Important because of further analysis inside foo

In this talk: different context abstraction! Object-Sensitivity Object-sensitivity: information on objects used as context Yannis Smaragdakis6 class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } foo analyzed separately for each object pointed to by a1 for each object pointed to by a2 How many cases in total? 0? 1? 2?... 1million? The number of contexts depends on the analysis so far! foo analyzed separately for each object pointed to by a1 for each object pointed to by a2 How many cases in total? 0? 1? 2?... 1million? The number of contexts depends on the analysis so far!

A large design space What “information on objects used as context” ? Yannis Smaragdakis7 class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } Information available on an object: its creation site (instruction) the context for its creation site No matter what “context” is! Context for a call has to be created out of: information for receiver object current context at call-site (information for caller object) Need to at least collapse two contexts into one Information available on an object: its creation site (instruction) the context for its creation site No matter what “context” is! Context for a call has to be created out of: information for receiver object current context at call-site (information for caller object) Need to at least collapse two contexts into one

Design Space This choice (practically options) has not been acknowledged before The choices made by standard published algorithms and implementations vary widely mostly without realizing The result is completely different precision and performance Yannis Smaragdakis8

Example: P ADDLE vs. Milanova For a 2-object-sensitive analysis: context is 2 allocation sites Yannis Smaragdakis9 class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } Original object-sensitivity (Milanova) uses: receiver ( a1 or a2 ) allocation site allocation site of receiver’s allocator PADDLE framework uses: receiver ( a1 or a2 ) allocation site caller allocation site i.e., of a Client object, not an A object Quiz: which one do we think wins? Original object-sensitivity (Milanova) uses: receiver ( a1 or a2 ) allocation site allocation site of receiver’s allocator PADDLE framework uses: receiver ( a1 or a2 ) allocation site caller allocation site i.e., of a Client object, not an A object Quiz: which one do we think wins?

General formal framework for context-sensitive analyses Keep context-sensitive variables, a store, sets Context, HContext, abstr. interpretation over A-Normal FJ formalism [Might, Smaragdakis, and Van Functions: record : Instr x Context  HContext merge : Instr x HContext x Context  Context Key analysis logic: i : [ v = new C(); ] with context c  store heap context record(i,c) with v i : [ v.m(…) ; ] with context c  analyze m with context merge(i,hc,c) where hc is the context stored with v Yannis Smaragdakis10

We can now express past analyses nicely Original Milanova et al.-style object-sensitivity: Context = HContext = Instr n Functions: record(i,c) = cons(i, first n-1 (c)) merge(i, hc, c) = hc Yannis Smaragdakis11 record : Instr x Context  HContext merge : Instr x HContext x Context  Context i : [ v = new C(); ] with context c  store heap context record(i,c) with v i : [ v.m(…) ; ] with context c  analyze m with context merge(i,hc,c) where hc is the context stored with v

We can now express past analyses nicely P ADDLE -style object-sensitivity: Context = HContext = Instr n Functions: record(i,c) = cons(i, first n-1 (c)) merge(i, hc, c) = cons(car(hc), first n-1 (c)) Yannis Smaragdakis12 record : Instr x Context  HContext merge : Instr x HContext x Context  Context i : [ v = new C(); ] with context c  store heap context record(i,c) with v i : [ v.m(…) ; ] with context c  analyze m with context merge(i,hc,c) where hc is the context stored with v

We can now express past analyses nicely Most commonly called “object-sensitivity”: HContext = Instr, Context = Instr n Functions: record(i,c) = i merge(i, hc, c) = cons(hc, first n-1 (c)) Yannis Smaragdakis13 record : Instr x Context  HContext merge : Instr x HContext x Context  Context i : [ v = new C(); ] with context c  store heap context record(i,c) with v i : [ v.m(…) ; ] with context c  analyze m with context merge(i,hc,c) where hc is the context stored with v

We can now express past analyses nicely object-sensitive+H analyses (heap cloning): HContext = Instr n+1, Context = Instr n Functions: record(i,c) = cons(i, c) merge(i, hc, c) = [any of the previous options] Yannis Smaragdakis14 record : Instr x Context  HContext merge : Instr x HContext x Context  Context i : [ v = new C(); ] with context c  store heap context record(i,c) with v i : [ v.m(…) ; ] with context c  analyze m with context merge(i,hc,c) where hc is the context stored with v

Some insights on context When context consists of n elements with K possibilities for each, we analyze each method up to nK times e.g., K = #allocation sites Relative to a shallower context (e.g., n-1) we may replicate same points-to data K times Ideal for precision: extra context elements partition space into small sets, i.e., evenly I.e., context elements are uncorrelated otherwise combinations uneven Yannis Smaragdakis15

Revisit Example: P ADDLE vs. Milanova For a 2-object-sensitive analysis: context is 2 allocation sites Yannis Smaragdakis16 class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } class A { void foo(Object o) {…} } class Client { void bar(A a1, A a2) { … a1.foo(someobj1); … a2.foo(someobj2); } } Original obj.-sens. (Milanova) uses: receiver ( a1 or a2 ) allocation site allocation site of receiver’s allocator PADDLE framework uses: receiver ( a1 or a2 ) allocation site caller allocation site Quiz: which one do we think wins? Original. Receiver and caller are highly correlated! e.g., same object, wrapper object, design patterns Original obj.-sens. (Milanova) uses: receiver ( a1 or a2 ) allocation site allocation site of receiver’s allocator PADDLE framework uses: receiver ( a1 or a2 ) allocation site caller allocation site Quiz: which one do we think wins? Original. Receiver and caller are highly correlated! e.g., same object, wrapper object, design patterns

A significant difference Good choice of context is more precise: smaller points-to sets better results for client analyses: static cast elimination, de-virtualization, reachable methods often difference on 2-object-sensitive analyses (good vs. bad context) as great as from 1-object-sensitive Good choice of context yields much faster implementation! often 2x or more using our framework Yannis Smaragdakis17

A significant difference Good choice of context is more precise: smaller points-to sets better results for client analyses: static cast elimination, de-virtualization, reachable methods often difference on 2-object-sensitive analyses (good vs. bad context) as great as from 1-object-sensitive Good choice of context yields much faster implementation! often 2x or more Yannis Smaragdakis18

Some more understanding of contexts The problem with precise, deep-context analyses is that they may explode in complexity when deeper context yields precision, it is great even better performance when imprecision creeps in, scalability wall: extra level of context, O(K) multiplicative factor in complexity plain combinatorial explosion Result: some programs are fast(er), some completely hopeless Yannis Smaragdakis19

Idea: type-sensitivity Why not alleviate the combinatorial explosion by reducing combinations Instead of allocation sites, keep types Otherwise precisely isomorphic to object- sensitivity just some elements of context are transformed by a function T : Instr  ClassName Yannis Smaragdakis20

Example type-sensitive analyses 2type+1H: HContext = Instr x ClassName Context = ClassName 2 Functions: record(i, [C 1,C 2 ]) = [i,C 1 ] merge(i, [i’,C], c) = [T(i’),C] Yannis Smaragdakis21 record : Instr x Context  HContext merge : Instr x HContext x Context  Context i : [ v = new C(); ] with context c  store heap context record(i,c) with v i : [ v.m(…) ; ] with context c  analyze m with context merge(i,hc,c) where hc is the context stored with v

Example type-sensitive analyses 1type1obj+1H: HContext = Instr 2 Context = Instr x ClassName Functions: record(i, [i’,C]) = [i,i’] merge(i, [i 1,i 2 ], c) = [i 1, T(i 2 )] Yannis Smaragdakis22 record : Instr x Context  HContext merge : Instr x HContext x Context  Context i : [ v = new C(); ] with context c  store heap context record(i,c) with v i : [ v.m(…) ; ] with context c  analyze m with context merge(i,hc,c) where hc is the context stored with v

What function T to choose? Yannis Smaragdakis23 class A { … i: B b = new B(); … b.foo(…); } class A { … i: B b = new B(); … b.foo(…); } Which type gives more information about i ? A or B ? i used in representing receiver object when analyzing specific implementation of method foo B offers little info: we already know good upper bound for B when analyzing foo : either B::foo or C::foo for some close superclass C Which type gives more information about i ? A or B ? i used in representing receiver object when analyzing specific implementation of method foo B offers little info: we already know good upper bound for B when analyzing foo : either B::foo or C::foo for some close superclass C

Type-sensitivity in practice Type-sensitive analyses work great in practice! Very fast, very few scalability issues 2type+1H at least 2x (and up to 8x) faster than 1obj+H for 9 out of 10 DaCapo benchmarks while almost always much more precise an excellent approximation of full object-sensitive analyses 2type+1H is probably the new sweet spot for a practical precise analysis Yannis Smaragdakis24

Conclusions We offered a clearer understanding of object-sensitivity design space, tradeoffs We exploited it to produce better points-to analysis: type- sensitive analysis like object sensitive, but with some contexts replaced by types choice matters a lot! Why do you care? because there are some really cool insights easy to follow because the result is practical: currently the best tradeoff of precision and performance Yannis Smaragdakis25