5/2/2008Prof. P. N. Hilfinger CS164 Lecture 401 Pointer Analysis Lecture 40 (adapted from notes by R. Bodik)

Slides:



Advertisements
Similar presentations
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Advertisements

Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Program Representations. Representing programs Goals.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
COMP171 Data Structure & Algorithm Tutorial 1 TA: M.Y.Chan.
CS 536 Spring Run-time organization Lecture 19.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
Interfaces. In this class, we will cover: What an interface is Why you would use an interface Creating an interface Using an interface Cloning an object.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Run time vs. Compile time
1 Refinement-Based Context-Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Intraprocedural Points-to Analysis Flow functions:
Java Alias Analysis for Online Environments Manu Sridharan 2004 OSQ Retreat Joint work with Rastislav Bodik, Denis Gopan, Jong-Deok Choi.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Intermediate Code. Local Optimizations
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
C++ / G4MICE Course Session 3 Introduction to Classes Pointers and References Makefiles Standard Template Library.
11 Values and References Chapter Objectives You will be able to: Describe and compare value types and reference types. Write programs that use variables.
Lecture 22 Miscellaneous Topics 4 + Memory Allocation.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
“is a”  Define a new class DerivedClass which extends BaseClass class BaseClass { // class contents } class DerivedClass : BaseClass { // class.
Topic 3 The Stack ADT.
CMSC 202 Exceptions. Aug 7, Error Handling In the ideal world, all errors would occur when your code is compiled. That won’t happen. Errors which.
Program Analysis with Dynamic Change of Precision Dirk Beyer Tom Henzinger Grégory Théoduloz Presented by: Pashootan Vaezipoor Directed Reading ASE 2008.
CSE 425: Object-Oriented Programming I Object-Oriented Programming A design method as well as a programming paradigm –For example, CRC cards, noun-verb.
Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, Chapter 11: Compiler II: Code Generation slide 1www.idc.ac.il/tecs.
1 Review of Static Program Analysis and Co/Contra-variance Ras Bodik, Thibaud Hottelier, James Ide UC Berkeley CS164: Introduction to Programming Languages.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
1 Lecture 19 Flow Analysis flow analysis in prolog; applications of flow analysis Ras Bodik Ali and Mangpo Hack Your Language! CS164: Introduction to Programming.
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
CS 376b Introduction to Computer Vision 01 / 23 / 2008 Instructor: Michael Eckmann.
C# Classes and Inheritance CNS 3260 C#.NET Software Development.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
CS-1030 Dr. Mark L. Hornick 1 Basic C++ State the difference between a function/class declaration and a function/class definition. Explain the purpose.
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2015 Lecture 10 – C: the heap and manual memory management.
ICS3U_FileIO.ppt File Input/Output (I/O)‏ ICS3U_FileIO.ppt File I/O Declare a file object File myFile = new File("billy.txt"); a file object whose name.
Interfaces F What is an Interface? F Creating an Interface F Implementing an Interface F What is Marker Interface?
1 Lecture 17 Flow Analysis flow analysis in prolog; applications of flow analysis Ras Bodik Shaon Barman Thibaud Hottelier Hack Your Language! CS164: Introduction.
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
Terms and Rules II Professor Evan Korth New York University (All rights reserved)
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Lucas Bang Lecture 11: Pointers.
Prof. Necula CS 164 Lecture 171 Operational Semantics of Cool ICOM 4029 Lecture 10.
Programming Languages and Compilers (CS 421)
CSE 374 Programming Concepts & Tools
Run-time organization
Java Programming Language
Data Flow Analysis Compiler Design
Advanced Compiler Design
Data Structures & Algorithms
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Procedure Linkages Standard procedure linkage Procedure has
Presentation transcript:

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 401 Pointer Analysis Lecture 40 (adapted from notes by R. Bodik)

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 402 Today Points-to analysis –an instance of static analysis for understanding pointers Andersen’s algorithm –via deduction Implementation of Andersen’s algorithm in Prolog Implementation of Andersen’s algorithm via CYK parsing –CFL-reachability 2

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 403 General Goals of Static Analysis Determine run-time properties at compilation –sample property: “is variable x a constant?” Statically: without running the program –Don’t know the inputs and thus must consider all possible program executions Err on the side of caution (soundness): allowed to say x is not a constant when it is –but not that x is a constant when it is not Many clients: optimization, verification, compilation 3

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 404 Client 1: Optimizing virtual calls in Java Motivation: virtual calls are costly, due to method dispatch Idea: –determine the target of the call statically –if we can prove call has a single target, call the target directly How to analyze? –declared (static) types of pointer variables not precise enough for this, –so, analyze their run-time (dynamic) types. 4

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 405 Client 1: example class A { void foo() {…} } class B extends A { void foo() {…} } void bar(A a) { a.foo() } OK to just call B.foo? B myB = new B(); A myA = myB; bar(myA); Declared type of a permits a.foo() to target both A.foo and B.foo. Yet we know only B.foo is the target. What program property would reveal this fact? 5

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 406 Client 2: Verification of casts In Java, casts are checked at run time –Java generics help readability, but still cast. A cast check: (Foo) e translates to –if ( ! (e instanceof Foo)) throw new ClassCastException() Goal: prove that no exception will happen at runtime –The exception prevents any security holes, but is expensive. –Static verification useful to catch bugs. 6

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 407 Client 2: Example class SimpleContainer { Object a; void put (Object o) { a=o; } Object get() { return a; } } SimpleContainer c1 = new SimpleContainer(); SimpleContainer c2 = new SimpleContainer(); c1.put(new Foo()); c2.put(“Hello”); Foo myFoo = (Foo) c1.get(); // Check not needed What property will lead to desired verification? 7

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 408 Client 3: Non-overlapping Fields in Heap We assign to E.len, but we don’t have to fetch from D.len every time; can save in register. In the IL, all D. fetches and E. fetches look like *t k. Can’t distinguish *t 1 from *t 2 unless we know t 1  t 2 E = new Thing (42); for (j = 0; j < D.len; j += 1) { if (E.len >= E.max) throw new OverflowExcept(); E.data[E.len] = D.data[j]; E.len += 1; }

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 409 Is there an analysis that can serve these three clients? We want to understand how pointers “flow” –that is, how are pointer values copied from variable to variable? Interested in flow from creation of an object to its uses –that is, flow from new Foo to myFoo.f Complication: pointers may flow via the heap –that is, a pointer may be stored in an object’s field –... and later read from this field

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4010 Common Analysis The necessary flow analysis involves –producers (creators of pointer values: new Foo) –consumers (uses of the pointer value, eg, p.f()) For simplicity, assume we are analyzing Java –we know all fields of an object at compile time –they must be introduced in the class definition –(also, let’s assume the Java program does not use reflection)

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4011 Continued.. Client 1: virtual call optimization: –which expressions new T() produced the values that may flow to receiver p in call for any program input? –Knowing producers will tells us possible dynamic types of p, and and thus also the set of target methods Client 2: cast verification: –Same, but producers include expressions (Type) p. Client 3: non-overlapping fields: –Again, same question 11

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4012 Flow analysis as a constant propagation Initially, consider only new and assignments p=r: if (…) p = new T1(); else p = new T2(); r = p; r.f(); // what are possible dynamic types of r? We (conceptually) translate the program to if (…) p = o 1 ; else p = o 2 ; r = p; r.f(); // what are possible symbolic constant values r? 12

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4013 Abstract objects The o i constants are called abstract objects –an abstract object o i stands for any and all concrete objects allocated at the allocation site with number i –allocation site = a `new’ expression When the analysis says a variable p may have value o 7 –we know p may point to any object allocated at new 7 Foo() 13

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4014 Flow analysis: Add Pointer Dereferences x = new Obj(); // o 1 z = new Obj(); // o 2 w = x; y = x; y.f = z; v = w.f; To propagate the abstract objects through p.f, must keep track of the heap state: –y and w point to same object –z and y.f point to same object, etc

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4015 Keep track of the heap state Heap state: where pointers point to The heap state may change at each statement, so ideally, track the heap state separately at each program point as in dataflow analysis. But scalable (i.e. practical) analyses typically don’t do it –to save space, they collapse all program points into one –consequently, they keep a single heap state…

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4016 Flow-Insensitive Analysis Disregards the control flow of the program –assumes that statements can execute in any order, and any number of times So, flow-insensitive analysis transforms this if (…) p = new T1(); else p = new T2(); r = p; p = r.f; into this CFG: p = new T1() p = new T2() p = r.f r = p

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4017 Flow-Insensitive Analysis Motivation: –Just “version” of program state, hence less space Flow-insensitive analysis is sound, assuming we mean that at least all possible values of pointer from all possible executions found But it is generally imprecise –it adds many executions not present in the original program –also, does not distinguish value of p at various program points

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4018 Canonical Statements Java pointers can be manipulated in complex statements: –e.g.: p.f().g.arr[i] = r.f.g(new Foo()).h Need a small set of canonical statements that accounts for everything our analysis needs to serve as intermediate representation: –p = new T()new –p = rassign –p = r.fgetfield –p.f = rputfield

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4019 Using Canonical Statements Complex statements can be canonicalized –p.f.g = r.f  t1 = p.f; t2 = r.f; t1.g = t2 –can be done with a syntax-directed translation

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4020 Handling of method calls Issue 1: Arguments and return values: –these are translated into assignments of the form p=r Example: Object foo(T x) { return x.f } r = new T; s = foo(r.g) could translate to foo_retval = x.f; r = new T; x = r.g; s = foo_retval; (have used flow-insensitivity: order irrelevant)

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4021 Handling of method calls Issue 2: targets of virtual calls –call p.f() may call many possible methods –to do the translation shown on previous slide, must determine what these targets are Suggest two simple methods: –Use declared type of p. – Check whole program to see which types are actually instantiated.

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4022 Handling of arrays We collapse all array elements into one –this single element will be represented by a field named arr –so p.g[i] = r becomes p.g.arr = r

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4023 Andersen’s Algorithm For flow-insensitive points-to analysis: Goal: compute two binary relations of interest: x pointsTo o: holds when x may point to abstract object o o flowsTo x: holds when abstract object o may flow to (be assigned to) x. (same thing, really: x pointsTo o  o flowsTo x)

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4024 Andersen’s algorithm Deduces the flowsTo relation from program statements –statements are facts –analysis is a set of inference rules –flowsTo relation is a set of facts inferred with analysis rules Statement facts: we’ll write them as x predicate y –p = new i T()o i new p –p = rr assign p –p = r.fr gf(f) p –p.f = rr pf(f) p

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4025 Inference rules 1) p = new i T() o i new p o i new p  o i flowsTo p Rule 1 2)p = r r assign p o i flowsTo r  r assign p  o i flowsTo p Rule 2 3)p.f = a a pf(f) p b = r.f r gf(f) b o i flowsTo a  a pf(f) p  p alias r  r gf(f) b  o i flowsTo b Rule 3

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4026 Inference rules (part 2) 4) Finally, x alias y (x and y may point to same object): o i flowsTo x  o i flowsTo y  x alias yRule 4

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4027 Notes When the analysis infers o flowsTo y, what did we prove? –nothing useful, usually, since o flowsTo y does not imply that there is a program input for which o will definitely flow to y. The useful result is when the analysis cannot infer that o flowsTo y –then we have proved that o cannot flow to y for any input –we obtained a useful piece of information, which may lead to better optimization, verification, compilation Same arguments apply to alias, pointsTo relations –and other static analyses in general

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4028 Inference Example (1) The program: x = new Foo(); // o1 z = new Bar(); // o2 w = x; y = x; y.f = z; v = w.f; The six facts: o 1 new x o 2 new z x assign w x assign y z pf(f) y w gf(f) v

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4029 Inference Example (2): 1. o 1 new x 2. o 2 new z 3. x assign w4. x assign y 5. z pf(f) y6. w gf(f) v The inferences: o 1 new x  o 1 flowsTo x o 2 new z  o 2 flowsTo z o 1 flowsTo x  x assign w  o 1 flowsTo w o 1 flowsTo x  x assign y  o 1 flowsTo y o 1 flowsTo y  o 1 flowsTo w  y alias w o 2 flowsTo z  z pf(f) y  y alias w  w gf(f) v  o 2 flowsTo v etc.

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4030 Example (3): Notes: –the inference must continue until no more facts can be derived –only then do we know we have performed sound analysis Conclusions from our example inference: –we have inferred o 2 flowsTo v –we have NOT inferred o 1 flowsTo v –hence we know v will point only to instances of Bar –(assuming the example contains the whole program) –thus casts (Bar) v will succeed –similarly, calls v.f() are optimizable

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4031 Prolog program for Andersen algorithm new(o1,x). % x=new_1 Foo() new(o2,z). % z=new_2 Bar() assign(x,y). % y=x assign(x,w). % w=x pf(z,y,f). % y.f=z gf(w,v,f). % v=w.f flowsTo(O,X) :- new(O,X). flowsTo(O,X) :- assign(Y,X), flowsTo(O,Y). flowsTo(O,X) :- pf(Y,P,F), gf(R,X,F), aliasP,R), flowsTo(O,Y). alias(X,Y) :- flowsTo(O,X), flowsTo(O,Y). 31

5/2/2008Prof. P. N. Hilfinger CS164 Lecture 4032 Inference via graph reachability Prolog’s search is too general and potentially expensive. Prolog program may in general backtrack (exponential time) Fortunately, there are better algorithms as well that operate in polynomial time.