Intraprocedural Points-to Analysis Flow functions:

Slides:



Advertisements
Similar presentations
Linked Lists Geletaw S..
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
MATH 224 – Discrete Mathematics
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Pointer Analysis.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Flow insensitive pointer analysis: fixed S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Pointer Analysis.
Worst case complexity of Andersen *x = y x abc y def x abc y def Worst case: N 2 per statement, so at least N 3 for the whole program. Andersen is in fact.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Optimizing Compilers for Modern Architectures More Interprocedural Analysis Chapter 11, Sections to end.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Semi-Sparse Flow-Sensitive Pointer Analysis Ben Hardekopf Calvin Lin The University of Texas at Austin POPL ’09 Simplified by Eric Villasenor.
Interprocedural analysis © Marcelo d’Amorim 2010.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
Approach #1 to context-sensitivity Keep information for different call sites separate In this case: context is the call site from which the procedure is.
Interprocedural pointer analysis for C We’ll look at Wilson & Lam PLDI 95, and focus on two problems solved by this paper: –how to represent pointer information.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Flow insensitivity and imprecision If you ignore flow, then you lose precision. main() { x := &y;... x := &z; } Flow insensitive analysis tells us that.
Previous finals up on the web page use them as practice problems look at them early.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time g() { lock; } h() { unlock; } f() { h(); if (...) { main(); } } main() { g(); f(); lock; unlock; } mainfgh ;;;;;;; u u ” ”””” ” ”
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Pointer and Shape Analysis Seminar Mooly Sagiv Schriber 317 Office Hours Thursday
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Schedule Midterm out tomorrow, due by next Monday Final during finals week Project updates next week.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Pointer analysis. Flow insensitive loss of precision S1: l := new Cons p := l S2: t := new Cons *p := t p := t l t S1 p S2 l t S1 p S2 l t S1 p S2 l t.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
Precision Going back to constant prop, in what cases would we lose precision?
PRESTO Research Group, Ohio State University Interprocedural Dataflow Analysis in the Presence of Large Libraries Atanas (Nasko) Rountev Scott Kagan Ohio.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
Featherweight X10: A Core Calculus for Async-Finish Parallelism Jonathan K. Lee, Jens Palsberg Presented By- Vasvi Kakkad.
ESEC/FSE-99 1 Data-Flow Analysis of Program Fragments Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University.
Using Types to Analyze and Optimize Object-Oriented Programs By: Amer Diwan Presented By: Jess Martin, Noah Wallace, and Will von Rosenberg.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India & K. V. Raghavan.
Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science.
Points-To Analysis in Almost Linear Time Josh Bauman Jason Bartkowiak CSCI 3294 OCTOBER 9, 2001.
Escape Analysis for Java Will von Rosenberg Noah Wallace.
1 Ch. 2: Getting Started. 2 About this lecture Study a few simple algorithms for sorting – Insertion Sort – Selection Sort (Exercise) – Merge Sort Show.
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
Searching CS 110: Data Structures and Algorithms First Semester,
1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.
Pointer Analysis CS Alias Analysis  Aliases: two expressions that denote the same memory location.  Aliases are introduced by:  pointers  call-by-reference.
Inter-procedural analysis
DFA foundations Simone Campanoni
Manuel Fahndrich Jakob Rehof Manuvir Das
Chapter 8: Recursion Data Structures in Java: From Abstract Data Types to the Java Collections Framework by Simon Gray.
Data Flow Analysis Suman Jana
Pointer Analysis Lecture 2
Interprocedural Analysis Chapter 19
University Of Virginia
Pointer Analysis Lecture 2
Pointer analysis.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Pointer analysis John Rollinson & Kaiyuan Li
Presentation transcript:

Intraprocedural Points-to Analysis Flow functions:

Pointers to dynamically-allocated memory Handle statements of the form: x := new T One idea: generate a new variable each time the new statement is analyzed to stand for the new location:

Example l := new Cons p := l t := new Cons *p := t p := t

Example solved l := new Cons p := l t := new Cons *p := t p := t l p V1 l p tV2 l p V1 t V2 l t V1 p V2 l t V1 p V2 l t V1 p V2V3 l t V1 p V2V3 l t V1 p V2V3

What went wrong? Lattice was infinitely tall! Instead, we need to summarize the infinitely many allocated objects in a finite way. –introduce summary nodes, which will stand for a whole class of allocated objects. For example: For each new statement with label L, introduce a summary node loc L, which stands for the memory allocated by statement L. Summary nodes can use other criterion for merging.

Example revisited & solved S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 Iter 1Iter 2Iter 3

Example revisited & solved S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 Iter 1Iter 2Iter 3

Array aliasing, and pointers to arrays Array indexing can cause aliasing: –a[i] aliases b[j] if: a aliases b and i = j a and b overlap, and i = j + k, where k is the amount of overlap. Can have pointers to elements of an array –p := &a[i];...; p++; How can arrays be modeled? –Could treat the whole array as one location. –Could try to reason about the array index expressions: array dependence analysis.

Fields Can summarize fields using per field summary –for each field F, keep a points-to node called F that summarizes all possible values that can ever be stored in F Can also use allocation sites –for each field F, and each allocation site S, keep a points-to node called (F, S) that summarizes all possible values that can ever be stored in the field F of objects allocated at site S.

Summary We just saw: –intraprocedural points-to analysis –handling dynamically allocated memory –handling pointers to arrays But, intraprocedural pointer analysis is not enough. –Sharing data structures across multiple procedures is one the big benefits of pointers: instead of passing the whole data structures around, just pass pointers to them (eg C pass by reference). –So pointers end up pointing to structures shared across procedures. –If you don’t do an interproc analysis, you’ll have to make conservative assumptions functions entries and function calls.

Conservative approximation on entry Say we don’t have interprocedural pointer analysis. What should the information be at the input of the following procedure: global g; void p(x,y) {... } xyg

Conservative approximation on entry Here are a few solutions: xyg locations from alloc sites prior to this invocation global g; void p(x,y) {... } They are all very conservative! We can try to do better. x,y,g & locations from alloc sites prior to this invocation

Interprocedural pointer analysis Main difficulty in performing interprocedural pointer analysis is scaling One can use a top-down summary based approach (Wilson & Lam 95), but even these are hard to scale

Cost: –space: store one fact at each prog point –time: iteration S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 Iter 1Iter 2Iter 3 Example revisited

New idea: store one dataflow fact Store one dataflow fact for the whole program Each statement updates this one dataflow fact –use the previous flow functions, but now they take the whole program dataflow fact, and return an updated version of it. Process each statement once, ignoring the order of the statements This is called a flow-insensitive analysis.

Flow insensitive pointer analysis S1: l := new Cons p := l S2: t := new Cons *p := t p := t

Flow insensitive pointer analysis S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2

Flow sensitive vs. insensitive S1: l := new Cons p := l S2: t := new Cons *p := t p := t l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 Flow-sensitive SolnFlow-insensitive Soln l t S1 p S2

What went wrong? What happened to the link between p and S1? –Can’t do strong updates anymore! –Need to remove all the kill sets from the flow functions. What happened to the self loop on S2? –We still have to iterate!

Flow insensitive pointer analysis: fixed S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t S1 p S2 l t S1 p S2 l t L1 p L2 l t S1 p S2 Iter 1Iter 2Iter 3 l t S1 p S2 Final result This is Andersen’s algorithm ’94

Flow insensitive loss of precision S1: l := new Cons p := l S2: t := new Cons *p := t p := t l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 Flow-sensitive SolnFlow-insensitive Soln l t S1 p S2

Flow insensitive loss of precision Flow insensitive analysis leads to loss of precision! main() { x := &y;... x := &z; } Flow insensitive analysis tells us that x may point to z here! However: –uses less memory (memory can be a big bottleneck to running on large programs) –runs faster

Worst case complexity of Andersen *x = y x abc y def x abc y def Worst case: N 2 per statement, so at least N 3 for the whole program. Andersen is in fact O(N 3 )

New idea: one successor per node Make each node have only one successor. This is an invariant that we want to maintain. x a,b,c y d,e,f *x = y x a,b,c y d,e,f

x *x = y y More general case for *x = y

x *x = y yxyxy More general case for *x = y

x x = *y y Handling: x = *y

x x = *y yxyxy Handling: x = *y

x x = y y x = &y xy Handling: x = y (what about y = x?) Handling: x = &y

x x = y yxyxy x = &y xyx y,… xy Handling: x = y (what about y = x?) Handling: x = &y get the same for y = x

Our favorite example, once more! S1: l := new Cons p := l S2: t := new Cons *p := t p := t

Our favorite example, once more! S1: l := new Cons p := l S2: t := new Cons *p := t p := t l S1 t S2 p l S1 l p l t S2 p l S1,S2 tp l S1 t S2 p 4 5

Flow insensitive loss of precision S1: l := new Cons p := l S2: t := new Cons *p := t p := t l t S1 p S2 l t S1 p S2 l t S1 p S2 l t S1 p S2 Flow-sensitive Subset-based Flow-insensitive Subset-based l t S1 p S2 l S1,S2 tp Flow-insensitive Unification- based

bar() { i := &a; j := &b; foo(&i); foo(&j); // i pnts to what? *i :=...; } void foo(int* p) { printf(“%d”,*p); } Another example

bar() { i := &a; j := &b; foo(&i); foo(&j); // i pnts to what? *i :=...; } void foo(int* p) { printf(“%d”,*p); } i a j b p i a i a j b i a j b p i,j a,b p Another example 4 3

Steensgaard & beyond A well engineered implementation of Steensgaard ran on Word97 (2.1 MLOC) in 1 minute. One Level Flow (Das PLDI 00) is an extension to Steensgaard that gets more precision and runs in 2 minutes on Word97.