Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Pointer Analysis B. Steensgaard: Points-to Analysis in Almost Linear Time. POPL 1996 M. Hind: Pointer analysis: haven't we solved this problem yet? PASTE.
Scalable Points-to Analysis. Rupesh Nasre. Advisor: Prof. R. Govindarajan. Comprehensive Examination. Jun 22, 2009.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Pointer Analysis.
Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Demand-driven Alias Analysis Implementation Based on Open64 Xiaomi An
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Parallel Inclusion-based Points-to Analysis Mario Méndez-Lojo Augustine Mathew Keshav Pingali The University of Texas at Austin (USA) 1.
1 Practical Object-sensitive Points-to Analysis for Java Ana Milanova Atanas Rountev Barbara Ryder Rutgers University.
Automated Soundness Proofs for Dataflow Analyses and Transformations via Local Rules Sorin Lerner* Todd Millstein** Erika Rice* Craig Chambers* * University.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
The Ant and The Grasshopper Fast and Accurate Pointer Analysis for Millions of Lines of Code Ben Hardekopf and Calvin Lin PLDI 2007 (Best Paper & Best.
Semi-Sparse Flow-Sensitive Pointer Analysis Ben Hardekopf Calvin Lin The University of Texas at Austin POPL ’09 Simplified by Eric Villasenor.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Purity Analysis : Abstract Interpretation Formulation Ravichandhran Madhavan, G. Ramalingam, Kapil Vaswani Microsoft Research, India.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Context-Sensitive Flow Analysis Using Instantiation Constraints CS 343, Spring 01/02 John Whaley Based on a presentation by Chris Unkel.
Interprocedural pointer analysis for C We’ll look at Wilson & Lam PLDI 95, and focus on two problems solved by this paper: –how to represent pointer information.
Set Constraint-Based Program Analysis Manuel Fähndrich CS590 UW Spring 2001.
Scaling CFL-Reachability-Based Points- To Analysis Using Context-Sensitive Must-Not-Alias Analysis Guoqing Xu, Atanas Rountev, Manu Sridharan Ohio State.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Intraprocedural Points-to Analysis Flow functions:
Java Alias Analysis for Online Environments Manu Sridharan 2004 OSQ Retreat Joint work with Rastislav Bodik, Denis Gopan, Jong-Deok Choi.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Swerve: Semester in Review. Topics  Symbolic pointer analysis  Model checking –C programs –Abstract counterexamples  Symbolic simulation and execution.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Pointer and Shape Analysis Seminar Mooly Sagiv Schriber 317 Office Hours Thursday
Reps Horwitz and Sagiv 95 (RHS) Another approach to context-sensitive interprocedural analysis Express the problem as a graph reachability query Works.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
An Efficient Inclusion-Based Points-To Analysis for Strictly-Typed Languages John Whaley Monica S. Lam Computer Systems Laboratory Stanford University.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Precision Going back to constant prop, in what cases would we lose precision?
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Adapting Side-Effects Analysis for Modular Program Model Checking M.S. Defense Oksana Tkachuk Major Professor: Matthew Dwyer Support US National Science.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India & K. V. Raghavan.
Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science.
Points-To Analysis in Almost Linear Time Josh Bauman Jason Bartkowiak CSCI 3294 OCTOBER 9, 2001.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
CS 343 presentation Concrete Type Inference Department of Computer Science Stanford University.
1 Combining Abstract Interpreters Mooly Sagiv Tel Aviv University
Pointer Analysis. Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008.
Pointer Analysis – Part I CS Pointer Analysis Answers which pointers can point to which memory locations at run-time Central to many program optimization.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1PLDI 2000 Off-line Variable Substitution for Scaling Points-to Analysis Atanas (Nasko) Rountev PROLANGS Group Rutgers University Satish Chandra Bell Labs.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 15: Alias Analysis Roman Manevich Ben-Gurion University.
Manuel Fahndrich Jakob Rehof Manuvir Das
Simone Campanoni Dependences Simone Campanoni
Pointer Analysis Lecture 2
Topic 17: Memory Analysis
G. Ramalingam Microsoft Research, India & K. V. Raghavan
Objective of This Course
Pointer Analysis Lecture 2
Pointer analysis.
Pointer analysis John Rollinson & Kaiyuan Li
Spring 2016 Program Analysis and Verification
Presentation transcript:

Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India

Andersen’s Analysis A flow-insensitive analysis –computes a single points-to solution valid at all program points –ignores control-flow – treats program as a set of statements –equivalent to merging all vertices into one (and applying algorithm A) –equivalent to adding an edge between every pair of vertices (and applying algo. A) –a solution R such that R  IdealMayPT(u) for every vertex u

Example (Flow-Sensitive Analysis) x = &a; y = x; x = &b; z = x; x = &a y = x 4 5 x = &b z = x

Example: Andersen’s Analysis x = &a; y = x; x = &b; z = x; x = &a y = x 4 5 x = &b z = x

Andersen’s Analysis Strong updates? Initial state?

Why Flow-Insensitive Analysis? Reduced space requirements –a single points-to solution Reduced time complexity –no copying individual updates more efficient –no need for joins –number of iterations? –a cubic-time algorithm Scales to millions of lines of code –most popular points-to analysis

Andersen’s Analysis A Set-Constraints Formulation Compute PT x for every variable x StatementConstraint x = null x = &y x = y x = *y *x = y

Steensgaard’s Analysis Unification-based analysis Inspired by type inference –an assignment “lhs := rhs” is interpreted as a constraint that lhs and rhs have the same type –the type of a pointer variable is the set of variables it can point-to “Assignment-direction-insensitive” –treats “lhs := rhs” as if it were both “lhs := rhs” and “rhs := lhs” An almost-linear time algorithm –single-pass algorithm; no iteration required

Example: Andersen’s Analysis x = &a; y = x; y = &b; b = &c; x = &a y = x 4 5 y = &b b = &c

Example: Steensgaard’s Analysis x = &a; y = x; y = &b; b = &c; x = &a y = x 4 5 y = &b b = &c

Steensgaard’s Analysis Can be implemented using Union-Find data-structure Leads to an almost-linear time algorithm

Exercise x = &a; y = x; y = &b; b = &c; *x = &d;

May-Point-To Analyses Ideal-May-Point-To Algorithm A Andersen’s Steensgaard’s more efficient / less precise ??? more efficient / less precise

Ideal Points-To Analysis: Definition Recap A sequence of states s 1 s 2 … s n is said to be an execution (of the program) iff – s 1 is the Initial-State –s i | s i+1 for 1 <= I < n A state s is said to be a reachable state iff there exists some execution s 1 s 2 … s n is such that s n = s. RS(u) = { s | (u,s) is reachable } IdealMayPT (u) = { (p,x) |  s  RS(u). s(p) == x } IdealMustPT (u) = { (p,x) |  s  RS(u). s(p) == x }

Does Algorithm A Compute The Most Precise Solution?

Ideal Algorithm A Abstract away correlations between variables –relational analysis vs. –independent attribute x: &by: &x x: &yy: &z x: {&y,&b}y: {&x,&z}   x: &yy: &x x: &by: &z x: &yy: &z x: &by: &x

Does Algorithm A Compute The Most Precise Solution?

Is The Precise Solution Computable? Claim: The set RS(u) of reachable concrete states (for our language) is computable. Note: This is true for any collecting semantics with a finite state space.

Precise Points-To Analysis: Decidability Corollary: Precise may-point-to analysis is computable. Corollary: Precise (demand) may-alias analysis is computable. –Given ptr-exp1, ptr-exp2, and a program point u, identify if there exists some reachable state at u where ptr-exp1 and ptr-exp2 are aliases. Ditto for must-point-to and must-alias … for our restricted language!

Precise Points-To Analysis: Computational Complexity What’s the complexity of the least-fixed point computation using the collecting semantics? The worst-case complexity of computing reachable states is exponential in the number of variables. –Can we do better? Theorem: Computing precise may-point-to is PSPACE-hard even if we have only two-level pointers.

May-Point-To Analyses Ideal-May-Point-To Algorithm A Andersen’s Steensgaard’s more efficient / less precise

Precise Points-To Analysis: Caveats Theorem: Precise may-alias analysis is undecidable in the presence of dynamic memory allocation. –Add “x = new/malloc ()” to language –State-space becomes infinite Digression: Integer variables + conditional- branching also makes any precise analysis undecidable.

May-Point-To Analyses Ideal (no Int, no Malloc) Algorithm A Andersen’s Steensgaard’s Ideal (with Int, with Malloc) Ideal (with Int) Ideal (with Malloc)

Dynamic Memory Allocation s: x = new () / malloc () Assume, for now, that allocated object stores one pointer –s: x = malloc ( sizeof(void*) ) Introduce a pseudo-variable V s to represent objects allocated at statement s, and use previous algorithm –treat s as if it were “x = &V s ” –also track possible values of V s –allocation-site based approach Key aspect: V s represents a set of objects (locations), not a single object –referred to as a summary object (node)

Dynamic Memory Allocation: Example x = new; y = x; *y = &b; *y = &a; x = new y = x 4 5 *y = &b *y = &a

Dynamic Memory Allocation: Object Fields Field-sensitive analysis class Foo { A* f; B* g; } s: x = new Foo() x->f = &b; x->g = &a;

Dynamic Memory Allocation: Object Fields Field-insensitive analysis class Foo { A* f; B* g; } s: x = new Foo() x->f = &b; x->g = &a;

Other Aspects Context-sensitivity Indirect (virtual) function calls and call- graph construction Pointer arithmetic Object-sensitivity

Andersen’s Analysis: Further Optimizations and Extensions Fahndrich et al., Partial online cycle elimination in inclusion constraint graphs, PLDI Rountev and Chandra, Offline variable substitution for scaling points-to analysis, Heintze and Tardieu, Ultra-fast aliasing analysis using CLA: a million lines of C code in a second, PLDI M. Hind, Pointer analysis: Haven’t we solved this problem yet?, PASTE Hardekopf and Lin, The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code, PLDI Hardekopf and Lin, Exploiting pointer and location equivalence to optimize pointer analysis, SAS Hardekopf and Lin, Semi-sparse flow-sensitive pointer analysis, POPL 2009.

Context-Sensitivity Etc. Liang & Harrold, Efficient computation of parameterized pointer information for interprocedural analyses. SAS Lattner et al., Making context-sensitive points-to analysis with heap cloning practical for the real world, PLDI Zhu & Calman, Symbolic pointer analysis revisited. PLDI Whaley & Lam, Cloning-based context-sensitive pointer alias analysis using BDD, PLDI Rountev et al. Points-to analysis for Java using annotated constraints. OOPSLA Milanova et al. Parameterized object sensitivity for points-to and side-effect analyses for Java. ISSTA 2002.

Applications Compiler optimizations Verification & Bug Finding –use in preliminary phases –use in verification itself

Dynamic Memory Allocation: Summary Object Update 4 5 *y = &a

Abstract Transformers: Weak/Strong Update AS[stmt] : AbsDataState -> AbsDataState AS[ *x = y ] s =

Correctness & Precision How can we formally reason about the correctness & precision of abstract transformers? Can we systematically derive a correct abstract transformer?

Enter: The French Recipe (Abstract Interpretation) 2 Data-State 2 Var x Var’ Concrete Domain Concrete states: C Semantics: For every statement st, CS[st] : C -> C  

Points-To Analysis (Abstract Interpretation)  (Y) = { (p,x) | exists s in Y. s(p) == x } RS(u) 2 Data-State 2 Var x Var’ IdealMayPT(u) MayPT(u)    IdealMayPT (u) =  ( RS(u) )

Approximating Transformers: Correctness Criterion CA correctly approximated by c1 c2 f a1 a2 f#f# correctly approximated by c is said to be correctly approximated by a iff  (c)   a

Approximating Transformers: Correctness Criterion CA c1 c2 f a1 a2 f#f# concretization  abstraction  requirement: f # (a1) ≥  (f(  (a1))

Concrete Transformers CS[stmt] : Data-State -> Data-State CS[ x = y ] s = s[x  s(y)] CS[ x = *y ] s = s[x  s(s(y))] CS[ *x = y ] s = s[s(x)  s(y)] CS[ x = null ] s = s[x  null] CS*[stmt] : 2 Data-State -> 2 Data-State CS*[st] X = { CS[st]s | s  X }

Abstract Transformers AS[stmt] : AbsDataState -> AbsDataState AS[ x = y ] s = s[x  s(y)] AS[ x = null ] s = s[x  {null}] AS[ x = *y ] s = s[x  s*(s(y))] where s*({v 1,…,v n }) = s(v 1 )  …  s(v n ) AS[ *x = y ] s = ???

Algorithm A: Tranformers Weak/Strong Update x: {&y}y: {&x,&z}z: {&a} x: &by: &xz: &a x: &yy: &zz: &b x: {&y,&b}y: {&x,&z}z: {&a,&b} x: &yy: &xz: &a x: &yy: &zz: &a *y = &b; f#f# f  

Algorithm A: Tranformers Weak/Strong Update x: {&y}y: {&x,&z}z: {&a} x: &yy: &bz: &a x: &yy: &bz: &a x: {&y}y: {&b}z: {&a} x: &yy: &xz: &a x: &yy: &zz: &a *x = &b; f#f# f  

Dynamic Memory Allocation: Summary Object Update 4 5 *y = &a