G. Ramalingam Microsoft Research, India & K. V. Raghavan

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
Analysis of programs with pointers. Simple example What are the dependences in this program? Problem: just looking at variable names will not give you.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
1.6 Behavioral Equivalence. 2 Two very important concepts in the study and analysis of programs –Equivalence between programs –Congruence between statements.
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
Lecture 15 – Dataflow Analysis Eran Yahav 1
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Program analysis Mooly Sagiv html://
Correctness. Until now We’ve seen how to define dataflow analyses How do we know our analyses are correct? We could reason about each individual analysis.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Program analysis Mooly Sagiv html://
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Overview of program analysis Mooly Sagiv html://
Role Analysis Victor Kunkac, Patric Lam, Martin Rinard Laboratory for Computer Science, MIT Presentation by George Caragea CMSC631,
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Overview of program analysis Mooly Sagiv html://
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Precision Going back to constant prop, in what cases would we lose precision?
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Example x := read() v := a + b x := x + 1 w := x + 1 a := w v := a + b z := x + 1 t := a + b.
Compiler Construction Lecture 16 Data-Flow Analysis.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Pointer Analysis Lecture 2 G. Ramalingam Microsoft Research, India & K. V. Raghavan.
Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 15: Alias Analysis Roman Manevich Ben-Gurion University.
Spring 2017 Program Analysis and Verification
Data Flow Analysis Suman Jana
Textbook: Principles of Program Analysis
Simone Campanoni Dependences Simone Campanoni
Pointer Analysis Lecture 2
Graph-Based Operational Semantics
Declarative Computation Model Kernel language semantics (Non-)Suspendable statements (VRH ) Carlos Varela RPI October 11, 2007 Adapted with.
Iterative Program Analysis Abstract Interpretation
Topic 10: Dataflow Analysis
University Of Virginia
Another example: constant prop
Denotational Semantics (Denotational Semantics)
Pointer Analysis Lecture 2
Fall Compiler Principles Lecture 10: Global Optimizations
Data Flow Analysis Compiler Design
Pointer analysis.
Spring 2016 Program Analysis and Verification
Presentation transcript:

G. Ramalingam Microsoft Research, India & K. V. Raghavan Pointer Analysis G. Ramalingam Microsoft Research, India & K. V. Raghavan

Goals Points-to Analysis: Determine the set of possible values of a pointer-variable (at different points in a program) what locations can a pointer point-to? Alias Analysis: Determine if two pointer-variables may point to the same location Compute conservative approximation A fundamental analysis, required by most other static analyses

A Constant Propagation Example y = 4; z = x + 5; x is always 3 here can replace x by 3 and replace x+5 by 8 and so on

A Constant Propagation Example With Pointers z = x + 5; Is x always 3 here?

A Constant Propagation Example With Pointers p = &y; x = 3; *p = 4; z = x + 5; if (?) p = &x; else p = &y; x = 3; *p = 4; z = x + 5; p = &x; x = 3; *p = 4; z = x + 5; pointers affect most program analyses x is always 3 x is always 4 x may be 3 or 4 (i.e., x is unknown in our lattice)

A Constant Propagation Example With Pointers p = &y; x = 3; *p = 4; z = x + 5; if (?) p = &x; else p = &y; x = 3; *p = 4; z = x + 5; p = &x; x = 3; *p = 4; z = x + 5; p always points-to y p always points-to x p may point-to x or y

Points-to Analysis Determine the set of targets a pointer variable could point-to (at different points in the program) “p points-to x” “p stores the value &x” “*p denotes the location x” targets could be variables or locations in the heap (dynamic memory allocation) p = &x; p = new Foo(); or p = malloc (…);

Algorithm A (may points-to analysis) A Simple Example p = &x; q = &y; if (?) { q = p; } x = &a; y = &b; z = *q;

Algorithm A (may points-to analysis) A Simple Example q x y z null p q x y z null p q x y z null p q x y z p q x y z null p q x y z null {x,y} p q x y z null {x,y} a b {a,b} p q x y z null p q x y z null {x,y} a b p q x y z null {x,y} a p = &x; q = &y; if (?) { q = p; } x = &a; y = &b; z = *q;

Algorithm A (may points-to analysis) A Simple Example y = &b; if (?) { p = &x; } else { p = &y; } *x = &c; *p = &c; How should we handle this statement? (Try it!) Weak update Weak update Strong update x: a y: b p: {x,y} a: null x: a y: b p: {x,y} a: c x: {a,c} y: {b,c} p: {x,y} a: c

Questions When is it correct to use a strong update? A weak update? Is this points-to analysis precise? We must formally define what we want to compute before we can answer many such questions

Points-To Analysis: An Informal Definition Let u denote a program-point Define IdealMayPT (u) to be { (p,x) | p points-to x in some state at u in some run } Algorithm should compute a set MayPT(u) that over-approximates above set

Static Program Analysis A static program analysis computes approximate information about the runtime behavior of a given program The set of valid programs is defined by the programming language syntax The runtime behavior of a given program is defined by the programming language semantics The analysis problem defines what information is desired The analysis algorithm determines what approximation to make

Programming Language: Syntax A program consists of a set of variables Var a directed graph (V,E,entry) with a distinguished entry vertex, with every edge labelled by a primitive statement A primitive statement is of the form x = null x = y x = *y x = &y; *x = y skip (where x and y are variables in Var) Omitted (for now) Dynamic memory allocation Pointer arithmetic Structures and fields Procedures

Example Program x = &a; y = &b; if (?) { p = &x; } else { p = &y; } 1 2 3 4 5 x = &a y = &b 6 7 8 *x = &c *p = &c p = &y p = &x skip x = &a; y = &b; if (?) { p = &x; } else { p = &y; } *x = &c; *p = &c; Vars = {x,y,p,a,b,c}

Programming Language: Operational Semantics Operational semantics == an interpreter (defined mathematically) State DataState ::= Var -> (Var U {null}) PC ::= V (the vertex set of the CFG) ProgramState ::= PC x DataState Initial-state: (entry, \x. null)

Example States Vars = {x,y,p,a,b,c} x: N, y:N, p:N, a:N, b:N, c:N Initial data-state 1 2 3 4 5 x = &a y = &b 6 7 8 *x = &c *p = &c p = &y p = &x skip Initial program-state x: N, y:N, p:N, a:N, b:N, c:N <1, > Next program-state x: a, y:N, p:N, a:N, b:N, c:N <2, >

Programming Language: Operational Semantics Meaning of primitive statements CS[stmt] : DataState -> 2DataState CS[ x = null ] s = {s[x  null]} CS[ x = &y ] s = {s[x  y]} CS[ x = y ] s = {s[x  s(y)]} CS[ x = *y ] s = … … CS[*x = y ] s = … = …

Programming Language: Operational Semantics Meaning of primitive statements CS[stmt] : DataState -> 2DataState CS[ x = null ] s = {s[x  null]} CS[ x = &y ] s = {s[x  y]} CS[ x = y ] s = {s[x  s(y)]} CS[ x = *y ] s = {s[x  s(s(y))]}, if s(y) is not null = {}, otherwise CS[*x = y ] s = … …

Programming Language: Operational Semantics Meaning of primitive statements CS[stmt] : DataState -> 2DataState CS[ x = null ] s = {s[x  null]} CS[ x = &y ] s = {s[x  y]} CS[ x = y ] s = {s[x  s(y)]} CS[ x = *y ] s = {s[x  s(s(y))]}, if s(y) is not null = {}, otherwise CS[*x = y ] s = {s[s(x)  s(y)]}, if s(x) is not null

Programming Language: Operational Semantics Meaning of program a transition relation  on program-states   ProgramState X ProgramState state1  state2 means that the execution of some edge in the program can transform state1 into state2 Defining  (u,s)  (v,s’) iff the program contains a control-flow edge u->v labelled with a statement stmt such that CS[stmt]s = s’

Programming Language: Operational Semantics A sequence of states s1s2 … sn is said to be an execution (of the program) iff s1 is the Initial-State si  si+1 for 1 <= I < n A state s is said to be a reachable state iff there exists some execution s1s2 … sn is such that sn = s. Define RS(u) = { s | (u,s) is reachable }

Programming Language: Operational Semantics A sequence of states s1s2 … sn is said to be an execution (of the program) iff s1 is the Initial-State si  si+1 for 1 <= I < n A state s is said to be a reachable state iff there exists some execution s1s2 … sn is such that sn = s. Define RS(u) = { s | (u,s) is reachable } This is the collecting semantics at point u

Ideal Points-To Analysis: Formal Definition Let u denote a vertex in the CFG Define IdealMayPT (u) to be \p. { x | exists s in RS(u). s(p) == x }

May-Point-To Analysis: Problem statement Compute MayPT: V -> 2Var’ such that for every vertex u MayPT(u)  IdealMayPT(u) (where Var’ = Var U {null})

May-Point-To Algorithms Compute MayPT: V -> 2Vars’ such that MayPT(u)  IdealMayPT(u) An algorithm is said to be correct if the solution MayPT it computes satisfies "uÎV. MayPT(u)  IdealMayPT(u) An algorithm is said to be precise if the solution MayPT it computes satisfies "uÎV. MayPT(u) = IdealMayPT(u) An algorithm that computes a solution MayPT1 is said to be more precise than one that computes a solution MayPT2 if "uÎV. MayPT1(u) Í MayPT2(u)

Algorithm A: A Formal Definition The “Data Flow Analysis” Recipe Define semi-lattice of abstract-values AbsDataState ::= (Var -> (2Var’ – {})) ∪{bot} f1  f2 = \x. (f1 (x) È f2 (x)) Define initial abstract-value InitialAbsState = \x. {null} Define transformers for primitive statements AS[stmt] : AbsDataState -> AbsDataState

Algorithm A: A Formal Definition The “Data Flow Analysis” Recipe Let st(v,u) denote stmt on edge v->u Compute the least-fixed-point of the following “dataflow equations” x(entry) = InitialAbsState x(u) = v->u AS(st(v,u)) x(v) v w u st(w,u) st(v,u) x(v) x(w) x(u)

Algorithm A: The Transformers Abstract transformers for primitive statements AS[stmt] : AbsDataState -> AbsDataState AS[ x = y ] s = s[x  s(y)] AS[ x = null ] s = s[x  {null}] AS[ x = &y ] s = s[x  {y}] AS[ x = *y ] s = s[x  s*(s(y)-{null})], if s(y) is not = {null} = bot, otherwise where s*({v1,…,vn}) = s(v1) È … È s(vn),

Algorithm A AS[ *x = y ] s = After fix-point solution is obtained, AbsDataState(u) is emitted as MayPT(u), for each program point u bot if s(x) = {null} s[z  s(y)] if s(x)-{null} = {z} s[z1  s(z1) ∪ s(y)] if s(x)-{null} = {z1, …, zk} [z2  s(z2) ∪ s(y)] (where k > 1) … [zk  s(zk) ∪ s(y)]

An alternative algorithm: must points-to analysis AbsDataState is modified, as follows: Each var is mapped to {} or to a singleton set join is point-wise intersection Let MustPT(u) be fix-point at u Guarantee: Υ(MustPT(u)) ⊇ MayPT(u) ⊇ IdealMayPT(u) where Υ(S) = S, if S is a singleton set = Var’, if S = {}

Must points-to analysis algorithm AS transfer functions same as in Algorithm A for x = y, x = null, and x = &y AS[ x = *y ] s = bot, if s(y) = {null} = s[x  {}], if s(y) = {} = s[x  s(z)], if s(y) = {z}

Must points-to analysis algorithm AS[ *x = y ] s = bot, if s(x) = {null} = s[z  s(y)] if s(x) = {z} = \v. {}, otherwise This analysis is less precise than the may-points-to analysis (Algorithm A), but is more efficient