Interprocedural Analysis Noam Rinetzky Mooly Sagiv Tel Aviv University 640-6706 Textbook Chapter 2.5.

Slides:



Advertisements
Similar presentations
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Advertisements

Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Interprocedural Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday
Register Allocation Mooly Sagiv Schrierber Wed 10:00-12:00 html://
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Bebop: A Symbolic Model Checker for Boolean Programs Thomas Ball Sriram K. Rajamani
1 Lecture 06 – Inter-procedural Analysis Eran Yahav.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Interprocedural Analysis Noam Rinetzky Mooly Sagiv.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Run time vs. Compile time
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Interprocedural Analysis Noam Rinetzky Mooly Sagiv.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Overview of program analysis Mooly Sagiv html://
Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.
1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Claus Brabrand, UFPE, Brazil Aug 09, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
Procedure Optimizations and Interprocedural Analysis Chapter 15, 19 Mooly Sagiv.
Precision Going back to constant prop, in what cases would we lose precision?
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Precise Interprocedural Dataflow Analysis via Graph Reachibility
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Program Analysis and Verification
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
Data-Flow Analysis (Chapter 8). Outline What is Data-Flow Analysis? Structure of an optimizing compiler An example: Reaching Definitions Basic Concepts:
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Interprocedural Analysis Noam Rinetzky Mooly Sagiv.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 8: Dataflow & Optimizations 1 Roman Manevich Ben-Gurion University of the Negev.
Program Analysis Last Lesson Mooly Sagiv. Goals u Show the significance of set constraints for CFA of Object Oriented Programs u Sketch advanced techniques.
Textbook: Principles of Program Analysis
Spring 2016 Program Analysis and Verification
Interprocedural Analysis Chapter 19
Symbolic Implementation of the Best Transformer
Iterative Program Analysis Abstract Interpretation
Control Flow Analysis (Chapter 7)
Data Flow Analysis Compiler Design
Presentation transcript:

Interprocedural Analysis Noam Rinetzky Mooly Sagiv Tel Aviv University Textbook Chapter 2.5

Outline u Challenges in interprocedural analysis u The trivial solution u Why isn’t it adequate u Simplifying assumptions u A naive solution u Join over valid paths u The call-string approach u The functional approach –A case study linear constant propagation –Context free reachability u Modularity issues u Other solutions

Challenges in Interprocedural Analysis u Respect call-return mechanism u Handling recursion u Local variables u Parameter passing mechanisms: value, value- result, reference, by name u Procedure nesting u The called procedure is not always known u The source code of the called procedure is not always available –separate compilation –vendor code –...

Extended Syntax of While a := x | n | a 1 op a a 2 b := true | false | not b | b 1 op b b 2 | a 1 op r a 2 S := [x := a] l | [call p(a, z)] l l’ | [skip] l | S 1 ; S 2 | if [b] l then S 1 else S 2 | while [b] l do S P := begin D S end D := proc id(val id*, res id*) is l S end l’ | D D

A Trivial treatment of procedures u Analyze a single procedure u After every call continue with conservative information –Global variables and local variables which “may be modified by the call” are mapped to 

A Trivial treatment of procedures begin proc p() is 1 [x := 1] 2 end 3 [call p()] 4 5 [print x] 6 end [a , x  ] [a , x  1] [x  0] [x][x] [x][x]

Advantages of the trivial solution u Can be easily implemented u Procedures can be written in different languages u Procedure inline can help u Side-effect analysis can help

Disadvantages of the trivial solution u Modular (object oriented and functional) programming encourages small frequently called procedures u Optimization –Modern machines allows the compiler to schedule many instructions in parallel –Need to optimize many instructions –Inline can be a bad solution u Software engineering – Many bugs result from interface misuse –Procedures define partial functions

Simplifying Assumptions u All the code is available u Simple parameter passing u The called procedure is syntactically known u No nesting u Procedure names are syntactically different from variables u Procedures are uniquely defined u Recursion is supported

Constant Example begin proc p() is 1 if [b] 2 then ( [a := a -1] 3 [call p()] 4 5 [a := a + 1] 6 ) [x := -2* a + 5] 7 end 8 [a=7] 9 ; [call p()] ; [print(x)] 12 end

A naive Interprocedural solution u Treat procedure calls as gotos u Obtain a conservative solution u Find the least fixed point of the system: u Use Chaotic iterations DF entry (s) =  DF entry (v) =  {f(e)(DFentry(u) : (u, v)  E}

Simple Example begin proc p() is 1 [x := a + 1] 2 end 3 [a=7] 4 [call p()] 5 6 [print x] 7 [a=9] 8 [call p()] 9 10 [print a] 11 end proc p x=a+1 end a=7 call p 5 call p 6 print x a=9 call p 9 call p 10 print a [x  0, a  0] [x  0, a  7] [x  8, a  7] [x  8, a  9] [x , a  ]

Simple Example begin proc p() is 1 [x := a + 1] 2 end 3 [a=7] 4 [call p()] 5 6 [print x] 7 [a=9] 8 [call p()] 9 10 [print a] 11 end proc p x=a+1 end a=7 call p 5 call p 6 print x a=9 call p 9 call p 10 print a [x  0, a  0] [x  0, a  7] [x , a  ] [x , a  9] [x , a  ]

We want something better … u Let paths(v) denote the potentially infinite set paths from start to v (written as sequences of labels) u For a sequence of edges [e 1, e 2, …, e n ] define f [e 1, e 2, …, e n ]: L  L by composing the effects of basic blocks f [e 1, e 2, …, e n ](l) = f(e n ) (… (f(e 2 ) (f(e 1 ) (l)) …) u JOP[v] =  {f[e 1, e 2, …,e n ](  ) [e 1, e 2, …, e n ]  paths(v)}

Valid Paths () f1f1 f2f2 f k-1 fkfk f3f3 f4f4 f5f5 f k-2 f k-3 call q enter q exit q ret

void p() { if (...) { x = x + 1; p(); // p_calls_p1 x = x - 1; } return; } Invalid Path int x; void main() { x = 5; p(); return; }

A More Precise Solution u Only considers matching calls and returns (valid) u Can be defined via context free grammar u Every call is a different letter u Matching calls and returns Matched   | Matched Matched |( c Matched ) c for all [call p()] lc lr in P Valid  Matched | l c Valid for all [call p()] lc lr in P

A More Precise Solution u Only considers matching calls and returns (valid) u Can be defined via context free grammar u Every call is a different letter u Matching calls and returns Intra   | (l i,l j ) Intra for all l i, l j in Lab*\ Lab IP Matched   | Intra | Matched Matched | (l c,l n ) Matched (l x,l r ) for all [call p()] lc lr and p is ln S lx Valid  Matched | (l c,l n) Valid for all [call p()] lc lr for all [call p()] lc lr and p is ln S lx Let Lab* = all the labels in the program Lab IP ={l c,l r : [call p()] lc lr in the program}

The Join-Over-Valid-Paths (JVP) u For a sequence of edges [e 1, e 2, …, e n ] define f [e 1, e 2, …, e n ]: L  L by composing the effects of basic statements –f[](s)=s –f [e, p](s) = f[p] (f e (s)) u JVP l =  {f[e 1, e 2, …, e](  ) [e 1, e 2, …, e]  vpaths(l), e = (*,l)} u Compute a safe approximation to JVP u In some cases the JVP can be computed –Distributivity of f –Functional representation

The Call String Approach for Approximating JVP u No assumptions u Record at every node a pair (l, c) where l  L is the dataflow information and c is a suffix of unmatched calls u Use Chaotic iterations u To guarantee termination limit the size of c (typically 1 or 2) u Emulates inline (but no code growth) u Exponential in C u For a finite lattice there exists a C which leads to join over all valid paths

Simple Example begin proc p() is 1 [x := a + 1] 2 end 3 [a=7] 4 [call p()] 5 6 [print x] 7 [a=9] 8 [call p()] 9 10 [print a] 11 end proc p x=a+1 end a=7 call p 5 call p 6 print x a=9 call p 9 call p 10 print a [x  0, a  0] [x  0, a  7] 5,[x  0, a  7] 5,[x  8, a  7] [x  8, a  7] [x  8, a  9] 9,[x  8, a  9]  9,[x  10, a  9]  [x  10, a  9] 5,[x  8, a  7] 9,[x  10, a  9] 

begin 0 proc p() is 1 if [b] 2 then ( [a := a -1] 3 [call p()] 4 5 [a := a + 1] 6 ) [x := -2* a + 5] 7 end 8 [a=7] 9 ; [call p()] ; [print(x)] 12 end 13 Recursive Example a=7 Call p 10 Call p 11 print(x) p If( … ) a=a-1 Call p 4 Call p 5 a=a+1 x=-2a+5 end 10:[x  0, a  7] [x  0, a  7] [x  0, a  0] 10:[x  0, a  7] 10:[x  0, a  6] 4:[x  0, a  6] 4:[x  -7, a  6] 10:[x  -7, a  6] 4:[x  -7, a  6] 4:[x  -7, a  7] 4:[x , a  ]

The Functional Approach u The meaning of a function is mapping from states into states u The abstract meaning of a function is function from an abstract state to abstract states

begin proc p() is 1 if [b] 2 then ( [a := a -1] 3 [call p()] 4 5 [a := a + 1] 6 ) [x := -2* a + 5] 7 end 8 [a=7] 9 ; [call p()] ; [print(x)] 12 end Motivating Example a=7 Call p 10 Call p 11 print(x) p If( … ) a=a-1 Call p 4 Call p 5 a=a+1 x=-2a+5 end [x  0, a  7] [x  0, a  0] e.[x  -2e(a)+5, a  e(a)] [x  -9, a  7]

begin proc p() is 1 if [b] 2 then ( [a := a -1] 3 [call p()] 4 5 [a := a + 1] 6 ) [x := -2* a + 5] 7 end 8 [read(a)] 9 ; [call p()] ; [print(x)] 12 end Motivating Example read(a) Call p 10 Call p 11 print(x) p If( … ) a=a-1 Call p 4 Call p 5 a=a+1 x=-2a+5 end [x  0, a  ] [x  0, a  0] e.[x  -2e(a)+5, a  e(a)] [x , a  ]

The Functional Approach u Main idea: Iterate on the abstract domain of functions from L to L u Two phase algorithm –Compute the dataflow solution at the exit of a procedure as a function of the initial values at the procedure entry (functional values) –Compute the dataflow values at every point using the functional values u Can compute the JVP

Example: Constant propagation u L = Var  N  { ,  } u Domain: F:L  L –(f 1  f 2 )(x) = f 1 (x)  f 2 (x) x=7 y=x+1 x=y  env.env[x  7]  env.env[y  env(x)+1] Id=  env  L.env  env.env[x  7] ○  env.env  env.env[y  env(x)+1] ○  env.env[x  7] ○  env.env

Example: Constant propagation u L = Var  N  { ,  } u Domain: F:L  L –(f 1  f 2 )(x) = f 1 (x)  f 2 (x) x=7y=x+1 x=y  env.env[x  7]  env.env[y  env(x)+1] Id=  env.env  env.env[y  env(x)+1]   env.env[x  7]

Running Example 1 init a=7 9 Call p 10 Call p 11 print(x) 12 p1p1 If( … ) 2 a=a-1 3 Call p 4 Call p 5 a=a+1 6 x=-2a+5 7 end 8 begin 0 end 13 NFunction 0 e.[x  e(x), a  e(a)]=id e. 

Running Example 1 NFunction 1 e. [x  e(x), a  e(a)]=id 2id 7 8 e.[x  -2e(a)+5, a  e(a)] 3id 4 e.[x  e(x), a  e(a)-1] 5 f 8 ○ e.[x  e(x), a  e(a)-1] = e.[x  -2(e(a)-1)+5, a  e(a)-1] 6 7 e.[x  -2(e(a)-1)+5, a  e(a)]  e.[x  e(x), a  e(a)] 8 a, x.[x  -2e(a)+5, a  e(a)] a=7 9 Call p 10 Call p 11 print(x) 12 p1p1 If( … ) 2 a=a-1 3 Call p 4 Call p 5 a=a+1 6 x=-2a+5 7 end 8 begin 0 end 13 0 e.[x  e(x), a  e(a)]=id 10 e.[x  e(x), a  7] 11 a, x.[x  -2e(a)+5, a  e(a)] ○ f 10

Running Example 2 NFunction 1 [x  0, a  7] [x  -9, a  7] 3 [x  0, a  7] 4 [x  0, a  6] 1 [x  -7, a  6] 6 [x  -7, a  7] 7 [x , a  7] 8 [x  -9, a  7] 1 [x , a  ] a=7 9 Call p 10 Call p 11 print(x) 12 p1p1 If( … ) 2 a=a-1 3 Call p 4 Call p 5 a=a+1 6 x=-2a+5 7 end 8 begin 0 end 13 0 [x  0, a  0] 10 [x  0, a  7] 11 [x  -9, a  7]

Issues in Functional Approach u How to guarantee that finite height for functional lattice? –It may happen that L has finite height and yet the lattice of monotonic function from L to L do not u Efficiently represent functions –Functional join –Functional composition –Testing equality –Usually non-trivial –But can be done for distributive functions

Example Linear Constant Propagation u Consider the constant propagation lattice u The value of every variable y at the program exit can be represented by: y =  {(a x x + b x )| x  Var * }  c a x,c  Z  { ,  } b x  Z u Supports efficient composition and “functional” join –[z := a * y + b] –What about [z:=x+y]? u Computes JVP

Functional Approach via Context Free Reachablity u The problem of computing reachability in a graph restricted by a context free grammar can be solved in cubic time u Can be used to compute JVP in arbitrary finite distributive data flow problems (not just bitvector) u Nodes in the graph correspond to individual facts u Efficient implementations exit (MOPED)

Conclusion u Handling functions is crucial for abstract interpretation u Virtual functions and exceptions complicate things u But scalability is an issue u Assume-guarantee helps –But relies on specifications