1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University 640-6706 Textbook: Principles of Program.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
SSA.
3-Valued Logic Analyzer (TVP) Tal Lev-Ami and Mooly Sagiv.
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
Program analysis Mooly Sagiv html://
Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://
Control Flow Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Programming Language Semantics Denotational Semantics Chapter 5 Part II.
From last time: Lattices A lattice is a tuple (S, v, ?, >, t, u ) such that: –(S, v ) is a poset – 8 a 2 S. ? v a – 8 a 2 S. a v > –Every two elements.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Program analysis Mooly Sagiv html://
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Interprocedural Analysis Noam Rinetzky Mooly Sagiv Tel Aviv University Textbook Chapter 2.5.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Administrative stuff Office hours: After class on Tuesday.
Data Flow Analysis Compiler Design Nov. 8, 2005.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
1 Program Analysis Systematic Domain Design Mooly Sagiv Tel Aviv University Textbook: Principles.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
1 Tentative Schedule u Today: Theory of abstract interpretation u May 5 Procedures u May 15, Orna Grumberg u May 12 Yom Hatzamaut u May.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Precision Going back to constant prop, in what cases would we lose precision?
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Solving fixpoint equations
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University.
Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Program Analysis and Verification
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Shape Analysis via 3-Valued Logic Mooly Sagiv Tel Aviv University Shape analysis with applications Chapter 4.6
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
Program Analysis and Verification
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
1 Numeric Abstract Domains Mooly Sagiv Tel Aviv University Adapted from Antoine Mine.
CS 598 Scripting Languages Design and Implementation 9. Constant propagation and Type Inference.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis and Verification Noam Rinetzky Lecture 8: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d.
Program Analysis Last Lesson Mooly Sagiv. Goals u Show the significance of set constraints for CFA of Object Oriented Programs u Sketch advanced techniques.
Textbook: Principles of Program Analysis
Spring 2016 Program Analysis and Verification
G. Ramalingam Microsoft Research, India & K. V. Raghavan
Iterative Program Analysis Abstract Interpretation
Program Analysis and Verification
Data Flow Analysis Compiler Design
Presentation transcript:

1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis Chapter (modified) Appendix A

Outline u A gentle introduction constant propagation u Mathematical background u Chaotic iterations u Abstract interpretation u More examples –Kill/Gen Problems –Garbage variables –Pointer analysis »Stack »Heap (later) –Array bound (next week) u Precision/Completeness

A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x  1, y  7, z  3 ]

A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ]

A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  0, z  3 ] [x , y  7, z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ]

A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  0, z  3 ] [x , y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x  1, y  7, z  3 ]

A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  7, z  3 ] [x , y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x , y  7, z  3 ] [x  1, y  7, z  3 ]

A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) [x  0, y  0, z  0 ] [x  1, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  7, z  3 ] [x  0, y  0, z  3 ] [x  3, y  7, z  3 ] [x , y  7, z  3 ]

Computing Constants u Construct a control flow graph (CFG) u Associate transfer functions with control flow graph edges u Iterate until a solution is found u The solution is unique –But order of evaluation may affect the number of iterations

Constructing CFG z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y =z + 4 x = 3 print y ) z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y

Associating Transfer Functions z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]       

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]       [x  1, y  0, z  0 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]       [x  0, y  0, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]      [x  0, y  0, z  3 ] [x  1, y  0, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]     [x  0, y  0, z  3 ] [x  1, y  0, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]    [x  0, y  0, z  3 ] [x  1, y  0, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]   [x  0, y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ]  [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ] [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  7, z  3 ] [x  3, y  7, z  3 ] [x , y  0, z  3 ]

Iterative Computation z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e [x  0, y  0, z  0 ] [x  0, y  0, z  3 ] [x , y  0, z  3 ] [x  1, y  0, z  3 ] [x , y  7, z  3 ] [x  3, y  7, z  3 ] [x , y  0, z  3 ]

Mathematical Background u Declaratively define –The result of the analysis –The exact solution –Allow comparison

Posets u A partial ordering is a binary relation  : L  L  {false, true} –For all l  L : l  l (Reflexive) –For all l 1, l 2, l 3  L : l 1  l 2, l 2  l 3  l 1  l 3 (Transitive) –For all l 1, l 2  L : l 1  l 2, l 2  l 1  l 1 = l 2 (Anti-Symmetric) u Denoted by (L,  ) u In program analysis –l 1  l 2  l 1 is more precise than l 2  l 1 represents fewer concrete states than l 2 u Examples –Total orders (N,  ) –Powersets (P(S),  ) –Powersets (P(S),  ) –Constant propagation

Posets u More notations –l 1  l 2  l 2  l 1 –l 1  l 2  l 1  l 2  l 1  l 2 –l 1  l 2  l 2  l 1

Upper and Lower Bounds u Consider a poset (L,  ) u A subset L’  L has a lower bound l  L if for all l’  L’ : l  l’ u A subset L’  L has an upper bound u  L if for all l’  L’ : l’  u u A greatest lower bound of a subset L’  L is a lower bound l 0  L such that l  l 0 for any lower bound l of L’ u A lowest upper bound of a subset L’  L is an upper bound u 0  L such that u 0  u for any upper bound u of L’ u For every subset L’  L: –The greatest lower bound of L’ is unique if at all exists »  L’ (meet) a  b –The lowest upper bound of L’ is unique if at all exists »  L’ (join) a  b

Complete Lattices u A poset (L,  ) is a complete lattice if every subset has least and upper bounds u L = (L,  ) = (L, , , , ,  ) –  =   =  L –  =  L =   u Examples –Total orders (N,  ) –Powersets (P(S),  ) –Powersets (P(S),  ) –Constant propagation

Complete Lattices u Lemma For every poset (L,  ) the following conditions are equivalent –L is a complete lattice –Every subset of L has a least upper bound –Every subset of L has a greatest lower bound

Cartesian Products u A complete lattice (L 1,  1 ) = (L 1, ,  1,  1,  1,  1 ) u A complete lattice (L 2,  2 ) = (, ,  2,  2,  2,  2 ) u Define a Poset L = (L 1  L 2,  ) where –(x 1, x 2 )  (y 1, y 2 ) if »x 1  y 1 and »x 2  y 2 u L is a complete lattice

Finite Maps u A complete lattice (L 1,  1 ) = (L 1, ,  1,  1,  1,  1 ) u A finite set V u Define a Poset L = (V  L 1,  ) where –e 1  e 2 if for all v  V »e 1 v  e 2 v u L is a complete lattice

Chains u A subset Y  L in a poset (L,  ) is a chain if every two elements in Y are ordered –For all l 1, l 2  Y: l 1  l 2 or l 2  l 1 u An ascending chain is a sequence of values –l 1  l 2  l 3  … u A strictly ascending chain is a sequence of values –l 1  l 2  l 3  … u A descending chain is a sequence of values –l 1  l 2  l 3  … u A strictly descending chain is a sequence of values –l 1  l 2  l 3  … u L has a finite height if every chain in L is finite u Lemma A poset (L,  ) has finite height if and only if every strictly decreasing and strictly increasing chains are finite

Monotone Functions u A poset (L,  ) u A function f: L  L is monotone if for every l 1, l 2  L: –l 1  l 2  f(l 1 )  f(l 2 )

Galois Connections u Lattices C and A and functions  : C  A and  : A  C u The pair of functions ( ,  ) form Galois connection if –  and  are monotone –  a  A »  (  (a))  a –  c  C »c   (  (C)) u Alternatively if:  c  C  a  A  (c)  a iff c   (a) u  and  uniquely determine each other

Fixed Points u A monotone function f: L  L where (L, , , , ,  ) is a complete lattice u Fix(f) = { l: l  L, f(l) = l} u Red(f) = {l: l  L, f(l)  l} u Ext(f) = {l: l  L, l  f(l)} –l 1  l 2  f(l 1 )  f(l 2 ) u Tarski’s Theorem 1955: if f is monotone then: – lfp(f) =  Fix(f) =  Red(f)  Fix(f) – gfp(f) =  Fix(f) =  Ext(f)  Fix(f)   f(  ) f(  ) f2()f2() f2()f2() Fix(f) Ext(f) Red(f) gfp(f) lfp(f)

Chaotic Iterations u A lattice L = (L, , , , ,  ) with finite strictly increasing chains u L n = L  L  …  L u A monotone function f: L n  L n u Compute lfp(f) u The simultaneous least fixed of the system {x[i] = f i (x) : 1  i  n } x := ( , , …,  ) while (f(x)  x ) do x := f(x) for i :=1 to n do x[i] =  WL = {1, 2, …, n} while (WL   ) do select and remove an element i  WL new := f i (x) if (new  x[i]) then x[i] := new; Add all the indexes that directly depends on i to WL

Chaotic Iterations u L n = L  L  …  L u A monotone function f: L n  L n u Compute lfp(f) u The simultaneous least fixed of the system {x[i] = f i (x) : 1  i  n } u Minimum number of non-constant u Maximum number of  for i :=1 to n do x[i] =  WL = {1, 2, …, n} while (WL   ) do select and remove an element i  WL new := f i (x) if (new  x[i]) then x[i] := new; Add all the indexes that directly depends on i to WL

Specialized Chaotic Iterations Chaotic(G(V, E): Graph, s: Node, L: Lattice,  : L, f: E  (L  L) ){ for each v in V to n do df entry [v] :=  df[v] =  WL = {s} while (WL   ) do select and remove an element u  WL for each v, such that. (u, v)  E do temp = f(e)(df entry [u]) new := df entry (v)  temp if (new  df entry [v]) then df entry [v] := new; WL := WL  {v}

z =3 x =1 while (x>0) if (x=1) y =7y =z+4 x=3 print y  e.e[z  3]  e.e[x  1]  e. if x >0 then e else   e. if x  0 then e else   e. e  [x  1, y , z  ]  e. if x  0 then e else   e.e[y  7]  e.e[y  e(z)+4]  e.e[x  3]  e.e  [x  0, y  0, z  0] WLdf entry ]v] {1} {2} df[2]:=[x  0, y  0, z  3] {3} df[3]:=[x  1, y  0, z  3] {4} df[4]:=[x  1, y  0, z  3] {5} df[5]:=[x  1, y  0, z  3] {7} df[7]:=[x  1, y  7, z  3] {8} df[8]:=[x  3, y  7, z  3] {3} df[3]:=[x , y , z  3] {4} df[4]:=[x , y , z  3] {5,6} df[5]:=[x  1, y , z  3] {6,7} df[6]:=[x , y , z  3] {7} df[7]:=[x , y  7, z  3]

Specialized Chaotic Iterations System of Equations S = df entry [s] =  df entry [v] =  {f(u, v) (df entry [u]) | (u, v)  E } F S :L n  L n F S (X)[s] =  F S (X)[v] =  {f(u, v)(X[u]) | (u, v)  E } lfp(S) = lfp(F S )

Complexity of Chaotic Iterations u Parameters: –n the number of CFG nodes –k is the maximum outdegree of edges –A lattice of height h –c is the maximum cost of »applying f (e) »  »L comparisons u Complexity O(n * h * c * k)

Soundness u Every detected constant is indeed such u Every error will be detected u The least fixed points represents all occurring runtime states

Completeness u Every constant is indeed detected as such u Every detected error is real u Every state represented by the least fixed is reachable for some input

The Abstract Interpretation Technique u The foundation of program analysis u Goals –Establish soundness of (find faults in) a given program analysis algorithm –Design new program analysis algorithms u The main ideas: –Relate each step in the algorithm to a step in a structural operational semantics –Establish global correctness using a general theorem u Not limited to a particular form of analysis

Soundness in Constant Propagation u Every detected constant is indeed such u May include fewer constants u May miss  u At every CFG node l All constants in df entry (l) are indeed constants u At every CFG node l df entry (l) “represents” all the possible concrete states arising when the structural operational semantics reaches l

Proof of Soundness u Define an “appropriate” structural operational semantics u Define “collecting” structural operational semantics u Establish a Galois connection between collecting states and reaching definitions u (Local correctness) Show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics u (Global correctness) Conclude that the analysis is sound CC1976  (lfp(S)[v, entry])  REACH[v, entry]

Structural Semantics for While [ass sos ]  s[x  A  a  s] [skip sos ]  s [comp 1 sos ]   axioms rules [comp 2 sos ]  s’ 

Structural Semantics for While if construct [if tt sos ]  if B  b  s=tt [if ff os ]  if B  b  s=ff

Structural Semantics for While while construct [while sos ] 

Collecting Semantics u The input state is not known at compile-time u “Collect” all the states for all possible inputs to the program u No lost of precision

A Simple Example Program z = 3 x = 1 while (x > 0) ( if (x = 1) then y = 7 else y = z + 4 x = 3 print y ) {[x  0, y  0, z  0 ]} {[x  1, y  0, z  3 ]} {[x  1, y  0, z  3 ], [x  3, y  0, z  3], } {[x  0, y  0, z  3 ]} {[x  1, y  7, z  3 ], [x  3, y  7, z  3] } { [x  3, y  7, z  3] }

Another Example x= 0 while (true) do x = x +1

An “Iterative” Definition u Generate a system of monotone equations u The least solution is well-defined u The least solution is the collecting interpretation u But may not be computable

Equations Generated for Collecting Interpretation u Equations for elementary statements –[skip] CS exit (1) = CS entry (l) –[b] CS exit (1) = {  :   CS entry (l),  b   =tt} –[x := a] CS exit (1) = { (s[x  A  a  s]) | s  CS entry (l)} u Equations for control flow constructs CS entry (l) =  CS exit (l’) l’ immediately precedes l in the control flow graph u An equation for the entry CS entry (1) = {  |   Var *  Z }

Specialized Chaotic Iterations System of Equations (Collecting Semantics) S = CS entry [s] ={  0 } CS entry [v] =  {f(e)(CS entry [u]) | (u, v)  E } where f(e) = X. {  st(e)   |  X} for atomic statements f(e) = X.{  |  b(e)   =tt } F S :L n  L n F s (X)[v] =  {f(e)[u] | (u, v)  E } lfp(S) = lfp(F S )

The Least Solution u 2n sets of equations CS entry (1), …, CS entry (n), CS exit (1), …, CS exit (n) u Can be written in vectorial form u The least solution lfp(F cs ) is well-defined u Every component is minimal u Since F cs is monotone such a solution always exists u CS entry (v) = {s|  s 0 |  * (S’, s)), init(S’)=v} u Simplify the soundness criteria

Abstract (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation concretization Operational semantics statement s Set of states 

Abstract (Conservative) interpretation abstract representation Set of states abstraction Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states abstract representation 

The Abstraction Function u Map collecting states into constants u The abstraction of an individual state  CP :[Var *  Z]  [Var *  Z  { ,  }]  CP (  ) =  u The abstraction of set of states  CP :P([Var *  Z])  [Var *  Z  { ,  }]  CP (CS) =  {  CP (  ) |   CS} =  {  |   CS} u Soundness  CP (CS entry (v))  df entry (v) u Completeness

The Concretization Function u Map constants into collecting states u The formal meaning of constants u The concretization  CP : [Var *  Z  { ,  }]  P([Var *  Z])  CP (df) = {  |  CP (  )  df} = {  |   df} u Soundness CS entry (v)   CP (df entry (v)) u Optimality

Galois Connection u  CP is monotone u  CP is monotone u  df  [Var *  Z  { ,  }] –  CP (  CP (df))  df u  c  P([Var *  Z]) –c CP   CP (  CP (C))

Local Concrete Semantics u For every atomic statement S –  S  : [Var *  Z]  [Var *  Z] –  x := a]  s = s[x  A  a  s] –  skip]  s = s u For Boolean conditions …

Local Abstract Semantics u For every atomic statement S –  S  # :Var *  L  Var *  L –  x := a  # (e) = e [x   a  # (e) –  skip  # (e) = e u For Booleans …

Local Soundness u For every atomic statement S show one of the following –  CP ({  S   |   CS }   S  # (  CP (CS)) –{  S   |    CP (df)}   CP (  S  # (df)) –  ({  S   |    CP (df)})   S  # (df) u The above condition implies global soundness [Cousot & Cousot 1976]  (CS entry (l))  df entry (l) CS entry (l)   (df entry (l))

Lemma 1 Consider a lattice L. f: L  L is monotone iff for all X  L:  {f(z) | z  X }  f(  {z | z  X })

Assignments in constant propagation u Monotone –df 1  df 2   x :=e  ) # df 1 )   x :=e  ) # df 2 ( u Local Soundness –  ({  x :=e   |   CS }   x :=e  # (  (CS))

Soundness Theorem(1) 1. Let ( ,  ) form Galois connection from C to A 2. f: C  C be a monotone function 3. f # : A  A be a monotone function 4.  a  A: f(  (a))   (f # (a)) lfp(f)   (lfp(f # ))  (lfp(f))  lfp(f # )

Soundness Theorem(2) 1. Let ( ,  ) form Galois connection from C to A 2. f: C  C be a monotone function 3. f # : A  A be a monotone function 4.  c  C:  (f(c))  f # (  (c))  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # ))

Soundness Theorem(3) 1. Let ( ,  ) form Galois connection from C to A 2. f: C  C be a monotone function 3. f # : A  A be a monotone function 4.  a  A:  (f(  (a)))  f # (a)  (lfp(f))  lfp(f # ) lfp(f)   (lfp(f # ))

Proof of Soundness (Summary) u Define an “appropriate” structural operational semantics u Define “collecting” structural operational semantics u Establish a Galois connection between collecting states and reaching definitions u (Local correctness) Show that the abstract interpretation of every atomic statement is sound w.r.t. the collecting semantics u (Global correctness) Conclude that the analysis is sound

Best (Conservative) interpretation abstract representation Set of states concretization Abstract semantics statement s abstract representation abstraction Operational semantics statement s Set of states concretization Set of states 

Induced Analysis (Relatively Optimal) u It is sometimes possible to show that a given analysis is not only sound but optimal w.r.t. the chosen abstraction –but not necessarily optimal! u Define  S  # (df) =  ({  S   |    (df)}) u But this  S  # may not be computable u Derive (at compiler-generation time) an alternative form for  S  # u A useful measure to decide if the abstraction must lead to overly imprecise results

Example Dataflow Problem u Formal available expression analysis u Find out which expressions are available at a given program point u Example program x = y + t z = y + r while (…) { t = t + (y + r) } u Lattice u Galois connection u Basic statements u Soundness

Example: May-Be-Garbage u A variable x may-be-garbage at a program point v if there exists a execution path leading to v in which x’s value is unpredictable: –Was not assigned –Was assigned using an unpredictable expression u Lattice u Galois connection u Basic statements u Soundness

Points-To Analysis u Determine if a pointer variable p may point to q on some path leading to a program point u “Adapt” other optimizations –Constant propagation x = 5; *p = 7 ; … x … u Pointer aliases –Variables p and q are may-aliases at v if the points-to set at v contains entries (p, x) and (q, x) u Side-effect analysis *p = *q + * * t

The PWhile Programming Language Abstract Syntax a := x | *x | &x | n | a 1 op a a 2 b := true | false | not b | b 1 op b b 2 | a 1 op r a 2 S := x := a | *x := a | skip | S 1 ; S 2 | if b then S 1 else S 2 | while b do S

Concrete Semantics 1 for PWhile For every atomic statement S  S  : States1  States1  x := a  (  )=  [loc(x)  A  a   ]  x := &y  (  )  x := *y  (  )  x := y  (  )  *x := y  (  ) State1= [Loc  Loc  Z]

Points-To Analysis u Lattice L pt = u Galois connection

Abstract Semantics for PWhile For every atomic statement S  S  # : P(Var*  Var*)  P(Var*  Var*)  x := &y  #  x := *y  #  x := y  #  *x := y  #

t := &a; y := &b; z := &c; if x> 0; then p:= &y; else p:= &z; *p := t;

/*  */ t := &a; /* {(t, a)}*/ /* {(t, a)}*/ y := &b; /* {(t, a), (y, b) }*/ /* {(t, a), (y, b)}*/ z := &c; /* {(t, a), (y, b), (z, c) }*/ if x> 0; then p:= &y; /* {(t, a), (y, b), (z, c), (p, y)}*/ else p:= &z; /* {(t, a), (y, b), (z, c), (p, z)}*/ /* {(t, a), (y, b), (z, c), (p, y), (p, z)}*/ *p := t; /* {(t, a), (y, b), (y, c), (p, y), (p, z), (y, a), (z, a)}*/

Flow insensitive points-to-analysis Steengard 1996 u Ignore control flow u One set of points-to per program u Can be represented as a directed graph u Conservative approximation –Accumulate pointers u Can be computed in almost linear time

t := &a; y := &b; z := &c; if x> 0; then p:= &y; else p:= &z; *p := t;

Precision u We cannot usually have –  (CS) = DF on all programs u But can we say something about precision in all programs?

The Join-Over-All-Paths (JOP) u Let paths(v) denote the potentially infinite set paths from start to v (written as sequences of labels) u For a sequence of edges [e 1, e 2, …, e n ] define f [e 1, e 2, …, e n ]: L  L by composing the effects of basic blocks f [e 1, e 2, …, e n ](l) = f(e n ) (… (f(e 2 ) (f(e 1 ) (l)) …) u JOP[v] =  {f[e 1, e 2, …,e n ](  ) [e 1, e 2, …, e n ]  paths(v)}

JOP vs. Least Solution u The DF solution obtained by Chaotic iteration satisfies for every l: –JOP[v]  DF entry (v) u A function f is additive (distributive) if –f(  {x| x  X}) =  {f(x) |  X} u If every f l is additive (distributive) for all the nodes v –JOP[v] = DF entry (v)

Conclusions u Chaotic iterations is a powerful technique u Easy to implement u Rather precise u But expensive –More efficient methods exist for structured programs u Abstract interpretation relates runtime semantics and static information u The concrete semantics serves as a tool in designing abstractions –More intuition will be given in the sequel