Lecture 03 – Background + intro AI + dataflow and connection to AI Eran Yahav 1.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Lecture 15 – Dataflow Analysis Eran Yahav 1
Lecture 02 – Structural Operational Semantics (SOS) Eran Yahav 1.
Lecture 16 – Dataflow Analysis Eran Yahav 1 Reference: Dragon 9, 12.
Lecture 04 – Numerical Abstractions Eran Yahav 1.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
CSE 231 : Advanced Compilers Building Program Analyzers.
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
1 Iterative Program Analysis Part I Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Administrative stuff Office hours: After class on Tuesday.
Data Flow Analysis Compiler Design Nov. 8, 2005.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Data flow analysis Emery Berger University.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Claus Brabrand, ITU, Denmark DATA-FLOW ANALYSISMar 25, 2009 Static Analysis: Data-Flow Analysis II Claus Brabrand IT University of Copenhagen (
Claus Brabrand, UFPE, Brazil Aug 09, 2010DATA-FLOW ANALYSIS Claus Brabrand ((( ))) Associate Professor, Ph.D. ((( Programming, Logic, and.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Sets, POSets, and Lattice © Marcelo d’Amorim 2010.
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.
MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Solving fixpoint equations
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III Roman Manevich Ben-Gurion University.
Program Analysis and Verification Noam Rinetzky Lecture 6: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Program Analysis and Verification
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 9: Abstract Interpretation I Roman Manevich Ben-Gurion University.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University.
CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 12: Abstract Interpretation IV Roman Manevich Ben-Gurion University.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Lecture 13 – Dataflow Analysis (and more!) Eran Yahav 1 Reference: Dragon 9, 12.
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 13: Abstract Interpretation V Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Program Analysis and Verification
Program Analysis and Verification
1 Iterative Program Analysis Abstract Interpretation Mooly Sagiv Tel Aviv University Textbook:
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Program Analysis and Verification Noam Rinetzky Lecture 8: Abstract Interpretation 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Program Analysis and Verification
Iterative Dataflow Problems Taken largely from notes of Alex Aiken (UC Berkeley) and Martin Rinard (MIT) Dataflow information used in optimization Several.
Dataflow Analysis CS What I s Dataflow Analysis? Static analysis reasoning about flow of data in program Different kinds of data: constants, variables,
Compiler Principles Fall Compiler Principles Lecture 9: Dataflow & Optimizations 2 Roman Manevich Ben-Gurion University of the Negev.
Spring 2017 Program Analysis and Verification
Spring 2016 Program Analysis and Verification
Iterative Program Analysis Abstract Interpretation
Program Analysis and Verification
Fall Compiler Principles Lecture 10: Global Optimizations
Data Flow Analysis Compiler Design
Lecture 20: Dataflow Analysis Frameworks 11 Mar 02
Presentation transcript:

Lecture 03 – Background + intro AI + dataflow and connection to AI Eran Yahav 1

Previously…  structural operational semantics  trace semantics  meaning of a program as the set of its traces  abstract interpretation  abstractions of the trace semantics 2 Example: Cormac’s PLDI’08 Example: Kuperstein PLDI’11

Today  abstract interpretation  some basic math background  lattices  functions  Galois connection  A little more dataflow analysis  monotone framework 3

Trace Semantics  note that input (x) is unknown  while loop?  trace semantics is not computable [y := x] 1 ; [z := 1] 2 ; while [y > 0] 3 ( [z := z * y] 4 ; [y := y − 1] 5 ; ) [y := 0] 6 … 4    … [y := x] 1 [z := 1] 2 [y > 0] 3    … [y := x] 1 [z := 1] 2 [y > 0] 3

Abstract Interpretation 5 CC’ concrete  ’’ set of states abstract state abstract SS SS   C’’ 

Dataflow Analysis  Reaching Definitions  The assignment lab: var := exp reaches lab’ if there is an execution where var was last assigned at lab 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 (adapted from Nielson, Nielson & Hankin) { (x,?), (y,?), (z,?) } { (x,?), (y,1), (z,?) } { (x,?), (y,1), (z,2) } { (x,?), (y,?), (z,4) } { (x,?), (y,5), (z,4) } { (x,?), (y,1), (z,2) } { (x,?), (y,?), (z,?) } { (x,?), (y,1), (z,2) } 6

Dataflow Analysis  Reaching Definitions  The assignment lab: var := exp reaches lab’ if there is an execution where var was last assigned at lab 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 (adapted from Nielson, Nielson & Hankin) { (x,?), (y,?), (z,?) } { (x,?), (y,1), (z,?) } { (x,?), (y,1), (z,2), (y,5), (z,4) } { (x,?), (y,?), (z,4), (y,5) } { (x,?), (y,5), (z,4) } { (x,?), (y,1), (z,2), (y,5), (z,4) } { (x,?), (y,6), (z,2), (z,4) } { (x,?), (y,1), (z,2), (y,5), (z,4) } 7

Dataflow Analysis  Build control-flow graph  Assign transfer functions  Compute fixed point 8

Control-Flow Graph 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 1: y:=x 2: z:=1 3: y > 0 4: z=z*y 5: y=y-1 6: y:=0 9

Transfer Functions 1: y:=x 2: z:=1 3: y > 0 4: z=z*y 5: y=y-1 6: y:=0 out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) in(6) = out(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } out(3) = in(3) 10

System of Equations in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 11

System of Equations in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 in(1)={(x,?),(y,?),(z,?)}, in(2)=out(1), in(3)=out(2) U out(5), in(4)=out(3), in(5)=out(4),In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) }, out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3), out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) }, out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } In(1)  {(x,?),(y,?),(z,?)}== In(2)  {(y,1)}{(x,?),(y,1),(z,?)}== In(3)  {(z,2),(y,5)} {(z,2),(z,4),(y,5)} In(4)  {(z,2),(y,5)} In(5)  {(z,4)} In(6)  {(z,2),(y,5)} Out(1)  {(y,1)}{(x,?),(y,1),(z,?)}== Out(2)  {(z,2)} {(z,2),(y,1)}== Out(3)  {(z,2),(y,5)} Out(4)  {(z,4)} Out(5)  {(y,5)} {(z,4),(y,5)} Out(6)  {(y,6)} … 12

F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 RD  RD’ when  i: RD(i)  RD’(i) … RD  F(RD  )F(F(RD  ))F(F(F(RD  )))F(F(F(F(RD  )))) In(1)  {(x,?),(y,?),(z,?)}== In(2)  {(y,1)}{(x,?),(y,1),(z,?)}== In(3)  {(z,2),(y,5)} {(z,2),(z,4),(y,5)} In(4)  {(z,2),(y,5)} In(5)  {(z,4)} In(6)  {(z,2),(y,5)} Out(1)  {(y,1)}{(x,?),(y,1),(z,?)}== Out(2)  {(z,2)} {(z,2),(y,1)}== Out(3)  {(z,2),(y,5)} Out(4)  {(z,4)} Out(5)  {(y,5)} {(z,4),(y,5)} Out(6)  {(y,6)} System of Equations 13

Monotonicity F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 RD  F(RD  )F(F(RD  ))F(F(F(RD  )))F(F(F(F(RD  )))) … Out(1)  {(y,1)}{(x,?),(y,1),(z,?)}{(y,1)}…. … … out(1) = {(y,1)} when (x,?)  out(1) in(1) \ { (y,l) | l  Lab } U { (y,1) }, otherwise (silly example just for illustration) 14

Convergence 0 1 … 2 n 0 1 … 2 n 0 1 … 2 15

Least Fixed Point  We will see later why it exists  For now, mostly informally… F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 RD  RD’ when  i: RD(i)  RD’(i) F is monotone: RD  RD’ implies that F(RD)  F(RD’) RD  = ( , ,…,  ) F(RD  ), F(F(RD  ), F(F(F(RD  )), … F n (RD  ) F n+1 (RD  ) = F n (RD  ) RD  F(RD  )   F(F(RD  )) …  F n (RD  ) 16

Partially Ordered Sets (posets)  Partial ordering is a relation  : L  L  { true, false} that is:  Reflexive (  l: l  l)  Transitive (  l1,l2,l3: l1  l2  l2  l3  l1  l3)  Anti-symmetric ( l1  l2  l2  l1  l1 = l2)  A partially ordered set (L,  ) is a set L equipped with a partial ordering   For example: (  (S),  )  Captures intuition of implication between facts  l1  l2 will intuitively mean l1  l2  Later, in abstract domains, when l1  l2, we will say that l1 is “more precise” than l2 (represents fewer concrete states) 17

Bounds in Posets  Given a poset (L,  )  A subset Y of L has l  L as an upper bound if  l’  Y : l’  l  A subset Y of L has l  L as a lower bound if  l’  Y : l’  l (we write l’  l when l  l’)  A least upper bound (lub) l of Y is an upper bound of Y that satisfies l  l’ whenever l’ is another upper bound of Y  A greatest lower bound (glb) l of Y is a lower bound of Y that satisfies l  l’ whenever l’ is another lower bound of Y  For any subset Y  L  If the lub exists then it is unique, and denoted by  Y (join)  We often write l1  l2 for  { l1, l2}  If the glb exists then it is unique, and denoted by  Y (meet)  We often write l1  l2 for  { l1, l2} 18

Complete Lattices  A complete lattice (L,  ) is a poset such that all subsets have least upper bounds as well as greatest lower bounds  In particular   =  =  L is the least element (bottom)   =  L =  is the greatest element (top) 19

Complete Lattices: Examples  {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3}  - +  0 20

Complete Lattices: Examples  [0,0][-1,-1][-2,-2] [-2,-1] [-2,0] [1,1][2,2] [-1,0] [0,1][1,2] … [-1,1][0,2] [-2,1][-1,2] [-2,2] …… [2,  ] … [1,  ] [0,  ] [-1,  ] [-2,  ] … … … … [- ,  ] … … [- ,-2] … [- ,-1] [- ,0] [- ,1] [- ,2] … … … … 21

Moore Families  A Moore family is a subset Y of a complete lattice (L,  ) that is closed under greatest lower bounds   Y’  Y:  Y’  Y  Hence, a Moore family is never empty  Always contains least element  Y and a greatest element   Why do we care?  We will see that Moore families correspond to choices of abstractions (as targets of upper closure operators)  More on this… later… 22

Moore Families: Examples  {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3} {2} {1,2} {2,3} {1,2,3}  {1,2,2} 23

Chains  In program analysis we will often look at sequences of properties of the form l0  l1  …  ln  Given a poset (L,  ) a subset Y  L is a chain if every two elements in Y are ordered   l1,l2  Y: (l1  l2) or (l2  l1)  When Y is a finite subset we say that the chain is a finite chain  A sequence of elements l0  l1  …  L is an ascending chain  A sequence of elements l0  l1  …  L is a descending chain  L has finite height when all chains in L are finite  A poset (L,  ) satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes, that is  s  N:  n  N: n  s  l n = l s 24

Chains: Example Posets -- -2 … … 2  … …   --  0 1 (a)(b)(c) 25

Properties of Functions  A function f: L1  L2 between two posets (L1,  1) and (L2,  2) is  Monotone (order-preserving)   l,l’  L1: l  1 l’  f(l)  2 f(l’)  Special case f: L  L,  l,l’  L: l  l’  f(l)  f(l’)  Distributive (join morphism)   l,l’  L1 f(l  l’) = f(l)  f(l’) 26

Function Properties: Examples  {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3} 27

Function Properties: Examples {2} {1,2} {2,3} {1,2,3} {2} {1,2} {2,3} {1,2,3} 28

Function Properties: Examples {2} {1,2} {2,3} {1,2,3} {2} {1,2} {2,3} {1,2,3} 29

Function Properties: Examples  {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3}  {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3} 30

Fixed Points  Consider a complete lattice L = (L, , , , ,  ) and a monotone function f: L  L  An element l  L is  A fixed point iff f(l) = l  A pre-fixedpoint iff l  f(l)  A post-fixedpoint iff l  f(l)  Fix(f) = set of all fixed points  Ext(f) = set of all pre-FP  Red(f) = set of all post-FP Red(f) Ext(f) Fix(f)   lfp gfp fn()fn() fn()fn() 31

Tarski’s Theorem  Consider a complete lattice L = (L, , , , ,  ) and a monotone function f: L  L, then lfp(f) and gfp(f) exist and  lfp(f) =  Red(f) =  Fix(f)  gfp(f) =  Ext(f) =  Fix(f) 32

Kleene’s Fixedpoint Theorem  Consider a complete lattice L = (L, , , , ,  ) and a continuous function f: L  L then lfp(f) =  n  N f n (  )  A function f is continuous if for every increasing chain Y  L, f(  Y) =  { f(y) | y  Y }  Monotone functions on posets satisfying ACC are continuous  Every ascending chain eventually stabilizes l0  l1  …  ln = ln+1 = … hence ln is the least upper bound of { l0,l1,…,ln }, thus f(  Y) = f(ln)  From monotonicity of f f(l0)  f(l1)  …  f(ln) = f(ln+1) = … Hence f(ln) is the least upper bound of { f(l0),f(l1), …,f(ln) }, thus  { f(y) | y  Y } = f(ln) 33

An Algorithm for Computing lfp  Kleene’s fixed point theorem gives a constructive method for computing the lfp lfp(f) =  n  N f n (  ) l =  while f(l)  l do l = f(l)  In our case, can compute efficiently using a worklist algorithm for “chaotic iteration” 34

Finally… 35

Reaching Definitions… in(1) = { (x,?), (y,?), (z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) } out(2) = in(2) \ { (z,l) | l  Lab } U { (z,2) } out(3) = in(3) out(4) = in(4) \ { (z,l) | l  Lab } U { (z,4) } out(5) = in(5) \ { (y,l) | l  Lab } U { (y,5) } out(6) = in(6) \ { (y,l) | l  Lab } U { (y,6) } F: (  (Var x Lab) ) 12  (  (Var x Lab) ) 12 l =  while F(l)  l do l = F(l) 36

Algorithm Revisited  Input: equation system for RD  Output: lfp(F)  Algorithm:  RD1 = , …, RD12 =   While RDj  Fj(RD1,…,RD12) for some j  RDj = Fj(RD1,…,RD12)  Idea: non-deterministically select which component of RD should make a step  Can make selection efficiently based on dependencies  We have information on what equations depend on Fj that has been updated, need only recompute these on every step 37

But…  where did the transfer functions come from?  e.g., out(1) = in(1) \ { (y,l) | l  Lab } U { (y,1) }  how do we know transfer functions really over-approximate the concrete operations? 38

Trace Semantics  note that input (x) is unknown  while loop?  trace semantics is not computable [y := x] 1 ; [z := 1] 2 ; while [y > 0] 3 ( [z := z * y] 4 ; [y := y − 1] 5 ; ) [y := 0] 6 … 39    … [y := x] 1 [z := 1] 2 [y > 0] 3    … [y := x] 1 [z := 1] 2 [y > 0] 3

Instrumented Trace Semantics  on every assignment, record the label in which the assignment was performed 40

Abstract Interpretation  Instrumented trace semantics 1: y := x; 2: z := 1; 3: while y > 0 { 4: z := z * y; 5: y := y − 1 } 6: y := 0 (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(z,4)(y,5)(y,6) … (x,?)(y,?)(z,?)(y,1)(z,2)(y,6) 41

Collecting Semantics 1: y:=x 2: z:=1 3: y > 0 4: z=z*y 5: y=y-1 6: y:=0 (x,?)(y,?)(z,?)(y,1) (x,?)(y,?)(z,?) (y,1)(z,2) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) (x,?)(y,?)(z,?)(y,1)(z,2) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?) (x,?)(y,?)(z,?)(y,1) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) (x,?)(y,?)(z,?) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5) (x,?)(y,?)(z,?)(y,1)(z,2) (y,6) 42

System of Equations in(1) = { (x,?)(y,?)(z,?) } in(2) = out(1) in(3) = out(2) U out(5) in(4) = out(3) in(5) = out(4) In(6) = out(3) out(1) = {   (y,1) |   in(1) } out(2) = {   (z,2) |   in(2) } out(3) = in(3) out(4) = {   (z,4) |   in(4) } out(5) = {   (y,5) |   in(5) } out(6) = {   (y,6) |   in(6) } G: (  (Trace) ) 12  (  (Trace) ) 12  Trace = (Var x Lab)* TR  G(TR  )   G(G(TR  )) …  G n (TR  ) … lfp(G) 43

Galois Connection   C  (C)  (A)A   (C)  A  C   (A) 44

  C  (C)  (A) A  Reaching Definitions: Abstraction  DOM(  ) = variables in   SRD(  )(v) = l iff rightmost pair for v in  is (v,l)   (C) = { (v,SRD(  )(v)) | v  DOM(  )    C }   (A) = {  |  v  DOM(  ) : (v,SRD(  )(v) )  A } (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(y,6) (x,?),(z,4),(y,6) (x,?),(z,2),(y,6) (x,?),(z,2),(z,4),(y,6) 45 Set of Traces Set of (var,lab) pairs

  C  (C)  (A) A  Reaching Definitions: Concretization  DOM(  ) = variables in   SRD(  )(v) = l iff rightmost pair for v in  is (v,l)   (C) = { (v,SRD(  )(v)) | v  DOM(  )    C }   (A) = {  |  v  DOM(  ) : (v,SRD(  )(v) )  A } (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)(z,4)(y,5)(y,6) (x,?),(z,4),(y,6) (x,?)(y,?)(z,?)(y,1)(z,2)(z,4)(y,5)…(z,4)(y,5)(y,6) … 46

Best (Induced) Transformer   C A C’ A’ How do we figure out the abstract effect  S  # of a statement S ? ? S#S# SS 47

Sound Abstract Transformer   C A C’ A’ S#S# SS  Aopt    S    (A)   S  # (A) 48

Soundness of Induced Analysis RD  F(RD  )   F(F(RD  )) …  lfp(F) TR  G(TR  )   G(G(TR  )) …  G n (TR  ) … lfp(G)   (lfp(G))   (lfp(G))  lfp(   G   )  lfp(F) 49

Back to a bit of dataflow analysis… 50

Recap  Represent properties of a program using a lattice (L, , , , ,  )  A continuous function f: L  L  Monotone function when L satisfies ACC implies continuous  Kleene’s fixedpoint theorem  lfp(f) =  n  N f n (  )  A constructive method for computing the lfp 51

Some required notation 52 blocks : Stmt  P(Blocks) blocks([x := a] lab ) = {[x := a] lab } blocks([skip] lab ) = {[skip] lab } blocks(S1; S2) = blocks(S1)  blocks(S2) blocks(if [b] lab then S1 else S2) = {[b] lab }  blocks(S1)  blocks(S2) blocks(while [b] lab do S) = {[b] lab }  blocks(S) FV: (BExp  AExp)  Var Variables used in an expression AExp(a) = all non-unit expressions in the arithmetic expression a similarly AExp(b) for a boolean expression b

Available Expressions Analysis 53 For each program point, which expressions must have already been computed, and not later modified, on all paths to the program point [x := a+b] 1 ; [y := a*b] 2 ; while [y > a+b] 3 ( [a := a + 1] 4 ; [x := a + b] 5 ) (a+b) always available at label 3

Available Expressions Analysis  Property space  in AE, out AE : Lab   (AExp)  Mapping a label to set of arithmetic expressions available at that label  Dataflow equations  Flow equations – how to join incoming dataflow facts  Effect equations - given an input set of expressions S, what is the effect of a statement 54

 in AE (lab) =   when lab is the initial label   { out AE (lab’) | lab’  pred(lab) } otherwise  out AE (lab) = … 55 Blockout (lab) [x := a] lab in(lab) \ { a’  AExp | x  FV(a’) } U { a’  AExp(a) | x  FV(a’) } [skip] lab in(lab) [b] lab in(lab) U AExp(b) Available Expressions Analysis From now on going to drop the AE subscript when clear from context

Transfer Functions 1: x = a+b 2: y:=a*b 3: y > a+b 4: a=a+1 5: x=a+b out(1) = in(1) U { a+b } out(2) = in(2) U { a*b } in(1) =  in(2) = out(1) in(3) = out(2)  out(5) in(4) = out(3) in(5) = out(4) out(4) = in(4) \ { a+b,a*b,a+1 } out(5) = in(5) U { a+b } out(3) = in(3) U { a+ b } 56 [x := a+b] 1 ; [y := a*b] 2 ; while [y > a+b] 3 ( [a := a + 1] 4 ; [x := a + b] 5 )

Solution 1: x = a+b 2: y:=a*b 3: y > a+b 4: a=a+1 5: x=a+b in(2) = out(1) = { a + b } out(2) = { a+b, a*b } out(4) =  out(5) = { a+b } in(4) = out(3) = { a+ b } 57 in(1) =  in(3) = { a + b }

Kill/Gen 58 Blockout (lab) [x := a] lab in(lab) \ { a’  AExp | x  FV(a’) } U { a’  AExp(a) | x  FV(a’) } [skip] lab in(lab) [b] lab in(lab) U AExp(b) Blockkillgen [x := a] lab { a’  AExp | x  FV(a’) } { a’  AExp(a) | x  FV(a’) } [skip] lab  [b] lab  AExp(b) out(lab) = in(lab) \ kill(B lab ) U gen(B lab ) B lab = block at label lab

Why solution with largest sets? 59 1: z = x+y 2: true 3: skip out(1) = in(1) U { x+y } in(1) =  in(2) = out(1)  out(3) in(3) = out(2) out(3) = in(3) out(2) = in(2) [z := x+y] 1 ; while [true] 2 ( [skip] 3 ; ) in(1) =  After simplification: in(2) = in(2)  { x+y } Solutions: {x+y} or  in(2) = out(1)  out(3) in(3) = out(2)

Reaching Definitions Revisited 60 Blockout (lab) [x := a] lab in(lab) \ { (x,l) | l  Lab } U { (x,lab) } [skip] lab in(lab) [b] lab in(lab) Blockkillgen [x := a] lab { (x,l) | l  Lab }{ (x,lab) } [skip] lab  [b] lab  For each program point, which assignments may have been made and not overwritten, when program execution reaches this point along some path.

Why solution with smallest sets? 61 1: z = x+y 2: true 3: skip out(1) = ( in(1) \ { (z,?) } ) U { (z,1) } in(1) = { (x,?),(y,?),(z,?) } in(2) = out(1) U out(3) in(3) = out(2) out(3) = in(3) out(2) = in(2) [z := x+y] 1 ; while [true] 2 ( [skip] 3 ; ) in(1) = { (x,?),(y,?),(z,?) } After simplification: in(2) = in(2) U { (x,?),(y,?),(z,1) } Many solutions: any superset of { (x,?),(y,?),(z,1) } in(2) = out(1) U out(3) in(3) = out(2)

Live Variables 62 For each program point, which variables may be live at the exit from the point. [ x :=2] 1 ; [y:=4] 2 ; [x:=1] 3 ; (if [y>x] 4 then [z:=y] 5 else [z:=y*y] 6 ); [x:=z] 7

[ x :=2] 1 ; [y:=4] 2 ; [x:=1] 3 ; (if [y>x] 4 then [z:=y] 5 else [z:=y*y] 6 ); [x:=z] 7 Live Variables 1: x := 2 2: y:=4 4: y > x 5: z := y 7: x := z 6: z = y*y 3: x:=1

1: x := 2 2: y:=4 4: y > x 5: z := y 7: x := z 64 [ x :=2] 1 ; [y:=4] 2 ; [x:=1] 3 ; (if [y>x] 4 then [z:=y] 5 else [z:=y*y] 6 ); [x:=z] 7 6: z = y*y Live Variables Blockkillgen [x := a] lab { x }{ FV(a) } [skip] lab  [b] lab  FV(b) 3: x:=1

1: x := 2 2: y:=4 4: y > x 5: z := y 7: x := z 65 [ x :=2] 1 ; [y:=4] 2 ; [x:=1] 3 ; (if [y>x] 4 then [z:=y] 5 else [z:=y*y] 6 ); [x:=z] 7 6: z = y*y Live Variables: solution Blockkillgen [x := a] lab { x }{ FV(a) } [skip] lab  [b] lab  FV(b) 3: x:=1 in(1) =  out(1) = in(2) =  out(2) = in(3) = { y } out(3) = in(4) = { x,y } in(7) = { z } out(7) =  out(6) = { z } in(6) = { y } out(5) = { z } in(5) = { y } in(4) = { x,y } out(4) = { y }

Why solution with smallest set? 66 1: x>1 2: skip out(1) = in(2) U in(3) out(2) = in(1) out(3) =  in(2) = out(2) while [x>1] 1 ( [skip] 2 ; ) [x := x+1] 3 ; After simplification: in(1) = in(1) U { x } Many solutions: any superset of { x } 3: x := x + 1 in(3) = { x } in(1) = out(1)  { x }

Monotone Frameworks   is  or   CFG edges go either forward or backwards  Entry labels are either initial program labels or final program labels (when going backwards)  Initial is an initial state (or final when going backwards)  f lab is the transfer function associated with the blocks B lab 67 In(lab) =  { out(lab’) | (lab’,lab)  CFG edges } Initialwhen lab  Entry labels otherwise out(lab) = f lab (in(lab))

Forward vs. Backward Analyses 68 1: x := 2 2: y:=4 4: y > x 5: z := y 7: x := z 6: z = y*y 1: x := 2 2: y:=4 4: y > x 5: z := y 7: x := z 6: z = y*y {(x,1), (y,?), (z,?) } { (x,?), (y,?), (z,?) } {(x,1), (y,2), (z,?) }  { z } { y }

Must vs. May Analyses  When  is  - must analysis  Want largest sets the solves the equation system  Properties hold on all paths reaching a label (exiting a label, for backwards)  When  is  - may analysis  Want smallest sets that solve the equation system  Properties hold at least on one path reaching a label (existing a label, for backwards) 69

Example: Reaching Definition  L =  (Var×Lab) is partially ordered by    is   L satisfies the Ascending Chain Condition because Var × Lab is finite (for a given program) 70

Example: Available Expressions  L =  (AExp) is partially ordered by    is   L satisfies the Ascending Chain Condition because AExp is finite (for a given program) 71

Analyses Summary 72 Reaching Definitions Available Expressions Live Variables L  (Var x Lab)  (AExp)  (Var)    AExp  Initial{ (x,?) | x  Var}  Entry labels{ init } final DirectionForward Backward F{ f: L  L |  k,g : f(val) = (val \ k) U g } f lab f lab (val) = (val \ kill)  gen

Analyses as Monotone Frameworks  Property space  Powerset  Clearly a complete lattice  Transformers  Kill/gen form  Monotone functions (let’s show it) 73

Monotonicity of Kill/Gen transformers  Have to show that x  x’ implies f(x)  f(x’)  Assume x  x’, then for kill set k and gen set g (x \ k) U g  (x’ \ k) U g  Technically, since we want to show it for all functions in F, we also have to show that the set is closed under function composition 74

Distributivity of Kill/Gen transformers  Have to show that f(x  y)  f(x)  f(y)  f(x  y) = ((x  y) \ k) U g = ((x \ k)  (y \ k)) U g = (((x \ k) U g)  ((y \ k) U g)) = f(x)  f(y)  Used distributivity of  and U  Works regardless of whether  is U or  75

Constant Propagation 76 … …   --  01   State = (Var  Z  )  No information Variable not a constant

Constant Propagation  L = ( (Var  Z  ) ,  )   1   2 iff  x:  1(x)   2(x)   ordering in the Z   lattice  Examples:  [x  , y  42, z   ]  [x  , y  42, z  73]  [x  , y  42, z  73]  [x  , y  42, z   ] 77

78 Constant Propagation A: AExp  (State  Z) A  a1 op a2  = A  a1  op A  a2  A  x  =  (x) otherwise  If  =  A  n  = n otherwise  If  =  Blockout (  ) [x := a] lab  If  =   [x  A  a  ] otherwise [skip] lab  [b] lab 

Example 79 [x :=42] 1 ; [y:=73] 2 ; (if [?] 3 then [z:=x+y] 4 else [z:=12] 5 [w:=z] 6 ); [X  , y  , z  , w   ] [X  42, y  , z  , w   ] [X  42, y  73, z  , w   ] [X  42, y  73, z  115, w   ] [X  42, y  73, z  12, w   ] [X  42, y  73, z  12, w  12] [X  42, y  73, z  , w  12] ……   --  ……

Constant Propagation is Non Distributive  Consider the transformer f =  [y=x*x]  #  Consider two states  1,  2   1(x) = 1   2(x) = (  1   2)(x) =  f(  1   2) maps y to  f(  1) maps y to 1 f(  2) maps y to 1 f(  1)  f(  2) maps y to 1 f(  1   2)  f(  1)  f(  2)