CSE 231 : Advanced Compilers Building Program Analyzers.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
1 CS 201 Compiler Construction Data Flow Framework.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
CSE 231 : Advanced Compilers Building Program Analyzers.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Another example: constant prop Set D = 2 { x ! N | x ∊ Vars Æ N ∊ Z } x := N in out F x := N (in) = in – { x ! * } [ { x ! N } x := y op z in out F x :=
CS 536 Spring Global Optimizations Lecture 23.
Correctness. Until now We’ve seen how to define dataflow analyses How do we know our analyses are correct? We could reason about each individual analysis.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.
Another example: constant prop Set D = 2 { x ! N | x 2 Vars Æ N 2 Z } x := N in out F x := N (in) = in – { x ! * } [ { x ! N } x := y op z in out F x :=
From last time: Lattices A lattice is a tuple (S, v, ?, >, t, u ) such that: –(S, v ) is a poset – 8 a 2 S. ? v a – 8 a 2 S. a v > –Every two elements.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Back to lattice (D, v, ?, >, t, u ) = (2 A, ¶, A, ;, Å, [ ) where A = { x ! N | x 2 Vars Æ N 2 Z } What’s the problem with this lattice? Lattice is infinitely.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Administrative stuff Office hours: After class on Tuesday.
Recap Let’s do a recap of what we’ve seen so far Started with worklist algorithm for reaching definitions.
Constraints for reaching definitions Using may-point-to information: out = in [ { x ! s | x 2 may-point-to(p) } Using must-point-to aswell: out = in –
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.
Project Project proposal due today (lots of projects have already sent theirs in!) I want some concrete results in 3 weeks! –for implementation projects,
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Projects. Dataflow analysis Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why.
Project web page
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Data flow analysis Emery Berger University.
1 CS 201 Compiler Construction Lecture 4 Data Flow Framework.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Composing Dataflow Analyses and Transformations Sorin Lerner (University of Washington) David Grove (IBM T.J. Watson) Craig Chambers (University of Washington)
Even more formal To reason more formally about termination and precision, we re-express our worklist algorithm mathematically We will use fixed points.
Termination Still, it’s annoying to have to perform a join in the worklist algorithm It would be nice to get rid of it, if there is a property of the flow.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Precision Going back to constant prop, in what cases would we lose precision?
Example x := read() v := a + b x := x + 1 w := x + 1 a := w v := a + b z := x + 1 t := a + b.
Data Flow Analysis. 2 Source code parsed to produce AST AST transformed to CFG Data flow analysis operates on control flow graph (and other intermediate.
Solving fixpoint equations
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
CS 614: Theory and Construction of Compilers Lecture 17 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Formalization of DFA using lattices. Recall worklist algorithm let m: map from edge to computed value at edge let worklist: work list of nodes for each.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis: Data-flow frameworks –Classic.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Yet More Data flow analysis John Cavazos.
Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d.
DFA foundations Simone Campanoni
Dataflow analysis.
Simone Campanoni DFA foundations Simone Campanoni
Another example: constant prop
Dataflow analysis.
Data Flow Analysis Compiler Design
Formalization of DFA using lattices
Formalization of DFA using lattices
Formalization of DFA using lattices
Formalization of DFA using lattices
Presentation transcript:

CSE 231 : Advanced Compilers Building Program Analyzers

dataflow analysis

us/um/people/simonpj/papers/c--/dfopt.pdf

us/um/people/simonpj/papers/c--/dfopt.pdf

Now where were we…

for edge e in CFG: m[e] = EMPTY for node n in CFG: q.push(n) while not q.empty(): n = q.pop() info_in = m[n.in_edges] info_out = F(n, info_in) for i from 0 to len(info_out): e = n.out_edges[i] new_info = m[e] UNION info_out[i] if m[e] != new_info: m[e] = new_info q.push(e.dest) Started with sets. Termination worries.

Foundations : Lattices A lattice is (S, ⊑, ⊥, ⊤, ⊔, ⊓ ) where: (S, ⊑ ) is a poset ⊥ is the smallest thing in S ⊤ is the biggest thing in S lub(a, b) and glb(a, b) always exist a ⊔ b = lub(a, b) a ⊓ b = glb(a, b)

Foundations : Lattices (Formally) A lattice is (S, ⊑, ⊥, ⊤, ⊔, ⊓ ) where: (S, ⊑ ) is a poset ∀ a ∈ S. ⊥ ⊑ a ∀ a ∈ S. a ⊑ ⊤ ∀ a, b ∈ S. ∃ c. c = lub(a, b) /\ a ⊔ b = c ∀ a, b ∈ S. ∃ c. c = glb(a, b) /\ a ⊓ b = c

Foundations : Fancy Lattice Names ⊥ is “botom” ⊤ is “top” ⊔ is “join” ⊓ is “meet”

for edge e in CFG: m[e] = BOTTOM for node n in CFG: q.push(n) while not q.empty(): n = q.pop() info_in = m[n.in_edges] info_out = F(n, info_in) for i from 0 to len(info_out): e = n.out_edges[i] new_info = m[e] JOIN info_out[i] if m[e] != new_info: m[e] = new_info q.push(e.dest) Port to lattices. Small patch set.

while not q.empty(): n = q.pop() info_in = m[n.in_edges] info_out = F(n, info_in) for i from 0 to len(info_out): e = n.out_edges[i] new_info = m[e] JOIN info_out[i] if m[e] != new_info: m[e] = new_info q.push(e.dest) Termination. Finite lattice height implies termination.

while not q.empty(): n = q.pop() info_in = m[n.in_edges] info_out = F(n, info_in) for i from 0 to len(info_out): e = n.out_edges[i] new_info = m[e] JOIN info_out[i] if m[e] != new_info: m[e] = new_info q.push(e.dest) Termination. But can we do better? Finite lattice height implies termination. Get rid of that JOIN right in the middle?

while not q.empty(): n = q.pop() info_in = m[n.in_edges] info_out = F(n, info_in) for i from 0 to len(info_out): e = n.out_edges[i] new_info = m[e] JOIN info_out[i] if m[e] != new_info: m[e] = new_info q.push(e.dest) Termination. But can we do better? Get rid of that JOIN right in the middle? Yes. The trick is in the flow functions.

In general, can’t remove join and have termination. So, when is it OK? Safely Getting Rid Of The Join

In general, can’t remove join and have termination. So, when is it OK? Port our algorithm to math to figure it out. Build in terms of fixpoints. Safely Getting Rid Of The Join

Fixpoint: an input equal the output. A fixpoint of F is any X such that F(X) = X. “Best” answer since repeating yields same result. Fixpoints Are Easy

Using Fixpoints In Analysis Goal: compute map m from CFG edges to dataflow information Strategy: define a global flow function F as follows: F takes a map m as a parameter and returns a new map m’, in which individual local flow functions have been applied

The Big F F mm’

Just Regular Flow Funcs Inside mm’ f1 f2 f3

Goal: Find Fixpoint of F F m F F m’ …

Fixpoint of F Goal: a fixed point of F, i.e. m where m = F(m) How should we do this?

Fixpoint of F Goal: a fixed point of F, i.e. m where m = F(m) How should we do this? Let ⊥ be ⊥ lifted to a map: ⊥ = e. ⊥ Compute F( ⊥ ), then F(F( ⊥ )), then F(F(F( ⊥ ))),... until the result doesn’t change

Fixpoint of F Goal: a fixed point of F, i.e. m where m = F(m) How should we do this? Let ⊥ be ⊥ lifted to a map: ⊥ = e. ⊥ Compute F( ⊥ ), then F(F( ⊥ )), then F(F(F( ⊥ ))),... until the result doesn’t change … but how long could that take ???

Fixpoint of F : Formal Solution Solution: ⊔ i = 0 Fi(⊥)Fi(⊥)

Fixpoint of F : Formal Solution Solution: We want F 1 ( ⊥ ) ⊑ F 2 ( ⊥ ) ⊑ F 3 ( ⊥ ) … ⊑ F k ( ⊥ ) Allows us to eliminate the big join. Just require F to be monotonic: ∀ a b, a ⊑ b ➞ F(a) ⊑ F(b) ⊔ i = 0 Fi(⊥)Fi(⊥)

Back To Termination Solution: if F is monotonic, we have it. Finite lattice height  termination w/out joins! OK. But how do we know F is monotonic? F is monotonic if flow functions monotonic.

Another benefit of monotonicity Suppose Marsians came to earth, and miraculously give you a fixed point of F, call it fp. Then:

Another benefit of monotonicity Suppose Marsians came to earth, and miraculously give you a fixed point of F, call it fp. Then:

Another benefit of monotonicity We are computing the least fixed point...

Recap Let’s do a recap of what we’ve seen so far Started with worklist algorithm for reaching definitions

Worklist algorithm for reaching defns let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ∅ for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) ∪ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Generalized algorithm using lattices let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ⊥ for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0.. info_out.length do let new_info := m(n.outgoing_edges[i]) ⊔ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

Next step: removed outer join Wanted to remove the outer join, while still providing termination guarantee To do this, we re-expressed our algorithm more formally We first defined a “global” flow function F, and then expressed our algorithm as a fixed point computation

Guarantees If F is monotonic, don’t need outer join If F is monotonic and height of lattice is finite: iterative algorithm terminates If F is monotonic, the fixed point we find is the least fixed point. Any questions?

What about if we start at top? What if we start with ⊤, F( ⊤ ), F(F( ⊤ )), F(F(F( ⊤ )))

What about if we start at top? What if we start with ⊤, F( ⊤ ), F(F( ⊤ )), F(F(F( ⊤ ))) We get the greatest fixed point Why do we prefer the least fixed point? –More precise

Graphically x y 10

Graphically x y 10

Graphically x y 10

Another example: constant prop Set D = x := N in out F x := N (in) = x := y op z in out F x := y op z (in) =

Another example: constant prop Set D = 2 { x ! N | x 2 Vars Æ N 2 Z } x := N in out F x := N (in) = in – { x ! * } [ { x ! N } x := y op z in out F x := y op z (in) = in – { x ! * } [ { x ! N | ( y ! N 1 ) 2 in Æ ( z ! N 2 ) 2 in Æ N = N 1 op N 2 }

Another example: constant prop *x := y in out F *x := y (in) = x := *y in out F x := *y (in) =

Another example: constant prop *x := y in out F *x := y (in) = in – { z ! * | z 2 may-point(x) } [ { z ! N | z 2 must-point-to(x) Æ y ! N 2 in } [ { z ! N | (y ! N) 2 in Æ (z ! N) 2 in } x := *y in out F x := *y (in) = in – { x ! * } [ { x ! N | 8 z 2 may-point-to(x). (z ! N) 2 in }

Another example: constant prop x := f(...) in out F x := f(...) (in) = *x := *y + *z in out F *x := *y + *z (in) =

Another example: constant prop x := f(...) in out F x := f(...) (in) = ; *x := *y + *z in out F *x := *y + *z (in) = F a := *y;b := *z;c := a + b; *x := c (in)

Another example: constant prop s: if (...) in out[0]out[1] merge out in[0]in[1]

Another example: constant prop Set D = 2 { x ! N | x 2 Vars Æ N 2 Z } x := N in out F x := N (in) = in – { x ! * } [ { x ! N } x := y op z in out F x := y op z (in) = in – { x ! * } [ { x ! N | ( y ! N 1 ) 2 in Æ ( z ! N 2 ) 2 in Æ N = N 1 op N 2 }

Another example: constant prop *x := y in out F *x := y (in) = in – { z ! * | z 2 may-point(x) } [ { z ! N | z 2 must-point-to(x) Æ y ! N 2 in } [ { z ! N | (y ! N) 2 in Æ (z ! N) 2 in } x := *y in out F x := *y (in) = in – { x ! * } [ { x ! N | 8 z 2 may-point-to(x). (z ! N) 2 in }

Another example: constant prop x := f(...) in out F x := f(...) (in) = ; *x := *y + *z in out F *x := *y + *z (in) = F a := *y;b := *z;c := a + b; *x := c (in)

Another example: constant prop s: if (...) in out[0]out[1] merge out in[0]in[1]

Lattice (D, v, ?, >, t, u) =