Data Flow Analysis 4 15-411 Compiler Design Nov. 8, 2005.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Data Flow Analysis. Goal: make assertions about the data usage in a program Use these assertions to determine if and when optimizations are legal Local:
Relations Relations on a Set. Properties of Relations.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
1 CS 201 Compiler Construction Data Flow Framework.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
CSE 231 : Advanced Compilers Building Program Analyzers.
CSE 231 : Advanced Compilers Building Program Analyzers.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
CS 536 Spring Global Optimizations Lecture 23.
From last time: Lattices A lattice is a tuple (S, v, ?, >, t, u ) such that: –(S, v ) is a poset – 8 a 2 S. ? v a – 8 a 2 S. a v > –Every two elements.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
© Love Ekenberg The Algorithm Concept, Big O Notation, and Program Verification Love Ekenberg.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Administrative stuff Office hours: After class on Tuesday.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
1 Data-Flow Frameworks Lattice-Theoretic Formulation Meet-Over-Paths Solution Monotonicity/Distributivity.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Data flow analysis Emery Berger University.
1 CS 201 Compiler Construction Lecture 4 Data Flow Framework.
Data Flow Analysis Compiler Design Nov. 8, 2005.
From last lecture We want to find a fixed point of F, that is to say a map m such that m = F(m) Define ?, which is ? lifted to be a map: ? = e. ? Compute.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis: Data-flow frameworks –Classic.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Constant Propagation. The constant propagation framework is different from all the data-flow problems discussed so far, in that It has an unbounded set.
Precision Going back to constant prop, in what cases would we lose precision?
Data Flow Analysis. 2 Source code parsed to produce AST AST transformed to CFG Data flow analysis operates on control flow graph (and other intermediate.
MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Solving fixpoint equations
Machine-Independent Optimizations Ⅱ CS308 Compiler Theory1.
Chapter 2 Mathematical preliminaries 2.1 Set, Relation and Functions 2.2 Proof Methods 2.3 Logarithms 2.4 Floor and Ceiling Functions 2.5 Factorial and.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Universidad Nacional de ColombiaUniversidad Nacional de Colombia Facultad de IngenieríaFacultad de Ingeniería Departamento de Sistemas- 2002Departamento.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.
Data-Flow Analysis (Chapter 8). Outline What is Data-Flow Analysis? Structure of an optimizing compiler An example: Reaching Definitions Basic Concepts:
CS 614: Theory and Construction of Compilers Lecture 17 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis: Data-flow frameworks –Classic.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Iterative Dataflow Problems Taken largely from notes of Alex Aiken (UC Berkeley) and Martin Rinard (MIT) Dataflow information used in optimization Several.
1 Lecture 5 Functions. 2 Functions in real applications Curve of a bridge can be described by a function Converting Celsius to Fahrenheit.
Chapter 2 1. Chapter Summary Sets The Language of Sets - Sec 2.1 – Lecture 8 Set Operations and Set Identities - Sec 2.2 – Lecture 9 Functions and sequences.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Yet More Data flow analysis John Cavazos.
Lub and glb Given a poset (S, · ), and two elements a 2 S and b 2 S, then the: –least upper bound (lub) is an element c such that a · c, b · c, and 8 d.
DFA foundations Simone Campanoni
Fixpoints and Reachability
Applied Discrete Mathematics Week 2: Functions and Sequences
Simone Campanoni DFA foundations Simone Campanoni
CSC D70: Compiler Optimization Dataflow-2 and Loops
A Survey of Program Slicing Techniques: Section 4
Another example: constant prop
Program Analysis and Verification
Concurrent Models of Computation
Data Flow Analysis Compiler Design
Lecture 20: Dataflow Analysis Frameworks 11 Mar 02
Presentation transcript:

Data Flow Analysis Compiler Design Nov. 8, 2005

Example from last time Nodes represent sequences of instructions with no branches. Edges represent control flow between nodes.

Constant Propagation Convenient to associate a pool of propagated constants with each node in the graph. Pool is a set of ordered pairs which indicate variables that have constant values when node is encountered. The pool at node B denoted by P B consists of a single element (a,1) since the assignment a:= 1 must occur before B.

Constant Pools Let V be a set of variables, C be a set of constants, and N be the set of nodes in the graph. The set U = V £ C represents ordered pairs which may appear in any constant pool. All constant pools are elements of the power set U, denoted P (U).

Constant Propagation (cont.) Fundamental problem of constant propagation is to determine the pool of constants for each node in a program graph. By inspection of the program graph for the example, the pool of constants at each node is P A =  P B = {(a, 1)} P C = {(a, 1)} P D = {(a, 1), (b, 2)} P E = {(a, 1), (b, 2), (d, 3)} P F = {(a, 1), (b, 2), (d, 3)}

Constant Propagation (cont.) P N may be determined for each node N in the graph as follows:  Consider each path (A, p 1,p 2, …, p n,N). Apply constant propagation along path to obtain set of constants at node N.  Intersection for each path to N is the set of constants which can be assumed for optimization. (It is unknown what path will be taken at execution time, so intersection is conservative choice)

Constant Propagation (cont.) Successively longer paths from A to D can be evaluated, resulting in P D,3, P D,4, …, P D,n for arbitrarily large n. The pool of constants that can be assumed no matter what flow of control occurs is the set of constants common to all P D,i, i.e. Å i P D,i This procedure is not effective since the number of such paths may have no finite bound, and the procedure would not halt.

Optimizing Function for Constant Propagation It is useful to define an optimizing function f which maps an input pool together with a particular node to a new output pool. The optimizing function for constant propagation is f: N £ P (U) ! P (U) where (v, c) 2 f(N, P) if and only if 1. (v, c) 2 P and the operation at node N does not assign a new value to the variable v. 2. The operation at N assigns an expression to the variable v, and the expression evaluates to the constant c.

Optimization Function for Example The optimizing function can be applied to node A with an empty constant pool resulting in f(A, ; ) = {(a,1)}. The function can be applied to B with {(a, 1)} as the constant pool yielding f(B, {(a, 1)}) = {(a, 1), (c, 0)}.

Extending f to Paths in the Graph Given a path from entry node A to an arbitrary node N, optimizing pool for path is determined by composing the function f. For example, f(C, f(B, f(A, ; ))) = {(a, 1), (c, 0), (b, 2)} is the constant pool for D for this path. Sometimes the notation is cleaner to write f C (f B (f A ())) or f C ± f B ± f A ().

Computing the Pool of Optimizing Information. The pool of optimizing information which can be assumed at node N in the graph, independent of the path taken at execution time, is P N = Å {x | x 2 F N }. Here F N = { f(p n, f(p n-1, …, f(p 1, P))…)| (p 1, p 2, …, p n, N) is a path from an entry node p 1 with corresponding entry pool P to node N}.

Lattices A partial order is a lattice if u and t are defined so that  u is the meet or greatest lower bound operation  x u y · x and x u y · y  If z · x and z · y then z · x u y  t is the join or least upper bound operation  x · x t y and y · x t y  If x · z and y · z, then x t y · z

Lattices (cont.) A finite partial order is a lattice if meets and joins exist for every pair of elements A lattice has unique elements bot and top such that x u ? = ? x t ? =x x u > = x x t > = > In a lattice x · y iff x u y = x x · y iff x t y = y

Monotonicity A function f on a partial order is monotonic if and only if x · y implies f(x) · f(y) Can show that optimizing function f: N £ P (U) ! P (U) for constant propagation is monotonic in its second argument.

By monotonicity, we also have A function f is distributive or has the homomorphism property if Distributive Data Flow Problems

Joins lose no information Benefit of Distributivity

Ideally, we would like to compute the meet over all paths (MOP) solution: Let f i be the transfer function for statement s i If p is a path {s 1,..., s n }, let f p = f n ± f n-1 ±... ± f 1 Let path(s) be the set of paths from the entry to s If a data flow problem is distributive, then solving data flow equations in the standard way yields the MOP solution Accuracy of Data Flow Analysis

Analyses of how the program computes Live variables Available expressions Reaching definitions Very busy expressions All Gen/Kill problems are distributive What Problems are Distributive?

Constant propagation In general, analysis of what the program computes is not distributive A Non-Distributive Example

Constant Propagation and the Homomorphism Prop. Does the Optimizing function for Constant Propagation satisfy the homomorphism? F(1, ) = {(x,1)} P 1 = F(2, F(1,)) = {(x, 1), (y, 2)} F(3, P) = {(x, 2)) P 2 = F(4, F(3,)) = {(x, 2), (y, 1)} F(5, P 1 Å P  ) = F(5,) =  F(5, P 1 )) = {(x, 1), (y, 2), (z, 3)} F(5, P 2 )) = {(x, 2), (y, 1), (z, 3)} F(5, P 1 ) Å F(5, P 2 ) = {(z, 3)}  = F(5, P 1 Å P  

Undecidability of MOP Problem Surprise!! The MOP problem is undecidable for Constant Propagation. No algorithm exists to compute the MOP solution for all instances of the constant propagation problem.

Meet-Semilatticies Let the finite set L be the set of all possible optimizing pools for a given application. Let u be a meet operation with the properties: u : L £ L ! L x u y = y u x x u (y u z) = (x u y) u z where x, y z 2 L. The set L and the u operation define a finite meet-semilattice.

Ordering on Meet-Semilattices The u operation defines a partial ordering on L by x 6 y if and only if x u y = x. Similarly, x < y if and only if x 6 y and x  y.

Monotone Data Flow Analysis Frameworks A monotone data flow analysis framework is a triple D= (L, u,F) where 1.(L, u ) is a semilattice of finite length with zero element 0, and 2.F is a monotone operation space associate with L.

Monotone Operation Spaces Let (L, u ) be a semilattice of finite length with a zero element. A set of operations F on L is said to be a monotone operation space associated with L iff the following conditions hold: 1.Each f 2 F is monotonic, i.e., ( 8 x, y 2 L)[x 6 y implies that f(x) 6 f(y)]. 2. There exists an identity operation e in F, i.e. ( 9 e 2 F) ( 8 x 2 L) [e(x) = x] 3. F is closed under composition, i.e., ( 8 f, g 2 F)[f ± g 2 F] 4. For each x 2 F there exists an f 2 F such that x = f(0).

Distributive Frameworks A monotone framework D = (L, u,F) is called a distributive framework if and only if ( 8 f 2 F) ( 8 x, y 2 L) [f(x u y) = f(x) u f(y)]. This really just the homomorphism property mentioned earlier.

Maximum Fixpoint Solution (MFP) Let I = (G, M) be an instance of a monotone framework D = (L, u, F). Let f i (P) be the operation associated with node i, and let the nodes of G be numbered from 2 to n by rPostorder. The maximum fixed point (MFP) solution of I is defined as the maximum fixed point of the following equations. x 1 = 0, x i = u j 2 pred(i) f j (x j ) for 2 6 i 6 n.

Conditions for Algorithm If X ½ L, the generalized meet operation u X is defined as the pairwise application of u to the elements of X. L is assumed to have a “zero element” 0 such that 0 6 x for all x 2 L. An augmented set L ’ is constructed from L by adding a “unit element” 1 such that 1 is not in L and 1 Æ x = x for all x in L. The set L ’ = L [ {1}. It follows that x <1 for all x in L.

Conditions for Algorithm (cont.) Need an “optimizing function” f: N £ L ! L. It should have the distributive or homomorphism property: F(N, x Æ y) = f(N, x) Æ f(N, y) for all N 2 N and x, y 2 L. Note that f(N, x) < 1 for all N 2 N and x 2 L.

Global Analysis Algorithm Global analysis starts with an entry pool set EP ½ I £ L, where (e, x) 2 EP if e 2 I is an entry node with optimizing pool x 2 L. A1 [initialize] L := EP. A2 [terminate ?] If L = ; then halt. A3 [select node] Let L’ 2 L, L’ = (N, P i ) for some N 2 N and P i 2 L. Then L := L – {L’}. A4 [Traverse] Let P N be the current approximate pool for node N (Initially P N = 1). If P N 6 P i the go to step A2. A5 [set pool] P N := P N Æ P i, L:= L [ {(N’, f(N, P N )) | N’ 2 I(N)}. A6 [Loop] Go to step A2.

MOP vs. MFP Solutions Kildall provided a general iterative algorithm for distributive frameworks (see previous slide). He proved that his algorithm converges to the MFP solution of a distributive framework and that for distributive frameworks the MFP solution is equal to the MOP solution. Kam and Ullman proved that when Kildall’s iterative algorithm is applied to a monotone framework, it converges to the MFP solution of that framework.