Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Relations Relations on a Set. Properties of Relations.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Equivalence, Order, and Inductive Proof
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
1 CS 201 Compiler Construction Data Flow Framework.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
CSE 231 : Advanced Compilers Building Program Analyzers.
Worklist algorithm Initialize all d i to the empty set Store all nodes onto a worklist while worklist is not empty: –remove node n from worklist –apply.
Programming Language Semantics Denotational Semantics Chapter 5 Based on a lecture by Martin Abadi.
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Abstract Interpretation Part I Mooly Sagiv Textbook: Chapter 4.
Administrative stuff Office hours: After class on Tuesday.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
San Diego October 4-7, 2006 Over 1,000 women in computing Events for undergraduates considering careers and graduate school Events for graduate students.
Recap: Reaching defns algorithm From last time: reaching defns worklist algo We want to avoid using structure of the domain outside of the flow functions.
1 Data-Flow Frameworks Lattice-Theoretic Formulation Meet-Over-Paths Solution Monotonicity/Distributivity.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Data flow analysis Emery Berger University.
1 CS 201 Compiler Construction Lecture 4 Data Flow Framework.
Partial Orderings: Selected Exercises
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.
Claus Brabrand, ITU, Denmark DATA-FLOW ANALYSISMar 25, 2009 Static Analysis: Data-Flow Analysis II Claus Brabrand IT University of Copenhagen (
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis: Data-flow frameworks –Classic.
Partially Ordered Sets (POSets)
Sets, POSets, and Lattice © Marcelo d’Amorim 2010.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Constant Propagation. The constant propagation framework is different from all the data-flow problems discussed so far, in that It has an unbounded set.
1 Region-Based Data Flow Analysis. 2 Loops Loops in programs deserve special treatment Because programs spend most of their time executing loops, improving.
MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Solving fixpoint equations
Machine-Independent Optimizations Ⅱ CS308 Compiler Theory1.
Chapter 9. Chapter Summary Relations and Their Properties n-ary Relations and Their Applications (not currently included in overheads) Representing Relations.
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.
Sets, Relations, and Lattices
Sets and Subsets Set A set is a collection of well-defined objects (elements/members). The elements of the set are said to belong to (or be contained in)
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Data Flow Analysis Extensions & Foundations.
Unit II Discrete Structures Relations and Functions SE (Comp.Engg.)
Chapter 8: Relations. 8.1 Relations and Their Properties Binary relations: Let A and B be any two sets. A binary relation R from A to B, written R : A.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Set Theory Concepts Set – A collection of “elements” (objects, members) denoted by upper case letters A, B, etc. elements are lower case brackets are used.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis: Data-flow frameworks –Classic.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Semilattices presented by Niko Simonson, CSS 548, Autumn 2012 Semilattice City, © 2009 Nora Shader.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
1 Iterative Program Analysis Part II Mathematical Background Mooly Sagiv Tel Aviv University
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Chaotic Iterations Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Iterative Dataflow Problems Taken largely from notes of Alex Aiken (UC Berkeley) and Martin Rinard (MIT) Dataflow information used in optimization Several.
Compiler Principles Fall Compiler Principles Lecture 9: Dataflow & Optimizations 2 Roman Manevich Ben-Gurion University of the Negev.
Code Optimization Data Flow Analysis. Data Flow Analysis (DFA)  General framework  Can be used for various optimization goals  Some terms  Basic block.
DFA foundations Simone Campanoni
Partial Orderings: Selected Exercises
Fixpoints and Reachability
Simone Campanoni DFA foundations Simone Campanoni
CSC D70: Compiler Optimization Dataflow-2 and Loops
CSC D70: Compiler Optimization Dataflow Analysis
Advanced Compiler Techniques
Concurrent Models of Computation
Data Flow Analysis Compiler Design
Lecture 20: Dataflow Analysis Frameworks 11 Mar 02
Presentation transcript:

Foundations of Data-Flow Analysis

Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise is the solution obtained by the iterative algorithm? Will the iterative algorithm converge? What is the meaning of the solution to the equations?

Data-Flow Analysis Framework A direction of the data flow D, which is either forwards or backwards A semilattice, which includes a domain of values V and a meet operator  A family F of transfer functions from V to V. This family must include functions suitable for the boundary conditions, which are constant transfer functions for the special nodes ENTRY and EXIT in any control flow graph

Example: Reaching Definitions The direction: forwards The domain of values: the set of subsets of the set of all definitions in the program The meet operator: set union The family of transfer functions: the set of transfer functions for various statements

Semilattices A semilattice is a set V and a binary meet operator  such that for all x, y, and z in V:  x  x = x (meet is idempotent)  x  y = y  x (meet is commutative)  x  ( y  z) = ( x  y)  z (meet is associative) A semilattice has a top element, denoted 丅, such that for all x in V, 丅  x = x Optionally, a semilattice may have a bottom element, denoted , such that for all x in V,   x = 

Example: Reaching Definitions The domain of values is the set of all subsets of the universal set U, or the power set of U, denoted 2 U The meet operator is the set union  The set union is idempotent, commutative, and associative The top element is the empty set  The bottom element is the universal set U

Partial Orders A relation  is a partial order on a set V if for all x, y, and z in V:  x  x (the partial order is reflexive)  If x  y and y  x, then x = y (the partial order is antisymmetric)  If x  y and y  z, then x  z (the partial order is transitive) The pair (V,  ) is called a poset, or partially ordered set We define x < y if and only if x  y and x  y

The Partial Order for a Semilattice It is useful to define a partial order  for a semilattice (V,  ). For all x and y in V, we define x  y if and only if x  y = x  is reflexive: x  x = x  x  x  is antisymmetric: x  y  x  y = x, y  x  y  x = y, x = ( x  y) = ( y  x) = y  is transitive: x  y  x  y = x, y  z  y  z = y, ( x  z) = (( x  y)  z) = ( x  ( y  z )) = ( x  y) = x  x  z

Example: Reaching Definitions The relation  is the set inclusion  x  y = x  x  y This says that sets larger in size is smaller in the partial order The set inclusion is reflexive, antisymmetric, and transitive

Greatest Lower Bounds A greatest lower bound (or glb) of domain elements x and y is an element g such that g  x, g  y, and If z is any element such that z  x and z  y, then z  g

Meet and Greatest Lower Bound The meet of x and y is the greatest lower bound of x and y Let g = x  y g  x: g  x = ( x  y)  x = x  ( y  x) = x  ( x  y) = ( x  x)  y = x  y = g g  y z  x and z  y  z  g z  g = z  ( x  y) = ( z  x)  y = z  y = z

Lattice Diagrams  {d2}{d2}{d1}{d1}{d3}{d3} {d 1, d 3 } {d 1, d 2 }{d 2, d 3 } {d 1, d 2, d 3 } 丅 

Product Lattices The product lattice for lattices (A,  A ) and (B,  B ) is defined as follows: The domain of the product lattice is A  B The meet  for the product lattice: (a, b)  (a’, b’) = (a  A a’, b  B b’) The partial order  for the product lattice: (a, b)  (a’, b’) iff a  A a’ and b  B b’ This definition can be extended to the product of any number of lattices

Example ({},{},{}) ({},{d 2 },{})({d 1 },{},{})({},{},{d 3 }) ({d 1 },{},{d 3 })({d 1 },{d 2 },{})({},{d 2 },{d 3 }) ({d 1 }, {d 2 }, {d 3 }) 丅 

Height of a Semilattice An ascending chain in a poset (V,  ) is a sequence x 1 < x 2 < … < x n The height of a semilattice is the largest number of < relations in any ascending chain An iterative data flow analysis algorithm is convergent if the corresponding semilattice has finite height A lattice consisting of a finite set of values will have a finite height It is also possible for a lattice with an infinite number of values to have a finite height

Transfer Functions The family of transfer functions F: V  V in a data-flow framework has the following properties: F has an identity function I, such that I( x ) = x for all x in V F is closed under composition; that is, for any two functions f and g in F, the function h defined by h ( x ) = g ( f ( x )) is in F

Example: Reaching Definitions The identity function: gen [B] = kill [B] =  Closure under composition: f 1 ( x ) = G 1  ( x - K 1 ), f 2 ( x ) = G 2  ( x - K 2 ), f 2 ( f 1 ( x )) = G 2  ((G 1  ( x - K 1 )) - K 2 ) = (G 2  (G 1 - K 2 ))  ( x - (K 1  K 2 )). Let G = G 2  (G 1 - K 2 ) and K = K 1  K 2. f ( x ) = f 2 ( f 1 ( x )) = G  ( x - K).

Monotone Frameworks A framework (D, F, V,  ) is monotone if x  y implies f ( x )  f ( y ), for all x and y in V, and f in F Equivalently, a framework (D, F, V,  ) is monotone if f ( x  y )  f ( x )  f ( y ), for all x and y in V, and f in F

Proof of Equivalence (  ) x  y  x and x  y  y  f(x  y)  f(x) and f(x  y)  f(y) f(x)  f(y) is the glb of f(x) and f(y)  f(x  y)  f(x)  f(y) (  ) x  y  x  y = x  f(x  y) = f(x)  f(x)  f(y)  f(y)  f(x)  f(y)

Distributive Frameworks A framework (D, F, V,  ) is distributive if f ( x  y ) = f ( x )  f ( y ) for all x and y in V, and f in F Distributivity implies monotonicity

Example: Reaching Definitions Let y and z be sets of definitions, and f ( x ) = G  ( x - K) Then G  (( y  z ) - K) = (G  ( y - K))  (G  ( z - K))

The Iterative Algorithm for General Frameworks: Input A control flow graph, with specially labeled ENTRY and EXIT nodes, A direction of the data flow D, A set of values V, A meet operator , A set of functions F, where f B in F is the transfer function for basic block B, and A constant value v ENTRY or v EXIT in V, representing the boundary condition for forward and backward frameworks, respectively

The Iterative Algorithm for General Frameworks: Output Values in V for IN[B] and OUT[B] for each basic block B in the control flow graph

The Iterative Algorithm for General Frameworks: Forward OUT[ENTRY] = v ENTRY ; for (each basic block B other than ENTRY) OUT[B] := 丅 ; while (changes to any OUT occur) for (each basic block B other than ENTRY) { IN[B] :=  p  pred(B) OUT[p]; OUT[B] := f B (IN[B]); }

The Iterative Algorithm for General Frameworks: Backward IN[EXIT] = v EXIT ; for (each basic block B other than EXIT) IN[B] := 丅 ; while (changes to any IN occur) for (each basic block B other than EXIT) { OUT[B] :=  s  succ(B) IN[s]; IN[B] := f B (OUT[B]); }

Properties of the Iterative Algorithm If the algorithm converges, the result is a solution to the data-flow equations If the framework is monotone, then the solution found is the maximum fixedpoint (MFP) of the data-flow equations. The maximum fixedpoint is a solution with the property that in any other solution, the value of IN[B] and OUT[B] are  the corresponding values of MFP If the semilattice of the framework is monotone and finite height, then the algorithm is guaranteed to converge

The Ideal Solution Consider any path P = ENTRY  B 1  …  B k-1  B k The transfer function for P is f P = f B k-1 ( f B k-2 ( … ( f B 1 ) … )) The ideal solution is IDEAL[B] =  P  possible paths from ENTRY to B f P ( v ENTRY ) Any answer that is greater than IDEAL is incorrect Any value smaller than or equal to IDEAL is conservative, i.e., safe

The Meet-Over-Paths Solution Finding all possible paths is undecidable The meet-over-paths solution is MOP[B] =  P  paths from ENTRY to B f P ( v ENTRY ) The paths considered in the MOP solution is a superset of all the paths that are possibly executed MOP[B]  IDEAL[B]

MFP Solution versus MOP Solution The iterative algorithm visits basic blocks, not necessarily in the order of execution At each confluence point, the algorithm applies the meet operator to the data-flow values obtained so far. Some of these values used were introduced artificially in the initialization process, not representing the result of any execution from the beginning of the program

Early Meet over Paths ENTRY B1B1 B2B2 B4B4 B3B3 MOP[ B 4 ] = (( f B 3  f B 1 )  ( f B 3  f B 2 ))( v ENTRY ) IN[ B 4 ] = f B 3 (( f B 1 ( v ENTRY )  f B 2 ( v ENTRY )))

Comparison of Solutions Using the iterative algorithm, we have IN[B]  MOP[B] for monotone frameworks and IN[B] = MOP[B] for distributive frameworks MFP  MOP  IDEAL