Data Flow Analysis 1 15-411 Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Lecture 11: Code Optimization CS 540 George Mason University.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Program Representations. Representing programs Goals.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
Partial Redundancy Elimination. Partial-Redundancy Elimination Minimize the number of expression evaluations By moving around the places where an expression.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
CS 536 Spring Global Optimizations Lecture 23.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Chair of Software Engineering Fundamentals of Program Analysis Dr. Manuel Oriol.
Data Flow Analysis Compiler Design Nov. 3, 2005.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Data flow analysis Emery Berger University.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Prof. Aiken CS 294 Lecture 11 Program Analysis. Prof. Aiken CS 294 Lecture 12 The Purpose of this Course How are the following related? –Program analysis.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Precision Going back to constant prop, in what cases would we lose precision?
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
1 CS 201 Compiler Construction Data Flow Analysis.
Data Flow Analysis. 2 Source code parsed to produce AST AST transformed to CFG Data flow analysis operates on control flow graph (and other intermediate.
MIT Foundations of Dataflow Analysis Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.
MIT Introduction to Program Analysis and Optimization Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Iterative Dataflow Problems Taken largely from notes of Alex Aiken (UC Berkeley) and Martin Rinard (MIT) Dataflow information used in optimization Several.
Optimization Simone Campanoni
Dataflow Analysis CS What I s Dataflow Analysis? Static analysis reasoning about flow of data in program Different kinds of data: constants, variables,
DFA foundations Simone Campanoni
Data Flow Analysis Suman Jana
Iterative Dataflow Problems
Fall Compiler Principles Lecture 8: Loop Optimizations
Topic 10: Dataflow Analysis
University Of Virginia
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Control Flow Analysis (Chapter 7)
Fall Compiler Principles Lecture 10: Loop Optimizations
Data Flow Analysis Compiler Design
Static Single Assignment
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Live Variables – Basic Block
Presentation transcript:

Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained them from Alex Aiken

 ´ ´

£ £ 

Compiler Structure Source code parsed to produce abstract syntax tree. Abstract syntax tree transformed to control flow graph. Data flow analysis operates on the control flow graph (and other intermediate representations). Source code Abstract Syntax tree Control Flow Graph Object code

Abstract Syntax Tree (AST) Programs are written in text as sequences of characters may be awkward to work with. First step: Convert to structured representation. Use lexer (like lex) to recognize tokens Use parser (like yacc) to group tokens structurally  often produce to produce AST

Abstract Syntax Tree Example x := a + b; y := a * b While (y > a){ a := a +1; x := a + b } program while block = x+ ab … … > = a + a1 ya

ASTs ASTs are abstract don’t contain all information in the program  e.g., spacing, comments, brackets, parenthesis. Any ambiguity has been resolved  e.g., a + b + c produces the same AST as (a +b) + c.

Disadvantages of ASTs ASTs have many similar forms e.g., for while, repeat, until, etc e.g., if, ?, switch Expressions in AST may be complex, nested (42 * y) + ( z > 5 ? 12 * z : z +20) Want simpler representation for analysis … at least for dataflow analysis.

Control-Flow Graph (CFG) A directed graph where Each node represents a statement Edges represent control flow Statements may be Assignments x = y op z or x = op z Copy statements x = y Branches goto L or if relop y goto L etc

Control-flow Graph Example x := a + b; y := a * b While (y > a){ a := a +1; x := a + b }

Variations on CFGs Usually don’t include declarations (e.g. int x;). May want a unique entry and exit point. May group statements into basic blocks. A basic block is a sequence of instructions with no branches into or out of the block.

Control-Flow Graph with Basic Blocks X := a + b; Y := a * b While (y > a){ a := a +1; x := a + b } Can lead to more efficient implementations But more complicated to explain so… We will use single-statement blocks in lecture

CFG vs. AST CFGs are much simpler than ASTs Fewer forms, less redundancy, only simple expressions But, ASTs are a more faithful representation CFGs introduce temporaries Lose block structure of program So for AST,  Easier to report error + other messages  Easier to explain to programmer  Easier to unparse to produce readable code

Data Flow Analysis A framework for proving facts about program Reasons about lots of little facts Little or no interaction between facts Works best on properties about how program computes Based on all paths through program  including infeasible paths

Available Expressions An expression e = x op y is available at a program point p, if on every path from the entry node of the graph to node p, e is computed at least once, and And there are no definitions of x or y since the most recent occurance of e on the path Optimization If an expression is available, it need not be recomputed At least, if it is in a register somewhere

Data Flow Facts Is expression e available? Facts: a + b is available a * b is available a + 1 is available

Gen and Kill What is the effect of each x = a + b stmt gen kill y = a * b a = a + 1 a + b a * b a + b a * b a + 1 statement on the set of facts?

∅ {a + b} {a + b, a * b} Ø {a + b} Computing Available Expressions

Terminology A join point is a program point where two branches meet Available expressions is a forward, must problem Forward = Data Flow from in to out Must = At joint point, property must hold on all paths that are joined.

Data Flow Equations Let s be a statement succ(s) = {immediate successor statements of s} Pred(s) = {immediate predecessor statements of s} In(s) program point just before executing s Out(s) = program point just after executing s In(s) =  s’ 2 pred(s) Out(s’) Out(s) = Gen(s) [ (In(s) – Kill(s)) Note these are also called transfer functions

Liveness Analysis A variable v is live at a program point p if v will be used on some execution path originating from p before v is overwritten Optimization If a variable is not live, no need to keep it in a register If a variable is dead at assignment, can eliminate assignment.

Data Flow Equations Available expressions is a forward must analysis Data flow propagate in same direction as CFG edges Expression is available if available on all paths Liveness is a backward may problem to kow if variable is live, need to look at future uses Variable is live if available on some path In(s) = Gen(s) [ (Out(s) – Kill(s)) Out(s) =  s’ 2 succ(s) In(s’)

Gen and Kill What is the effect of each x = a + b y = a * b a = a + 1 stmt gen kill y > a a, b a, y a x y a statement on the set of facts?

{x} Computing Live Variables

{x, y, a} {x} Computing Live Variables

{x, y, a} {x} {x, y, a} Computing Live Variables

{x, y, a} {x} {x, y, a} {y, a, b} Computing Live Variables

{x, y, a} {x} {x, y, a} {y, a, b} Computing Live Variables

{x, y, a, b} {x} {x, y, a} {y, a, b} Computing Live Variables

{x, y, a, b} {x} {x, y, a, b} {y, a, b} Computing Live Variables

{x, y, a, b} {x} {x, y, a, b} {y, a, b} Computing Live Variables {x, a, b}

{x, y, a, b} {x} {x, y, a, b} {y, a, b} Computing Live Variables {x, a, b} {a, b}

Very Busy Expressions An expression e is very busy at point p if On every path from p, e is evaluated before the value of e is changed Optimization Can hoist very busy expression computation What kind of problem? Forward or backward? Backward May or must? Must

Code Hoisting Code hoisting finds expressions that are always evaluated following some point in a program, regardless of the execution path and moves them to the latest point beyond which they would always be evaluated. It is a transformation that almost always reduces the space occupied but that may affect its execution time positively or not at all.

Reaching Definitions A definition of a variable v is an assignment to v A definition of variable v reaches point p if There is no intervening assignment to v Also called def-use information What kind of problem? Forward or backward? Forward May or must? may

Space of Data Flow Analyses Most data flow analyses can be classified this way A few don’t fit: bidirectional Lots of literature on data flow analysis MayMust Forward Backward Reaching definitions Available expressions Live Variables Very busy expressions

Data Flow Facts and lattices Typically, data flow facts form a lattice Example, Available expressions “top” “bottom”

Partial Orders A partial order is a pair (P, · ) such that  · µ P £ P  · is reflexive: x · x  · is anti-symmetric: x · y and y · x implies x = y  · is transitive: x · y and y · z implies x · z

Lattices A partial order is a lattice if u and t are defined so that  u is the meet or greatest lower bound operation  x u y · x and x u y · y  If z · x and z · y then z · x u y  t is the join or least upper bound operation  x · x t y and y · x t y  If x · z and y · z, then x t y · z

Lattices (cont.) A finite partial order is a lattice if meet and join exist for every pair of elements A lattice has unique elements bot and top such that x u ? = ? x t ? =x x u > = x x t > = > In a lattice x · y iff x u y = x x · y iff x t y = y

Useful Lattices (2 S, µ ) forms a lattice for any set S. 2 S is the powerset of S (set of all subsets) If (S, · ) is a lattice, so is (S, ¸ ) i.e., lattices can be flipped The lattice for constant propagation 123 … > ?

Forward Must Data Flow Algorithm Out(s) = Gen(s) for all statements s W = {all statements} (worklist) Repeat Take s from W In(s) =  s’ 2 pred(s) Out(s’) Temp = Gen(s) [ (In(s) – Kill(s)) If (temp != Out (s)) { Out(s) = temp W = W [ succ(s) } Until W = 

Monotonicity A function f on a partial order is monotonic if x · y implies f(x) · f(y) Easy to check that operations to compute In and Out are monotonic In(s) =  s’ 2 pred(s) Out(s’) Temp = Gen(s) [ (In(s) – Kill(s)) Putting the two together Temp = f s (  s’ 2 pred(s) Out(s’))

Termination We know algorithm terminates because The lattice has finite height The operations to compute In and Out are monotonic On every iteration we remove a statement from the worklist and/or move down the lattice.