Iterative Data-flow Analysis C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Lecture 11: Code Optimization CS 540 George Mason University.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
CS 536 Spring Global Optimizations Lecture 23.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Data Flow Analysis Compiler Design Nov. 3, 2005.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs, Data-flow Analysis Data-flow Frameworks --- today’s.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Data flow analysis Emery Berger University.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis: Data-flow frameworks –Classic.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
1 CS 201 Compiler Construction Data Flow Analysis.
Global Common Subexpression Elimination with Data-flow Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Data Flow Analysis. 2 Source code parsed to produce AST AST transformed to CFG Data flow analysis operates on control flow graph (and other intermediate.
Solving fixpoint equations
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Proliferation of Data-flow Problems Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
12/5/2002© 2002 Hal Perkins & UW CSER-1 CSE 582 – Compilers Data-flow Analysis Hal Perkins Autumn 2002.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Terminology, Principles, and Concerns, III With examples from DOM (Ch 9) and DVNT (Ch 10) Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.
Terminology, Principles, and Concerns, IV With examples from LIVE and global block positioning Copyright 2011, Keith D. Cooper & Linda Torczon, all rights.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis: Data-flow frameworks –Classic.
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Data flow analysis John Cavazos University.
Optimization Simone Campanoni
Compiler Principles Fall Compiler Principles Lecture 9: Dataflow & Optimizations 2 Roman Manevich Ben-Gurion University of the Negev.
Code Optimization Data Flow Analysis. Data Flow Analysis (DFA)  General framework  Can be used for various optimization goals  Some terms  Basic block.
DFA foundations Simone Campanoni
11/22/2016© Hal Perkins & UW CSER-1 CSE P 501 – Compilers Dataflow Analysis Hal Perkins Autumn 2009.
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Introduction to Optimization
Data Flow Analysis Suman Jana
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Global Redundancy Elimination: Computing Available Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda.
Introduction to Optimization
University Of Virginia
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
Fall Compiler Principles Lecture 10: Global Optimizations
Data Flow Analysis Compiler Design
Dataflow Analysis Hal Perkins Winter 2008
Introduction to Optimization
Static Single Assignment
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
Optimizing Compilers CISC 673 Spring 2009 Data flow analysis
Presentation transcript:

Iterative Data-flow Analysis C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. New lecture, 2003

COMP 512, Fall Review Last class: Looked at Global Common Subexpression Elimination ( Cocke 70 ) Defined the available expressions problem as the key to finding opportunities and proving safety A VAIL (n 0 ) = Ø A VAIL (b) =  x  pred(b) (DEE XPR (x)  (A VAIL (x)  E XPR K ILL (x) )) Looked at an algorithm to solve these equations  Compute initial information: DEE XPR & E XPR K ILL  Apply an iterative solver to find a fixed-point solution Today Why does the iterative solver work?

COMP 512, Fall Data-flow Analysis Definition Data-flow analysis is a collection of techniques for compile-time reasoning about the run-time flow of values Almost always involves building a graph  Problems are trivial on a basic block  Global problems  control-flow graph (or derivative)  Whole program problems  call graph (or derivative) Usually formulated as a set of simultaneous equations  Sets attached to nodes and edges  Semilattice to describe values  We solved A VAIL with an iterative fixed-point algorithm Desired result is usually meet over all paths solution  “What is true on every path from the entry?”  “Can this happen on any path from the entry?”  Related to the safety of optimization ( how we use the results )

COMP 512, Fall Data-flow Analysis Limitations 1. Precision – “up to symbolic execution”  Assume all paths are taken 2.Solution – cannot afford to compute M OP solution  Large class of problems where M OP = M FP = L FP  Not all problems of interest are in this class 3.Arrays – treated naively in classical analysis  Represent whole array with a single fact 4.Pointers – difficult ( and expensive ) to analyze  Imprecision rapidly adds up  Need to ask the right questions Summary For scalar values, we can quickly solve simple problems Good news: Simple problems can carry us pretty far *

COMP 512, Fall Data-flow Analysis Semilattice A semilattice is a set L and a meet operation  such that,  a, b, & c  L : 1. a  a = a 2. a  b = b  a 3. a  (b  c) = (a  b)  c  imposes an order on L,  a, b, & c  L : 1. a ≥ b  a  b = b 2. a > b  a ≥ b and a ≠ b A semilattice has a bottom element, denoted  1.  a  L,   a =  2.  a  L, a ≥ 

COMP 512, Fall Data-flow Analysis How does this relate to data-flow analysis? Choose a semilattice to represent the facts Attach a meaning to each a  L Each a  L is a distinct set of known facts With each node n, associate a function f n : L  L f n models behavior of code in block corresponding to n Let F be the set of all functions that the code might generate Example — A VAIL Semilattice is (2 E,  ), where E is the set of all expressions &  is   Set are bigger than | variables |,  is Ø For a node n, f n has the form f n (x) = a n  (x  b n )  Where a n is DEE xpr (n) and b n is E XPR K ILL (n)

COMP 512, Fall Concrete Example: Available Expressions m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b r  c + d C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d u  e + f E v  a + b w  c + d x  e + f F E = { a+b, c+d, e+f, a+17, b+18 } 2 E is the set of all subsets of E 2 E = [ {a+b, c+d, e+f, a+17, b+18}, {a+b, c+d, e+f, a+17}, {a+b, c+d, e+f, b+18}, {a+b, c+d, a+17, b+18}, {a+b, e+f, a+17, b+18}, {c+d, e+f, a+17, b+18}, {a+b, c+d, e+f}, {a+b, c+d, b+18}, {a+b, c+d, a+17}, {a+b, e+f, a+17}, {a+b, e+f, b+18},{a+b, a+17, b+18}, {c+d, e+f, a+17}, {c+d, e+f, b+18}, {c+d, a+17, b+18},{e+f, a+17, b+18}, {a+b, c+d},{a+b, e+f},{a+b, a+17}, {a+b, b+18},{c+d, e+f},{c+d, a+17}, {c+d, b+18},{e+f, a+17},{e+f, b+18}, {a+17, b+18},{a+b}, {c+d}, {e+f}, {a+17}, {b+18}, {} ]

COMP 512, Fall Concrete Example: Available Expressions The Lattice { } {a+b} {c+d} {e+f} {a+17} {b+18} {a+b, c+d} {a+b, a+17} {c+d, e+f} {c+d, b+18} {e+f, b+18} {a+b, e+f} {a+b, b+18} {c+d, a+17} {e+f, a+17} {a+17, b+18} {a+b, c+d, e+f} {a+b, c+d, b+18} {a+b, c+d, a+17} {a+b, e+f, a+17} {a+b, e+f, b+18} {a+b, a+17, b+18} {c+d, e+f, a+17} {c+d, e+f, b+18} {c+d, a+17, b+18} {e+f, a+17, b+18}, {a+b, c+d, e+f, a+17} {a+b, c+d, e+f, b+18} {a+b, c+d, a+17, b+18} {a+b, e+f, a+17, b+18} {c+d, e+f, a+17, b+18} {a+b, c+d, e+f, a+17, b+18}, * Comparability (transitive)

COMP 512, Fall Concrete Example: Available Expressions The Lattice { } {a+b} {c+d} {e+f} {a+17} {b+18} {a+b, c+d} {a+b, a+17} {c+d, e+f} {c+d, b+18} {e+f, b+18} {a+b, e+f} {a+b, b+18} {c+d, a+17} {e+f, a+17} {a+17, b+18} {a+b, c+d, e+f} {a+b, c+d, b+18} {a+b, c+d, a+17} {a+b, e+f, a+17} {a+b, e+f, b+18} {a+b, a+17, b+18} {c+d, e+f, a+17} {c+d, e+f, b+18} {c+d, a+17, b+18} {e+f, a+17, b+18}, {a+b, c+d, e+f, a+17} {a+b, c+d, e+f, b+18} {a+b, c+d, a+17, b+18} {a+b, e+f, a+17, b+18} {c+d, e+f, a+17, b+18} {a+b, c+d, e+f, a+17, b+18}, * meet

COMP 512, Fall Round-robin Iterative Algorithm Termination: does it halt? Correctness: what answer does it produce? Speed: how quickly does it find that answer? A VAIL (b 0 )  Ø for i  1 to N A VAIL (b i )  { all expressions } change  true while (change) change  false for i  0 to N T EMP   x  pred (b) (D EF (x)  (A VAIL (x)  NK ILL (x) )) if A VAIL (b i ) ≠ T EMP then change  true A VAIL (b i )  T EMP The round-robin solver is easier to analyze than the worklist solver.

COMP 512, Fall Round-robin Iterative Algorithm Termination Makes sweeps over the nodes Halts when some sweep produces no change A VAIL (b 0 )  Ø for i  1 to N A VAIL (b i )  { all expressions } change  true while (change) change  false for i  0 to N T EMP   x  pred (b) (D EF (x)  (A VAIL (x)  NK ILL (x) )) if A VAIL (b i ) ≠ T EMP then change  true A VAIL (b i )  T EMP

COMP 512, Fall Iterative Data-flow Analysis Any finite semilattice is bounded Some infinite semilattices are bounded … … 0 ….001 ….002 … Real constants Termination If every f n  F is monotone, i.e., x ≤ y  f(x) ≤ f(y), and If the lattice is bounded, i.e., every descending chain is finite  Chain is sequence x 1, x 2, …, x n where x i  L, 1 ≤ i ≤ n  x i > x i+1, 1 ≤ i < n  chain is descending Then The set at each node can only change a finite number of times The iterative algorithm must halt on an instance of the problem 

COMP 512, Fall Iterative Data-flow Analysis Correctness ( What does it compute? ) If every f n  F is monotone, i.e., x ≤ y  f(x) ≤ f(y), and If the semilattice is bounded, i.e., every descending chain is finite  Chain is sequence x 1, x 2, …, x n where x i  L, 1 ≤ i ≤ n  x i > x i+1, 1 ≤ i < n  chain is descending Given a bounded semilattice S and a monotone function space F  k such that f k (  ) = f j (  )  j > k f k (  ) is called the least fixed-point of f over S If L has a T, then  k such that f k ( T ) = f j ( T )  j > k and f k ( T ) is called the maximal fixed-point of f over S optimism

COMP 512, Fall Iterative Data-flow Analysis Correctness If every f n  F is monotone, i.e., f(x  y) ≤ f(x)  f(y), and If the lattice is bounded, i.e., every descending chain is finite  Chain is sequence x 1, x 2, …, x n where x i  L, 1 ≤ i ≤ n  x i > x i+1, 1 ≤ i < n  chain is descending Then The round-robin algorithm computes a least fixed-point ( LFP ) The uniqueness of the solution depends on other properties of F Unique solution  it finds the one we want Multiple solutions  we need to know which one it finds

COMP 512, Fall Iterative Data-flow Analysis Correctness Does the iterative algorithm compute the desired answer? Admissible Function Spaces 1.  f  F,  x,y  L, f (x  y) = f (x)  f (y) 2.  f i  F such that  x  L, f i (x) = x 3.f,g  F  h  F such that h(x ) = f (g(x)) 4.  x  L,  a finite subset H  F such that x =  f  H f (  ) If F meets these four conditions, then an instance of the problem will have a unique fixed point solution (instance  graph + initial values)  LFP = MFP = MOP  order of evaluation does not matter Not distributive  fixed point solution may not be unique *

COMP 512, Fall Iterative Data-flow Analysis If a data-flow framework meets those admissibility conditions then it has a unique fixed-point solution The iterative algorithm finds the (best) answer The solution does not depend on order of computation Algorithm can choose an order that converges quickly Intuition Choose an order so that changes propagate as far as possible on each “sweep”  Process a node’s predecessors before the node Cycles pose problems, of course  Ignore back edges when computing the order? *

COMP 512, Fall Ordering the Nodes to Maximize Propagation Postorder Reverse Postorder Reverse postorder visits predecessors before visiting a node Use reverse preorder for backward problems  Reverse postorder on reverse CFG is reverse preorder N+1 - postorder number

COMP 512, Fall Iterative Data-flow Analysis Speed For a problem with an admissible function space & a bounded semilattice, If the functions all meet the rapid condition, i.e.,  f,g  F,  x  L, f (g(  )) ≥ g(  )  f (x)  x then, a round-robin, reverse-postorder iterative algorithm will halt in d(G)+3 passes over a graph G d(G) is the loop-connectedness of the graph w.r.t a DFST  Maximal number of back edges in an acyclic path  Several studies suggest that, in practice, d(G) is small ( <3 )  For most CFGs, d(G) is independent of the specific DFST Sets stabilize in two passes around a loop Each pass does O(E ) meets & O(N ) other operations *

COMP 512, Fall Iterative Data-flow analysis What does this mean? Reverse postorder  Easily computed order that increases propagation per pass Round-robin iterative algorithm  Visit all the nodes in a consistent order ( RPO )  Do it again until the sets stop changing Rapid condition  Most classic global data-flow problems meet this condition These conditions are easily met  Admissible framework, rapid function space  Round-robin, reverse-postorder, iterative algorithm  The analysis runs in ( effectively ) linear time

COMP 512, Fall Some problems are not admissible Global constant propagation First condition in admissibility  f  F,  x,y  L, f (x  y) = f (x)  f (y) Constant propagation is not admissible  Kam & Ullman time bound does not hold  There are tight time bounds, however, based on lattice height  Require a variable-by-variable formulation … a  b + c Function “f” models block’s effects f( S1 ) = {a=7,b=3,c=4} f( S2 ) = {a=7,b=1,c=6} f(S1  S2) = Ø S1 : {b=3,c=4} S2 : {b=1,c=6}

COMP 512, Fall Some admissible problems are not rapid Interprocedural May Modify sets Iterations proportional to number of parameters  Not a function of the call graph  Can make example arbitrarily bad Proportional to length of chain of bindings… shift(a,b,c,d,e,f) { local t; … call shift(t,a,b,c,d,e); f = 1; … } Assume call-by-reference Compute the set of variables (in shift) that can be modified by a call to shift How long does it take? shift abcdef Nothing to do with d(G)

COMP 512, Fall Extra Slides Start Here

COMP 512, Fall Computing Available Expressions A VAIL (b) =  x  pred(b) (DEE XPR (x)  (A VAIL (x)  E XPR K ILL (x) )) where E XPR K ILL (b) is the set of expression killed in b, and DEE XPR (b) is the set of expressions defined in b and not subsequently killed in b Initial condition A VAIL (n 0 ) = Ø, because nothing is computed before n 0 The other node’s A VAIL sets will be computed over their preds. N 0 has no predecessor.

COMP 512, Fall Making Theory Concrete Computing A VAIL for the example A VAIL (A) = Ø A VAIL (B) = { a+b }  ( Ø  all ) = { a+b } A VAIL (C)= { a+b } A VAIL (D) = { a+b,c+d }  ({ a+b }  all ) = { a+b,c+d } A VAIL (E) = { a+b,c+d } A VAIL (F) = [{ b+18,a+b,e+f }  ({ a+b,c+d }  { all - e+f })]  [{ a+17,c+d,e+f }  ({ a+b,c+d }  { all - e+f })] = { a+b,c+d,e+f } A VAIL (G)= [ { c+d }  ({ a+b }  all )]  [{ a+b,c+d,e+f }  ({ a+b,c+d,e+f }  all )] = { a+b,c+d } m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b r  c + d C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d u  e + f E v  a + b w  c + d x  e + f F *

COMP 512, Fall Redundancy Elimination Wrap-up AlgorithmAcronymCredits Local Value NumberingLVNBalke, 1967 Superlocal Value NumberingSVNMany Dominator-based Value Num’gDVNTSimpson, 1996 Global CSE (with A VAIL )GCSECocke, 1970 SCC-based Value Numbering † SCCVN/VDCMSimpson, 1996 Partitioning Algorithm † AWZAlpern et al, 1988 … and there are many others … † We have not seen these ones (yet). Three general approaches Hash-based, bottom-up techniques Data-flow techniques Partitioning Each has strengths & weaknesses

COMP 512, Fall Making Theory Concrete Comparing the techniques m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b r  c + d C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d u  e + f E v  a + b w  c + d x  e + f F LVN SVN DVN GRE DVN GRE The VN methods are ordered LVN ≤ SVN ≤ DVN (≤ SCCVN) GRE is different o Based on names, not value o Two phase algorithm  Analysis  Replacement

COMP 512, Fall Redundancy Elimination Wrap-up Comparisons Better results in loops

COMP 512, Fall Redundancy Elimination Wrap-up Generalizations Hash-based methods are fastest AWZ (& SCCVN) find the most cases Expect better results with larger scope Experimental data Ran LVN, SVN, DVNT, AWZ Used global name space for DVNT  Requires offline replacement  Exposes more opportunities Code was compiled with lots of optimization How did they do?  D VNT beat A WZ  Improvements grew with scope  D VNT vs. S CC V N was ± 1%  D VNT 6x faster than S CC V N  S CC V N 2.5x faster than A WZ * The partitioning method based on DFA minimization

COMP 512, Fall Redundancy Elimination Wrap-up Conclusions Redundancy elimination has some depth & subtlety Variations on names, algorithms & analysis matter Compile-time speed does not have to sacrifice code quality DVNT is probably the method of choice Results quite close to the global methods ( ± 1% ) Much lower costs than SCCVN or AWZ

COMP 512, Fall Lattice Theory This stuff is somewhat dry Everybody stand up and stretch 