Data Flow Analysis 3 15-411 Compiler Design Nov. 8, 2005.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
1 CS 201 Compiler Construction Machine Code Generation.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
1 CS 201 Compiler Construction Data Flow Framework.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Early Global Program Optimizations Chapter Mooly Sagiv.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
CS 536 Spring Global Optimizations Lecture 23.
Data Flow Analysis Compiler Design Nov. 3, 2005.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
CS 201 Compiler Construction
CHAPTER 4 Decidability Contents Decidable Languages
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Aho-Corasick String Matching An Efficient String Matching.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS5371 Theory of Computation Lecture 1: Mathematics Review I (Basic Terminology)
Data Flow Analysis Compiler Design Nov. 8, 2005.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
A UNIFIED APPROACH TO GLOBAL PROGRAM OPTIMIZATION Proseminar „Programmanalyse”, Prof. Dr. Heike Wehrheim Universität Paderborn, WS 2011/2012.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Induction and recursion
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Rev.S08 MAC 1140 Module 12 Introduction to Sequences, Counting, The Binomial Theorem, and Mathematical Induction.
June 27, 2002 HornstrupCentret1 Using Compile-time Techniques to Generate and Visualize Invariants for Algorithm Explanation Thursday, 27 June :00-13:30.
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Divide and Conquer Faculty Name: Ruhi Fatima Topics Covered Divide and Conquer Matrix multiplication Recurrence.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
1 Chapter 11 Global Properties (Distributed Termination)
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Data Flow Analysis II AModel Checking and Abstract Interpretation Feb. 2, 2011.
Optimization Simone Campanoni
Section Recursion 2  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
Dataflow Analysis CS What I s Dataflow Analysis? Static analysis reasoning about flow of data in program Different kinds of data: constants, variables,
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Code Optimization.
Data Flow Analysis Suman Jana
Topic 10: Dataflow Analysis
University Of Virginia
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Interval Partitioning of a Flow Graph
Data Flow Analysis Compiler Design
Static Single Assignment
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
COMPILERS Liveness Analysis
Presentation transcript:

Data Flow Analysis Compiler Design Nov. 8, 2005

Key Reference on Global Optimization Gary A. Kildall, A Unified Approach to Global Program Optimization, ACM Symposium on Principles of Programming Languages, 1973, pages From the abstract: “ A technique is presented for global analysis of object code generated for expressions. The global expression optimization presented includes constant propagation, common sub- expression elimination, elimination of redundant register load operations and live expression analysis. A general purpose program flow analysis algorithm is developed which depends on an optimizing function. The algorithm is defined formally using a directed graph model of program flow structure and is shown to be correct. …”

Kildall’s Contribution A number of techniques had been developed for compile-time optimization to  locate redundant computations,  perform constant computations,  reduce the number of store-load sequences, etc. Some provided analysis of only straight-line sequences of instructions; others tried to take program branching into account. Kildall gave a single unified flow analysis algorithm which extended all the straight-line techniques to include branching. He stated the algorithm formally and proved it correct in his POPL paper.

Constant Propagation – Example program begin integer i, a, b, c, d, e; a := 1; c:=0; … for i :=1 step 1 until 10 do begin b:= 2; … d := a + b; … e := b + c; … c := 4; … end

Directed Graph Representation Nodes represent sequences of instructions with no branches. Edges represent control flow between nodes.

Constant Propagation Convenient to associate a pool of propagated constants with each node in the graph. Pool is a set of ordered pairs which indicate variables that have constant values when node is encountered. The pool at node B denoted by P B consists of a single element (a,1) since the assignment a:= 1 must occur before B.

Constant Propagation (cont.) Fundamental problem of constant propagation is to determine the pool of constants for each node in an arbitrary program graph. By inspection of the program graph for the example, the pool of constants at each node is P A = P B = {(a, 1)} P C = {(a, 1)} P D = {(a, 1), (b, 2)} P E = {(a, 1), (b, 2), (d, 3)} P F = {(a, 1), (b, 2), (d, 3)}

Constant Propagation (cont.) P N may be determined for each node N in the graph as follows:  Consider each path (A, p 1,p 2, …, p n,N). Apply constant propagation along path to obtain set of constants at node N.  Intersection for each path to N is the set of constants which can be assumed for optimization. (It is unknown what path will be taken at execution time, so intersection is conservative choice)

Global Analysis Algorithm--Informal Start with an entry node in the program graph, along with a given entry pool corresponding to this entry node. Process the entry node and produce optimization information for all immediate successors of the entry node. Intersect incoming optimizing pools with already established pools at the successor nodes. (First time node is encountered, assume incoming pool is first approximation and continue processing.) for each successor, if amount of optimizing information is reduced by this intersection, then process successor like initial entry node.

Global Analysis Algorithm (cont) It is useful to define an optimizing function f which maps an input pool together with a particular node to a new output pool. Given a set of propagated constants, it is possible to examine the operation of a particular node and determine the set of constants that can be assumed after the node is executed. In the case of constant propagation, let V be a set of variables, C be a set of constants, and N be the set of nodes in the graph. The set U = V £ C represents ordered pairs which may appear in any constant pool. In fact, all constant pools are elements of the power set U, denoted P (U). Thus, f: N £ P (U) ! P (U), where (v, c) 2 f(N, P) if and only if (cont.)

Global Analysis Algorithm (cont.) 1. (v, c) 2 P and the operation at node N does not assign a new value to the variable v. 2. The operation at N assigns an expression to the variable v, and the expression evaluates to the constant c.

Constant Propagation (cont.) Successively longer paths from A to D can be evaluated, resulting in P D,3, P D,4, …, P D,n for arbitrarily large n. The pool of constants that can be assumed no matter what flow of control occurs is the set of constants common to all P D,i, i.e. Å i P D,i This procedure is not effective since the number of such paths may have no finite bound, and the procedure would not halt.

Optimization Function for Example The optimizing function can be applied to node A with an empty constant pool resulting in f(A, ; ) = {(a,1)}. The function can be applied to B with {(a, 1)} as the constant pool yielding f(B, {(a, 1)}) = {(a, 1), (c, 0)}.

Extending f to Paths in the Graph Given a path from entry node A to an arbitrary node N, optimizing pool for path is determined by composing the function f. For example, f(C, f(B, f(A, ; ))) = {(a, 1), (c, 0), (b, 2)} is the constant pool for D for this path.

Constant Propagation (cont.) The pool of propagated constants at node D can be determined as follows: A path from entry node A to the node D is (A, B, C, D). For this path the first approximation to the pool for D is P D,1 = {(a, 1), (b, 2), (c, 0)}. A longer path from A to D is (A, B, C, D, E, F, C, D) which results in the pool P D,2 = {(a, 1), (b, 2), (c, 4), (d, 3), (e, 2)}.

Computing the Pool of Optimizing Information. The pool of optimizing information which can be assumed at node N in the graph, independent of the path taken at execution time, is P N = Å {x | x 2 F N }. Here F N = { f(p n, f(p n-1, …, f(p 1, P))…)| (p 1, p 2, …, p n, N) is a path from an entry node p 1 with corresponding entry pool P to node N}.

Directed Graphs and Paths A finite directed graph G = is an arbitrary finite set of nodes N and edges E ½ N £ N. A path from node A to node B in G is a sequence (p 1, p 2, …, p k ) such that p 1 = A and p k = B where (p i, p i+1 ) 2 E for 1 6 i < k. The length of the path is k – 1.

Program Graphs A program graph is a finite directed graph G with a non-empty set of entry nodes I ½ N. Given N 2 N we assume there exists a path (p 1, p 2, …, p n ) such that p 1 2 I and p n = N. (i.e., there is a path to every node in the graph from an entry node.)

Successors and Predecessors of a Node The set of immediate successors of a node N is given by I(N) = { N’ 2 N | 9 (N,N’) 2 E }. The set of immediate predecessors of N is given by I -1 (N) = {N’ 2 N | 9 (N’, N) 2 E }.

Meet-Semilatticies Let the finite set L be the set of all possible optimizing pools for a given application. Let Æ be a meet operation with the properties: Æ : L £ L ! L x Æ y = y Æ x x Æ (y Æ z) = (x Æ y) Æ z where x, y z 2 L. The set L and the Æ operation define a finite meet-semilattice.

Ordering on Meet-Semilattices The Æ operation defines a partial ordering on L by x 6 y if and only if x Æ y = x. Similarly, x < y if and only if x 6 y and x  y.

Generalized Meet Operation If X ½ L, the generalized meet operation Æ X is defined as the pairwise application of Æ to the elements of X. L is assumed to have a “zero element” 0 such that 0 6 x for all x 2 L. An augmented set L ’ is constructed from L by adding a “unit element” 1 such that 1 is not in L and 1 Æ x = x for all x in L. The set L ’ = L [ {1}. It follows that x <1 for all x in L.

Optimizing Function An “optimizing function” f is defined f: N £ L ! L. It must have the homomorphism property: F(N, x Æ y) = f(N, x) Æ f(N, y) for all N 2 N and x, y 2 L. Note that f(N, x) < 1 for all N 2 N and x 2 L.

Global Analysis Algorithm Global analysis starts with an entry pool set EP ½ I £ L, where (e, x) 2 EP if e 2 I is an entry node with optimizing pool x 2 L. A1 [initialize] L := EP. A2 [terminate ?] If L = ; then halt. A3 [select node] Let L’ 2 L, L’ = (N, P i ) for some N 2 N and P i 2 L. Then L := L – {L’}. A4 [Traverse] Let P N be the current approximate pool for node N (Initially P N = 1). If P N 6 P i the go to step A2. A5 [set pool] P N := P N Æ P i, L:= L [ {(N’, f(N, P N )) | N’ 2 I(N)}. A6 [Loop] Go to step A2.