Topic 3: Flow Analysis José Nelson Amaral

Slides:

Advertisements

Similar presentations

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.

Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.

Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.

Lecture 11: Code Optimization CS 540 George Mason University.

Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.

Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.

CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.

Dominators and CFGs Taken largely from University of Delaware Compiler Notes \course\cpeg421-05s\Topic2.ppt.

1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.

1 Introduction to Data Flow Analysis. 2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure.

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.

1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,

1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.

Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.

CMPUT Compiler Design and Optimization

1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,

1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.

PSUCS322 HM 1 Languages and Compiler Design II Basic Blocks Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.

Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.

1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.

Lecture 6 Program Flow Analysis Forrest Brewer Ryan Kastner Jose Amaral.

1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.

2015/6/24\course\cpeg421-10F\Topic1-b.ppt1 Topic 1b: Flow Analysis Some slides come from Prof. J. N. Amaral

Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.

CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.

2015/6/29\course\cpeg421-08s\Topic4-a.ppt1 Topic-I-C Dataflow Analysis.

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Loops Guo, Yao.

Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.

Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.

Ben Livshits Based in part of Stanford class slides from

1 Region-Based Data Flow Analysis. 2 Loops Loops in programs deserve special treatment Because programs spend most of their time executing loops, improving.

Precision Going back to constant prop, in what cases would we lose precision?

1 CS 201 Compiler Construction Data Flow Analysis.

1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.

Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.

1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,

CSc 453 Final Code Generation Saumya Debray The University of Arizona Tucson.

1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.

Dataflow Analysis Topic today Data flow analysis: Section 3 of Representation and Analysis Paper (Section 3) NOTE we finished through slide 30 on Friday.

Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.

Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 1 Developed By:

1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.

1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)

CMPUT Compiler Design and Optimization1 CMPUT680 - Fall 2006 Topic 3: Intermediate Representation in the ORC José Nelson Amaral

CS 614: Theory and Construction of Compilers Lecture 15 Fall 2003 Department of Computer Science University of Alabama Joel Jones.

CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.

1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.

1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,

Optimization Simone Campanoni

Code Optimization Data Flow Analysis. Data Flow Analysis (DFA)  General framework  Can be used for various optimization goals  Some terms  Basic block.

Basic Program Analysis

Data Flow Analysis Suman Jana

Dataflow Testing G. Rothermel.

University Of Virginia

Control Flow Analysis CS 4501 Baishakhi Ray.

Code Optimization Chapter 10

Code Optimization Chapter 9 (1st ed. Ch.10)

1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.

Topic 4: Flow Analysis Some slides come from Prof. J. N. Amaral

Code Optimization Overview and Examples Control Flow Graph

Control Flow Analysis (Chapter 7)

Interval Partitioning of a Flow Graph

Data Flow Analysis Compiler Design

Topic-4a Dataflow Analysis 2019/2/22 \course\cpeg421-08s\Topic4-a.ppt.

Static Single Assignment

Taken largely from University of Delaware Compiler Notes

CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019

Presentation transcript:

Topic 3: Flow Analysis José Nelson Amaral CMPUT680 Topic 3: Flow Analysis José Nelson Amaral

Reading List Slides Tiger book: section 8.2, chapter 10 (page 218), chapter 18 (pp 408-418) Dragon book: chapter 10

Flow Analysis Control flow analysis Interprocedural Program Flow analysis Intraprocedural Procedure Data flow analysis Local Basic block Control Flow Analysis: determine the control structure of a program and build a Control Flow Graph. Data Flow Analysis: determine the flow of scalar values and build Data Flow Graphs. Solution to the Flow Analysis Problem: propagate data flow information along a flow graph.

Lexical Analyzer (Scanner) Syntax Analyzer (Parser) Front End of a Compiler Lexical Analyzer (Scanner) + Syntax Analyzer (Parser) + Semantic Analyzer Abstract Syntax Tree with attributes Intermediate-code Generator Non-enhanced Intermediate Code Front End Error Message

Component-Based Approach to Building Compilers Target-1 Code Generator Target-2 Code Generator Intermediate-code Enhancer Language-1 Front End Source program in Language-1 Language-2 Front End in Language-2 Non-optimized Intermediate Code Optimized Intermediate Code Target-1 machine code Target-2 machine code

Advantages of Using an Intermediate Language 1. Retargeting - Build a compiler for a new machine by attaching a new code generator to an existing front-end. 2. Code Improvements - reuse intermediate code improvements in compilers for different languages and different machines. Note: the terms “intermediate code”, “intermediate language”, and “intermediate representation” are all used interchangeably.

The Phases of a Compiler position := initial + rate * 60 lexical analyzer id1 := id2 + id3 * 60 intermediate code generator temp1 := inttoreal (60) temp2 := id3 * temp1 temp3 := id2 + temp2 id1 := temp3 syntax analyzer := id1 + id2 * id3 60 code enhancer temp1 := id3 * 60.0 id1 := id2 + temp1 The Phases of a Compiler semantic analyzer := id1 + id2 * id3 inttoreal 60 code generator MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1

Flow Analysis Motivation: Constant Propagation S1: A ← 2 (def of A) S2: B ← 10 (def of B) Sk: C ← A + B Is C a constant? Sk+1: Do I = 1, C .

Introduction to Code Transformations Code transformation - a program transformation that preserves correctness and attempts to improves the performance (e.g., response time, throughput, space, power dissipation) of the input program. Code transformations may be performed at multiple levels of program representation: 1. Source code 2. Intermediate code 3. Target machine code

Basic Blocks A basic block is a sequence of consecutive intermediate language statements in which flow of control can only enter at the beginning and leave at the end. Only the last statement of a basic block can be a branch statement and only the first statement of a basic block can be a target of a branch. In some frameworks, procedure calls may occur within a basic block. (AhoSethiUllman, pp. 529)

Basic Block Partitioning Algorithm 1. Identify leader statements (i.e. the first statements of basic blocks) by using the following rules: (i) The first statement in the program is a leader (ii) Any statement that is the target of a branch statement is a leader (for most intermediate languages these are statements with an associated label) (iii) Any statement that immediately follows a branch or return statement is a leader (AhoSethiUllman, pp. 529)

Example: Finding Leaders The following code computes the inner product of two vectors. (1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) Three-address code begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 Source code (AhoSethiUllman, pp. 529)

Example: Finding Leaders The following code computes the inner product of two vectors. Rule (i) (1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) … begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 Source code Three-address code

Example: Finding Leaders The following code computes the inner product of two vectors. Rule (i) (1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) … begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 Rule (ii) Source code Three-address code

Example: Finding Leaders The following code computes the inner product of two vectors. Rule (i) (1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) … begin prod := 0; i := 1; do begin prod := prod + a[i] * b[i]; i = i+ 1; end while i <= 20 Rule (ii) Source code Rule (iii) Three-address code

Forming the Basic Blocks Now that we know the leaders, how do we form the basic blocks associated with each leader? 2. The basic block corresponding to a leader consists of the leader, plus all statements up to but not including the next leader or up to the end of the program.

Example: Forming the Basic Blocks (1) prod := 0 (2) i := 1 B2 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) Basic Blocks: B3 (13) …

Control Flow Graph (CFG) A control flow graph (CFG), or simply a flow graph, is a directed multigraph in which: (i) the nodes are basic blocks; and (ii) the edges represent flow of control (branches or fall-through execution). The basic block whose leader is the first intermediate language statement is called the start node. In a CFG we have no information about the data. Therefore an edge in the CFG means that the program may take that path.

Control Flow Graph (CFG) There is a directed edge from basic block B1 to basic block B2 in the CFG if: (1) There is a branch from the last statement of B1 to the first statement of B2, or (2) Control flow can fall through from B1 to B2 because: (i) B2 immediately follows B1, and (ii) B1 does not end with an unconditional branch

Example: Control Flow Graph Formation (1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) … B1 Rule (2) B2 B1 B2 B3 B3

Example : Control Flow Graph Formation (1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) … B1 Rule (1) Rule (2) B2 B1 B2 B3 B3

Example : Control Flow Graph Formation (1) prod := 0 (2) i := 1 (3) t1 := 4 * i (4) t2 := a[t1] (5) t3 := 4 * i (6) t4 := b[t3] (7) t5 := t2 * t4 (8) t6 := prod + t5 (9) prod := t6 (10) t7 := i + 1 (11) i := t7 (12) if i <= 20 goto (3) (13) … B1 Rule (1) Rule (2) B2 B1 B2 B3 Rule (2) B3

CFGs are Multigraphs Note: there may be multiple edges from one basic block to another in a CFG. Therefore, in general the CFG is a multigraph. The edges are distinguished by their condition labels. A trivial example is given below: [101] . . . [102] if i > n goto L1 Basic Block B1 False True [103] label L1: [104] . . . Basic Block B2

Identifying loops Question: Given the control flow graph of a procedure, how can we identify loops? Answer: We use the concept of dominance.

Dominators The dominator set of node b, dom(b), is A node a in a CFG dominates a node b if every path from the start node to node b goes through a. We say that node a is a dominator of node b. The dominator set of node b, dom(b), is formed by all nodes that dominate b. Note: by definition, each node dominates itself, therefore, b  dom(b).

Domination Relation Definition: Let G = (N, E, s) denote a flowgraph, where: N: set of vertices E: set of edges s: starting node. and let a ∈ N, b ∈ N. 1. a dominates b, written a ≤ b, if every path from s to b contains a. 2. a properly dominates b, written a < b, if a ≤ b and a ≠ b.

Domination Relation Definition: Let G = (N, E, s) denote a flowgraph, where: N: set of vertices E: set of edges s: starting node. and let a ∈ N, b ∈ N. 3. a directly (immediately) dominates b, written a <d b if: a < b and there is no c ∈N such that a < c < b.

An Example Domination relation: { (1, 1), (1, 2), (1, 3), (1,4) … (2, 3), (2, 4), … (2, 10) } 1 2 3 4 5 6 7 8 9 10 S Direct Domination: 1 <d 2, 2 <d 3, … Dominator Sets: DOM(1) = {1} DOM(2) = {1, 2} DOM(3) = {1, 2, 3} DOM(10) = {1, 2, 10)

Question Assume that node a is an immediate dominator of a node b. Is a necessarily an immediate predecessor of b in the flow graph?

Example Answer: NO! Example: consider nodes 5 and 8. 1 S 2 3 4 5 6 7 8 9 10

Dominance Intuition Imagine a source of light at the start node, and that the edges are optical fibers 1 S 2 3 4 5 To find which nodes are dominated by a given node a, place an opaque barrier at a and observe which nodes became dark. 6 7 8 9 10

Dominance Intuition The start node dominates all nodes in the flowgraph. 1 S 2 3 4 5 6 7 8 9 10

Dominance Intuition Which nodes are dominated by node 3? 1 S 2 3 4 5 6 7 8 9 10

Dominance Intuition Which nodes are dominated by node 3? 1 S Which nodes are dominated by node 3? 2 3 4 Node 3 dominates nodes 3, 4, 5, 6, 7, 8, and 9. 5 6 7 8 9 10

Dominance Intuition Which nodes are dominated by node 7? 1 S 2 3 4 5 6 8 9 10

Dominance Intuition Which nodes are dominated by node 7? 1 S Which nodes are dominated by node 7? 2 3 4 5 Node 7 only dominates itself. 6 7 8 9 10

Finding Loops Motivation: Programs spend most of the execution time in loops, therefore there is a larger payoff for optimizations that exploit loop structure. How do we identify loops in a flow graph? The goal is to create an uniform treatment for program loops written using different loop structures (e.g. while, for) and loops constructed out of goto’s. Basic idea: Use a general approach based on analyzing graph-theoretical properties of the CFG.

Definition A strongly-connected component (SCC) of a flowgraph G = (N, E, s) is a subgraph G’ = (N’, E’, s’) in which there is a path from each node in N’ to every node in N’. A strongly-connected component G’ = (N’, E’, s’) of a flowgraph G = (N, E, s) is a loop with entry s’ if s’ dominates all nodes in N’.

Example 1 2 3 In the flow graph below, do nodes 2 and 3 form a loop? Nodes 2 and 3 form a strongly connected component, but they are not a loop. Why? No node in the subgraph dominates all the other nodes, therefore this subgraph is not a loop.

How to Find Loops? Look for “back edges” start An edge (b,a) of a flowgraph G is a back edge if a dominates b, a < b. a b

Natural Loops Given a back edge (b,a), start Given a back edge (b,a), a natural loop associated with (b,a) with entry in node a is the subgraph formed by a plus all nodes that can reach b without going through a. a b

Natural Loops One way to find natural loops is: start 1) find a back edge (b,a) a 2) find the nodes that are dominated by a. 3) look for nodes that can reach b, without going through a, among the nodes dominated by a. b

An Example Find all back edges in this graph (9,1) 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 4 5 6 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 5 6 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (10,7) 5 6 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (10,7) 5 6 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (7,4) (10,7) {7,8,10} 5 6 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (7,4) (10,7) {7,8,10} 5 6 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 (7,4) {4,5,6,7,8,10} 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (8,3) (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 (7,4) {4,5,6,7,8,10} 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (8,3) (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 (7,4) {4,5,6,7,8,10} 7 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 (7,4) {4,5,6,7,8,10} 7 (8,3) {3,4,5,6,7,8,10} 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (4,3) (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 (7,4) {4,5,6,7,8,10} 7 (8,3) {3,4,5,6,7,8,10} 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (4,3) (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 (7,4) {4,5,6,7,8,10} 7 (8,3) {3,4,5,6,7,8,10} 8 9 10

An Example Find all back edges in this graph 1 Find all back edges in this graph and the natural loop associated with each back edge 2 3 (9,1) Entire graph 4 (10,7) {7,8,10} 5 6 (7,4) {4,5,6,7,8,10} 7 (8,3) {3,4,5,6,7,8,10} 8 (4,3) {3,4,5,6,7,8,10} 9 10

A Dominator Tree A dominator tree is a useful way to represent the dominance relation. In a dominator tree the start node s is the root, and each node d dominates only its descendents in the tree.

A Dominator Tree (Example) 1 1 2 2 3 3 4 4 5 6 5 6 7 7 8 8 9 10 9 10

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) WHILE RETURN F8ISTORE U8ADD U8MPY LDID (C) U8I8CVT (i) CV(8) F8CONST(0.0) BLOCK CV(0) GT STID (j) CV(0) WHILE GT LDID (i) (dimension) LDID (i) LDID (dimension) BLOCK void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; }

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) WHILE RETURN F8ISTORE U8ADD U8MPY LDID (C) U8I8CVT (i) CV(8) F8CONST(0.0) BLOCK CV(0) GT STID (j) CV(0) WHILE GT LDID (i) (dimension) LDID (i) LDID (dimension) BLOCK FuncEntry void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; }

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) WHILE RETURN F8ISTORE U8ADD U8MPY LDID (C) U8I8CVT (i) CV(8) F8CONST(0.0) BLOCK CV(0) GT STID (j) CV(0) WHILE GT LDID (i) (dimension) LDID (i) LDID (dimension) BLOCK FuncEntry i = 0; void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; }

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) CV(0) WHILE RETURN GT BLOCK STID (j) CV(0) WHILE GT LDID (i) (dimension) LDID (i) LDID (dimension) F8ISTORE BLOCK U8ADD F8CONST(0.0) U8MPY LDID (C) FuncEntry U8I8CVT CV(8) i = 0; LDID (i) test L1 merge body void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; }

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) CV(0) WHILE RETURN GT BLOCK STID (j) CV(0) WHILE GT LDID (i) (dimension) LDID (i) LDID (dimension) F8ISTORE BLOCK U8ADD F8CONST(0.0) U8MPY LDID (C) FuncEntry U8I8CVT CV(8) i = 0; LDID (i) i<dimension L1 merge body void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; }

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) CV(0) WHILE RETURN GT BLOCK STID (j) CV(0) WHILE GT LDID (i) (dimension) LDID (i) LDID (dimension) F8ISTORE BLOCK U8ADD F8CONST(0.0) U8MPY LDID (C) FuncEntry U8I8CVT CV(8) i = 0; LDID (i) i<dimension L1 merge C[i] =0.0; void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; }

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) CV(0) WHILE RETURN GT BLOCK WHILE GT LDID (i) (dimension) LDID (i) LDID (dimension) F8ISTORE STID (j) BLOCK U8ADD F8CONST(0.0) CV(0) U8MPY LDID (C) FuncEntry U8I8CVT CV(8) i = 0; LDID (i) i<dimension L1 merge void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; } C[i] =0.0; j = 0;

Highest WHIRL Representation FUNC_ENTRY (MatrixVectorMultiply) Highest WHIRL Representation Generating CFG from a WHIRL Tree IDNAME (C) IDNAME (A) IDNAME (B) IDNAME (dimension) BLOCK BLOCK BLOCK STID (i) CV(0) WHILE RETURN GT BLOCK LDID (i) LDID (dimension) F8ISTORE STID (j) WHILE U8ADD F8CONST(0.0) CV(0) GT BLOCK FuncEntry U8MPY LDID (C) U8I8CVT LDID (i) LDID (dimension) CV(8) i = 0; LDID (i) L1 i<dimension see file opt_cfg.cxx at: ORC2.0/osprey1.0/be/opt/ C[i] =0.0; j = 0; void MatrixVectorMultiply(double *C, double *A, double *B, int dimension) { int i, j; for(i=0 ; i<dimension ; i++) C[i] = 0.0; for(j=0 ; j<dimension ; j++) C[i] = C[i] + A[i*dimension+j]*B[j]; } L2 test body merge merge

Regions A region is a set of nodes N that includes a header with the following properties: (i) the header must dominate all the nodes in the region; (ii) All the edges between nodes in N are in the region; A single-entry-single-exit (SESE) region has an entry node that dominates all nodes in the region and an exit node that post-dominates all nodes in the region. A loop is a special region that forms a strongly connected component. A loop must have a single entry but may have multiple exits. Typically we are interested on studying the data flow into and out of regions. For instance, which definitions reach a region.

Points and Paths points in a basic block: - between statements - before the first statement - after the last statement d1: i := m-1 d2: j := n d3: a := u1 B2 d4: i := i+1 B3 In the example, how many points basic blocks B1, B2, B3, and B5 have? d5: j := j+1 B4 B5 B6 B1 has four, B2, B3, and B5 have two points each d6: a := u2 (AhoSethiUllman, pp. 609)

Points and Paths A path is a sequence of points d1: i := m-1 d2: j := n d3: a := u1 d4: i := i+1 d5: j := j+1 d6: a := u2 B1 B2 B3 B4 B6 B5 A path is a sequence of points p1, p2, …, pn such that either: (i) pi immediately precedes S, and pi+1 immediately follows S. (ii) or pi is the end of a basic block and pi+1 is the beginning of a successor block In the example, is there a path from the beginning of block B5 to the beginning of block B6?

Points and Paths A path is a sequence of points B1 A path is a sequence of points p1, p2, …, pn such that either: (i) if pi immediately precedes S, then pi+1 immediately follows S. (ii) or pi is the end of a basic block and pi+1 is the beginning of a successor block d1: i := m-1 d2: j := n d3: a := u1 B2 d4: i := i+1 B3 d5: j := j+1 In the example, is there a path from the beginning of block B5 to the beginning of block B6? B4 B5 B6 Yes, it travels through the end point of B5 and then through all the points in B2, B3, and B4. d6: a := u2

Global Dataflow Analysis Motivation We need to know variable def and use information between basic blocks for: constant folding dead-code elimination redundant-computation elimination code motion induction-variable elimination data-dependence-graph (DDG) construction

Definition and Use 1. Definition & Use Sk: V1 = V2 + V3 Sk is a definition of V1 Sk is an use of V2 and V3

Reach and Kill Kill a definition d1 of a variable v is killed between p1 and p2 if in every path from p1 to p2 there is another definition of v. d1: x := … d2 : x := … Reach a definition di reaches a point pj if ∃ a path di → pj, and di is not killed along the path In the example, do d1 and d2 reach the points and ? both d1, d2 reach point but only d1 reaches point

Problem Formulation: Example 1 Can d1 reach point p1? It depends on what point p1 represents!!! x := exp1 if p > 0 x := x + 1 a = b + c e = x + 1 d1 s1 s2 s3 s4 d1 x := exp1 s1 if p > 0 s2 x := x + 1 s3 a = b + c s4 e = x + 1 p1

Problem Formulation: Example 2 x := exp1 if y > 0 a := b + 2 x = exp2 c = a + 1 d1 s2 s3 d4 s5 Can d1 and d4 reach point p3? d1 x := exp1 s2 while y > 0 do s3 a := b + 2 d4 x := exp2 s5 c := a + 1 end while p3

Available Expressions An expression x+y is available at a point p if: (1) Every path from the start node to p evaluates x+y. (2) In each path, after the last evaluation prior to reaching p, there are no subsequent assignments to x or to y. We say that a basic block kills expression x+y if it may assign x or y, and does not subsequently recomputes x+y.

Available Expression: Example 3 S2: Y = A * B + C B2 S1: X = A * B + C B1 S3: C = 1 B3 S4: Z = A * B + C - D * E B4 Is expression A * B available at the begin of basic block B4?

Redundant Expressions: Example 3 S1: TEMP = A * B X = TEMP + C S4: Z = TEMP + C - D * E S2: TEMP = A * B Y = TEMP + C S3: C = 1 B1 B2 B3 B4 Yes, because it is generated in all paths leading to B4 and it is not killed after its generation in any path. Thus the redundant expression can be eliminated.

D-U and U-D Chains (Motivation) Many dataflow analyses need to find the use-sites of each defined variable or the definition-sites of each variable used in an expression. Def-Use (D-U), and Use-Def (U-D) chains are efficient data structures that keep this information. Notice that when a code is represented in Static Single-Assignment (SSA) form (as in most modern compilers) there is no need to maintain D-U and U-D chains.

UD chain An UD chain is a list of all definitions that can reach a given use of a variable. ... S1’: v= ... Sn: ... = … v … . . . A UD chain: UD(Sn, v) = (S1’, …, Sm’). Sm’: v = ...

DU chain A DU chain is a list of all uses that can be reached by a given definition of a variable. . . . Sn’: v = … S1: … = … v … ... Sk: … = … v … ... A DU chain: DU(Sn’, v) = (S1, …, Sk). (AhoSethiUllman, pp. 632)

Reaching Definitions Problem Statement: Determine the set of definitions reaching a point in a program. To solve this problem we must take into consideration the data flow and the control flow in the program. A common method to solve such a problem is to create a set of data-flow equations.

Global Data-Flow Analysis Set up dataflow equations for each basic block. For reaching definition the equation is: Note: the dataflow equations depend on the problem statement (AhoSethiUllman, pp. 608)

Data-Flow Analysis of Structured Programs Structured programs have an useful property: there is a single point of entrance and a single exit point for each statement. We will consider program statements that can be described by the following syntax: Statement → id := Expression | Statement ; Statement | if Expression then Statement else Statement | do Statement while Expression Expression → id + id | id (AhoSethiUllman, pp. 611)

Data-Flow Analysis of Structured Programs S ::= id := E | S ; S | if E then S else S | do S while E E ::= id + id | id This restricted syntax results in the forms depicted below for flowgraphs S1 S2 S1 If E goto S1 S1 S2 If E goto S1 S1; S2 if E then S1 else S2 do S1 while E (AhoSethiUllman, pp. 611)

Dataflow Equations for Reaching Definition Represents all other definitions of a in the program. S d : a := b + c gen[S] = {d} kill [S] = Da - {d} S S1 S2 gen [S] = gen [S2] ∪ (gen [S1] - kill [S2]) kill [S] = kill [S2] ∪ (kill [S1] - gen [S2]) Data-flow equations for reaching definitions S S1 S2 gen [S] = gen [S1] ∪ gen [S2] kill [S] = kill [S1] ∩ kill [S2] S S1 gen [S] = gen [S1] kill [S] = kill [S1] (AhoSethiUllman, pp. 612)

Dataflow Equations for Reaching Definition out [S] = gen [S] ∪ (in [S] - kill [S]) S d : a := b + c in [S1] = in [S] in [S2] = out [S1] out [S] = out [S2] S S1 S2 Date-flow equations for reaching definitions in [S1] = in [S] in [S2] = in [S] out [S] = out [S1] ∪ out [S2] S S1 S2 in [S1] = in [S] ∪ out [S1] out [S]= out [S1] S S1 (AhoSethiUllman, pp. 612)

Dataflow Analysis: An Example Using RD (reaching definition) as an example: i = 0 . i = i + 1 d1 : in loop L d2 : out Question: What is the set of reaching definitions at the exit of the loop L? in [L] = {d1} ∪ out[L] gen [L] = {d2} kill [L] = {d1} out [L] = gen [L]∪{in [L] - kill[L]} in[L] depends on out[L], and out[L] depends on in[L]!!

Solution: Iterative flow propagation Initialization out[L] = ∅ i = 0 . i = i + 1 d1 : in loop L d2 : out First iteration in[L] = {d1} ∪ out[L] = {d1} out[L] = gen [L] ∪ (in [L] - kill [L]) = {d2} ∪ ({d1} - {d1}) = {d2} Second iteration in[L] = {d1} ∪ out[L] = {d1,d2} out[L] = gen [L] ∪ (in [L] - kill [L]) = {d2} ∪ {{d1,d2} - {d1}} = {d2} ∪ {d2} = {d2} in [L] = {d1} ∪ out[L] gen [L] = {d2} kill [L] = {d1} out [L] = gen [L] ∪ {in [L] - kill[L]} We reached the fixed point!

Solution First iteration out[L] = {d2} Second iteration in[L] = {d1} ∪ out[L] = {d1,d2} out[L] = gen [L] ∪ (in [L] - kill [L]) = {d2} ∪ {{d1,d2} - {d1}} = {d2} ∪ {d2} = {d2} i = 0 . i = i + 1 d1 : in loop L d2 : out in [L] = {d1} ∪ out[L] gen [L] = {d2} kill [L] = {d1} out [L] = gen [L] ∪ {in [L] - kill[L]}

Iterative Algorithm for Reaching Definitions Step 1: Compute gen and kill for each basic block B1 d1: i := m-1 d2: j := n d3: a := u1 gen[B1] = {d1, d2, d3} kill[B1] = {d4, d5, d6, d7} gen[B2] = {d4, d5} kill [B2] = {d1, d2, d7} gen[B3] = {d6} kill [B3] = {d3} gen[B4] = {d7} kill [B4] = {d1, d4} B2 d4: i := i+1 d5: j :=j - 1 B3 d6: a := u2 B4 d7: i := u3 (AhoSethiUllman, pp. 626)

Iterative Algorithm for Reaching Definitions Step 2: For every basic block, make: out[B] = gen[B] B1 d1: i := m-1 d2: j := n d3: a := u1 Initialization: in[B1] = ∅ out[B1] = {d1, d2, d3} in[B2] = ∅ out[B2] = {d4, d5} in[B3] =∅ out[B3] = {d6} in[B4] = ∅ out[B4] = {d7} B2 d4: i := i+1 d5: j :=j - 1 B3 d6: a := u2 B4 d7: i := u3

Iterative Algorithm for Reaching Definitions To simplify the representation, the in[B] and out[B] sets are represented by bit strings. Assuming the representation d1d2d3 d4d5d6d7 we obtain: B1 d1: i := m-1 d2: j := n d3: a := u1 Initialization: in[B1] = ∅ out[B1] = {d1, d2, d3} in[B2] = ∅ out[B2] = {d4, d5} in[B3] = ∅ out[B3] = {d6} in[B4] = ∅ out[B4] = {d7} B2 d4: i := i+1 d5: j :=j - 1 B3 d6: a := u2 B4 d7: i := u3 Notation: d1d2d3 d4d5d6d7 (AhoSethiUllman, pp. 627)

Iterative Algorithm for Reaching Definitions gen[B1] = {d1, d2, d3} kill[B1] = {d4, d5, d6, d7} gen[B2] = {d4, d5} kill [B2] = {d1, d2, d7} gen[B3] = {d6} kill [B3] = {d3} gen[B4] = {d7} kill [B4] = {d1, d4} Iterative Algorithm for Reaching Definitions while a fixed point is not found: in[B] = ∪ out[P] where P is a predecessor of B out[B] = gen[B] ∪ (in[B]-kill[B]) d1: i := m-1 d2: j := n d3: a := u1 d4: i := i+1 d5: j :=j - 1 d7: i := u3 d6: a := u2 B1 B2 B4 B3 Notation: d1d2d3 d4d5d6d7

Iterative Algorithm for Reaching Definitions gen[B1] = {d1, d2, d3} kill[B1] = {d4, d5, d6, d7} gen[B2] = {d4, d5} kill [B2] = {d1, d2, d7} gen[B3] = {d6} kill [B3] = {d3} gen[B4] = {d7} kill [B4] = {d1, d4} Iterative Algorithm for Reaching Definitions while a fixed point is not found: in[B] = ∪ out[P] where P is a predecessor of B out[B] = gen[B] ∪ (in[B]-kill[B]) d1: i := m-1 d2: j := n d3: a := u1 d4: i := i+1 d5: j :=j - 1 d7: i := u3 d6: a := u2 B1 B2 B4 B3 Notation: d1d2d3 d4d5d6d7

Iterative Algorithm for Reaching Definitions gen[B1] = {d1, d2, d3} kill[B1] = {d4, d5, d6, d7} gen[B2] = {d4, d5} kill [B2] = {d1, d2, d7} gen[B3] = {d6} kill [B3] = {d3} gen[B4] = {d7} kill [B4] = {d1, d4} Iterative Algorithm for Reaching Definitions while a fixed point is not found: in[B] = ∪ out[P] where P is a predecessor of B out[B] = gen[B] ∪ (in[B]-kill[B]) d1: i := m-1 d2: j := n d3: a := u1 d4: i := i+1 d5: j :=j - 1 d7: i := u3 d6: a := u2 B1 B2 B4 B3 Notation: d1d2d3 d4d5d6d7

Algorithm Convergence Intuitively we can observe that the algorithm converges to a fix point because the out[B] set never decreases in size. It can be shown that an upper bound on the number of iterations required to reach a fix point is the number of nodes in the flow graph. Intuitively, if a definition reaches a point, it can only reach the point through a cycle free path, and no cycle free path can be longer than the number of nodes in the graph. Empirical evidence suggests that for real programs the number of iterations required to reach a fix point is less then five. (AhoSethiUllman, pp. 626)