Control Flow Analysis (Chapter 7)

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 More Control Flow John Cavazos University.
1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Lecture 2 Emery Berger University of.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
PSUCS322 HM 1 Languages and Compiler Design II Basic Blocks Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.
ECE1724F Compiler Primer Sept. 18, 2002.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
Lecture 6 Program Flow Analysis Forrest Brewer Ryan Kastner Jose Amaral.
2015/6/24\course\cpeg421-10F\Topic1-b.ppt1 Topic 1b: Flow Analysis Some slides come from Prof. J. N. Amaral
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Loops Guo, Yao.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
1 Region-Based Data Flow Analysis. 2 Loops Loops in programs deserve special treatment Because programs spend most of their time executing loops, improving.
Precision Going back to constant prop, in what cases would we lose precision?
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Introduction Of Tree. Introduction A tree is a non-linear data structure in which items are arranged in sequence. It is used to represent hierarchical.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Data-Flow Analysis (Chapter 8). Outline What is Data-Flow Analysis? Structure of an optimizing compiler An example: Reaching Definitions Basic Concepts:
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Loops.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
CS 614: Theory and Construction of Compilers Lecture 15 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Control Flow Analysis Compiler Baojian Hua
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Loops Simone Campanoni
1 Chapter10: Code generator. 2 Code Generator Source Program Target Program Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator.
Simone Campanoni CFA Simone Campanoni
Basic Program Analysis
Weakest Precondition of Unstructured Programs
CS 201 Compiler Construction
A Simple Syntax-Directed Translator
DATA STRUCTURES AND OBJECT ORIENTED PROGRAMMING IN C++
i206: Lecture 13: Recursion, continued Trees
Graph Algorithms Using Depth First Search
CS 201 Compiler Construction
Topic 10: Dataflow Analysis
Factored Use-Def Chains and Static Single Assignment Forms
Taken largely from University of Delaware Compiler Notes
Control Flow Analysis CS 4501 Baishakhi Ray.
Code Optimization Chapter 10
Code Optimization Chapter 9 (1st ed. Ch.10)
Optimizing Compilers CISC 673 Spring 2009 More Control Flow
Chapter 6 Intermediate-Code Generation
Topic 4: Flow Analysis Some slides come from Prof. J. N. Amaral
Code Optimization Overview and Examples Control Flow Graph
Control Flow Analysis (Chapter 7)
Optimizations using SSA
Interval Partitioning of a Flow Graph
Data Flow Analysis Compiler Design
EECS 583 – Class 7 Static Single Assignment Form
EECS 583 – Class 2 Control Flow Analysis
EECS 583 – Class 7 Static Single Assignment Form
Taken largely from University of Delaware Compiler Notes
Code Generation Part II
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Control Flow Analysis (Chapter 7)

Outline What is Control Flow Analysis? Structure of an optimizing compiler Constructing basic blocks Depth first search Finding dominators Reducibility Interval and Structural Analysis Conclusions

Control Flow Analysis Input: A sequence of IR Output: A partition of the IR into basic blocks A control flow graph The loop structure

Compiler Structure String of characters Scanner tokens Parser Symbol table and access routines AST OS Interface Semantic analyzer IR Code Generator Object code

Optimizing Compiler Structure String of characters Front-End IR Control Flow Analysis CFG Data Flow Analysis CFG+information Program Transformations Object code IR instruction selection

An Example Reaching Definitions A definition --- an assignment to variable An assignment d reaches a program point block if there exists an execution path to the this point in which the value assigned at d is still active

Running Example MIR intermediate code for C routine 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } 1 1, 2 1, 2, 3 1, 2, 3, 5 1, 2, 3, 5, 8, 9, 10, 11 1, 2, 3, 5, 8, 9, 10, 11 1, 3, 5, 8, 9, 10, 11 1, 5, 8, 9, 10, 11 1, 8, 9, 10, 11

Our first task is in analysing program to discover its control structure. Control structure is obvious in source code but difficult in case of Intermediate code, so we create visual representation, namely flow chart as shown in the following figure. Flow chart is in the following diagram

Identify basic blocks: where each basic block is informally a straight-line sequence of code that can be entered only at the beginning and exited only at the end. Node 1-4 basic block B1 Node 8-11 form Block B6 Node 12 into block B2 Node 5 into B3 Node 6 into B4 Node 7 into B5

Approaches for Control Flow Analysis Iterative Compute natural loops and iterate on CFG Interval Based Reduce the CFG to single node Inductively define the data flow solution Structural Identify control flow structures in the CFG

entry  1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m  2, 3 2, 3, 5 ,8, 9, 10, 11 2, 3, 5, 8,9, 10, 11 2, 3, 5 , 8, 9, 10, 11 2, 3, 5, 8,9, 10, 11 2,3 exit

entry exit {9, 10}, {1, 2, 3} {11}, {5} {2, 3, 5}, {8, 9, 10, 11} 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m {9, 10}, {1, 2, 3} {11}, {5} {2, 3, 5}, {8, 9, 10, 11} exit

entry {11}, {5} , {8, 9, 10, 11} {9, 10}, {1, 2, 3} exit

entry  1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m  2, 3 2, 3, 5,8,9, 10, 11 2, 3, 5, 8,9, 10, 11 2, 3, 5, 8,9, 10, 11 2, 3, 5,8,9, 10, 11 2,3 exit

entry {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5} exit

entry {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5} exit

entry {9, 10}, {1, 2, 3} , {8, 9, 10, 11, 5} exit

entry , {8, 9, 10, 11, 5} {9, 10}, {1, 2, 3} exit

entry , {1, 2, 3, 8, 9, 10, 11, 5} exit

entry , {1, 2, 3, 8, 9, 10, 11, 5} exit

Finding Basic Blocks A basic block is the maximal sequence of straight-line IR instructions no fork-join A leader IR instruction the entry of a routine a target of a branch instruction immediately following branch

Constructing basic blocks Input: a sequence of MIR instructions Output: a list of basic blocks where each MIR instruction occurs in exactly one block Method: determine the leaders of the basic blocks: - the first instruction in the procedure is a leader - any instruction that is the target of a jump is a leader - any instruction after branch is a leader for each leader its basic block consists of - the leader and - all instructions up to but not including the next leader or the end of the program

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m

Running Example unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m

Running Example B1 B2 B3 B4 B5 B6 unsigned int fib(unsigned int m) {unsigned int f0=0, f1=1, f2, i; if (m <= 1) { return m; } else { for (i=2, i <=m, i++) { f2=f0+f1; f0=f1; f1 =f2;} return f2; } 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m B1 B2 B3 B4 B5 B6

Constructing Control Flow Graph (CFG) Special entry block r without successors Special exit block without predecessors There is an edge m  n m= entry and the first instruction in n begins the procedure n=exit and the last instruction in m is return or the last instruction in the procedure there is a branch from the last instruction in m into the first instruction in n the first instruction in n immediately follows the last non-branch instruction in m

Running Example B1 B2 B3 B4 B5 B6 1: receive m(val) 2: f0  0 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m B1 B2 B3 B4 B5 B6

entry exit 1: receive m(val) 2: f0  0 3: f1  1 4: if m <= 1 goto L3 5: i  2 6: L1: if i <=m goto L2 7: return f2 8: L2: f2  f0 + f1 9: f0  f1 10: f1  f2 11: i  i + 1 12: goto L1 13: L3: return m exit

How to treat call instructions? A call is an atomic instruction A call ends a basic block Replace the call by the procedure body (inline) A call is a “goto” into the procedure A call is handled in a special way

Potential Difficulties Gotos outside procedure boundaries Exit/Trap calls Exception handling Computed gotos setjump(), lonjump() calls

DFS, Pre order Traversal & Post Order Traversal All these four apply to rooted, directed graph and to flow graph. DFS:- visits the descendants of a node in the graph before visiting any of its siblings that are not also its descendants.

Rooted directed graph with Depth-First presentation of it.

DFS algorithm

The Depth First presentation includes all the graph’s nodes and the edges that make up the depth first order displayed as a tree. The edges that are part of the DFS are called tree edges and that are not part of Depth First spanning tree is divide into 3 classes: 1. forward edges – F 2. Back edges – B 3. Cross Edge - C

To get DFS for a given graph, an instance of DFS that computes both a Depth First Spanning tree and pre order and post order traversal of the graph G = <N, E> with root r. DFS for the given graph is : 1,2,3,4,5,6,7,8 BFS for the given graph is :1,2,6,3,4,5,7,8

Dominators and Post dominators To determine the loops in a flow graph, we first define a binary relation called dominance on flow graph nodes. Node d dominates node i, written d dom i, if every possible execution path from entry to i includes d. dom is reflexive (Every node dominates itself), transitive (if a dom b and b dom c, then a dom c) and antisymmetric (if a dom b and b dom a , then b=a)

Immediate dominance idom : such that for a ≠ b , a idom b iff, a dom b and there does not exist a node c such that c ≠ b for which a dom c and c dom b, and we write idom(b) to denote the immediate dominance of b. Clearly, the immediate dominator of a node is unique.

The immediate dominance relation forms a tree of the nodes of a flow graph whose root is the entry node, whose edges are the immediate dominances, and whose paths display all the dominance relationships. d strictly dominates i, written d sdom i, if d dominates i and d ≠ i Node p post dominates node i, written p pdom i, if every possible execution path from i to exit includes p, i.e i dom p in flow graph with all the edges reversal and entry and exit interchanged.

Loops and strongly connected components Back edge shows loops: back edge d to c which shows loop

Given a back edge mn, the natural loop of mn is the subgraph consisting of the set of nodes containing n and all the nodes from which m can be reached in the flowgraph without passing through n and the edge set connecting all the nodes in its node set. Node n is the loop header Algorithm is used to compute set of nodes and the set of edges of the loop it needed.

Many optimizations require moving code from inside a loop to just before its header. To get this, we introduced the concept of pre header. Pre header is initially an empty block place just before the header of a loop, such that all the edges that previously went to the header from outside the loop now go to pre header and there is a single new edge from pre header to the header

Loop without and with pre header

If two loops have the same header then they are either nested loop or disjoint. The most general looping structure that may occur is a strongly connected components (SCC) of a flow graph, which is a sub graph Gs = <Ns, Es> such that every node by a path that includes only edges in Es. A strongly connected component is maximal if every strongly connected component containing it is the component itself.

Reducibility It results from several kinds of transformations that can be applied to flow graphs that collapse sub graphs into single nodes and hence reduce the flow graph. The pattern make flow graphs irreducible are called improper regions and they are multiple entry strongly connected components of a flow graph.

The simplest improper region is the two entry loop & three entry loop as in the above fig. It is easy to show how to produce an infinite sequence of distinct improper regions beginning with these two.

Practical approaches to deal with irreducibility 1. Iterative data flow analysis on irreducible regions and to plug the results into the data flow equations for the rest of the flow graphs. 2. Use technique called node splitting that transforms irreducible regions into reducible regions. 3. To perform an induced iteration on the lattice of monotone functions from the lattice to itself (chap 8).

The result of node splitting to B3

Interval analysis and control trees Interval analysis is a name given to several approaches to both control and data flow analysis In control flow analysis interval analysis refers to dividing up the flow graph into regions of various sorts consolidating each region into a new node. A flow graph resulting from one or more such transformations is called an abstract flow graph.

The result of applying a sequence of such transformations produces a control tree, defined as follows: 1. the root of control tree is an abstract graph representing the original flow graph. 2. the leaves of the control trees are individual basic blocks. 3. the nodes between the root and the leaves are abstract nodes representing regions of the flow graph. 4.The edges represent relationship between each abstract node and the regions that are its descendants.

Ex T1-T2 Analysis T1: collapses a one -node self loop to a single node T2: collapses a sequence of two nodes such that the first is the only predecessor of the second to a single node

Structural Analysis Identify “common” structures in the control flow graph (even irreducible) Reduce the CFG into “simple-regions” Shift some dataflow analysis from compile-time to compiler-generation-time Can be efficiently implemented via DFS

Examples of acyclic regions used in structural analysis

Examples of cyclic regions used in structural analysis

Structural analysis of a flow graph