Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Static Single Assignment.

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

1 SSA review Each definition has a unique name Each use refers to a single definition The compiler inserts  -functions at points where different control.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
SSA.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Stanford University CS243 Winter 2006 Wei Li 1 SSA.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Program Representations. Representing programs Goals.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
CS745: SSA© Seth Copen Goldstein & Todd C. Mowry Static Single Assignment.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic G: Static Single- Assignment Form José Nelson Amaral
CS 201 Compiler Construction
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Static Single Assignment Form (SSA)
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Loops Guo, Yao.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
CSE P501 – Compiler Construction
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Dominators, etc. Emery Berger University.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 (lecture 8) and.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Introduction to SSA Data-flow Analysis Revisited – Static Single Assignment (SSA) Form Liberally Borrowed from U. Delaware and Cooper and Torczon Text.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Introduction to Optimization
Lecture 5 Partial Redundancy Elimination
Static Single Assignment
Building SSA Form (A mildly abridged account)
Efficiently Computing SSA
Introduction to Optimization
Factored Use-Def Chains and Static Single Assignment Forms
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
CSC D70: Compiler Optimization Static Single Assignment (SSA)
Code Optimization Overview and Examples Control Flow Graph
Static Single Assignment Form (SSA)
Optimizations using SSA
Control Flow Analysis (Chapter 7)
EECS 583 – Class 7 Static Single Assignment Form
Introduction to Optimization
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/4/2019 CPEG421-05S/Topic5.
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Reference These slides, with minor modification and some deletion, come from U. of Delaware – and the web, of course. 4/17/2019 CPEG421-05S/Topic5.
EECS 583 – Class 7 Static Single Assignment Form
CSC D70: Compiler Optimization Static Single Assignment (SSA)
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Static Single Assignment

“Advanced Compiler Techniques” Content  SSA Introduction  Converting to SSA  SSA Example  SSAPRE  Reading Tiger Book : 19.1, 19.3 Related Papers 2

“Advanced Compiler Techniques” Prelude  SSA ( Static Single - Assignment ): A program is said to be in SSA form iff Each variable is statically defined exactly only once, and each use of a variable is dominated by that variable ’ s definition. So, straight line code is in SSA form ? 3

“Advanced Compiler Techniques” Example  In general, how to transform an arbitrary program into SSA form?  Does the definitio n of X 2 dominates its use in the example? X  x x  X 3   (X 1, X 2 )

“Advanced Compiler Techniques” SSA: Motivation  Provide a uniform basis of an IR to solve a wide range of classical dataflow problems  Encode both dataflow and control flow information  A SSA form can be constructed and maintained efficiently  Many SSA dataflow analysis algorithms are more efficient ( have lower complexity ) than their CFG counterparts. 5

Static Single-Assignment Form  Each variable has only one definition in the program text.  This single static definition can be in a loop and may be executed many times. Thus even in a program expressed in SSA, a variable can be dynamically defined many times. 6 “Advanced Compiler Techniques”

Advantages of SSA  Simpler dataflow analysis  No need to use use - def / def - use chains, which requires N  M space for N uses and M definitions  SSA form relates in a useful way with dominance structures.  Differentiate unrelated uses of the same variable E. g. loop induction variables 7

SSA Is Used In Modern Compilers 8 Interprocedural Analysis and Optimization Loop Nest Optimization and Parallelization Global (Scalar) Optimization Backend Code Generation Front end Good IR Middle-End “Advanced Compiler Techniques”

SSA Is Used In Modern Compilers 9 Interprocedural Analysis and Optimization Loop Nest Optimization and Parallelization Global (Scalar) Optimization Backend Code Generation Front end Good IR Middle-End “Advanced Compiler Techniques”

10 Def-Use Chain  A tuple connects 2 data - flow events is a chain Chains express data - flow relationships directly Chains provide a graphical representation Chains jump across unrelated code, simplifying search  We can build chains efficiently Four interesting types of chains Def - Use chains are the most common “Advanced Compiler Techniques” 10

11 Def-Use Chains Example “Advanced Compiler Techniques” 11

12 Building Def-Use Chains  To construct D ef - U se chains, we solve reaching definitions  A definition d of some variable v reaches an operation i if and only if i reads v and there is a v - clear path from d to i “Advanced Compiler Techniques” 12

“Advanced Compiler Techniques” SSA Form – An Example SSA - form  Each name is defined exactly once  Each use refers to exactly one name What ’ s hard  Straight - line code is trivial  Splits in the CFG are trivial  Joins in the CFG are hard Building SSA Form  Insert Ø - functions at birth points  Rename all values for uniqueness x  x  a + b x  y - z x  13 z  x * q s  w - x ? 13

Def-Use Chains x  x  a + b x  y - z x  13 z  x * q s  w - x Computes  of three values here Computes  of four values here Value is born here  y - z Value is born here  y - z  13 Value is born  y - z  13  a+b “Advanced Compiler Techniques” 14 Birth point s

Def-Use Chains Birth point s x  x  a + b x  y - z x  13 z  x * q s  w - x We should be able to compute the values that we need with fewer meet operations, if only we can find these birth points. Need to identify birth points Need to insert some artifact to force the evaluation to follow the birth points Enter Static Single Assignment form “Advanced Compiler Techniques” 15

Def-Use Chains Making Birth Points Explicit x 0  x 5  a + b x 1  y - z x 3  13 z  x 4 * q s  w - x 6 These are all birth points for values All birth points are join points Not all join points are birth points Birth points are value-specific … “Advanced Compiler Techniques” 16

Def-Use Chains Making Birth Points Explicit x 0  x 5  a + b x 1  y - z x 3  13 z  x 4 * q s  w - x 6 x 2  Ø(x 1,x 0 ) x 4  Ø(x 3,x 2 ) x 6  Ø(x 5,x 4 ) Each needs a definition to reconcile the values of x Insert a ø - function at each birth point Rename values so each name is defined once Now, each use refers to one definition  Static Single Assignment Form “Advanced Compiler Techniques” 17

“Advanced Compiler Techniques” Review SSA - form  Each name is defined exactly once  Each use refers to exactly one name What ’ s hard  Straight - line code is trivial  Splits in the CFG are trivial  Joins in the CFG are hard Building SSA Form  Insert Ø - functions at birth points  Rename all values for uniqueness A Ø-function is a special kind of copy that selects one of its parameters. The choice of parameter is governed by the CFG edge along which control reached the current block. Real machines do not implement a Ø-function directly in hardware.(not yet!) y 1 ...y 2 ... y 3  Ø(y 1,y 2 ) 18

Use-def Dependencies in Non-straight-line Code 19 a = a a a  Many uses to many defs Overhead in representation Hard to manage Complicated : M * N “Advanced Compiler Techniques”

Factoring Operator 20 Number of edges reduced from 9 to 6 A  is regarded as def ( its parameters are uses ) Many uses to 1 def Each def dominates all its uses a = a a a a =   a, a, a ) Factoring – when multiple edges cross a join point, create a common node that all edges must pass through “Advanced Compiler Techniques”

Rename to represent U-D edges 21 a2=a2= a4a4 a4a4 a3=a3= a4a4 a 1 = a4=a1,a2,a3)a4=a1,a2,a3) No longer necessary to represent the use - def edges explicitly “Advanced Compiler Techniques”

SSA Form in Control-Flow Path Merges 22 b  M[x] a  0 if b<4 a  b c  a + b B1 B2 B3 B4 Is this code in SSA form? No, two definitions of a at B 4 appear in the code ( in B 1 and B 3) How can we transform this code into a code in SSA form? We can create two versions of a, one for B 1 and another for B 3. “Advanced Compiler Techniques”

SSA Form in Control-Flow Path Merges 23 b  M[x] a1  0 if b<4 a2  b c  a? + b B1 B2 B3 B4 But which version should we use in B 4 now? We define a fictional function that “ knows ” which control path was taken to reach the basic block B 4: = B3B3 from B4B4 at arrive we if a2a2 B2B2 from B4B4 at arrive we if a1a1 a2a1,  ( ) “Advanced Compiler Techniques”

SSA Form in Control-Flow Path Merges 24 b  M[x] a1  0 if b<4 a2  b a3   (a2,a1) c  a3 + b B1 B2 B3 B4 “Advanced Compiler Techniques” But which version should we use in B 4 now? We define a fictional function that “ knows ” which control path was taken to reach the basic block B 4: = B3B3 from B4B4 at arrive we if a2a2 B2B2 from B4B4 at arrive we if a1a1 a2a1,  ( )

A Loop Example X  1 X  X + 1 B1 B2 X 0  1 X 1   (X 2, X 0 ) X 2  X B1 B2 Before SSA After SSA “Advanced Compiler Techniques” 25

A Loop Example 26 a  0 b  a+1 c  c+b a  b*2 if a < N return a1  0 a3   (a1,a2) b1   (b0,b2) c2   (c0,c1) b2  a3+1 c1  c2+b2 a2  b2*2 if a2 < N return  ( b 0, b 2) is not necessary because b 0 is never used. But the phase that generates  functions does not know it. Unnecessary functions are eliminated by dead code elimination. Note : only a, c are first used in the loop body before it is redefined. For b, it is redefined right at the beginning. “Advanced Compiler Techniques”

The  function 27 How can we implement a  function that “ knows ” which control path was taken? Answer 1: We don ’ t !! The  function is used only to connect use to definitions during optimization, but is never implemented. Answer 2: If we must execute the  function, we can implement it by inserting MOVE instructions in all control paths. “Advanced Compiler Techniques”

A naive method  Simply introduce a phi - function at each “ join ” point in CFG  But, we already pointed out that this is inefficient – too many useless phi - functions may be introduced !  What is a good algorithm to introduce only the right number of phi - functions ? 28

Criteria For Inserting  Functions 29 We could insert one  function for each variable at every join point ( a point in the CFG with more than one predecessor ). But that would be wasteful. What should be our criteria to insert a  function for a variable a at node z of the CFG? Intuitively, we should add a function  if there are two definitions of a that can reach the point z through distinct paths. “Advanced Compiler Techniques”

Path Convergence Criterion 30 Insert a  function for a variable a at a node z if all the following conditions are true : 1. There is a block x that defines a 2. There is a block y  x that defines a 3. There is a non - empty path Pxz from x to z 4. There is a non - empty path Pyz from y to z 5. Paths Pxz and Pyz don ’ t have any nodes in common other than z 6. The node z does not appear within both Pxz and Pyz prior to the end, but it might appear in one or the other. The start node contains an implicit definition of every variable. “Advanced Compiler Techniques”

Iterated Path-Convergence Criterion 31 The  function itself is a definition of a. Therefore the path - convergence criterion is a set of equations that must be satisfied. while there are nodes x, y, z satisfying conditions 1-6 and z does not contain a  function for a do insert a   ( a, a, …, a ) at node z This algorithm is extremely costly, because it requires the examination of every triple of nodes x, y, z and every path leading from x to y. Can we do better? “Advanced Compiler Techniques”

Concept of dominance Frontiers 32 X  Block s domin ated by bb 1 bb1 bbn Border between dorm and not - dorm ( Dominanc e Frontier ) An Intuitive View “Advanced Compiler Techniques”

Dominance Frontier  The dominance frontier DF ( x ) of a node x is the set of all node z such that x dominates a predecessor of z, without strictly dominating z.  Recall : if x dominates y and x ≠ y, then x strictly dominates y  Intuitively : The fringe just beyond the region X dominates 33 DF ( X ) = { Y | ( ∃ P ∈ PRED ( Y ): X Dom P ) and X SDom Y }

“Advanced Compiler Techniques” Dominance Relation  If X appears on every path from ENTRY to Y, then X dominates Y.  Dominance relation is both reflexive and transitive.  idom ( Y ): immediate dominator of Y  Dominator Tree ENTRY is the root Any node Y other than ENTRY has idom ( Y ) as its parent Notation : parent, child, ancestor, descendant 34

“Advanced Compiler Techniques” Dominator Tree Example ENTRY a b c d EXIT ENTRY a d EXIT b c CFG Dominator Tree 35

“Advanced Compiler Techniques” Dominator Tree Example 36

Calculate The Dominance Frontier How to Determine the Dominance Frontier of Node 5 ? An Intuitive Way 1. Determine the dominance region of node 5: 2. Determine the targets of edges crossing from the dominance region of node 5 {5, 6, 7, 8} These targets are the dominance frontier of node 5 DF (5) = { 4, 5, 12, 13} NOTE : node 5 is in DF (5) in this case – why ? “Advanced Compiler Techniques”

Algorithm: Converting to SSA  Big picture, translation to SSA form is done in 3 steps The dominance frontier mapping is constructed form the control flow graph. Using the dominance frontiers, the locations of the  - functions for each variable in the original program are determined. The variables are renamed by replacing each mention of an original variable V by an appropriate mention of a new variable V i 38

39 SSA Construction Algorithm 1. Insert  - functions a.) calculate dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert  - functions  global name n  block b in which n is assigned  block d in b ’ s dominance frontier insert a  - function for n in d add d to n ’ s list of defining blocks { Creates the iterated dominance frontier This adds to the worklist ! Use a checklist to avoid putting blocks on the worklist twice; keep another checklist to avoid inserting the same  -function twice. Compute list of blocks where each name is assigned & use as a worklist Moderately complex “Advanced Compiler Techniques” 39

40 SSA Construction Algorithm 2. Rename variables in a pre - order walk over dominator tree ( use an array of stacks, one stack per global name ) Staring with the root block, b a.) generate unique names for each  - function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version ( from the stack ) ii. Rewrite definition by inventing & pushing new name c.) fill in  - function parameters of successor blocks d.) recurse on b ’ s children in the dominator tree e.) pop names generated in b from stacks “Advanced Compiler Techniques” 40

41 SSA Construction Algorithm  Computing Dominance  First step in  - function insertion computes dominance  Recall that n dominates m iff n is on every path from n 0 to m Every node dominates itself n ’ s immediate dominator is its closest dominator, IDOM ( n ) DOM ( n 0 ) = { n 0 } DOM ( n ) = { n }  (  p  preds ( n ) DOM ( p )) Initially, DOM(n) = N,  n≠n 0 “Advanced Compiler Techniques” 41

42 Why Dominance Frontiers  Dominance frontier criterion : if node x contains def of a, then any node z in DF ( x ) needs a  function for a  intuition : at least two non - intersecting paths converge to z, and one path must contain node strictly dominated by x “Advanced Compiler Techniques” 42

43 Dominance Frontiers (visually) “Advanced Compiler Techniques” 43

Dominance Frontiers 44 nB0B1B2B3B4B5B6B7 DOM(n)00,10,1,20,1,30,1,3,40,1,3,50,1,3,60,1,7 IDOM(n) DF(n) For each join point x For each CFG pred of x Run to ID OM (x ) in dom tree, add x to DF(n) for each n between x and ID OM (x ) B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 B7B7 B0B0 B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 B7B7 B0B0 Flow Graph Dominance Tree “Advanced Compiler Techniques”

Dominance Frontiers &  -Function Insertion 45 Dominance Frontiers A definition at n forces a  -function at m iff n  D OM (m) but n  D OM (p) for some p  preds(m) DF(n ) is fringe just beyond region n dominates B1B1 B2B2 B3B3 B4B4 B5B5 B6B6 B7B7 B0B0  in 1 forces  -function in DF(1) = {1} (halt ) x ... x   (...) DF(4) is {6}, so  in 4 forces  -function in 6 x   (...)  in 6 forces  -function in DF(6) = {7} x   (...)  in 7 forces  -function in DF(7) = {1} For each assignment, we insert the  -functions “Advanced Compiler Techniques”

Ф-function and Dominance Frontier  Intuition behind dominance frontier Y ∈ DF ( X ) means :  Y has multiple predecessors  X dominate one of them, say U, U inherits everything defined in X  Reaching definition of Y are from U and other predecessors  So Y is exactly the place where Ф - function is needed 46

a   (a,a) b   (b,b) c   (c,c) d   (d,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i ... B0B0 b ... c ... d ... B2B2 a   (a,a) b   (b,b) c   ( c,c) d   (d,d) i   (i,i) a ... c ... B1B1 a ... d ... B3B3 B4B4 c ... B5B5 d   (d,d) c   (c,c) b ... B6B6 i > 100 With all the  -functions Lots of new ops Renaming is next Assume a, b, c, & d defined before B 0 Excluding local names avoids  ’s for y & z “Advanced Compiler Techniques” 47

48 SSA Construction Algorithm (reminder) 2. Rename variables in a pre - order walk over dominator tree ( use an array of stacks, one stack per global name ) Staring with the root block, b a.) generate unique names for each  - function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses of global names with the current version ( from the stack ) ii. Rewrite definition by inventing & pushing new name c.) fill in  - function parameters of successor blocks d.) recurse on b ’ s children in the dominator tree e.) pop names generated in b from stacks “Advanced Compiler Techniques” 48

a   (a,a) b   (b,b) c   (c,c) d   (d,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i ... B0B0 b ... c ... d ... B2B2 a   (a,a) b   (b,b) c   (c,c) d   (d,d) i   (i,i) a ... c ... B1B1 a ... d ... B3B3 B4B4 c ... B5B5 d   (d,d) c   (c,c) b ... B6B6 i > 100 Counters Stacks a a0a0 b0b0 c0c0 d0d0 Before processing B 0 bcdi Assume a, b, c, & d defined before B 0 i has not been defined “Advanced Compiler Techniques” 49

a   (a,a) b   (b,b) c   (c,c) d   (d,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b ... c ... d ... B2B2 a   (a 0,a) b   (b 0,b) c   (c 0,c) d   (d 0,d) i   (i 0,i) a ... c ... B1B1 a ... d ... B3B3 B4B4 c ... B5B5 d   (d,d) c   (c,c) b ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 0 i0i0 “Advanced Compiler Techniques” 50

a   (a,a) b   (b,b) c   (c,c) d   (d,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b ... c ... d ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a ... d ... B3B3 B4B4 c ... B5B5 d   (d,d) c   (c,c) b ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 1 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 “Advanced Compiler Techniques” 51

a   (a 2,a) b   (b 2,b) c   (c 3,c) d   (d 2,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a ... d ... B3B3 B4B4 c ... B5B5 d   (d,d) c   (c,c) b ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 2 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 b2b2 d2d2 c3c3 “Advanced Compiler Techniques” 52

a   (a 2,a) b   (b 2,b) c   (c 3,c) d   (d 2,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a ... d ... B3B3 B4B4 c ... B5B5 d   (d,d) c   (c,c) b ... B6B6 i > 100 i ≤ 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 Before starting B 3 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 “Advanced Compiler Techniques” 53

a   (a 2,a) b   (b 2,b) c   (c 3,c) d   (d 2,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a 3 ... d 3 ... B3B3 d ... B4B4 c ... B5B5 d   (d,d) c   (c,c) b ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 3 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 a3a3 d3d3 “Advanced Compiler Techniques” 54

a   (a 2,a) b   (b 2,b) c   (c 3,c) d   (d 2,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a 3 ... d 3 ... B3B3 d 4 ... B4B4 c ... B5B5 d   (d 4,d) c   (c 2,c) b ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 4 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 a3a3 d3d3 d4d4 “Advanced Compiler Techniques” 55

a   (a 2,a) b   (b 2,b) c   (c 3,c) d   (d 2,d) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a 3 ... d 3 ... B3B3 d 4 ... B4B4 c 4 ... B5B5 d   (d 4,d 3 ) c   (c 2,c 4 ) b ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 5 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 a3a3 d3d3 c4c4 “Advanced Compiler Techniques” 56

a   (a 2,a 3 ) b   (b 2,b 3 ) c   (c 3,c 5 ) d   (d 2,d 5 ) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a 3 ... d 3 ... B3B3 d 4 ... B4B4 c 4 ... B5B5 d 5   (d 4,d 3 ) c 5   (c 2,c 4 ) b 3 ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 6 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 a3a3 d3d3 c5c5 d5d5 b3b3 “Advanced Compiler Techniques” 57

a   (a 2,a 3 ) b   (b 2,b 3 ) c   (c 3,c 5 ) d   (d 2,d 5 ) y  a+b z  c+d i  i+1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a) b 1   (b 0,b) c 1   (c 0,c) d 1   (d 0,d) i 1   (i 0,i) a 2 ... c 2 ... B1B1 a 3 ... d 3 ... B3B3 d 4 ... B4B4 c 4 ... B5B5 d 5   (d 4,d 3 ) c 5   (c 2,c 4 ) b 3 ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 Before B 7 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 “Advanced Compiler Techniques” 58

a 4   (a 2,a 3 ) b 4   (b 2,b 3 ) c 6   (c 3,c 5 ) d 6   (d 2,d 5 ) y  a 4 +b 4 z  c 6 +d 6 i 2  i 1 +1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a 4 ) b 1   (b 0,b 4 ) c 1   (c 0,c 6 ) d 1   (d 0,d 6 ) i 1   (i 0,i 2 ) a 2 ... c 2 ... B1B1 a 3 ... d 3 ... B3B3 d 4 ... B4B4 c 4 ... B5B5 d 5   (d 4,d 3 ) c 5   (c 2,c 4 ) b 3 ... B6B6 i > 100 Counters Stacks abcdi a0a0 b0b0 c0c0 d0d0 End of B 7 i0i0 a1a1 b1b1 c1c1 d1d1 i1i1 a2a2 c2c2 a4a4 b4b4 c6c6 d6d6 i2i2 “Advanced Compiler Techniques” 59

a 4   (a 2,a 3 ) b 4   (b 2,b 3 ) c 6   (c 3,c 5 ) d 6   (d 2,d 5 ) y  a 4 +b 4 z  c 6 +d 6 i 2  i 1 +1 B7B7 i > 100 i 0 ... B0B0 b 2 ... c 3 ... d 2 ... B2B2 a 1   (a 0,a 4 ) b 1   (b 0,b 4 ) c 1   (c 0,c 6 ) d 1   (d 0,d 6 ) i 1   (i 0,i 2 ) a 2 ... c 2 ... B1B1 a 3 ... d 3 ... B3B3 d 4 ... B4B4 c 4 ... B5B5 d 5   (d 4,d 3 ) c 5   (c 2,c 4 ) b 3 ... B6B6 i > 100 After renaming Semi-pruned SSA form We’re done … Semi-pruned  only names live in 2 or more blocks are “global names”. “Advanced Compiler Techniques” 60

61 SSA Construction  What ’ s this “ pruned SSA ” stuff? Minimal SSA still contains extraneous  - functions Inserts some  - functions where they are dead Would like to avoid inserting them Two ideas  Semi - pruned SSA : discard names used in only one block Significant reduction in total number of  - functions Needs only local Live information ( cheap to compute )  Pruned SSA : only insert  - functions where value is live Inserts even fewer  - functions, but costs more to do Requires global Live variable analysis ( more expensive ) In practice, both are simple modifications to step 1. “Advanced Compiler Techniques” 61

“Advanced Compiler Techniques” Converting Out of SSA  Eventually, a program must be executed.  The Ф - function have precise semantics, but they are generally not represented in existing target machines. X 17  (x 10,x 11 )...  x  x 17 X 17  x 10 X 17  x 11 62

“Advanced Compiler Techniques” Converting Out of SSA  Naively, a k - input Ф - function at entrance to node X can be replaced by k ordinary assignments, one at the end of each control predecessor of X.  Inefficient object code can be generated. 63

“Advanced Compiler Techniques” Dead Code Elimination  Where does dead code come from? Assignments without any use  Dead code elimination method Initially all statements are marked dead Some statements need to be marked live because of certain conditions Mark these statements can cause others to be marked live. After worklist is empty, truly dead code can be removed 64

“Advanced Compiler Techniques” SSA Example  Dead Code Elimination Intuition Because there is only one definition for each variable, if the list of uses of the variable is empty, the definition is dead. When a statement v  x  y is eliminated because v is dead, this statement must be removed from the list of uses of x and y. This might cause those definitions to become dead. 65

“Advanced Compiler Techniques” SSA Example  Simple Constant Propagation Intuition If there is a statement v  c, where c is a constant, then all uses of v can be replaced for c. A  function of the form v   ( c 1, c 2, …, cn ) where all ci are identical can be replaced for v  c. Using a work list algorithm in a program in SSA form, we can perform constant propagation in linear time. 66

Problems with Use-Def Chains 67 What happens if we statically know direction of branch? Do no need to propagate information along that path Easy to do with CFGs U - D chains Hard to tell which definitions to ignore “Advanced Compiler Techniques”

Use-Def with SSA 68 SSA form shortens u - d chains Chains terminate at merge points, rather than crossing them Can simply ignore information merged from un - taken branches Much easier to account for irrelevant information “Advanced Compiler Techniques”

69 VN Example Review With VNs a 3  x 1 + y 2  b 3  x 1 + y 2 a 4  17 4  c 3  x 1 + y 2 Rewritten a 0 3  x y 0 2  b 0 3  a 0 3 a 1 4  17 4  c 0 3  a 0 3 Give each value a unique name No value is ever killed These are SSA names ( static single - assignment ) Original Code a  x + y  b  x + y a  17  c  x + y “Advanced Compiler Techniques”

70 Example X =2 Y =3 X =3 Y =2 Z=X+YZ=X+Y {,, … } {,,, … } “Advanced Compiler Techniques”

71 Constant Propagation w/ SSA  For statements x i := C, for some constant C, replace all x i with C  For x i :=  ( C, C,..., C ), for some constant C, replace statement with x i := C  Iterate “Advanced Compiler Techniques”

72 Example: SSA a := 3 d := 2 f := a + d g := 5 a := g – d f < = g f := g + 1 g < a d := 2 T F T F a1 := 3 d1 := 2 d3 =  (d2,d1) a3 =  (a2,a1) f1 := a3 + d3 g1 := 5 a2 := g1 – d3 f1 <= g1 f2 := g1 + 1 g1 < a2 f3 :=  (f2,f1) d2 := 2 TF T F “Advanced Compiler Techniques”

73 a1 := 3 d1 := 2 d3 =  (d2,d1) a3 =  (a2,a1) f1 := a3 + d3 g1 := 5 a2 := g1 – d3 f1 <= g1 f2 := g1 + 1 g1 < a2 f3 :=  (f2,f1) d2 := 2 T F Example: SSA T F a1 := 3 d1 := 2 d3 =  (2,2) a3 =  (a2,3) f1 := a3 + d3 g1 := 5 a2 := 5 – d3 f1 <= 5 f2 := < a2 f3 :=  (f2,f1) d2 := 2 T F T F “Advanced Compiler Techniques”

74 Example: SSA a1 := 3 d1 := 2 d3 =  a3 =  (a2,3) f1 := a3 + 2 g1 := 5 a2 := 5 – 2 f1 <= 5 f2 := 6 5 < a2 f3 :=  (6,f1) d2 := 2 T F T F a1 := 3 d1 := 2 d3 =  (2,2) a3 =  (a2,3) f1 := a3 + d3 g1 := 5 a2 := 5 – d3 f1 <= 5 f2 := < a2 f3 :=  (f2,f1) d2 := 2 T F T F This may continue for a few steps... “Advanced Compiler Techniques”

Loop Invariants 75 i = 1 i <= 100 t1 = n + 2 k = i * t1 j = i j <= 100 t2 = 100*n t3 = 10 * k t4 = t2 + t3 t5 = t4 + j j = j + 1 i = i + 1 do i = 1, 100 k = i * (n+2) do j = i, 100 a[i,j] = 100 * n + 10*k + j end T F F T Inner loop Outer loop “Advanced Compiler Techniques”

Loop Invariants: SSA 76 i 1 = 1 i 2 =  (i 1,i 3 ) i 2 <= 100 t1 0 = n k 1 = i 2 * t1 0 j 1 = i 2 j 2 =  (j 1,j 3 ) j 2 <= 100 t2 0 = 100*n 1 t3 0 = 10 * k 1 t4 0 = t2 0 + t3 0 t5 0 = t4 0 + j 2 j 3 = j i 3 = i do i = 1, 100 k = i * (n+2) do j = i, 100 a[i,j] = 100 * n + 10*k + j end T F F T all operands are invariant invariant Inner loop Outer loop “Advanced Compiler Techniques”

Code Motion Example 77 i 1 = 1 i 2 =  (i 1,i 3 ) i 2 <= 100 t1 0 = n k 1 = i 2 * t1 0 j 1 = i 2 j 2 =  (j 1,j 3 ) j 2 <= 100 t2 0 = 100*n 1 t3 0 = 10 * k 1 t4 0 = t2 0 + t3 0 t5 0 = t4 0 + j 2 j 3 = j i 3 = i T F F T Assuming t1 not live outside the loop-nest, this stmt is invariant and all three conditions met i 1 = 1 t1 0 = n i 2 =  (i 1,i 3 ) i 2 <= 100 k 1 = i 2 * t1 0 j 1 = i 2 j 2 =  (j 1,j 3 ) j 2 <= 100 t2 0 = 100*n 1 t3 0 = 10 * k 1 t4 0 = t2 0 + t3 0 t5 0 = t4 0 + j 2 j 3 = j i 3 = i T F F T “Advanced Compiler Techniques”

Code Motion Example 78 i 1 = 1 t1 0 = n i 2 =  (i 1,i 3 ) i 2 <= 100 k 1 = i 2 * t1 0 j 1 = i 2 j 2 =  (j 1,j 3 ) j 2 <= 100 t2 0 = 100*n 1 t3 0 = 10 * k 1 t4 0 = t2 0 + t3 0 t5 0 = t4 0 + j 2 j 3 = j i 3 = i T F F T invariant and all conditions met, assuming t2, t3, t4 not live outside the loop-nest i 1 = 1 t1 0 = n i 2 =  (i 1,i 3 ) i 2 <= 100 k 1 = i 2 * t1 0 j 1 = i 2 t2 0 = 100*n 1 t3 0 = 10*k 1 t4 0 = t2 0 + t3 0 j 2 =  (j 1,j 3 ) j 2 <= 100 t5 0 = t4 0 + j 2 j 3 = j i 3 = i T F F T “Advanced Compiler Techniques”

Code Motion Example 79 i 1 = 1 t1 0 = n i 2 =  (i 1,i 3 ) i 2 <= 100 k 1 = i 2 * t1 j 1 = i 2 t2 0 = 100*n 1 t3 0 = 10 * k 1 t4 0 = t2 0 + t3 0 j 2 =  (j 1,j 3 ) j 2 <= 100 t5 0 = t4 0 + j 2 j 3 = j i 3 = i T F F T invariant and all conditions met i 1 = 1 t1 0 = n t2 0 = 100*n 1 i 2 =  (i 1,i 3 ) i 2 <= 100 k 1 = i 2 * t1 j 1 = i 2 t3 0 = 10 * k 1 t4 0 = t2 0 + t3 0 j 2 =  (j 1,j 3 ) j 2 <= 100 t5 0 = t4 0 + j 2 j 3 = j i 3 = i T F F T “Advanced Compiler Techniques”

SSA Example i=1; j=1; k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } return j; } i  1 j  1 k  0 j  i k  k+1 j  k k  k+2 return j if j<20 if k<100 B1 B2 B3 B5 B6 B4 B7 80

“Advanced Compiler Techniques” SSA Example i=1; j=1; k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } return j; } i  1 j  1 k1  0 j  i k3  k+1 j  k k5  k+2 return j if j<20 if k<100 B1 B2 B3 B5 B6 B4 B7 81

“Advanced Compiler Techniques” SSA Example i=1; j=1; k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } return j; } i  1 j  1 k1  0 j  i k3  k+1 j  k k5  k+2 return j if j<20 if k<100 k4   (k3,k5) B1 B2 B3 B5 B6 B4 B7

“Advanced Compiler Techniques” SSA Example i=1; j=1; k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } return j; } i  1 j  1 k1  0 j  i k3  k+1 j  k k5  k+2 return j if j<20 k2   (k4,k1) if k<100 k4   (k3,k5) B1 B2 B3 B5 B6 B4 B7

“Advanced Compiler Techniques” SSA Example i=1; j=1; k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } return j; } i  1 j  1 k1  0 j  i k3  k2+1 j  k k5  k2+2 return j if j<20 k2   (k4,k1) if k2<100 k4   (k3,k5) B1 B2 B3 B5 B6 B4 B7 84

“Advanced Compiler Techniques” SSA Example i=1; j=1; k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } return j; } i1  1 j1  1 k1  0 j3  i1 k3  k2+1 j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4,j1) k2   (k4,k1) if k2<100 j4   (j3,j5) k4   (k3,k5) B1 B2 B3 B5 B6 B4 B7 85

“Advanced Compiler Techniques” SSA Example: Const. Propagation i1  1 j1  1 k1  0 j3  i1 k3  k2+1 j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4,j1) k2   (k4,k1) if k2<100 j4   (j3,j5) k4   (k3,k5) B1 B2 B3 B5 B6 B4 B7 i1  1 j1  1 k1  0 j3  1 k3  k2+1 j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4, 1) k2   (k4,0) if k2<100 j4   (j3,j5) k4   (k3,k5) B1 B2 B3 B5 B6 B4 B7 86

“Advanced Compiler Techniques” SSA Example: Dead Code Elimination j3  1 k3  k2+1 j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4,1) k2   (k4,0) if k2<100 j4   (j3,j5) k4   (k3,k5) B2 B3 B5 B6 B4 B7 i1  1 j1  1 k1  0 j3  1 k3  k2+1 j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4, 1) k2   (k4,0) if k2<100 j4   (j3,j5) k4   (k3,k5) B1 B2 B3 B5 B6 B4 B7 87

“Advanced Compiler Techniques” SSA Example: Constant Propagation and Dead Code Elimination j3  1 k3  k2+1 j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4,1) k2   (k4,0) if k2<100 j4   (1,j5) k4   (k3,k5) B2 B3 B5 B6 B4 B7 j3  1 k3  k2+1 j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4,1) k2   (k4,0) if k2<100 j4   (j3,j5) k4   (k3,k5) B2 B3 B5 B6 B4 B7 88

“Advanced Compiler Techniques” SSA Example: One Step Further But block 6 is never executed ! How can we find this out, and simplify the program? SSA conditional constant propagation finds the least fixed point for the program and allows further elimination of dead code. k3  k2+1j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4,1) k2   (k4,0) if k2<100 j4   (1,j5) k4   (k3,k5) B2 B3 B5 B6 B4 B7 89

“Advanced Compiler Techniques” SSA Example: Dead Code Elimination k3  k2+1j5  k2 k5  k2+2 return j2 if j2<20 j2   (j4,1) k2   (k4,0) if k2<100 j4   (1,j5) k4   (k3,k5) B2 B3 B6 B4 B7 B4 k3  k2+1 return j2 j2   (j4,1) k2   (k4,0) if k2<100 j4   (1) k4   (k3) B2 B5 B7 90

“Advanced Compiler Techniques” SSA Example: Single Argument  Function Elimination B4 k3  k2+1 return j2 j2   (j4,1) k2   (k4,0) if k2<100 j4  1 k4  k3 B2 B5 B7 B4 k3  k2+1 return j2 j2   (j4,1) k2   (k4,0) if k2<100 j4   (1) k4   (k3) B2 B5 B7 91

“Advanced Compiler Techniques” SSA Example: Constant and Copy Propagation k3  k2+1 return j2 j2   (j4,1) k2   (k4,0) if k2<100 j4  1 k4  k3 B2 B5 B7 k3  k2+1 return j2 j2   (1,1) k2   (k3,0) if k2<100 j4  1 k4  k3 B2 B5 B7 B4 92

“Advanced Compiler Techniques” SSA Example: More Dead Code k3  k2+1 return j2 j2   (1,1) k2   (k3,0) if k2<100 j4  1 k4  k3 B2 B5 B7 B4 k3  k2+1 return j2 j2   (1,1) k2   (k3,0) if k2<100 B2 B5 B4 93

“Advanced Compiler Techniques” SSA Example: More  Function Simplification k3  k2+1 return j2 j2   (1,1) k2   (k3,0) if k2<100 B2 B5 B4 k3  k2+1 return j2 j2  1 k2   (k3,0) if k2<100 B2 B5 B4 94

“Advanced Compiler Techniques” SSA Example: More Constant Propagation k3  k2+1 return j2 j2  1 k2   (k3,0) if k2<100 B2 B5 B4 k3  k2+1 return 1 j2  1 k2   (k3,0) if k2<100 B2 B5 B4 95

“Advanced Compiler Techniques” SSA Example: Ultimate Dead Code Elimination k3  k2+1 return 1 j2  1 k2   (k3,0) if k2<100 B2 B5 B4 return 1 B4 96

Advantages of SSA-based optimizations  Dependency information built - in No separate phase required to compute dependency information  Transformed output preserves SSA form Little overhead to update dependencies  Efficient algorithms due to : Sparse occurrence of nodes Complexity dependent only on problem size ( independent of program size ) Linear data flow propagation along use - def edges Can customize treatment according to candidate  Can re - apply algorithms as often as needed  No separation of local optimizations from global optimizations 97 “Advanced Compiler Techniques”

SSA in the Real World  Invented end of the 80 s, a lot of research in the 90 s  Used in many modern compilers LLVM GNU GCC 4 ETH Oberon 2 IBM Jikes Java VM Java Hotspot VM Mono Many more… 98 “Advanced Compiler Techniques”

SSAPRE Kennedy et al., Partial redundancy elimination in SSA form, ACM TOPLAS

“Advanced Compiler Techniques” SSAPRE: Motivation  Traditional data flow analysis based on bit - vectors do not interface well with SSA representation 100

“Advanced Compiler Techniques” Traditional Partial Redundancy Elimination a  x*y b  x*y B1B1 B2B2 B3B3 t  x*y a  t t  x*y b  t B1B1 B2B2 B3B3 SS A fo rm No t SS A fo rm ! Before PRE After PRE 101

“Advanced Compiler Techniques” SSAPRE: Motivation (Cont.)  Traditional PRE needs a postpass transform to turn the results into SSA form again.  SSA is based on variables, not on expressions 102

“Advanced Compiler Techniques” SSAPRE: Motivation (Cont.)  Representative occurrence Given a partially redundant occurrence E 0, we define the representative occurrence for E 0 as the nearest to E 0 among all occurrences that dominate E 0. Unique and well - defined 103

“Advanced Compiler Techniques” FRG (Factored Redundancy Graph) Redundancy edge Without factoring Control flow edge EE E E Factored EE E E E =  (E, E,  ) 104

“Advanced Compiler Techniques” FRG (Factored Redundancy Graph) EE E E E =  (E, E,  ) t 0  x*yt 1  x*y t 2   ( t 0, t 1, t 3 )  t2 t2  t2 t2 Assume E=x*y Note: This is in SSA form 105

“Advanced Compiler Techniques” Observation  Every use - def edge of the temporary corresponds directly to a redundancy edge for the expression. 106

“Advanced Compiler Techniques” Intuition for SSAPRE  Construct FRG for each expression E  Refine redundant edges to form the use - def relation for each expression E ’ s temporary  Transform the program For a definition, replace with t  E For a use, replace with  t Sometimes need also insert an expression 107

“Advanced Compiler Techniques” Main Steps  Initialization : scan the whole program, collect all the occurrences, and build a worklist of expressions.  Then, for each entry in the worklist : Phi placement : find where the Phi nodes of the FRG must be placed. Renaming : assign appropriate " redundancy classes " to all occurrences Analyze : compute various flags in PHI nodes  This conceptually is composed of two data flow analysis passes, which in practice only scan the PHI nodes in the FRG, not the whole code, so they are not that heavy. Finalize : make so that the FRG is exactly like the use - def graph for the temporary introduced  Code motion : actually update the code using the FRG. 108

An Example 109 “Advanced Compiler Techniques”

Next Time  Inter - procedural Analysis Reading : Dragon chapter