1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer Control- flow analysis Data- flow analysis Transfor- mations
3 Determining Loops in Flow Graphs: Dominators Dominators: d dom n –Node d of a CFG dominates node n if every path from the initial node of the CFG to n goes through d –The loop entry dominates all nodes in the loop The immediate dominator m of a node n is the last dominator on the path from the initial node to n –If d n and d dom n then d dom m
4 Dominator Trees CFGDominator tree
5 Natural Loops A back edge is is an edge a b whose head b dominates its tail a Given a back edge n d –The natural loop consists of d plus the nodes that can reach n without going through d –The loop header is node d In other words –A natural loop must have a single-entry node d –There must be a back edge that enters node d
6 Natural Inner/Outer Loops Unless two loops have the same header, they are disjoint or one is nested within the other A nested loop is an inner loop if it contains no other loops A loop is an outer loop if it is not contained within another loop
7 Natural Inner Loops Example CFGDominator tree Natural loop for 7 dom 10 Natural loop for 3 dom 4 Natural loop for 4 dom 7
8 Natural Outer Loops Example CFGDominator tree Natural loop for 1 dom 9 Natural loop for 3 dom 8
9 Pre-Headers To facilitate loop transformations, a compiler often adds a preheader to a loop Code motion, strength reduction, and other loop transformations populate the preheader Header Preheader
10 Reducible Flow Graphs Example of a reducible CFG 1 23 Example of a nonreducible CFG Reducible graph = disjoint partition in forward and back edges such that the forward edges form an acyclic (sub)graph
11 Global Data-Flow Analysis To apply global optimizations on basic blocks, data-flow information is collected by solving systems of data-flow equations Suppose we need to determine the live-variables for a sequence of statements S in[S] = use[S] (out[S] - def[S]) i := m-1 j := n j := j-1 := i+j B1: B2: B3: Solution: in[B1] = { m, n } ({ i, j }–{ i, j }) = { m, n } out[B1] = in[B2] = { i, j } in[B2] = { j } ({ i, j } – { j }) = { j } out[B2] = in[B3] = { i, j }
12 Live-Variable Analysis S d: a:=b+c Then, the data-flow equations for S are: use[S]= { b, c } def[S]= { a } in[S]= use[S] (out[S] – def[S]) is of the form out[S] in[S] applies transfer function: f [S] (x) = use [S] (x - def [S] ) backwards
13 Live-Variable Analysis S in[S]= in[S 1 ] in[S 1 ]= use[S 1 ] (out[S 1 ] – def[S 1 ]) out[S 1 ]= in[S 2 ] in[S 2 ]= use[S 2 ] (out[S 2 ] – def[S 2 ]) out[S 2 ]= out[S] is of the form S2S2 S1S1
14 Live-Variable Analysis S in[S]= in[S 1 ] in[S 2 ] in[S 1 ]= use[S 1 ] (out[S 1 ] – def[S 1 ]) in[S 2 ]= use[S 2 ] (out[S 2 ] – def[S 2 ]) out[S 1 ]= out[S] out[S 2 ]= out[S] is of the form S2S2 S1S1
15 Live-Variable Analysis S in[S]= in[S 1 ] in[S 1 ]= use[S 1 ] (out[S 1 ] – def[S 1 ]) out[S 1 ]= out[S] use[S 1 ] is of the form S1S1 non-iterative solution
16 Global Data-Flow Analysis Suppose we need to determine the reaching definitions for a sequence of statements S out[S] = gen[S] (in[S] - kill[S]) d1: i := m-1 d2: j := n d3: j := j-1 B1: B2: B3: out[B1] = gen[B1] = {d1, d2} out[B2] = gen[B2] {d1} = {d1, d3} d1 reaches B2 and B3 and d2 reaches B2, but not B3 because d2 is killed in B2
17 Reaching Definitions S d: a:=b+c Then, the data-flow equations for S are: gen[S]= {d} kill[S]= D a - {d} out[S]= gen[S] (in[S] - kill[S]) where D a = all definitions of a in the region of code is of the form applies transfer function: f [S] (x) = gen [S] (x - kill [S] ) out[S] in[S] forwards
18 Reaching Definitions S gen[S]= gen[S 2 ] (gen[S 1 ] - kill[S 2 ]) kill[S]= kill[S 2 ] (kill[S 1 ] - gen[S 2 ]) in[S 1 ]= in[S] in[S 2 ]= out[S 1 ] out[S]= out[S 2 ] is of the form S2S2 S1S1
19 Reaching Definitions S gen[S]= gen[S 1 ] gen[S 2 ] kill[S]= kill[S 1 ] kill[S 2 ] in[S 1 ]= in[S] in[S 2 ]= in[S] out[S]= out[S 1 ] out[S 2 ] is of the form S2S2 S1S1
20 Reaching Definitions S gen[S]= gen[S 1 ] kill[S]= kill[S 1 ] in[S 1 ]= in[S] gen[S 1 ] out[S]= out[S 1 ] is of the form S1S1 non-iterative solution (see later)
21 Reaching Definitions: Computing Gen/Kill d 1 : i := m-1; d 2 : j := n; d 3 : a := u1; do d 4 : i := i+1; d 5 : j := j-1; if e1 then d 6 : a := u2 else d 7 : i := u3 while e2 ; gen={d 1 } kill={d 4, d 7 } d1d1 gen={d 2 } kill={d 5 } d2d2 gen={d 1,d 2 } kill={d 4,d 5,d 7 } ; d3d3 gen={d 3 } kill={d 6 } gen={d 1,d 2,d 3 } kill={d 4,d 5,d 6,d 7 } ; gen={d 3,d 4,d 5,d 6,d 7 } kill={d 1,d 2 } do ; gen={d 4 } kill={d 1, d 7 } d4d4 ; gen={d 5 } kill={d 2 } d5d5 if e2 d6d6 d7d7 e1 gen={d 6 } kill={d 3 } gen={d 7 } kill={d 1,d 4 } gen={d 4,d 5 } kill={d 1,d 2,d 7 } gen={d 4,d 5,d 6,d 7 } kill={d 1,d 2 } gen={d 6,d 7 } kill={}
22 Using Bit-Vectors to Compute Reaching Definitions d 1 : i := m-1; d 2 : j := n; d 3 : a := u1; do d 4 : i := i+1; d 5 : j := j-1; if e1 then d 6 : a := u2 else d 7 : i := u3 while e2 ; d1d1 d2d2 ; d3d3 ; do ; d4d4 ; d5d5 if e2 d6d6 d7d7 e d 1 d 2 d 3 d 4 d 5 d 6 d 7 gen= kill=
23 Reaching Definitions: Computing In/Out (non-iterative) d 1 : i := m-1; d 2 : j := n; d 3 : a := u1; do d 4 : i := i+1; d 5 : j := j-1; if e1 then d 6 : a := u2 else d 7 : i := u3 while e2 ; in={} out={d 1 } d1d1 in={d 1 } out={d 1,d 2 } d2d2 in={} out={d 1,d 2 } ; d3d3 in={d 1,d 2 } out={d 1,d 2,d 3 } in={} out={d 1,d 2,d 3 } ; in={} out={d 3,d 4,d 5,d 6,d 7 } do ; in={d 1,d 2,d 3,d 4,d 5,d 6,d 7 } out={d 2,d 3,d 4,d 5,d 6 } d4d4 ; in={d 2,d 3, d 4,d 5,d 6 } out={d 3,d 4,d 5,d 6 } d5d5 if e2 d6d6 d7d7 e1 in={d 3,d 4, d 5,d 6 } out={d 4,d 5,d 6 } in={d 3,d 4,d 5,d 6 } out={d 3,d 5, d 6,d 7 } in={d 1,d 2,d 3,d 4,d 5,d 6,d 7 } out={d 3,d 4,d 5,d 6 } in={d 1,d 2,d 3, d 4,d 5,d 6,d 7 } out={d 3,d 5,d 6,d 7 } in={d 1,d 2,d 3 } out={d 3,d 4,d 5, d 6,d 7 } in={d 3,d 4,d 5,d 6 } out={d 3,d 4,d 5,d 6,d 7 }
24 Accuracy, Safeness, and Conservative Estimations Conservative: refers to making safe assumptions when insufficient information is available at compile time, i.e. the compiler has to guarantee not to change the meaning of the optimized code Safe: refers to the fact that a superset of reaching definitions is safe (some may have been killed) Accuracy: more and better information enables more code optimizations
25 Reaching Definitions are a Conservative (Safe) Estimation S2S2 S1S1 Suppose this branch is never taken Estimation: gen[S]= gen[S 1 ] gen[S 2 ] kill[S]= kill[S 1 ] kill[S 2 ] Accurate: gen’[S]= gen[S 1 ] ⊆ gen[S] kill’[S]= kill[S 1 ] ⊇ kill[S]
26 Reaching Definitions are a Conservative (Safe) Estimation in[S 1 ] = in[S] gen[S 1 ] S1S1 Why gen? S is of the form The problem is that in[S 1 ] = in[S] out[S 1 ] makes more sense, but we cannot solve this directly (only iteratively), because out[S 1 ] depends on in[S 1 ]
27 Reaching Definitions are a Conservative (Safe) Estimation d: a:=b+c We have: (1) in[S 1 ] = in[S] out[S 1 ] (2) out[S 1 ] = gen[S 1 ] (in[S 1 ] - kill[S 1 ]) Solve in[S 1 ] and out[S 1 ] by estimating in 1 [S 1 ] using safe but approximate out[S 1 ]= , then re-compute out 1 [S 1 ] using (2) to estimate in 2 [S 1 ], etc. in 1 [S 1 ]= (1) in[S] out[S 1 ] = in[S] out 1 [S 1 ]= (2) gen[S 1 ] (in 1 [S 1 ] - kill[S 1 ]) = gen[S 1 ] (in[S] - kill[S 1 ]) in 2 [S 1 ]= (1) in[S] out 1 [S 1 ] = in[S] gen[S 1 ] (in[S] - kill[S 1 ]) = in[S] gen[S 1 ] out 2 [S 1 ]= (2) gen[S 1 ] (in 2 [S 1 ] - kill[S 1 ]) = gen[S 1 ] (in[S] gen[S 1 ] - kill[S 1 ]) = gen[S 1 ] (in[S] - kill[S 1 ]) Because out 1 [S 1 ] = out 2 [S 1 ], and therefore in 3 [S 1 ] = in 2 [S 1 ], we conclude that in[S 1 ] = in[S] gen[S 1 ]