Code Optimization More Optimization Techniques
More Optimization Techniques Loop optimization Code motion Strength reduction for induction variables Loop unrolling Function call optimization Function in-lining Tail recursion elimination Alias analysis Data flow analysis for pointers
Code Motion Concept Find loop invariants Code that will not change its results in each iteration Move them outside the loop whenever possible Method First identify the loop Then perform reachability analysis Find the define-use links For each variable used, find where it is defined If x’s definitions are all outside the loop, then x is a loop invariant Identify the loop invariant statements Exam the movability of the statements The process continues till no more code to be moved
Code Motion First identify the loop and the entry node Then compute the In and Out sets 1: read m 2: i := 1 3: n := 2 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 In[B2] = {} Out[B2] = {d1, d2, d3} In[B2] = {d1, d2, d3, d5, d6, d7, d8} Out[B2] = {d1, d2, d3, d5, d6, d7, d8} In[B3] = {d1, d2, d3, d5, d6, d7, d8} Out[B3] = {d1, d5, d6, d7, d8} In[B4] = {d1, d2, d3, d5, d6, d7, d8} Out[B4] = {d1, d2, d3, d5, d6, d7, d8}
Code Motion Now compute define-use links For each use, find the corresponding definitions in the In set If a variable has more than one possible source for its definition, then all links should be given 1: read m 2: i := 1 3: n := 2 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 In[B2]: {d1, d2, d3, d5, d6, d7, d8} In[B3]: {d1, d2, d3, d5, d6, d7, d8} m: d1 i: d2, d8 n: d3, d7 k: d5 j: d6
Code Motion Now find invariant code Statements whose operands are not defined within the loop Computation does not depend on any value computed in the loop 1: read m 2: i := 1 3: n := 2 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 Statement 5: m: defined outside the loop 2: constant Is a loop invariant Move 5 out of the loop No need to worry about the dependency to external definitions Statement 6: k: defined inside the loop 3: constant Not a loop invariant
Code Motion Create a block before the entry block (if not done so yet) Move the invariant code into the new block If there are other back edges pointing to the same entry block If it is an outer loop then Point to the new block May have to perform code motion for the other loops 1: read m 2: read n 3: i := 1 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 5: k := m + 2 Move 5 out of the loop Redirect the dependency link
Code Motion Continue to find new invariant code 1: read m 2: read n 3: i := 1 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 5: k := m + 2 No more statements can be moved out. Done! 6: Now, 6 has no operands that are defined within the loop Move 6 out of the loop 5: k := m + 2 6: j := 3 * k
Code Motion Problem Moving code causes the execution order to change i := 1 v := v – 1 If v > 0 goto B2 i := 2 u := u + 1 B1 B2 B4 B3 If u>v goto B4 j := i … B5 i := 2 is a loop invariant Move out of the loop? What would i be? Both i := 1 and i := 2 reaches B5 Moving i := 2 out would cause i := 1 “dead” Move only if the block (containing the statement to be moved out) dominates all exit blocks
Code Motion Problem Moving code causes the execution order to change i := 1 v := v – 1 If v > 0 goto B2 i := 2 u := u + 1 B1 B2 B4 B3 i := 3 If u>v goto B4 j := i … B5 Should i := 3 be moved out of the loop? B2 does dominate the only exit block What would i be? Moving i := 3 out would cause i=2 reaches the end of B2 Move only if there is no other definition of i (the variable being defined) in the loop
Code Motion Problem Moving code causes the execution order to change i := 1 v := v – 1 If v > 0 goto B2 i := 2 u := u + 1 B1 B2 B4 B3 k := i j := i … B5 Should i := 2 be moved out of the loop? It does dominate the exit. It is the only definition of i in the loop. What would i be? B2 uses i, and i := 2 as well as i := 1 reaches B2 Move only if all the use of i (the variable being defined) in the loop can only be reached by i := 2 (the statement to be moved out)
Code Motion Problem Moving code causes the execution order to change Algorithm Detect loop invariant code s: invariant statement, a candidate to be moved out B: the block containing s d: the definition produced by s x: the variable defined in d Check B dominates all exits No other definition of x in the loop All use of x in the loop is from d Only if all three conditions are met, move the code To the block immediately before the loop entry
Code Motion Check B dominates all exit nodes What is an exit node? Entry node? Node of the back edge? Any node in the loop with an outgoing edge to a node that is not in the loop Check dominator tree for this condition No other definition of x in the loop Simply check all the definitions All uses of x in the loop are from d Check the In set of each block after the exit block
Induction Variables Finding basic induction variables in loop L Scanning the statements in L and for each variable, say x Within L, x is only defined once Definition of x is of the form x := x b, b is a constant Could be x := b x also Perform constant propagation and constant folding to allow better recognition of b
Induction Variables Finding other induction variables y, y is defined as a linear function of x Scan the statements in L and find all variables, say y y is only defined once in L Definition of y is equivalent to y := c * x + d x is a basic induction variable c and d are constants Then: y is in the family of x, expressed as (x, c, d) x is expressed as (x, 1, 0) y := 4 * x, then y is (x, 4, 0)
Strength Reduction for Induction Variables x is the basic IV Definition of x is: x := x + b For each y in the family of x, expressed as (x, c, d) Create a new variable sy outside the loop Immediately before the loop entry If not created yet Initialize sy outside the loop sy := c * x + d Assignment to sy in the loop Add sy := sy + c * b immediately after x’s definition Replacing definition of y Replaced by y := sy
Strength Reduction for Induction Variables i := i + 1 t2 := 4 * i t3 := a[t2] if t3 < v … i := m – 1 t1 := 4 * n v := a[t1] i := i + 1 s2 := s2 + 4 t2 := s2; t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] Add s2 and initialize s2 i is a basic IV t2 is an IV in i’s family (i, 4, 0) Add: s2 := s2 + 4 Replace: t2 := 4*i by t2 := s2 Why not just replace t2 := 4 * i by t2 := t2 + 4 and initialize t2 := 4 * i
Strength Reduction for Induction Variables use t2 i := i + 1 t2 := t2 + 4 t3 := a[t2] if t3 < v … t2’s use is now from an incorrect def t2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] t2 := t1 + v use t2 i := i + 1 t2 := 4 * i t3 := a[t2] if t3 < v … i := m – 1 t1 := 4 * n v := a[t1] t2 := t1 + v use t2 i := i + 1 s2 := s2 + 4 t2 := s2 t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] t2 := t1 + v Why can’t we just replace t2 := 4 * i by s2 := s2 + 4 and t2 := s2 (on the same spot)
Strength Reduction for Induction Variables s2 := s2 + 4 t2 := s2 use t2 i := i + 1 s2 := s2 + 4 t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] The update has to be done after i’s def t2 := s2 has to be where t2’s def is change s2 and so t2 at an incorrect time t2 := 4 * i use t2 i := i + 1 t3 := a[t2] if t3 < v … i := m – 1 t1 := 4 * n v := a[t1] t2 := s2; use t2 i := i + 1 s2 := s2 + 4 t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] t2 := s2 is in t2’s original location Def of s2 is right after i’s def
Finding More Induction Variables Consider y being in the family of a basic IV x and z is defined as a linear function of y Scan the statements in L and find all variables, say z z is only defined once in L Definition of z is equivalent to z := c’ * y + d’ y is an induction variable, and y is in the family of x c’ and d’ are constants No assignment to x between assignments to y and z No assignment to y outside L reaches the use of z Then: z is in the family of x, expressed as (x, c’*c, c’*d+d’)
Loop Unrolling Advantages Saves loop tests Facilitate the application of optimization techniques Increases instruction level parallelism fac := 1; for (int i:=5; i>1; i--) { fac := (fac * i) % Max; } fac := 1; fac := (fac * 5) % Max; fac := (fac * 4) % Max; fac := (fac * 3) % Max; fac := (fac * 2) % Max; do constant propagation can achieve great optimization
Optimization for Function Calls Tail recursion elimination A function is tail-recursive if the last stmt is a call to itself list_type last (lst: list_type) { if (lst.next = null) then return lst; else return last (lst.next); } Tail recursive call can be replaced with list_type last (lst: list_type) { start: if (lst.next == null) then return lst; else { lst := lst.next; go to start } }
Optimization for Function Calls Tail recursion elimination Replacement rule Formal param of the func := Actual param in the tail-recursive call Apply to multiple parameters, as long as they have one to one correspondence Go back to the start of the function Advantages of tail recursion elimination The assignments and a jump were anyway needed by a function call Function call requires additional copies and jumps Saves stack space Recursive call: require linear stack usage Converted code: only constant stack usage
Optimization for Function Calls Function Inlining Advantages Enable additional optimizations oDue to the revealing of the actual parameters, a lot of optimizations may be possible Save copying and jump required due to a function call Disadvantages Code size increases
Function Inlining – Return and Exit Simple case Replace return stmt by assigning returned value to the target variable Assume that there is returned value and the returned value is assigned to a target variable in the caller Some issues The function may have multiple exits Use a label to serve as a single exit point The function may not assign return values on all paths Have to be truthful to original code Create a temporary variable to keep the returned result
Function Inlining – Return and Exit General rules Replace return statements by Assigning returned value to the temporary variable (if there is returned value) Followed by goto to exit-label Assigning the temporary variable to the target variable at the exit label Optimizations If the function has a single return statement and the caller stmt is an assignment stmt Use the simple case rules If the function has no return statement Simply use the exit label Apply all data flow analysis and optimization techniques
Function Inlining – Parameter Passing Simple case Replace formal parameters by actual parameters If an actual parameter is an expression, evaluated it once Assign evaluation result to a temporary variable Replace formal parameter with temporary variable Actual parameter is A[exp] Mix of the above two cases Problems Array size mismatch, subtyping, etc. Global variables and static Recognize global variables and preserve its scope Rename for matching names Retain the static status of static variables
Function Inlining int Max := 10**6; … int factorial (int x) { int fac := 1; for (int i:=x; i>1; i--) { fac := (fac * i) % Max; } return fac; } … int val1 := factorial(10); … int val2 := factorial(5); … int Max := 10**6; … int val1; { int fac := 1; for (int i:=10; i>1; i--) { fac := (fac * i) % Max; } val1 := fac; } … int val2; { int fac := 1; for (int i:=5; i>1; i--) { fac := (fac * i) % Max; val2 := fac; } … single return point global var preserve scope
Function Inlining int Max := 10**6; int overflow := -2; … int factorial (int x) { if (x >= 1) { int fac := 1; for (int i:=x; i>1; i--) { fac := (fac * i); if (fac > Max) return overflow; } return fac; } … int val := factorial (x*y+2); … Multiple return points Undefined return path for x < 1 actual parameter is an expression int Max := 10**6; int overflow := -2; … int t1 := x*y+2; int ret; { if (t1 >= 1) { int fac := 1; for (int i:=t1; i>1; i--) { fac := (fac * i); if (fac > Max) { ret := overflow; goto L1; } } { ret := fac; goto L1; } } L1: int val := ret … define temp ret var define exit label define temp var evaluate exp same as original code ret undefined on some paths
A Headache in Data Flow Analysis - Alias Pointer aliases E.g., *p := y may write to any memory location E.g., x := *p may read from any memory location If p is dynamically computed Problems Example: Live variable analysis …, x := *p all variables in all prior blocks may be alive Example: Constant folding a := 1; b := 2; *p := 0; c := a + b; c := 3 only if *p is not an alias for a or b! All data flow analysis techniques can be messed up by aliases Need to perform alias analysis when possible
Alias Analysis Ptr(v) For each variable v that may hold an address, computer Ptr(v) Ptr(v) includes all variables v may point to Ptr(v) may include variables allocated in stack and heap Can be represented as a graph G, G 2 V V V: the set of all variables in the program E.g., Ptr(x) = {y}, Ptr(y) = {u, v} Anderson’s graphical representation x y u v x y u v a b
Alias Flow Analysis Data flow analysis For p := &q Add (p,q) to Gen[B] Add (p,x) to Kill[B], for all x in V (the entire set of variables) oBesides the ones in Gen[B] For p := q Add (p,x) to Gen[B], for all x, x in Ptr(q) Add (p,x) to Kill[B], for all x in V (the entire set of variables) oBesides the ones in Gen[B] For p := *q Add (p,x) to Gen[B], for all x, x in Ptr(Ptr(q)) Add (p,x) to Kill[B], for all x in V (the entire set of variables) p q … p q … p q ? …
Alias Flow Analysis Data flow analysis For *p := q Add (x,y) to Gen[B], for all x in Ptr(p) and all y in Ptr(q) Add (x,y) to Kill[B], for all x in Ptr(p) and all y in V oBesides those in Gen[B] For *p := *q Add (x,y) to Gen[B], for all x in Ptr(p) and all y in Ptr(Ptr(q)) Add (x,y) to Kill[B], for all x in Ptr(p) and all y in V oBesides those in Gen[B] When you have *…* or &…& Resolve layer by layer p q ? … ? q … p ?
Alias Flow Analysis Data flow analysis Out[B] = Gen[B] (In[B] – Kill[B]) If (x,y) is in the In set and also in the Kill set, then it is killed In[B] = Out[P] As long as a definition of a pointer definition pair (x,y) reaches B, the pointer x could have value y Assumption No pointer arithmetic If there is, then need to use offset, instead of actual variables, for data flow analysis
Alias Flow Analysis Example x := &a y := &b c := &i if (i) *x := c x := y x a i b y c Out: (x,a), (y,b), (c,i) In: (x,a), (y,b), (c,i) Killed: (x,a) Out: (x,b), (y,b), (c,i) In: (x,a), (x,b), (y,b), (c,i) Killed: (a,*), (b,*) -- none Out: (x,a), (x,b), (y,b), (c,i), (a,i), (b,i) x points to a and b a and b are now possible to point to whatever c points to
Data Flow Analysis with Alias Apply the reachability information to perform analysis E.g. a := 1; b := 2; *p := 0; c := a + b; *p := 0 defines x := 0, for all (p,x) that reaches *p := 0 If (p,a) or (p,b) reaches *p := 0, then c is no longer a constant If only (p,a) reaches *p := 0 does not imply p = a, p could also be undefined Use the reachable Ptr set to help with data flow analysis
Data Flow Analysis with Alias Define-use For c := *p op b Only p is of pointer type All x, x in Ptr(p) are added to the use set oOr, all x, (p,x) reaches here, are added to the use set oIn the example, for c:= *x op b, a, b are added to the use set p is also added to the use set (need to use p to get *p) Of course, b in the use set, c in the define set For *p := b op c Only p is of pointer type All x, x in Ptr(p) are added to the def set oOr, all x, (p,x) reaches here, are added to the def set oIn the example, a := b op c, b := b op c, are added to the def set p is also in the use set (need to use p to get *p) Of course, b and c are in the use set x a i b y c
Data Flow Analysis with Alias Define-use For p := … Only p is defined For p := …q… Only p is defined Only q is used (and other variables appear in the right side), not the objects q points to
Code Optimization -- Summary Read Chapter 9 Sections 9.2, 9.4, 9.5, 9.6 Optimization of intermediate code Overview of optimization techniques Construct CFG and loop recognition Data flow analysis based on CFG Constant propagation and constant folding Copy propagation Code motion Induction variable identification and Strength reduction Dead code elimination
Code Optimization -- Summary Optimization of intermediate code Loop optimization (9.1) Function call optimization (2.5.4) Alias analysis (12.4)