Code Optimization More Optimization Techniques. More Optimization Techniques  Loop optimization  Code motion  Strength reduction for induction variables.

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Intermediate Code Generation
Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
Jeffrey D. Ullman Stanford University. 2  A never-published Stanford technical report by Fran Allen in  Fran won the Turing award in  Flow.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
1 Copy Propagation What does it mean? Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
Peephole Optimization Final pass over generated code: examine a few consecutive instructions: 2 to 4 See if an obvious replacement is possible: store/load.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Administrative info Subscribe to the class mailing list –instructions are on the class web page, which is accessible from my home page, which is accessible.
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Chapter 9: Subprogram Control
Improving Code Generation Honors Compilers April 16 th 2002.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
1 CS 201 Compiler Construction Data Flow Analysis.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Code Optimization Data Flow Analysis. Data Flow Analysis (DFA)  General framework  Can be used for various optimization goals  Some terms  Basic block.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
More Code Generation and Optimization Pat Morin COMP 3002.
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Code Optimization.
Simone Campanoni Dependences Simone Campanoni
Compilers.
Basic Block Optimizations
Fall Compiler Principles Lecture 8: Loop Optimizations
Unit IV Code Generation
Chapter 6 Intermediate-Code Generation
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Topic 4: Flow Analysis Some slides come from Prof. J. N. Amaral
Compiler Code Optimizations
Code Optimization Overview and Examples Control Flow Graph
Optimizations using SSA
Interval Partitioning of a Flow Graph
Final Code Generation and Code Optimization
Static Single Assignment
Intermediate Code Generation
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Live Variables – Basic Block
Basic Block Optimizations
Code Optimization.
Presentation transcript:

Code Optimization More Optimization Techniques

More Optimization Techniques  Loop optimization  Code motion  Strength reduction for induction variables  Loop unrolling  Function call optimization  Function in-lining  Tail recursion elimination  Alias analysis  Data flow analysis for pointers

Code Motion  Concept  Find loop invariants  Code that will not change its results in each iteration  Move them outside the loop whenever possible  Method  First identify the loop  Then perform reachability analysis  Find the define-use links  For each variable used, find where it is defined  If x’s definitions are all outside the loop, then x is a loop invariant  Identify the loop invariant statements  Exam the movability of the statements  The process continues till no more code to be moved

Code Motion  First identify the loop and the entry node  Then compute the In and Out sets 1: read m 2: i := 1 3: n := 2 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 In[B2] = {} Out[B2] = {d1, d2, d3} In[B2] = {d1, d2, d3, d5, d6, d7, d8} Out[B2] = {d1, d2, d3, d5, d6, d7, d8} In[B3] = {d1, d2, d3, d5, d6, d7, d8} Out[B3] = {d1, d5, d6, d7, d8} In[B4] = {d1, d2, d3, d5, d6, d7, d8} Out[B4] = {d1, d2, d3, d5, d6, d7, d8}

Code Motion  Now compute define-use links  For each use, find the corresponding definitions in the In set  If a variable has more than one possible source for its definition, then all links should be given 1: read m 2: i := 1 3: n := 2 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 In[B2]: {d1, d2, d3, d5, d6, d7, d8} In[B3]: {d1, d2, d3, d5, d6, d7, d8} m: d1 i: d2, d8 n: d3, d7 k: d5 j: d6

Code Motion  Now find invariant code  Statements whose operands are not defined within the loop  Computation does not depend on any value computed in the loop 1: read m 2: i := 1 3: n := 2 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 Statement 5: m: defined outside the loop 2: constant  Is a loop invariant Move 5 out of the loop No need to worry about the dependency to external definitions Statement 6: k: defined inside the loop 3: constant  Not a loop invariant

Code Motion  Create a block before the entry block (if not done so yet)  Move the invariant code into the new block  If there are other back edges pointing to the same entry block  If it is an outer loop then Point to the new block  May have to perform code motion for the other loops 1: read m 2: read n 3: i := 1 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 5: k := m + 2 Move 5 out of the loop Redirect the dependency link

Code Motion  Continue to find new invariant code 1: read m 2: read n 3: i := 1 4: if (i>m) goto 10 5: k := m + 2 6: j := 3 * k 7: n := j + n 8: i := i + 1 9: goto 4 10: print n B1 B2 B4 B3 5: k := m + 2 No more statements can be moved out. Done! 6: Now, 6 has no operands that are defined within the loop  Move 6 out of the loop 5: k := m + 2 6: j := 3 * k

Code Motion  Problem  Moving code causes the execution order to change i := 1 v := v – 1 If v > 0 goto B2 i := 2 u := u + 1 B1 B2 B4 B3 If u>v goto B4 j := i … B5 i := 2 is a loop invariant Move out of the loop? What would i be? Both i := 1 and i := 2 reaches B5 Moving i := 2 out would cause i := 1 “dead”  Move only if the block (containing the statement to be moved out) dominates all exit blocks

Code Motion  Problem  Moving code causes the execution order to change i := 1 v := v – 1 If v > 0 goto B2 i := 2 u := u + 1 B1 B2 B4 B3 i := 3 If u>v goto B4 j := i … B5 Should i := 3 be moved out of the loop? B2 does dominate the only exit block What would i be? Moving i := 3 out would cause i=2 reaches the end of B2  Move only if there is no other definition of i (the variable being defined) in the loop

Code Motion  Problem  Moving code causes the execution order to change i := 1 v := v – 1 If v > 0 goto B2 i := 2 u := u + 1 B1 B2 B4 B3 k := i j := i … B5 Should i := 2 be moved out of the loop? It does dominate the exit. It is the only definition of i in the loop. What would i be? B2 uses i, and i := 2 as well as i := 1 reaches B2  Move only if all the use of i (the variable being defined) in the loop can only be reached by i := 2 (the statement to be moved out)

Code Motion  Problem  Moving code causes the execution order to change  Algorithm  Detect loop invariant code  s: invariant statement, a candidate to be moved out  B: the block containing s  d: the definition produced by s  x: the variable defined in d  Check  B dominates all exits  No other definition of x in the loop  All use of x in the loop is from d  Only if all three conditions are met, move the code  To the block immediately before the loop entry

Code Motion  Check  B dominates all exit nodes  What is an exit node? Entry node? Node of the back edge?  Any node in the loop with an outgoing edge to a node that is not in the loop  Check dominator tree for this condition  No other definition of x in the loop  Simply check all the definitions  All uses of x in the loop are from d  Check the In set of each block after the exit block

Induction Variables  Finding basic induction variables in loop L  Scanning the statements in L and for each variable, say x  Within L, x is only defined once  Definition of x is of the form x := x  b, b is a constant  Could be x := b  x also  Perform constant propagation and constant folding to allow better recognition of b

Induction Variables  Finding other induction variables y, y is defined as a linear function of x  Scan the statements in L and find all variables, say y  y is only defined once in L  Definition of y is equivalent to  y := c * x + d  x is a basic induction variable  c and d are constants  Then: y is in the family of x, expressed as (x, c, d)  x is expressed as (x, 1, 0)  y := 4 * x, then y is (x, 4, 0)

Strength Reduction for Induction Variables  x is the basic IV  Definition of x is: x := x + b  For each y in the family of x, expressed as (x, c, d)  Create a new variable sy outside the loop  Immediately before the loop entry  If not created yet  Initialize sy outside the loop  sy := c * x + d  Assignment to sy in the loop  Add sy := sy + c * b immediately after x’s definition  Replacing definition of y  Replaced by y := sy

Strength Reduction for Induction Variables i := i + 1 t2 := 4 * i t3 := a[t2] if t3 < v … i := m – 1 t1 := 4 * n v := a[t1] i := i + 1 s2 := s2 + 4 t2 := s2; t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] Add s2 and initialize s2 i is a basic IV t2 is an IV in i’s family (i, 4, 0) Add: s2 := s2 + 4 Replace: t2 := 4*i by t2 := s2 Why not just replace t2 := 4 * i by t2 := t2 + 4 and initialize t2 := 4 * i

Strength Reduction for Induction Variables use t2 i := i + 1 t2 := t2 + 4 t3 := a[t2] if t3 < v … t2’s use is now from an incorrect def t2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] t2 := t1 + v use t2 i := i + 1 t2 := 4 * i t3 := a[t2] if t3 < v … i := m – 1 t1 := 4 * n v := a[t1] t2 := t1 + v use t2 i := i + 1 s2 := s2 + 4 t2 := s2 t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] t2 := t1 + v Why can’t we just replace t2 := 4 * i by s2 := s2 + 4 and t2 := s2 (on the same spot)

Strength Reduction for Induction Variables s2 := s2 + 4 t2 := s2 use t2 i := i + 1 s2 := s2 + 4 t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] The update has to be done after i’s def t2 := s2 has to be where t2’s def is change s2 and so t2 at an incorrect time t2 := 4 * i use t2 i := i + 1 t3 := a[t2] if t3 < v … i := m – 1 t1 := 4 * n v := a[t1] t2 := s2; use t2 i := i + 1 s2 := s2 + 4 t3 := a[t2] if t3 < v … s2 := 4 * i i := m – 1 t1 := 4 * n v := a[t1] t2 := s2 is in t2’s original location Def of s2 is right after i’s def

Finding More Induction Variables  Consider y being in the family of a basic IV x and z is defined as a linear function of y  Scan the statements in L and find all variables, say z  z is only defined once in L  Definition of z is equivalent to  z := c’ * y + d’  y is an induction variable, and y is in the family of x  c’ and d’ are constants  No assignment to x between assignments to y and z  No assignment to y outside L reaches the use of z  Then: z is in the family of x, expressed as (x, c’*c, c’*d+d’)

Loop Unrolling  Advantages  Saves loop tests  Facilitate the application of optimization techniques  Increases instruction level parallelism fac := 1; for (int i:=5; i>1; i--) { fac := (fac * i) % Max; } fac := 1; fac := (fac * 5) % Max; fac := (fac * 4) % Max; fac := (fac * 3) % Max; fac := (fac * 2) % Max; do constant propagation can achieve great optimization

Optimization for Function Calls  Tail recursion elimination  A function is tail-recursive if the last stmt is a call to itself list_type last (lst: list_type) { if (lst.next = null) then return lst; else return last (lst.next); }  Tail recursive call can be replaced with list_type last (lst: list_type) { start: if (lst.next == null) then return lst; else { lst := lst.next; go to start } }

Optimization for Function Calls  Tail recursion elimination  Replacement rule  Formal param of the func := Actual param in the tail-recursive call  Apply to multiple parameters, as long as they have one to one correspondence  Go back to the start of the function  Advantages of tail recursion elimination  The assignments and a jump were anyway needed by a function call  Function call requires additional copies and jumps  Saves stack space  Recursive call: require linear stack usage  Converted code: only constant stack usage

Optimization for Function Calls  Function Inlining  Advantages  Enable additional optimizations oDue to the revealing of the actual parameters, a lot of optimizations may be possible  Save copying and jump required due to a function call  Disadvantages  Code size increases

Function Inlining – Return and Exit  Simple case  Replace return stmt by assigning returned value to the target variable  Assume that there is returned value and the returned value is assigned to a target variable in the caller  Some issues  The function may have multiple exits  Use a label to serve as a single exit point  The function may not assign return values on all paths  Have to be truthful to original code  Create a temporary variable to keep the returned result

Function Inlining – Return and Exit  General rules  Replace return statements by  Assigning returned value to the temporary variable (if there is returned value)  Followed by goto to exit-label  Assigning the temporary variable to the target variable at the exit label  Optimizations  If the function has a single return statement and the caller stmt is an assignment stmt  Use the simple case rules  If the function has no return statement  Simply use the exit label  Apply all data flow analysis and optimization techniques

Function Inlining – Parameter Passing  Simple case  Replace formal parameters by actual parameters  If an actual parameter is an expression, evaluated it once  Assign evaluation result to a temporary variable  Replace formal parameter with temporary variable  Actual parameter is A[exp]  Mix of the above two cases  Problems  Array size mismatch, subtyping, etc.  Global variables and static  Recognize global variables and preserve its scope  Rename for matching names  Retain the static status of static variables

Function Inlining  int Max := 10**6; … int factorial (int x) { int fac := 1; for (int i:=x; i>1; i--) { fac := (fac * i) % Max; } return fac; } … int val1 := factorial(10); … int val2 := factorial(5); … int Max := 10**6; … int val1; { int fac := 1; for (int i:=10; i>1; i--) { fac := (fac * i) % Max; } val1 := fac; } … int val2; { int fac := 1; for (int i:=5; i>1; i--) { fac := (fac * i) % Max; val2 := fac; } … single return point global var preserve scope

Function Inlining  int Max := 10**6; int overflow := -2; … int factorial (int x) { if (x >= 1) { int fac := 1; for (int i:=x; i>1; i--) { fac := (fac * i); if (fac > Max) return overflow; } return fac; } … int val := factorial (x*y+2); … Multiple return points Undefined return path for x < 1 actual parameter is an expression int Max := 10**6; int overflow := -2; … int t1 := x*y+2; int ret; { if (t1 >= 1) { int fac := 1; for (int i:=t1; i>1; i--) { fac := (fac * i); if (fac > Max) { ret := overflow; goto L1; } } { ret := fac; goto L1; } } L1: int val := ret … define temp ret var define exit label define temp var evaluate exp same as original code ret undefined on some paths

A Headache in Data Flow Analysis - Alias  Pointer aliases  E.g., *p := y may write to any memory location  E.g., x := *p may read from any memory location  If p is dynamically computed  Problems  Example: Live variable analysis  …, x := *p  all variables in all prior blocks may be alive  Example: Constant folding  a := 1; b := 2; *p := 0; c := a + b;  c := 3 only if *p is not an alias for a or b!  All data flow analysis techniques can be messed up by aliases  Need to perform alias analysis when possible

Alias Analysis  Ptr(v)  For each variable v that may hold an address, computer Ptr(v)  Ptr(v) includes all variables v may point to  Ptr(v) may include variables allocated in stack and heap  Can be represented as a graph G, G  2 V  V  V: the set of all variables in the program  E.g., Ptr(x) = {y}, Ptr(y) = {u, v}  Anderson’s graphical representation x y u v x y u v a b

Alias Flow Analysis  Data flow analysis  For p := &q  Add (p,q) to Gen[B]  Add (p,x) to Kill[B], for all x in V (the entire set of variables) oBesides the ones in Gen[B]  For p := q  Add (p,x) to Gen[B], for all x, x in Ptr(q)  Add (p,x) to Kill[B], for all x in V (the entire set of variables) oBesides the ones in Gen[B]  For p := *q  Add (p,x) to Gen[B], for all x, x in Ptr(Ptr(q))  Add (p,x) to Kill[B], for all x in V (the entire set of variables) p q … p q … p q ? …

Alias Flow Analysis  Data flow analysis  For *p := q  Add (x,y) to Gen[B], for all x in Ptr(p) and all y in Ptr(q)  Add (x,y) to Kill[B], for all x in Ptr(p) and all y in V oBesides those in Gen[B]  For *p := *q  Add (x,y) to Gen[B], for all x in Ptr(p) and all y in Ptr(Ptr(q))  Add (x,y) to Kill[B], for all x in Ptr(p) and all y in V oBesides those in Gen[B]  When you have *…* or &…&  Resolve layer by layer p q ? … ? q … p ?

Alias Flow Analysis  Data flow analysis  Out[B] = Gen[B]  (In[B] – Kill[B])  If (x,y) is in the In set and also in the Kill set, then it is killed  In[B] =  Out[P]  As long as a definition of a pointer definition pair (x,y) reaches B, the pointer x could have value y  Assumption  No pointer arithmetic  If there is, then need to use offset, instead of actual variables, for data flow analysis

Alias Flow Analysis  Example x := &a y := &b c := &i if (i) *x := c x := y x a i b y c Out: (x,a), (y,b), (c,i) In: (x,a), (y,b), (c,i) Killed: (x,a) Out: (x,b), (y,b), (c,i) In: (x,a), (x,b), (y,b), (c,i) Killed: (a,*), (b,*) -- none Out: (x,a), (x,b), (y,b), (c,i), (a,i), (b,i) x points to a and b a and b are now possible to point to whatever c points to

Data Flow Analysis with Alias  Apply the reachability information to perform analysis  E.g.  a := 1; b := 2; *p := 0; c := a + b;  *p := 0 defines x := 0, for all (p,x) that reaches *p := 0  If (p,a) or (p,b) reaches *p := 0, then c is no longer a constant  If only (p,a) reaches *p := 0 does not imply p = a, p could also be undefined  Use the reachable Ptr set to help with data flow analysis

Data Flow Analysis with Alias  Define-use  For c := *p op b  Only p is of pointer type  All x, x in Ptr(p) are added to the use set oOr, all x, (p,x) reaches here, are added to the use set oIn the example, for c:= *x op b, a, b are added to the use set  p is also added to the use set (need to use p to get *p)  Of course, b in the use set, c in the define set  For *p := b op c  Only p is of pointer type  All x, x in Ptr(p) are added to the def set oOr, all x, (p,x) reaches here, are added to the def set oIn the example, a := b op c, b := b op c, are added to the def set  p is also in the use set (need to use p to get *p)  Of course, b and c are in the use set x a i b y c

Data Flow Analysis with Alias  Define-use  For p := …  Only p is defined  For p := …q…  Only p is defined  Only q is used (and other variables appear in the right side), not the objects q points to

Code Optimization -- Summary  Read Chapter 9  Sections 9.2, 9.4, 9.5, 9.6  Optimization of intermediate code  Overview of optimization techniques  Construct CFG and loop recognition  Data flow analysis based on CFG  Constant propagation and constant folding  Copy propagation  Code motion  Induction variable identification and Strength reduction  Dead code elimination

Code Optimization -- Summary  Optimization of intermediate code  Loop optimization (9.1)  Function call optimization (2.5.4)  Alias analysis (12.4)