CPSC 388 – Compiler Design and Construction Optimization.

Slides:



Advertisements
Similar presentations
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Advertisements

Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Loops or Lather, Rinse, Repeat… CS153: Compilers Greg Morrisett.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
Code optimization: –A transformation to a program to make it run faster and/or take up less space –Optimization should be safe, preserve the meaning of.
Jeffrey D. Ullman Stanford University. 2  A never-published Stanford technical report by Fran Allen in  Fran won the Turing award in  Flow.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
C Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
CPSC 388 – Compiler Design and Construction Parameter Passing.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
1 Copy Propagation What does it mean? Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
Reduction in Strength CS 480. Our sample calculation for i := 1 to n for j := 1 to m c [i, j] := 0 for k := 1 to p c[i, j] := c[i, j] + a[i, k] * b[k,
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Improving Code Generation Honors Compilers April 16 th 2002.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Data Transfer & Decisions I (1) Fall 2005 Lecture 3: MIPS Assembly language Decisions I.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
What’s in an optimizing compiler?
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. C How To Program - 4th edition Deitels Class 05 University.
1 Code Generation Part II Chapter 8 (1 st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Computer Science 313 – Advanced Programming Topics.
CPS120: Introduction to Computer Science Decision Making in Programs.
Current Assignments Homework 2 is available and is due in three days (June 19th). Project 1 due in 6 days (June 23 rd ) Write a binomial root solver using.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Chapter 10 Code Optimization Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.
 Control Flow statements ◦ Selection statements ◦ Iteration statements ◦ Jump statements.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Code Generation CPSC 388 Ellen Walker Hiram College.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Optimization Simone Campanoni
CS 536 © CS 536 Spring Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 15.
Dr. Hussien Sharaf Dr Emad Nabil. Dr. Hussien M. Sharaf 2 position := initial + rate * Lexical analyzer 2. Syntax analyzer id 1 := id 2 + id 3 *
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Code Optimization More Optimization Techniques. More Optimization Techniques  Loop optimization  Code motion  Strength reduction for induction variables.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Optimization Code Optimization ©SoftMoore Consulting.
Compiler Construction
Princeton University Spring 2016
Machine-Independent Optimization
Code Generation Part III
Compiler Code Optimizations
Code Optimization Overview and Examples Control Flow Graph
Code Generation Part III
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
Lecture 19: Code Optimisation
Code Generation Part II
Code Optimization.
Presentation transcript:

CPSC 388 – Compiler Design and Construction Optimization

Optimization Goal  Produce Better Code Fewer instructions Faster Execution  Do Not Change Behavior of Program!

Optimization Techniques  Peep-hole optimization Done after code generation Makes small local changes to assembly  Moving Loop-Invariants Done before code generation Find Computations in loops that can be moved outside  Strength Reduction in for loops Done before code generation Replace multiplications with additions  Copy Propagation Done before code generation Replace use of variable with literal or other variable

Peep-hole Optimization  Look through small window at assembly code for common cases that can be improved 1.Redundant load 2.Redundant push/pop 3.Replace a Jump to a jump 4.Remove a Jump to next instruction 5.Replace a Jump around jump 6.Remove Useless operations 7.Reduction in strength

Redundant Load  Before storeRx, M loadM, Rx  After storeRx, M

Redundant Push/Pop  Before pushRx popRx  After … nothing …

Replace a jump to a jump  Before gotoL1 … L1:gotoL2  After goto L2 L1:goto L2

Remove a Jump to next Instruction  Before gotoL1 L1:…  After L1:…

Replace a jump around jump  Before if T0 = 0 goto L1 else goto L2 L1:…  After if T0 != 0 goto L2 L1:…

Remove useless operations  Before addT0, T0, 0 mulT0, T0, 1  After … nothing …

Reduction in Strength  Before mulT0, T0, 2 addT0, T0, 1  After shift-leftT0 incT0

One optimization may lead to another loadTx, M addTx, 0 storeTx, M  After One Optimization: loadTx, M storeTx, M  After Another Optimization: loadTx, M

You Try It  The code generated from this program contains opportunities for the first two kinds (redundant load, jump to a jump). Can you explain how just by looking at the source code? public class Opt { public static void main() { int a; int b; if (true) { b = 0; } else { b = 1; } return; } a = 1; b = a; }

Moving Loop-Invariant Computations Out of the Loop  For greatest gain, optimize “hot spots”, i.e. inner loops.  An expression is loop invariant if the same value is computed on every iteration of the loop  Compute the value once outside loop and reuse value inside loop

Example for (int i=0;i<100;i++) { for (int j=0;j<100;j++) { for (int k=0;k<100;k++) { A[i][j][k]=i*j*k; }

Example for (int i=0;i<100;i++) { for (int j=0;j<100;j++) { for (int k=0;k<100;k++) { T0=i*j*k; T1=FP+ -i*4000-j*400-k*4; Store T0, 0(T1) } Invariant to I loop Invariant to J loop Invariant to K loop

Example tmp0=FP + for (int i=0;i<100;i++) { tmp1=tmp0-i*4000; for (int j=0;j<100;j++) { tmp2=tmp1-j*400; tmp3=i*j; for (int k=0;k<100;k++) { T0=tmp3*k; T1=tmp2-k*4; store T0, 0(T1) }

Comparison before and after of inner most loop (executed 1 million times) Original Code  5 multiplications (3 for lvalue, 2 for rvalue)  3 subtractions(for lvalue)  1 indexed store New Code  2 multiplications (1 for lvalue, 1 for rvalue)  1 subtraction (for lvalue)  1 indexed store

Questions  How do you recognize loop-invariant expressions?  When and where do we move the computations of those expressions?

Recognizing Loop Invariants An expression is invariant with respect to a loop if for every operand, one of the following holds: It is a literal It is a variable that gets its value only from outside the loop

When and Where to move invariant expressions  Must consider safety of move  Must consider profitability of move

Safety of moving invariants  If evaluating expression might cause an error and the loop might not get executed: b=a; while (a != 0) { x = 1/b; //possible “/0” if moved a--; }

Safety of moving invariants  What about preserving order of events? if the unoptimized code performed output THEN had runtime error Is it valid for the optimized code to simply have runtime error?  Changing order of computations may change result for floating-point computations due to differing precisions

Profitability of moving invariants If the computation might NOT execute in the original program then moving the computation might actually slow down the program!

Moving is Safe and Profitable If  Loop will execute at least once  Code will execute if loop does Isn’t inside any condition Is on all paths through loop (both if and else portions)  Expression is in non short-circuited part of the loop test E.g. while (x < i+j*100)

You Try It  What are some examples of loops for which the compiler can be sure that the loop will execute at least once?

Strength Reduction  Concentrate on “hot spots”  Replace expensive operations (*) with cheaper ones (+)

Example Strength Reduction For i from low to high do …i*k1+k2 Where i is the loop index K1 and K2 are constant with respect to the loop Consider the sequence of values for i and expression

Examples Strength Reduction Iteration # ii*k1+k2 1lowlow*k1+k2 2low+1(low+1)*k1+k2= low*k1+k2+k1 3low+1+1(low+1+1)*k1+k2= low*k1+k2+k1+k1

Example Strength Reduction  Compute low*k1+k2 once before loop  Store value in a temporary  Use the temporary instead of the expression inside loop  Increment temporary by k1 at the end of the loop

Example Strength Reduction temp=low*k1+k2 For i from low to high do …temp… temp=temp+k1 end

Another Example tmp0 = FP + offset A for (i=0; i<100; i++) { tmp1 = tmp0 - i*40000 // i * tmp0 for (j=0; j<100; j++) { tmp2 = tmp1 - j*400 // j * tmp1 tmp3 = i*j // j * i + 0 for (k=0; k<100; k++) { T0 = tmp3 * k // k * tmp3 + 0 T1 = tmp2 - k*4 // k * -4 + tmp2 store T0, 0(T1) } Now Perform Strength Reduction

tmp0 = FP + offset A temp1 = tmp0 // temp1 = 0* tmp0 for (i=0; i<100; i++) { tmp1 = temp1 temp2 = tmp1 // temp2 = 0*-400+tmp1 temp3 = 0 // temp3 = 0*i+0 for (j=0; j<100; j++) { tmp2 = temp2 tmp3 = temp3 temp4 = 0 // temp4 = 0*tmp3+0 temp5 = tmp2 // temp5 = 0*-4+tmp2 for (k=0; k<100; k++) { T0 = temp4 T1 = temp5 store T0, 0(T1) temp4 = temp4 + tmp3 temp5 = temp5 - 4 } temp2 = temp temp3 = temp3 + i } temp1 = temp }

You Try It  Suppose that the index variable is incremented by something other than one each time around the loop. For example, consider a loop of the form: for (i=low; i<=high; i+=2)...  Can strength reduction still be performed? If yes, what changes must be made to the proposed algorithm?

Copy Propagation  Statements of the form “x=y” (called d) are called copy statements. For every use, u, of variable x reached by a copy statement such that: No other definition of x reaches u, and y can’t change between d and u  You can replace the use of x at u with a use of y.

Examples of Copy Propagation x=y a=x+z x=y if (…) x=2 a=x+z x=y if (…) y=3 a=x+z Yes No

Question  Why is this a useful transformation?  If ALL uses of x reached by definition d are replaced, then the definition of d is useless, and can be removed.

tmp0 = FP + offset A temp1 = tmp0 // cannot be propagated for (i=0; i<100; i++) { tmp1 = temp1 temp2 = tmp1 // cannot be propagated temp3 = 0 // cannot be propagated for (j=0; j<100; j++) { tmp2 = temp2 tmp3 = temp3 temp4 = 0 // cannot be propagated temp5 = tmp2 // cannot be propagated for (k=0; k<100; k++) { T0 = temp4 T1 = temp5 store T0, 0(T1) temp4 = temp4 + tmp3 temp5 = temp5 - 4 } temp2 = temp temp3 = temp3 + i } temp1 = temp }

tmp0 = FP + offset A temp1 = tmp0 for (i=0; i<100; i++) { temp2 = temp1 temp3 = 0 for (j=0; j<100; j++) { temp4 = 0 temp5 = temp2 for (k=0; k<100; k++) { store temp4 0(temp5) temp4 = temp4 + temp3 temp5 = temp5 - 4 } temp2 = temp temp3 = temp3 + i } temp1 = temp }

Comparision before and after Before  5 *, 3 +/-, 1 indexed store in inner most loop After  2 +/- in inner most loop  2 +/-, 2 copy statements in middle loop  1 +/-, 1 copy in outer loop