9. Optimization Marcus Denker.

Slides:



Advertisements
Similar presentations
CSC 4181 Compiler Construction Code Generation & Optimization.
Advertisements

8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Program Representations. Representing programs Goals.
Optimization Compiler Baojian Hua
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Intermediate Code. Local Optimizations
Improving Code Generation Honors Compilers April 16 th 2002.
Prof. Fateman CS164 Lecture 211 Local Optimizations Lecture 21.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine. Interpreter of the virtual machine is invoked to execute the.
What’s in an optimizing compiler?
1 Code Generation Part II Chapter 8 (1 st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.
2/22/2016© Hal Perkins & UW CSEP-1 CSE P 501 – Compilers Register Allocation Hal Perkins Winter 2008.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
More Code Generation and Optimization Pat Morin COMP 3002.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
Introduction to Optimization
High-level optimization Jakub Yaghob
Code Optimization.
Data Flow Analysis Suman Jana
Expressions and Assignment
Static Single Assignment
Optimization Code Optimization ©SoftMoore Consulting.
Algorithm Analysis CSE 2011 Winter September 2018.
Machine-Independent Optimization
Topic 10: Dataflow Analysis
Introduction to Optimization
Code Generation Part III
University Of Virginia
Optimizing Transformations Hal Perkins Autumn 2011
Unit IV Code Generation
Optimizing Transformations Hal Perkins Winter 2008
Code Optimization Overview and Examples Control Flow Graph
Code Generation Part III
Optimizations using SSA
Introduction to Optimization
Chapter 12 Pipelining and RISC
Optimization 薛智文 (textbook ch# 9) 薛智文 96 Spring.
SSA-based Optimizations
Intermediate Code Generation
Code Generation Part II
Lecture 4: Instruction Set Design/Pipelining
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Code Optimization.
Presentation transcript:

9. Optimization Marcus Denker

Roadmap Introduction Optimizations in the Back-end The Optimizer SSA Optimizations Advanced Optimizations © Marcus Denker

Roadmap Introduction Optimizations in the Back-end The Optimizer SSA Optimizations Advanced Optimizations © Marcus Denker

Optimization: The Idea Transform the program to improve efficiency Performance: faster execution Size: smaller executable, smaller memory footprint Tradeoffs: 1) Performance vs. Size 2) Compilation speed and memory © Marcus Denker

No Magic Bullet! There is no perfect optimizer Optimization No Magic Bullet! There is no perfect optimizer Example: optimize for simplicity Opt(P): Smallest Program Q: Program with no output, does not stop Opt(Q)? Imagine we have an optimal optimizer that gives smallest program. © Marcus Denker

No Magic Bullet! There is no perfect optimizer Optimization No Magic Bullet! There is no perfect optimizer Example: optimize for simplicity Opt(P): Smallest Program Q: Program with no output, does not stop Opt(Q)? The shortest program it should generate is L1 goto L1 L1 goto L1 © Marcus Denker

No Magic Bullet! There is no perfect optimizer Optimization No Magic Bullet! There is no perfect optimizer Example: optimize for simplicity Opt(P): Smallest ProgramQ: Program with no output, does not stop Opt(Q)? If we feed it Q, then our optimizer will need to solve the halting problem L1 goto L1 Halting problem! © Marcus Denker

Another way to look at it... Optimization Another way to look at it... Rice (1953): For every compiler there is a modified compiler that generates shorter code. Proof: Assume there is a compiler U that generates the shortest optimized program Opt(P) for all P. Assume P to be a program that does not stop and has no output Opt(P) will be L1 goto L1 Halting problem. Thus: U does not exist. There will be always a better optimizer! Job guarantee for compiler architects :-) © Marcus Denker

Optimization on many levels Optimizations both in the optimizer and back-end © Marcus Denker

Roadmap Introduction Optimizations in the Back-end The Optimizer SSA Optimizations Advanced Optimizations © Marcus Denker

Optimizations in the Backend Register Allocation Instruction Selection Peep-hole Optimization © Marcus Denker

Register Allocation Processor has only finite amount of registers Optimization Register Allocation Processor has only finite amount of registers Can be very small (x86) Temporary variables non-overlapping temporaries can share one register Passing arguments via registers Optimizing register allocation very important for good performance Especially on x86 © Marcus Denker

Instruction Selection Optimization Instruction Selection For every expression, there are many ways to realize them for a processor Example: Multiplication*2 can be done by bit-shift Instruction selection is a form of optimization © Marcus Denker

Peephole Optimization Simple local optimization Look at code “through a hole” replace sequences by known shorter ones table pre-computed store R,a; load a,R store R,a; ashl = shift left peephole typically considers 2-3 lines good for simple compilers for longer instructions sequences use graph matching imul 2,R; ashl 2,R; Important when using simple instruction selection! © Marcus Denker

Optimization on many levels Major work of optimization done in a special phase Focus of this lecture © Marcus Denker

Optimization Different levels of IR Different levels of IR for different optimizations Example: Array access as direct memory manipulation We generate many simple to optimize integer expressions We focus on high-level optimizations © Marcus Denker

Roadmap Introduction Optimizations in the Back-end The Optimizer SSA Optimizations Advanced Optimizations © Marcus Denker

Examples for Optimizations Constant Folding / Propagation Copy Propagation Algebraic Simplifications Strength Reduction Dead Code Elimination Structure Simplifications Loop Optimizations Partial Redundancy Elimination Code Inlining © Marcus Denker

Constant Folding Evaluate constant expressions at compile time Optimization Constant Folding Evaluate constant expressions at compile time Only possible when side-effect freeness guaranteed c:= 1 + 3 c:= 4 true not false A form of partial evaluation. Some of this can be done early while generating IR from AST. Caveat: Floats — implementation could be different between machines! © Marcus Denker

Constant Propagation Variables that have constant value, e.g. c := 3 Optimization Constant Propagation Variables that have constant value, e.g. c := 3 Later uses of c can be replaced by the constant If no change of c between! b := 3 c := 1 + b d := b + c b := 3 c := 1 + 3 d := 3 + c Later we will see SSA is ideal to analyse this Analysis needed, as b can be assigned more than once! © Marcus Denker

Copy Propagation for a statement x := y Optimization Copy Propagation for a statement x := y replace later uses of x with y, if x and y have not been changed. x := y c := 1 + x d := x + c x := y c := 1 + y d := y + c Again we will use SSA Analysis needed, as y and x can be assigned more than once! © Marcus Denker

Algebraic Simplifications Optimization Algebraic Simplifications Use algebraic properties to simplify expressions -(-i) i b or: true true Important to simplify code for later optimizations © Marcus Denker

Strength Reduction Replace expensive operations with simpler ones Optimization Strength Reduction Replace expensive operations with simpler ones Example: Multiplications replaced by additions y := x * 2 y := x + x Actually here a bit shift would be even better Peephole optimizations are often strength reductions © Marcus Denker

Dead Code Remove unnecessary code b := 3 c := 1 + 3 d := 3 + c Optimization Dead Code Remove unnecessary code e.g. variables assigned but never read b := 3 c := 1 + 3 d := 3 + c c := 1 + 3 d := 3 + c Remove code never reached if (false) {a := 5} if (false) {} © Marcus Denker

Simplify Structure Similar to dead code: Simplify CFG Structure Optimization Simplify Structure Similar to dead code: Simplify CFG Structure Optimizations will degenerate CFG Needs to be cleaned to simplify further optimization! Eg simplify jumps to jumps. Also next slide. © Marcus Denker

Delete Empty Basic Blocks Optimization Delete Empty Basic Blocks © Marcus Denker

Optimization Fuse Basic Blocks Here we have “conditional” jumps between basic blocks, where the conditions are always true. So we can fuse together these basic blocks and eliminate the jumps. © Marcus Denker

Common Subexpression Elimination (CSE) Optimization Common Subexpression Elimination (CSE) Common Subexpression: - There is another occurrence of the expression whose evaluation always precedes this one - operands remain unchanged Local (inside one basic block): When building IR Global (complete flow-graph) © Marcus Denker

Example CSE t1 := a + 2 b := a + 2 b := t1 c := 4 * b c := 4 * b Optimization Example CSE t1 := a + 2 b := t1 c := 4 * b b < c? b := a + 2 c := 4 * b b < c? b := 1 b := 1 Need to verify that a has not changed in between! d := a + 2 d := t1 © Marcus Denker

Loop Optimizations Optimizing code in loops is important often executed, large payoff All optimizations help when applied to loop-bodies Some optimizations are loop specific Cf. “hot-spot optimization” © Marcus Denker

Loop Invariant Code Motion Optimization Loop Invariant Code Motion Move expressions that are constant over all iterations out of the loop Does not generally work for expressions with side effects. © Marcus Denker

Induction Variable Optimizations Values of variables form an arithmetic progression integer a(100) t1 := 202 do i = 1, 100 t1 := t1 - 2 a(i) = t1 endo integer a(100) do i = 1, 100 a(i) = 202 - 2 * i endo FORTRAN example Finding such optimizations is rather complicated (see chapter in Muchnick) value assigned to a decreases by 2 uses Strength Reduction © Marcus Denker

Partial Redundancy Elimination (PRE) Optimization Partial Redundancy Elimination (PRE) Combines multiple optimizations: global common-subexpression elimination loop-invariant code motion Partial Redundancy: computation done more than once on some path in the flow-graph PRE: insert and delete code to minimize redundancy. © Marcus Denker

Code Inlining All optimization up to know where local to one procedure Problem: procedures or functions are very short Especially in good OO code! Solution: Copy code of small procedures into the caller OO: Polymorphic calls. Which method is called? Good OO code has small methods – great to inline if possible (eg only one implementor) © Marcus Denker

Example: Inlining a := power2(b) power2(x) { return x*x } a := b * b Optimization Example: Inlining a := power2(b) power2(x) { return x*x } NB: inlining can bloat the generated code. C++ added “inline” keyword as a hint, but modern compilers ignore this hint. a := b * b © Marcus Denker

Roadmap Introduction Optimizations in the Back-end The Optimizer SSA Optimizations Advanced Optimizations © Marcus Denker

Repeat: SSA SSA: Static Single Assignment Form Optimization Repeat: SSA SSA: Static Single Assignment Form Definition: Every variable is only assigned once © Marcus Denker

Optimization Properties Definitions of variables (assignments) have a list of all uses Variable uses (reads) point to the one definition CFG of Basic Blocks © Marcus Denker

Examples: Optimization on SSA We take three simple ones: Constant Propagation Copy Propagation Simple Dead Code Elimination © Marcus Denker

Repeat: Constant Propagation Optimization Repeat: Constant Propagation Variables that have constant value, e.g. c := 3 Later uses of c can be replaced by the constant If no change of c between! b := 3 c := 1 + b d := b + c b := 3 c := 1 + 3 d := 3 + c Analysis needed, as b can be assigned more than once! © Marcus Denker

Constant Propagation and SSA Optimization Constant Propagation and SSA Variables are assigned once We know that we can replace all uses by the constant! b1 := 3 c1 := 1 + b1 d1 := b1 + c1 b1 := 3 c1 := 1 + 3 d1 := 3 + c Note that this example shows that your must optimize iteratively, since now c1 will also be a constant. We also now get dead code (b1 := 3) © Marcus Denker

Repeat: Copy Propagation Optimization Repeat: Copy Propagation for a statement x := y replace later uses of x with y, if x and y have not been changed. x := y c := 1 + y d := y + c x := y c := 1 + y d := y + c Analysis needed, as y and x can be assigned more than once! © Marcus Denker

Copy Propagation And SSA Optimization Copy Propagation And SSA for a statement x1 := y1 replace later uses of x1 with y1 x1 := y1 c1 := 1 + x1 d1 := x1 + c1 x1 := y1 c1 := 1 + y1 d1 := y1 + c1 © Marcus Denker

Dead Code Elimination and SSA Optimization Dead Code Elimination and SSA Variable is live if the list of uses is not empty. Dead definitions can be deleted (If there is no side-effect) © Marcus Denker

Roadmap Introduction Optimizations in the Back-end The Optimizer SSA Optimizations Advanced Optimizations © Marcus Denker

Advanced Optimizations Optimizing for using multiple processors Auto parallelization Very active area of research (again) Inter-procedural optimizations Global view, not just one procedure Profile-guided optimization Vectorization Dynamic optimization Used in virtual machines (both hardware and language VM) Dynamic optimization goes a step further to re-optimize at run-time in the VM. Good way to exploit unused CPU cycles or unused CPUs (multi-core). Profile-guided means you generate code, profile it in a typical scenario, and then use that information to optimize it. Problem: usage scenarios can change in deployment, there is no way to react to that as profile is generated at compile Time. Dynamic optimization uses profile information gathered at runtime. ----- gcc uses many of these techniques (not dynamic optimization). --- One problem with optimizations in current compilers is that useful information may be lost, e.g., when matrix operations are translated to loops. This makes it very hard to come up with suitable optimization, for example we can not use any mathematical domain knowledge. © Marcus Denker

Iterative Process There is no general “right” order of optimizations One optimization generates new opportunities for a preceding one. Optimization is an iterative process Compile Time vs. Code Quality © Marcus Denker

What we have seen... Introduction Optimizations in the Back-end The Optimizer SSA Optimizations Advanced Optimizations © Marcus Denker

Literature Muchnick: Advanced Compiler Design and Implementation Optimization Literature Muchnick: Advanced Compiler Design and Implementation >600 pages on optimizations Appel: Modern Compiler Implementation in Java The basics © Marcus Denker

What you should know! Why do we optimize programs? Optimization What you should know! Why do we optimize programs? Is there an optimal optimizer? Where in a compiler does optimization happen? Can you explain constant propagation? © Marcus Denker

Can you answer these questions? Optimization Can you answer these questions? What makes SSA suitable for optimization? When is a definition of a variable live in SSA Form? Why don’t we just optimize on the AST? Why do we need to optimize IR on different levels? In which order do we run the different optimizations? © Marcus Denker

License http://creativecommons.org/licenses/by-sa/2.5/ Optimization License http://creativecommons.org/licenses/by-sa/2.5/ Attribution-ShareAlike 2.5 You are free: to copy, distribute, display, and perform the work to make derivative works to make commercial use of the work Under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above. © Marcus Denker