Code Generation Part III

Slides:



Advertisements
Similar presentations
Example of Constructing the DAG (1)t 1 := 4 * iStep (1):create node 4 and i 0 Step (2):create node Step (3):attach identifier t 1 (2)t 2 := a[t 1 ]Step.
Advertisements

CSC 4181 Compiler Construction Code Generation & Optimization.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Loops or Lather, Rinse, Repeat… CS153: Compilers Greg Morrisett.
19 Classic Examples of Local and Global Code Optimizations Local Constant folding Constant combining Strength reduction.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
C Chuen-Liang Chen, NTUCS&IE / 321 OPTIMIZATION Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University.
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
1 Classical Optimization Types of classical optimizations  Operation level: one operation in isolation  Local: optimize pairs of operations in same basic.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
1 Copy Propagation What does it mean? Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
Code Generation Professor Yihjia Tsai Tamkang University.
Introduction to Program Optimizations Chapter 11 Mooly Sagiv.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
1 CS 201 Compiler Construction Data Flow Analysis.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Dataflow IV: Loop Optimizations, Exam 2 Review EECS 483 – Lecture 26 University of Michigan Wednesday, December 6, 2006.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
What’s in an optimizing compiler?
1 Code Generation Part II Chapter 8 (1 st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Dataflow II: Finish Dataflow Analysis, Start on Classical Optimizations EECS 483 – Lecture 24 University of Michigan Wednesday, November 29, 2006.
EECS 583 – Class 8 Classic Optimization University of Michigan October 3, 2011.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Code Optimization.
© Seth Copen Goldstein & Todd C. Mowry
Princeton University Spring 2016
Fall Compiler Principles Lecture 8: Loop Optimizations
Machine-Independent Optimization
Topic 10: Dataflow Analysis
Optimizing Transformations Hal Perkins Autumn 2011
Code Optimization Chapter 10
Code Optimization Chapter 9 (1st ed. Ch.10)
Compiler Code Optimizations
Code Optimization Overview and Examples Control Flow Graph
Code Generation Part III
EECS 583 – Class 8 Classic Optimization
Fall Compiler Principles Lecture 10: Loop Optimizations
EECS 583 – Class 8 Finish SSA, Start on Classic Optimization
Local Optimizations.
Lecture 19: Code Optimisation
EECS 583 – Class 9 Classic and ILP Optimization
Code Generation Part II
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Live Variables – Basic Block
Code Optimization.
Presentation transcript:

Code Generation Part III Chapters 8 and 9.1 (1st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013

Classic Examples of Local and Global Code Optimizations Constant folding Constant combining Strength reduction Constant propagation Common subexpression elimination Backward copy propagation Global Dead code elimination Constant propagation Forward copy propagation Common subexpression elimination Code motion Loop strength reduction Induction variable elimination

Local: Constant Folding Goal: eliminate unnecessary operations Rules: X is an arithmetic operation If src1(X) and src2(X) are constant, then change X by applying the operation r7 = 4 + 1 src2(X) = 1 r5 = 2 * r4 r6 = r5 * 2 src1(X) = 4

Local: Constant Combining Goal: eliminate unnecessary operations First operation often becomes dead after constant combining Rules: Operations X and Y in same basic block X and Y have at least one literal src Y uses dest(X) None of the srcs of X have defs between X and Y (excluding Y) r7 = 5 r5 = 2 * r4 r6 = r5 * 2 r6 = r4 * 4

Local: Strength Reduction Goal: replace expensive operations with cheaper ones Rules (common): X is an multiplication operation where src1(X) or src2(X) is a const 2k integer literal Change X by using shift operation For k=1 can use add r7 = 5 r5 = 2 * r4 r6 = r4 * 4 r5 = r4 + r4 r6 = r4 << 2

Local: Constant Propagation Goal: replace register uses with literals (constants) in a single basic block Rules: Operation X is a move to register with src1(X) literal Operation Y uses dest(X) There is no def of dest(X) between X and Y (excluding defs at X and Y) Replace dest(X) in Y with src1(X) r1 = 5 r2 = _x r3 = 7 r4 = r4 + r1 r1 = r1 + r2 r1 = r1 + 1 r3 = 12 r8 = r1 - r2 r9 = r3 + r5 r3 = r2 + 1 r7 = r3 - r1 M[r7] = 0 r4 = r4 + 5 r1 = 5 + _x r1 = 5 + _x + 1 r8 = 5 + _x + 1 - _x r9 = 12 + r5 r3 = _x + 1 r7 = _x + 1 - 5 - _x - 1

Local: Common Subexpression Elimination (CSE) Goal: eliminate re-computations of an expression More efficient code Resulting moves can get copy propagated (see later) Rules: Operations X and Y have the same opcode and Y follows X src(X) = src(Y) for all srcs For all srcs, no def of a src between X and Y (excluding Y) No def of dest(X) between X and Y (excluding X and Y) Replace Y with move dest(Y) = dest(X) r1 = r2 + r3 r4 = r4 + 1 r1 = 6 r6 = r2 + r3 r2 = r1 - 1 r5 = r4 + 1 r7 = r2 + r3 r5 = r1 - 1 r5 = r2

Local: Backward Copy Propagation Goal: propagate LHS of moves backward Eliminates useless moves Rules (dataflow required) X and Y in same block Y is a move to register dest(X) is a register that is not live out of the block Y uses dest(X) dest(Y) not used or defined between X and Y (excluding X and Y) No uses of dest(X) after the first redef of dest(Y) Replace src(Y) on path from X to Y with dest(X) and remove Y r1 = r8 + r9 r2 = r9 + r1 r4 = r2 r6 = r2 + 1 r9 = r1 r7 = r6 r5 = r7 + 1 r4 = 0 r8 = r2 + r7 r7 = r2 + 1 remove r7 = r6 r6 not live

Global: Dead Code Elimination r1 = 3 r2 = 10 Goal: eliminate any operation who’s result is never used Rules (dataflow required) X is an operation with no use in def-use (DU) chain, i.e. dest(X) is not live Delete X if removable (not a mem store or branch) Rules too simple! Misses deletion of r4, even after deleting r7, since r4 is live in loop Better is to trace UD chains backwards from “critical” operations r4 = r4 + 1 r7 = r1 * r4 r7 not live r3 = r3 + 1 r2 = 0 r3 = r2 + r1 M[r1] = r3

Global: Constant Propagation r1 = 4 r2 = 10 Goal: globally replace register uses with literals Rules (dataflow required) X is a move to a register with src1(X) literal Y uses dest(X) dest(X) has only one def at X for use-def (UD) chains to Y Replace dest(X) in Y with src1(X) r5 = 2 r7 = r1 * r5 r7 = 8 r3 = r3 + r5 r2 = 0 r3 = r2 + r1 r6 = r7 * r4 r3 = r2 + 4 r6 = 8 * r4 r3 = r3 + 2 M[r1] = r3 M[4] = r3

Global: Forward Copy Propagation Goal: globally propagate RHS of moves forward Reduces dependence chain May be possible to eliminate moves Rules (dataflow required) X is a move with src1(X) register Y uses dest(X) dest(X) has only one def at X for UD chains to Y src1(X) has no def on any path from X to Y Replace dest(X) in Y with src1(X) r1 = r2 r3 = r4 r6 = r3 + 1 r2 = 0 r5 = r2 + r3 r5 = r2 + r4 r6 = r4 + 1

Global: Common Subexpression Elimination (CSE) r1 = r2 * r6 Goal: eliminate recomputations of an expression Rules: X and Y have the same opcode and X dominates Y src(X) = src(Y) for all srcs For all srcs, no def of a src on any path between X and Y (excluding Y) Insert rx = dest(X) immediately after X for new register rx Replace Y with move dest(Y) = rx r3 = r4 / r7 r10 = r3 r2 = r2 + 1 r3 = r3 + 1 r1 = r3 * 7 r5 = r2 * r6 r8 = r4 / r7 r8 = r10 r9 = r3 * 7

Global: Code Motion preheader r1 = 0 Goal: move loop-invariant computations to preheader Rules: Operation X in block that dominates all exit blocks X is the only operation to modify dest(X) in loop body All srcs of X have no defs in any of the basic blocks in the loop body Move X to end of preheader Note 1: if one src of X is a memory load, need to check for stores in loop body Note 2: X must be movable and not cause exceptions r4 = M[r5] header r4 = M[r5] r7 = r4 * 3 r8 = r2 + 1 r7 = r8 * r4 r3 = r2 + 1 r1 = r1 + r7 M[r1] = r3

Global: Loop Strength Reduction i := 0 t1 := n-2 i := 0 t1 := n-2 t2 := 4*i B2: t2 := 4*i A[t2] := 0 i := i+1 B2: A[t2] := 0 i := i+1 t2 := t2+4 B3: B3: if i < t1 goto B2 if i < t1 goto B2 Replace expensive computations with induction variables

Global: Induction Variable Elimination i := 0 t1 := n-2 t2 := 4*i t1 := 4*n t1 := t1-8 t2 := 4*i B2: A[t2] := 0 i := i+1 t2 := t2+4 B2: A[t2] := 0 t2 := t2+4 B3: B3: if i<t1 goto B2 if t2<t1 goto B2 Replace induction variable in expressions with another