Princeton University Spring 2016

Slides:



Advertisements
Similar presentations
Compiler Support for Superscalar Processors. Loop Unrolling Assumption: Standard five stage pipeline Empty cycles between instructions before the result.
Advertisements

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
1 Chapter 8: Code Generation. 2 Generating Instructions from Three-address Code Example: D = (A*B)+C =* A B T1 =+ T1 C T2 = T2 D.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
SSA.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
EECC551 - Shaaban #1 Spring 2006 lec# Pipelining and Instruction-Level Parallelism. Definition of basic instruction block Increasing Instruction-Level.
Code Generation Professor Yihjia Tsai Tamkang University.
2015/6/24\course\cpeg421-10F\Topic1-b.ppt1 Topic 1b: Flow Analysis Some slides come from Prof. J. N. Amaral
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
Improving Code Generation Honors Compilers April 16 th 2002.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
What’s in an optimizing compiler?
1 Code Generation Part II Chapter 9 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
EECS 583 – Class 8 Classic Optimization University of Michigan October 3, 2011.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Code Optimization More Optimization Techniques. More Optimization Techniques  Loop optimization  Code motion  Strength reduction for induction variables.
More Code Generation and Optimization Pat Morin COMP 3002.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
Code Optimization.
Optimization Code Optimization ©SoftMoore Consulting.
Princeton University Spring 2016
Fall Compiler Principles Lecture 8: Loop Optimizations
Topic 10: Dataflow Analysis
Factored Use-Def Chains and Static Single Assignment Forms
Topic 4: Flow Analysis Some slides come from Prof. J. N. Amaral
Code Optimization Overview and Examples Control Flow Graph
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
EECS 583 – Class 8 Classic Optimization
Optimizations using SSA
Control Flow Analysis (Chapter 7)
Fall Compiler Principles Lecture 10: Loop Optimizations
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
Code Generation Part II
Pipelining and Exploiting Instruction-Level Parallelism (ILP)
COMPILERS Liveness Analysis
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
CMPE 152: Compiler Design April 30 Class Meeting
Code Optimization.
Presentation transcript:

Princeton University Spring 2016 Topic 9: Control Flow COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer

The Front End

The Back End (Intel-HP codename for “Itanium”; uses compiler to identify parallelism)

Optimization  r1 holds loop index i  load address of array a  load constant 4  calculate offset for index i  calculate address of a[i]  load content of x  store x in a[i] increment loop counter i  repeat, unless exit condition holds How can we optimize this code for code size/speed/resource usage/…?

Optimization  r1 holds loop index i  load address of array a  load constant 4  calculate offset for index i  calculate address of a[i]  load content of x  store x in a[i] increment loop counter i  repeat, unless exit condition holds Instructions not dependent of iteration count… …can be moved outside the loop!

Register Allocation Q: can any of the registers be shared/reused? – analyze liveness/def-use

Register Allocation Q: can any of the registers be shared/reused? – analyze liveness/def-use A: registers r4 and r5 don’t overlap – can map to same register, say r5! Then, rename r6 to r4.

Scheduling Q: can we exploit this?

Scheduling A: can use the “empty slot” to execute some other instruction A, as long as A is independent (does not consume the value in r5)

Scheduling Can use the “empty slot” to execute some other instruction A, as long as A is independent (does not consume the value in r5)

Backend analyses and tranformations e.g. loop-invariant code removal

Control Flow Analysis

Control Flow Analysis Example 1 2 3 4 5 6 7 8

Control Flow Analysis Example 1 1 2 2 3 4 3 5 6 4 7 7 5 6 8 8

Basic Blocks (extra labels and jumps mentioned in previous lecture now omitted for simplicity)

CFG: CFG of Basic Blocks Start 1 BB1 1 2 BB2 BB4 2 3 BB3 BB5 3 4 4 5 5 End

Domination Motivation propagate r1=4 exploit 4+5=9 “constant folding” assume r1 dead here

Domination Motivation propagate r1=4 exploit 4+5=9 “constant folding” assume r1 dead here What about this:

Domination Motivation propagate r1=4 exploit 4+5=9 What about this: Illegal if r1=4 does not hold in the other incoming arc! -- Need to analyze which basic blocks are guaranteed to have been executed prior to join.

Dominator Analysis

Dominator Analysis

Dominator Analysis starting point: n dominated by all nodes nodes that dominate all predecessors of n

Dominator Analysis Example Task: fill in column Dom[n]

Dominator Analysis Example More concise information: immediate dominators/dominator tree.

Dominator Analysis Example Hence: last dominator of n on any path from s0 to n is IDom[n] Task: fill in column IDom[n]

Dominator Analysis Example Hence: last dominator of n on any path from s0 to n is IDom[n] - 1 2 4 5 8 9 7

Use of Immediate Dominators: Dominator Tree Immediate dominators can be arranged in tree root: s0 children of node n: the nodes m such that n=IDom[m] hence: each node dominates only its tree descendants s0=1 2 3 4 5 6 7 8 9 10 11 12 (note: some tree arcs are CFG edges, some are not)

Post Dominator

Loop Optimization

Examples of Loops 1 1 2 1 2 3 3 4 4 2 3 4 5 5 6

Two loops, with identical header node: 1 Examples of Loops 2 4 3 6 1 5 Header node: 2 Header node: 1 1 3 2 4 5 Two loops, with identical header node: 1 1 4 2 3 Header node: 1

Back Edges Back-edges: 3  2, 4  2, 9  8, 10  5

Natural Loops Back-edge Header of nat.loop Nodes 3  2 4  2 9  8 10  5

Q: Suppose we had an additional edge 5  3 – is this a backedge? Natural Loops Back-edge Header of nat.loop Nodes 3  2 2 2, 3 4  2 2, 4 9  8 8 8, 9 10  5 5 5, 8, 9, 10 Q: Suppose we had an additional edge 5  3 – is this a backedge?

Q: Suppose we had an additional edge 5  3 – is this a backedge? Natural Loops Back-edge Header of nat.loop Nodes 3  2 2 2, 3 4  2 2, 4 9  8 8 8, 9 10  5 5 5, 8, 9, 10 Q: Suppose we had an additional edge 5  3 – is this a backedge? A: No! 3 does not dominate 5!

Loop Optimization

This CFG does not contain a loop! Loop Optimization {1,2,3} is not a loop: 1 is not a header: there are no paths/arcs back to 1 2 is not a header: it does not dominate 1 or 3 similarly for 3 1 3 2 This CFG does not contain a loop! {2,3} is not a loop: 2 is not a header: it does not dominate 3 similarly for 3

“into the edge” leading to the header. Loop Optimization “into the edge” leading to the header.

Loop Optimization Q: is there an induction variable here? the loop increment/decrement i, and by a loop-independent value. Q: is there an induction variable here?

Loop Optimization the loop increment/decrement i, and by a loop-independent value. exploit here?! r1 is induction variable! holds here r1 1 2 3 4 r4 8 12

the loop increment/decrement I, by a loop-independent value. Loop Optimization the loop increment/decrement I, by a loop-independent value. r4 = -4 r4 = r4 + 4 // replace * by + r1 1 2 3 4 r4 8 12 r4 <= 40 Eliminated r1 and r3, cut 2 instructions; made 1 instruction 1 cycle faster!

Non-Loop Cycles Remember: this CFG does not contain a loop! 1 3 2 Remember: this CFG does not contain a loop! Reduction: collapse nodes, eliminate edges

duplicate a node of the cycle, say 2 Node Splitting 1 3 2 duplicate a node of the cycle, say 2 1 2’ 2 3

Node Splitting 1 duplicate a node of the cycle, say 2 1 3 2 duplicate a node of the cycle, say 2 connect the copy to its successor and predecessor 1 2’ 2 3

Node Splitting 1 duplicate a node of the cycle, say 2 1 3 2 duplicate a node of the cycle, say 2 connect the copy to its successor and predecessor the successor of the copy is the loop header! 1 2’ 2 3

Collapse nodes, eliminate edges Reduction Collapse nodes, eliminate edges 1 A A C B 4 5 6 2 3 7 9 B C 8