Loops Simone Campanoni

Slides:



Advertisements
Similar presentations
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Advertisements

1 SSA review Each definition has a unique name Each use refers to a single definition The compiler inserts  -functions at points where different control.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 More Control Flow John Cavazos University.
1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Program Representations. Representing programs Goals.
1 Constant Propagation for Loops with Factored Use-Def Chains Reporter : Lai, Yen-Chang.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Jeffrey D. Ullman Stanford University Flow Graph Theory.
Example in SSA X := Y op Z in out F X := Y op Z (in) = in [ { X ! Y op Z } X :=  (Y,Z) in 0 out F X :=   (in 0, in 1 ) = (in 0 Å in 1 ) [ { X ! E |
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Lecture 2 Emery Berger University of.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
PSUCS322 HM 1 Languages and Compiler Design II Basic Blocks Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
ECE1724F Compiler Primer Sept. 18, 2002.
Lecture 6 Program Flow Analysis Forrest Brewer Ryan Kastner Jose Amaral.
2015/6/24\course\cpeg421-10F\Topic1-b.ppt1 Topic 1b: Flow Analysis Some slides come from Prof. J. N. Amaral
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Loop invariant detection using SSA An expression is invariant in a loop L iff: (base cases) –it’s a constant –it’s a variable use, all of whose single.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
1 Copy Propagation What does it mean? – Given an assignment x = y, replace later uses of x with uses of y, provided there are no intervening assignments.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Loops Guo, Yao.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
Efficiency of Iterative Algorithms
1 Region-Based Data Flow Analysis. 2 Loops Loops in programs deserve special treatment Because programs spend most of their time executing loops, improving.
CSE P501 – Compiler Construction
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Loops.
Program Representations. Representing programs Goals.
Announcements & Reading Material
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Machine-Independent Optimizations Ⅳ CS308 Compiler Theory1.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Simone Campanoni SSA Simone Campanoni
Simone Campanoni CFA Simone Campanoni
Basic Program Analysis
Simone Campanoni Dependences Simone Campanoni
CS 201 Compiler Construction
EECS 583 – Class 2 Control Flow Analysis
Simone Campanoni Loop transformations Simone Campanoni
Static Single Assignment
Program Representations
CSC D70: Compiler Optimization Dataflow-2 and Loops
CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion
CS 201 Compiler Construction
Topic 10: Dataflow Analysis
Factored Use-Def Chains and Static Single Assignment Forms
Control Flow Analysis CS 4501 Baishakhi Ray.
Code Optimization Chapter 10
Code Optimization Chapter 9 (1st ed. Ch.10)
Optimizing Compilers CISC 673 Spring 2009 More Control Flow
Graphs Chapter 11 Objectives Upon completion you will be able to:
Topic 4: Flow Analysis Some slides come from Prof. J. N. Amaral
Code Optimization Overview and Examples Control Flow Graph
Static Single Assignment Form (SSA)
Control Flow Analysis (Chapter 7)
Interval Partitioning of a Flow Graph
Data Flow Analysis Compiler Design
EECS 583 – Class 7 Static Single Assignment Form
CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion
CSC D70: Compiler Optimization Static Single Assignment (SSA)
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Loops Simone Campanoni

… points … H0, H1, H2 … … up to 30 points … … (excluding extra points) … … out of 100 GradePoints A95 – 100 A -90 – 94 B +80 – 89 B61 – 79 C D25 – 49 F0 - 24

H1 solution: data structure

H1: finding CAT functions

H1

Homework H3 is due tomorrow!

Outline Loops Identify loops Loop normalization

Impact of optimized code to program Code transformation 10 seconds 1 second How much did we optimize the overall program? Coverage of optimized code 10% coverage: Speedup=~1.10x (100->91 seconds) 20% coverage: Speedup=~1.22x (100->82 seconds) 90% coverage: Speedup=~5.26x (100->19 seconds)

90% of time is spent in 10% of code Hot code Loop Cold code Identify hot code to succeed!!!

Loops …... but where are they?... How can we find them?

Loops in source code i=0; while (i < 10){ … i++; } for (i=0; i < 10; i++){ … } i=0; do { … i++; } while (i < 10); S={0,1,…,10} foreach (i in S){ … } Is there a LLVM IR instruction “for”? There is no IR instruction for “loop”

Target optimization: we need to identify loops There is no IR instruction for “loop” How to identify an IR loop?

Loops in IR Loop identification: Input: Control-Flow-Graph Output: loops in CFG Not sensitive to input syntax: a uniform treatment for all loops Define a loop in graphic-theoretic terms Intuitive properties of a loop Single entry point Edges must form at least a cycle in CFG How to check these properties automatically?

Dominators Definition: Node d dominates node n in a graph (d dom n) if every path from the start node to n goes through d What is the relation between instructions within a basic block? What is the relation between basic blocks 1 and 2? What is the relation between basic blocks 1, 2, and 3?

Immediate dominators Definition: the immediate dominator of a node n is the unique node that strictly dominates n but does not strictly dominate another node that strictly dominates n CFG Immediate dominators Dominator tree

Natural loops in CFG Header: node that dominates all other nodes in a loop Single entry point of a loop Back edge: arc (tail -> head) whose head dominates its tail Natural loop of a back edge: smallest set of nodes that includes the head and tail of the back edge, and has no predecessors outside the set, except for the predecessors of the header.

Identify natural loops ①Find the dominator relations in a flow graph ②Identify the back edges ③Find the natural loop associated with the back edge

Finding dominators Definition: Node d dominates node n in a graph (d dom n) if every path from the start node to n goes through d. Data-flow analysis: Direction? Values? GEN? KILL? IN? OUT? Meet operator? Top? Bottom?

Finding back-edges Definition: a back-edge is an arc (tail -> head) whose head dominates its tail (A) Depth-first spanning tree Edges traversed in a depth-first search of the flow graph form a depth-first spanning tree Compute retreating edges in CFG: Advancing edges: from ancestor to proper descendant Retreating edges: from descendant to ancestor (B) For each retreating edge t->h, check if h is in t’s dominator list

Finding natural loops Definition: the natural loop of a back edge is smallest set of nodes that includes the head and tail of the back edge, and has no predecessors outside the set, except for the predecessors of the header Let t->h be the back-edge A.Delete h from the flow graph B.Find those nodes that can reach t (those nodes plus h form the natural loop of t->h)

Identify inner loops If two loops do not have the same header they are either disjoint, or one is entirely contained (nested within) the other inner loop, one that contains no other loop Loop nesting tree What about if two loops share the same header? while (i < 10){ if (i == 5) continue; }

Loop-nest tree Loop-nest tree: each node represents the blocks of a loop, and parent nodes are enclosing loops. The leaves of the tree are the inner-most loops ,3 1,2,3,4 How to compute the loop-nest tree?

Loops in LLVM Rely on other passes to identify loops Fetch the result of the LoopInfoWrapperPass analysis Iterate over loops

Loops in LLVM: sub-loops Iterate over sub-loops of a loop

Defining loops in graphic-theoretic terms Is it good? Bad? Implications? L1: … if (X < 10) goto L2; goto L1; L2:... if (…) goto L1; … do { … L1: … } while (X < 10); The good The bad The ugly?

Outline Loops Identify loops Loop normalization

First normalization: adding a pre-header Optimizations often require code to be executed once before the loop Create a pre-header basic block for every loop

Common loop normalization Pre- header Body Header

Loop normalization in LLVM The loop-simplify pass normalize natural loops Output of loop-simplify : Pre-header: the only predecessor of the header Latch: node executed just before starting a new loop iteration Exit node: ensures it is dominated by the header Header Body n1 n2 n3 exit nX

Loop normalization in LLVM The loop-simplify pass normalize natural loops Output of loop-simplify : Pre-header: the only predecessor of the header Latch: node executed just before starting a new loop iteration Exit node: ensures it is dominated by the header Pre- header Body n1 n2 n3 exit nX Header

Loop normalization in LLVM The loop-simplify pass normalize natural loops Output of loop-simplify : Pre-header: the only predecessor of the header Latch: node executed just before starting a new loop iteration Exit node: ensures it is dominated by the header Pre- header Body n1 n2 n3 exit nX Header Latch

Loop normalization in LLVM The loop-simplify pass normalize natural loops Output of loop-simplify : Pre-header: the only predecessor of the header Latch: node executed just before starting a new loop iteration Exit node: ensures it is dominated by the header Pre- header Body n1 n2 n3 Exit node nX Header Latch exit

Loop normalization in LLVM Pre-header llvm::Loop:getLoopPreheader() Header llvm::Loop::getHeader() Latch llvm::Loop::getLoopLatch() Exit llvm::Loop::getExitBlocks() Pre- header Body Exit node Header Latch opt -loop-simplify bitcode.bc -o normalized.bc

Further normalizations in LLVM Loop representation can be further normalized: loop-simplify normalize the shape of the loop What about definitions in a loop? Problem: updating code in loop might require to update code outside loops for keeping SSA Loop-closed SSA form: no var is used outside of the loop in that it is defined Keeping SSA form is expensive with loops lcssa insert phi instruction at loop boundaries for variables defined in a loop body and used outside Isolation between optimization performed in and out the loop Faster keeping the SSA form Propagation of code changes outside the loop blocked by phi instructions

Loop pass example in LLVM while (){ d = … } …... = d op... while (){ d = …... if (...){ d =... } …... = d op... while (){ d = …... if (...){ d2 =... } …... = d op... while (){ d = …... if (...){ d2 =... } d3=phi(d,d2) } …... = d op... while (){ d = …... if (...){ d2 =... } d3=phi(d,d2) } …... = d3 op... Changes to code outside our loop

Further normalizations in LLVM Loop representation can be further normalized: loop-simplify normalize the shape of the loop What about definitions in a loop? Problem: updating code in loop might require to update code outside loops for keeping SSA Loop-closed SSA form: no var is used outside of the loop in that it is defined Keeping SSA form is expensive with loops lcssa insert phi instruction at loop boundaries for variables defined in a loop body and used outside Isolation between optimization performed in and out the loop Faster keeping the SSA form Propagation of code changes outside the loop blocked by phi instructions

Loop-closed SSA form in LLVM opt -lcssa bitcode.bc -o transformed.bc llvm::Loop::isLCSSAForm(DT) formLCSSA(…)

Further normalizations in LLVM Last loop-related normalization: Induction variable normalization

Coding time