Fall Compiler Principles Lecture 8: Loop Optimizations

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Course Outline Traditional Static Program Analysis Software Testing
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Compiler Principles Winter Compiler Principles Loop Optimizations and Register Allocation Mayer Goldberg and Roman Manevich Ben-Gurion University.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
19 Classic Examples of Local and Global Code Optimizations Local Constant folding Constant combining Strength reduction.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Dataflow Analysis Introduction Guo, Yao Part of the slides are adapted from.
Compilation (Semester A, 2013/14) Lecture 12: Abstract Interpretation Noam Rinetzky Slides credit: Roman Manevich, Mooly Sagiv and Eran Yahav.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
CS 536 Spring Global Optimizations Lecture 23.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Another example p := &x; *p := 5 y := x + 1;. Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
Loop invariant detection using SSA An expression is invariant in a loop L iff: (base cases) –it’s a constant –it’s a variable use, all of whose single.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Class canceled next Tuesday. Recap: Components of IR Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Optimizing Compilers Nai-Wei Lin Department of Computer Science and Information Engineering National Chung Cheng University.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
MIT Introduction to Program Analysis and Optimization Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Winter Compiler Principles Global Optimizations Mayer Goldberg and Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Compilation Lecture 8 Abstract Interpretation Noam Rinetzky 1.
Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University.
Optimization Simone Campanoni
Program Analysis and Verification Spring 2015 Program Analysis and Verification Lecture 8: Static Analysis II Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 9: Dataflow & Optimizations 2 Roman Manevich Ben-Gurion University of the Negev.
Compiler Principles Fall Compiler Principles Lecture 8: Dataflow & Optimizations 1 Roman Manevich Ben-Gurion University of the Negev.
CS 412/413 Spring 2005Introduction to Compilers1 CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 30: Loop Optimizations and Pointer Analysis.
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Data Flow Analysis Suman Jana
Fall Compiler Principles Lecture 7: Dataflow & Optimizations 2
Fall Compiler Principles Lecture 6: Dataflow & Optimizations 1
IR + Optimizations Noam Rinetzky
Machine-Independent Optimization
Topic 10: Dataflow Analysis
Global optimizations.
University Of Virginia
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Abstract Interpretation Noam Rinetzky
Topic 4: Flow Analysis Some slides come from Prof. J. N. Amaral
Compiler Code Optimizations
Code Optimization Overview and Examples Control Flow Graph
Optimizations Noam Rinetzky
Fall Compiler Principles Lecture 10: Global Optimizations
Fall Compiler Principles Lecture 10: Loop Optimizations
Data Flow Analysis Compiler Design
Fall Compiler Principles Lecture 6: Dataflow & Optimizations 1
Fall Compiler Principles Lecture 4b: Scanning/Parsing recap
Fall Compiler Principles Lecture 13: Summary
Live Variables – Basic Block
Code Optimization.
Presentation transcript:

Fall 2016-2017 Compiler Principles Lecture 8: Loop Optimizations Roman Manevich Ben-Gurion University of the Negev

Tentative syllabus mid-term exam Front End Intermediate Representation Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Operational Semantics Lowering Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Energy Optimization mid-term exam

previously Dataflow analysis framework More optimizations Join semilattices Chaotic iteration Monotone transformers More optimizations Reminder: Available expressions  common sub-expression elimination + copy propagation Constant propagation  constant folding

agenda Loop optimizations Dataflow summary Reaching definitions  loop code motion + strength reduction via induction variables Dataflow summary

Loop optimizations Most of a program’s computations are done inside loops Focus optimizations effort on loops The optimizations we’ve seen so far are independent of the control structure Some optimizations are specialized to loops Loop-invariant code motion Strength reduction via induction variables Require another type of analysis to find out where expressions get their values from Reaching definitions

Loop invariant computation y := … t := … z := … start y := t * 4 x < y + z x := x + 1 end

Loop invariant computation y := … t := … z := … start t*4 and y+z have same value on each iteration y := t * 4 x < y + z x := x + 1 end

Code hoisting Only if y is not used before defined inside the loop y := … t := … z := … y := t * 4 w := y + z start x < w x := x + 1 end

What reasoning did we use? Both t and z are defined only outside of loop y := … t := … z := … start constants are trivially loop-invariant y := t * 4 x < y + z y is defined inside loop but it is loop invariant since t*4 is loop-invariant x := x + 1 end

What about now? y := … t := … z := … start Now t is not loop-invariant and so are t*4 and y y := t * 4 x < y + z x := x + 1 t := t + 1 end

Loop-invariant code motion d: t := a1 op a2 a1 op a2 loop-invariant (for a loop L) if computes the same value in each iteration Hard to know in general Conservative approximation Each ai is either: A constant, or All definitions of ai that reach d are outside L, or Only one definition of ai reaches d, and is loop-invariant itself Transformation: hoist the loop-invariant code outside of the loop

Reaching definitions analysis

Reaching definitions analysis Definition: An assignment d: t := R reaches a program location if there is a path from the definition to the program location, along which the defined variable is never redefined Let’s define the corresponding dataflow analysis

Reaching definitions analysis Definition: An assignment d: t := R reaches a program location if there is a path from the definition to the program location, along which the defined variable is never redefined Direction: ? Domain: ? Join operator: ? Transfer function: fd: x:=R(RD) = ? fd: not-a-def(RD) = ? Where defs(a) is the set of locations defining a (statements of the form a=...) Initial value: ?

Reaching definitions analysis Definition: An assignment d: t := R reaches a program location if there is a path from the definition to the program location, along which the defined variable is never redefined Direction: Forward Domain: sets of program locations that are definitions Join operator: union Transfer function: F[d: x:=R](RD) = (RD \ defs(x, P))  {d} F[d: not-a-def](RD) = RD Where defs(x , P) is the set of labels defining x (statements of the form x:=R) in P Initial value: {}

Reaching definitions analysis d1: y := … d2: t := … d3: z := … start d4: y := t * 4 d4:x < y + z end {} d6: x := x + 1

Reaching definitions analysis d1: y := … d2: t := … d3: z := … start d4: y := t * 4 d4:x < y + z end {} d5: x := x + 1

Initialization d1: y := … d2: t := … d3: z := … start {} {} d4: y := t * 4 d4:x < y + z end {} {} d5: x := x + 1 {}

Iteration 1 d1: y := … d2: t := … d3: z := … {} start {} {} d4: y := t * 4 d4:x < y + z end {} {} d5: x := x + 1 {}

Iteration 1 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} d4: y := t * 4 d4:x < y + z end {} {} d5: x := x + 1 {}

Iteration 2 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} d4: y := t * 4 x < y + z end {} {} d5: x := x + 1 {}

Iteration 2 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3} d4: y := t * 4 x < y + z end {} {} d5: x := x + 1 {}

Iteration 2 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3} d4: y := t * 4 x < y + z end {} {d2, d3, d4} {} d5: x := x + 1 {}

Iteration 2 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3} d4: y := t * 4 x < y + z end {} {d2, d3, d4} {d2, d3, d4} d5: x := x + 1 {}

Iteration 3 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3} d4: y := t * 4 x < y + z end {} {d2, d3, d4} {d2, d3, d4} d5: x := x + 1 {d2, d3, d4} {}

Iteration 3 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3} d4: y := t * 4 x < y + z end {} {d2, d3, d4} {d2, d3, d4} d5: x := x + 1 {d2, d3, d4} {d2, d3, d4, d5}

Iteration 4 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3} d4: y := t * 4 x < y + z end {} {d2, d3, d4} {d2, d3, d4} d5: x := x + 1 {d2, d3, d4} {d2, d3, d4, d5}

Iteration 4 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z end {} {d2, d3, d4} {d2, d3, d4} d5: x := x + 1 {d2, d3, d4} {d2, d3, d4, d5}

Iteration 4 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z end {} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4} {d2, d3, d4, d5}

Iteration 5 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4, d5} end {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4} {d2, d3, d4, d5}

Iteration 6 d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4, d5} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

Which expressions are loop invariant? d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} t is defined only in d2 – outside of loop {d1, d2, d3} y is defined only in d4 – inside of loop but depends on t and 4, both loop-invariant {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4, d5} end {d2, d3, d4, d5} x is defined only in d5 – inside of loop so is not a loop-invariant {d2, d3, d4, d5} z is defined only in d3 – outside of loop {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

When is a transformation allowed Hoisting an assignment out of the loop body: as long as it doesn’t change the set of reaching definitions inside the loop body Adding an assignment to a temporary : Only if all the reaching definitions of the variables involved are outside the loop body

Inferring loop invariants

Inferring loop-invariant expressions For a command c of the form t := a1 op a2 A variable ai is immediately loop-invariant if all reaching definitions IN[c]={d1,…,dk} for ai are outside of the loop LOOP-INV = immediately loop-invariant variables and constants LOOP-INV = LOOP-INV  {x | d: x := a1 op a2, d is in the loop, and both a1 and a2 are in LOOP-INV} Iterate until fixed-point An expression is loop-invariant if all operands are loop-invariants

Computing LOOP-INV d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

(immediately) LOOP-INV = {t} Computing LOOP-INV d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} (immediately) LOOP-INV = {t} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

(immediately) LOOP-INV = {t, z} Computing LOOP-INV d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} (immediately) LOOP-INV = {t, z} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

(immediately) LOOP-INV = {t, z} Computing LOOP-INV d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} (immediately) LOOP-INV = {t, z} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

(immediately) LOOP-INV = {t, z} Computing LOOP-INV d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} (immediately) LOOP-INV = {t, z} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

Computing LOOP-INV d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} LOOP-INV = {t, z, 4} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z {d2, d3, d4} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

Computing LOOP-INV d1: y := … d2: t := … d3: z := … {} {d1} start {} {d1, d2} {d1, d2, d3} LOOP-INV = {t, z, 4, y} {d1, d2, d3, d4, d5} d4: y := t * 4 x < y + z Can move outside of loop only if every command in the loop sees only this definition (and not d1’s) {d2, d3, d4} end {d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4, d5} d5: x := x + 1 {d2, d3, d4, d5}

Strength reduction via induction variables

Induction variables j is a linear function of the induction variable with base a and multiplier b while (i < x) { j := a + b * i a[j] := j i := i + 1 } i must receive definition from increment assignment i is incremented by a loop-invariant expression on each iteration – this is called an induction variable

Strength-reduction Prepare initial value j := a + b * (i-1) while (i < x) { j := j + b a[j] := j i := i + 1 } Increment by multiplier

Dataflow summary

Summary of optimizations All of the analyses below use monotone transformers Enabled Optimizations Analysis Common-subexpression elimination Copy Propagation Available Expressions Constant folding Constant Propagation Dead code elimination Live Variables Loop-invariant code motion Induction variable strength reduction Reaching Definitions (May + Must)

Analysis direction Forward Backward In the CFG, information is propagated from predecessors of a statement to its output Properties depend on past computations Examples: available expressions, constant propagating, reaching definitions Backward In the CFG, information is propagated from successors of a statement to its output Properties depend on future computations Examples: Liveness analysis

Types of dataflow analysis May vs. Must Must analysis – properties hold on all paths (join is set intersection) Examples: available expressions, constant propagation May analysis – properties hold on some path (join is set union) Examples: liveness, reaching definitions

Strong liveness

The fixed-point of DCE DCE = dead code elimination Observation: DCE(P) does not always equal DCE(DCE(P)) DCE is not an idempotent transformation

Liveness example Which statements are dead? { b } a := b; { a, b } c := a; { a, b } Which statements are dead? d := a + b; { a, b, d } e := d; { a, b, e } d := a; { b, d, e } f := e; { b, d } Foo(b, d) { }

Dead code elimination Can we find more dead assignments now? { b } a := b; { a, b } c := a; { a, b } Can we find more dead assignments now? d := a + b; { a, b, d } e := d; { a, b, e } d := a; { b, d, e } f := e; { b, d } Foo(b, d) { }

Second liveness analysis { b } a := b; { a, b } Which statements are dead? d := a + b; { a, b, d } e := d; { a, b } d := a; { b, d } Foo(b, d) { }

Second dead code elimination { b } a := b; { a, b } Can we find more dead assignments now? d := a + b; { a, b, d } e := d; { a, b } d := a; { b, d } Foo(b, d) { }

Third liveness analysis { b } a := b; { a, b } Which statements are dead? d := a + b; { a, b } d := a; { b, d } Foo(b, d) { }

Third dead code elimination { b } a := b; { a, b } Can we find more dead assignments now? d := a + b; { a, b } d := a; { b, d } Foo(b, d) { }

Fourth liveness analysis { b } a := b; No more dead statements { a, b } d := a; { b, d } Foo(b, d) { }

Loop example { } IfZ x Goto L1 { } Start { } x := x - 1; { } d := d + 1; { } { } L1: Ret x End

Strong liveness Definition: a variable v is strongly live at label L if … What are the analysis parameters? SDCE = Strong Dead Code Elimination = DCE(StrongLiveness(P))

Strong liveness transformer Reminder: the liveness analysis transfer function for an assignment x:=y+z is F[x:=y+z](LVout) = (LVout \ {x})  {y, z} Definition: a variable v is strongly live at label L if … The strong liveness analysis transfer function for an assignment x:=y+z is F[x:=y+z](LVout) = (LVout \ {x})  {y, z | xLVout} Is it monotone?

Strong liveness transformer Reminder: the liveness analysis transfer function for an assignment x:=y+z is F[x:=y+z](LVout) = (LVout \ {x})  {y, z} Definition: a variable v is strongly live at label L if it is an argument to a procedure, a return command, or a condition Used before assigned to a strongly live variable The strong liveness analysis transfer function for an assignment x:=y+z is F[x:=y+z](LVout) = (LVout \ {x})  {y, z | xLVout} Is it monotone?

Compute strong liveness { b } a := b; { a, b } c := a; { a, b } Which statements are dead? d := a + b; { a, b } e := d; { a, b } d := a; { b, d } f := e; { b, d } Foo(b, d) { }

Eliminate dead code { b } a := b; { a, b } c := a; { a, b } d := a + b; { a, b } e := d; { a, b } d := a; { b, d } f := e; { b, d } Foo(b, d) { }

Eliminate dead code { b } a := b; { a, b } d := a; { b, d } Foo(b, d) { }

Apply SDCE { } IfZ x Goto L1 { } Start { } x := x - 1; { } d := d + 1; { } { } L1: Ret x End

Composing analyses

Accelerating copy propagation Can we compute CP* - the fixed point of applying copy propagation? Yes: see question in the exam of 2016-B What about the fixed-point of CF+CP+CSE+DCE? There is an algorithm by Sethi and Aho but it is limited to basic blocks Is it extendible to CFGs? An open problem

Next lecture: Register allocation