Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

Slides:



Advertisements
Similar presentations
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Advertisements

Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
Jeffrey D. Ullman Stanford University. 2  Generalizes: 1.Moving loop-invariant computations outside the loop. 2.Eliminating common subexpressions. 3.True.
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Partial Redundancy Elimination.
Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
Lazy Code Motion Comp 512 Spring 2011
Partial Redundancy Elimination & Lazy Code Motion
Λλ Fernando Magno Quintão Pereira P ROGRAMMING L ANGUAGES L ABORATORY Universidade Federal de Minas Gerais - Department of Computer Science P ROGRAM A.
Partial Redundancy Elimination. Partial-Redundancy Elimination Minimize the number of expression evaluations By moving around the places where an expression.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
Loop Invariant Code Motion — classical approaches — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
1 Data flow analysis Goal : –collect information about how a procedure manipulates its data This information is used in various optimizations –For example,
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Global optimization. Data flow analysis To generate better code, need to examine definitions and uses of variables beyond basic blocks. With use- definition.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
1 CS 201 Compiler Construction Data Flow Analysis.
Global Common Subexpression Elimination with Data-flow Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Computer Science & Engineering, Indian Institute of Technology, Bombay Code optimization by partial redundancy elimination using Eliminatability paths.
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
First Principles (with examples from value numbering) C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Order from Chaos — the big picture — C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Algebraic Reassociation of Expressions Briggs & Cooper, “Effective Partial Redundancy Elimination,” Proceedings of the ACM SIGPLAN 1994 Conference on Programming.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
Boolean & Relational Values Control-flow Constructs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
Terminology, Principles, and Concerns, II With examples from superlocal value numbering (Ch 8 in EaC2e) Copyright 2011, Keith D. Cooper & Linda Torczon,
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Profile Guided Code Positioning C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Introduction to Optimization
Data Flow Analysis Suman Jana
Lecture 5 Partial Redundancy Elimination
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion
Introduction to Optimization
University Of Virginia
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Advanced Compiler Techniques
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
The Last Lecture COMP 512 Rice University Houston, Texas Fall 2003
Topic 5a Partial Redundancy Elimination and SSA Form
Introduction to Optimization
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Algebraic Reassociation of Expressions COMP 512 Rice University Houston, Texas Fall 2003 P. Briggs & K.D. Cooper, “Effective Partial Redundancy Elimination,”
CSC D70: Compiler Optimization LICM: Loop Invariant Code Motion
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
Prof. Dhananjay M Dhamdhere
Presentation transcript:

Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. This lecture draws on two papers: “Lazy Code Motion,” J. Knoop, O. Ruthing, & B. Steffen, in PLDI 92 “A Variation of Knoop, Ruthing, and Steffen’s Lazy Code Motion,” K. Drechsler & M. Stadel, SIGPLAN Notices, 28(5), May 1993

COMP 512, Fall Where are we? Machine Independent Redundancy Redundancy elim. Partial red. elim. Consolidation Code motion Loop-invariant c.m. Consolidation Global Scheduling [Click] Constant propagation Useless code Dead code elim. Partial d.c.e. Constant propagation Algebraic identities Create opportunities Reassociation Replication Specialization Replication Strength Reduction Constant Propagation Method caching Heap  stack allocation Tail recursion elimination * Today (lazy code motion)

COMP 512, Fall Some occurrences of b+c are redundant Redundant Expression An expression is redundant at point p if, on every path to p 1. It is evaluated before reaching p, and 2. Non of its constitutent values is redefined before p Example a  b + c b  b + 1 a  b + c *

COMP 512, Fall b  b + 1 a  b + c Inserting a copy of “a  b + c” after the definition of b can make it redundant fully redundant? Partially Redundant Expression An expression is partially redundant at p if it is redundant along some, but not all, paths reaching p Example b  b + 1a  b + c *

COMP 512, Fall Loop Invariant Expression Another example Loop invariant expressions are partially redundant Partial redundancy elimination performs code motion Major part of the work is figuring out where to insert operations x  y * z a  b * c b+c is partially redundant here x  y * z a  b * c

COMP 512, Fall Lazy Code Motion The concept Solve data-flow problems that reveal limits of code motion Compute I NSERT & D ELETE sets from solutions Linear pass over the code to rewrite it (using I NSERT & D ELETE ) The history Partial redundancy elimination ( Morel & Renvoise, CACM, 1979 ) Improvements by Drechsler & Stadel, Joshi & Dhamdhere, Chow, Knoop, Ruthing & Steffen, Dhamdhere, Sorkin, … All versions of PRE optimize placement  Guarantee that no path is lengthened L CM was invented by Knoop et al. in P LDI, 1992 We will look at a variation by Drechsler & Stadel SIGPLAN Notices, 28(5), May, 1993

COMP 512, Fall Lazy Code Motion The intuitions Compute available expressions Compute anticipable expressions These lead to an earliest placement for each expression Push expressions down the CFG until it changes behavior Assumptions Uses a lexical notion of identity ( not value identity ) I LOC -style code with unlimited name space Consistent, disciplined use of names  Identical expressions define the same name  No other expression defines that name } Avoids copies Result serves as proxy

COMP 512, Fall Lazy Code Motion The Name Space r i + r j  r k, always ( hash to find k ) We can refer to r i + r j by r k ( bit-vector sets ) Variables must be set by copies  No consistent definition for a variable  Break the rule for this case, but require r source < r destination  To achieve this, assign register names to variables first Without this name space L CM must insert copies to preserve redundant values L CM must compute its own map of expressions to unique ids Digression in Chapter 5 of EAC: “The impact of naming”

COMP 512, Fall Lazy Code Motion Local Predicates D EF (b) contains expressions defined in b that survive to the end of b. e  DEF(b)  evaluating e at the end of b produces the same value for e A NT L OC (b) contains expressions defined in b that have upward exposed arguments (both args). e  A NT L OC (b)  evaluating e at the start of b produces the same value for e NotKilled(b) contains those expressions whose arguments are not defined in b. e  N oT K ILLED (b)  evaluating e produces the same result at the start and end of b D EF and N OT K ILLED are from A VAIL

COMP 512, Fall Lazy Code Motion Availability Initialize A VAIL I N (n) to the set of all names, except at n 0 Set A VAIL I N (n 0 ) to Ø Interpreting A VAIL e  A VAIL O UT (b)  evaluating e at end of b produces the same value for e. A VAIL O UT tells the compiler how far forward e can move This differs from the way we talk about A VAIL in global redundancy elimination A VAIL I N (n) =  m  preds(n) A VAIL O UT (m), n ≠ n 0 A VAIL O UT (m) = D EF (m)  (A VAIL I N (m)  N OT K ILLED (m))

COMP 512, Fall Lazy Code Motion Anticipability Initialize A NT O UT (n) to the set of all names, except at exit blocks Set A NT O UT (n) to Ø, for each exit block n Interpreting A NT O UT e  A NT I N (b)  evaluating e at start of b produces the same value for e. A NT I N tells the compiler how far backward e can move This view shows that anticipability is, in some sense, the inverse of availablilty (& explains the new interpretation of A VAIL ) A NT O UT (n) =  m  succs(n) A NT I N (m), n not an exit block A NT I N (m) = A NT L OC (m)  (A NT O UT (m)  N OT K ILLED (m))

COMP 512, Fall Lazy Code Motion Earliest placement E ARLIEST is a predicate Computed for edges rather than nodes ( placement ) e  EARLIEST(i,j) if  It can move to head of j,  It is not available at the end of i and  either it cannot move to the head of i (N OT K ILLED (i)) or another edge leaving i prevents its placement in i (A NT O UT (i)) E ARLIEST (i,j) = A NT I N (j)  A VAIL O UT (i)  (N OT K ILLED (i)  A NT O UT (i)) E ARLIEST (n 0,j) = A NT I N (j)  A VAIL O UT (n 0 )

COMP 512, Fall Lazy Code Motion Later (than earliest) placement Initialize L ATER I N (n 0 ) to Ø x  L ATER I N (k)  every path that reaches k has x  E ARLIEST (m) for some block m, and the path from m to k is x-clear & does not evaluate x  the compiler can move x through k without losing any benefit x  L ATER (i,j)  is its earliest placement, or it can be moved forward from i (L ATER (i)) and placement at entry to i does not anticipate a use in i ( moving it across the edge exposes that use ) L ATER I N (j) =  m  pred(n) L ATER (i,j), j ≠ n 0 L ATER (i,j) = E ARLIEST (i,j)  (L ATER I N (i)  A NT L OC (i))

COMP 512, Fall Lazy Code Motion Rewriting the code I NSERT & D ELETE are predicates Compiler uses them to guide the rewrite step x  I NSERT (i,j)  insert x at start of i, end of j, or new block x  D ELETE (k)  delete first evaluation of x in k I NSERT (i,j) = L ATER (i,j)  L ATER I N (j) D ELETE (k) = A NT L OC (k)  L ATER I N (k), k ≠ n 0 If local redundancy elimination has already been performed, only one copy of x exists. Otherwise, remove all upward exposed copies of x *

COMP 512, Fall Lazy Code Motion Edge placement x  I NSERT (i,j) Three cases |succs(i)| = 1  insert at end of i | succs(i)| > 1, but |preds(j)| = 1  insert at start of j | succs(i)| > 1, & |preds(j)| > 1  create new block in for x BiBi BjBj |succs(i)| = 1 x |preds(j)| = 1 BiBi BjBj BkBk x |succs(i) > 1 |preds(j)| > 1 BiBi BjBj BkBk BhBh x

COMP 512, Fall Lazy Code Motion Example B 1 : r 1  1 r 2  r 0 m if r 1 < r 2  B 2, B 3 B 2 : … r 20  r 17 * r r 4  r r 1  r 4 if r 1 < r 2  B 2, B 3 B 3 :... B1B1 B2B2 B3B3 Example is too small to show off Later Insert(1,2) = { r 20 } Delete(2) = { r 20 }