Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,

Slides:



Advertisements
Similar presentations
Topic G: Static Single-Assignment Form José Nelson Amaral
Advertisements

SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Operator Strength Reduction From Cooper, Simpson, & Vick, “Operator Strength Reduction”, ACM TOPLAS, 23(5), See also § of EaC2e. 1COMP 512,
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Coalescing Register Allocation CS153: Compilers Greg Morrisett.
SSA.
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
A Deeper Look at Data-flow Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University.
Code Motion of Control Structures From the paper by Cytron, Lowry, and Zadeck, COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda.
SSA-Based Constant Propagation, SCP, SCCP, & the Issue of Combining Optimizations 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon,
The Last Lecture Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission.
Lazy Code Motion C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Example in SSA X := Y op Z in out F X := Y op Z (in) = in [ { X ! Y op Z } X :=  (Y,Z) in 0 out F X :=   (in 0, in 1 ) = (in 0 Å in 1 ) [ { X ! E |
Introduction to Code Optimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
1 CS 201 Compiler Construction Lecture 10 SSA-Based Sparse Conditional Constant Propagation.
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Code Shape III Booleans, Relationals, & Control flow Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Computing SSA Emery Berger University.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Intermediate Code. Local Optimizations
Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
1 CS 201 Compiler Construction Lecture 10 SSA-Based Sparse Conditional Constant Propagation.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Global Common Subexpression Elimination with Data-flow Analysis Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Introduction to Optimization, II Value Numbering & Larger Scopes Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
First Principles (with examples from value numbering) C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Operator Strength Reduction C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Static Single Assignment John Cavazos.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Global Redundancy Elimination: Computing Available Expressions Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Building SSA Form (A mildly abridged account) For the full story, see the lecture notes for COMP 512 (lecture 8) and.
Dead Code Elimination This lecture presents the algorithm Dead from EaC2e, Chapter 10. That algorithm derives, in turn, from Rob Shillner’s unpublished.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Terminology, Principles, and Concerns, II With examples from superlocal value numbering (Ch 8 in EaC2e) Copyright 2011, Keith D. Cooper & Linda Torczon,
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
Building SSA Form, I 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at.
Profile Guided Code Positioning C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Definition-Use Chains
Introduction to Optimization
Finding Global Redundancies with Hopcroft’s DFA Minimization Algorithm
Introduction to Optimization
Wrapping Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Building SSA Form COMP 512 Rice University Houston, Texas Fall 2003
Code Shape III Booleans, Relationals, & Control flow
CS 201 Compiler Construction
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith.
The Last Lecture COMP 512 Rice University Houston, Texas Fall 2003
Optimizations using SSA
Introduction to Optimization
Static Single Assignment
Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.
Algebraic Reassociation of Expressions COMP 512 Rice University Houston, Texas Fall 2003 P. Briggs & K.D. Cooper, “Effective Partial Redundancy Elimination,”
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Objectives Identify advantages (and disadvantages ?) of optimizing in SSA form Given a CFG in SSA form, perform Global Constant Propagation Dead code elimination.
Objectives Identify advantages (and disadvantages ?) of optimizing in SSA form Given a CFG in SSA form, perform Global Constant Propagation Dead code elimination.
Presentation transcript:

Using SSA Dead Code Elimination & Constant Propagation C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use.

COMP 512, Fall Using SSA – Dead code elimination Dead code elimination Conceptually similar to mark-sweep garbage collection  Mark useful operations  Everything not marked is useless Need an efficient way to find and to mark useful operations  Start with critical operations  Work back up SSA edges to find their antecedents Define critical I/O statements, linkage code ( entry & exit blocks ), return values, calls to other procedures Algorithm will use post-dominators & reverse dominance frontiers

COMP 512, Fall Using SSA – Dead code elimination Mark for each op i clear i’s mark if i is critical then mark i add i to WorkList while (Worklist ≠ Ø) remove i from WorkList (i has form “x  y op z” ) if def(y) is not marked then mark def(y) add def(y) to WorkList if def(z) is not marked then mark def(z) add def(z) to WorkList for each b  RDF (block(i)) mark the block-ending branch in b add it to WorkList Sweep for each op i if i is not marked then if i is a branch then rewrite with a jump to i’s nearest useful post-dominator if i is not a jump then delete i Notes: Eliminates some branches Reconnects dead branches to the remaining live code Find useful post-dominator by walking post-dom tree > Entry & exit nodes are useful

COMP 512, Fall Using SSA – Dead code elimination Handling Branches When is a branch useful?  When a useful operation depends on its existence j control dependent on i  one path from i leads to j, one doesn’t This is the reverse dominance frontier of j ( RDF (j)) Algorithm uses RDF (n ) to mark branches as live In the CFG, j is control dependent on i if 1.  a non-null path p from i to j  j post-dominates every node on p after i 2.j does not strictly post-dominate i

COMP 512, Fall Using SSA – Dead Code Elimination What’s left? Algorithm eliminates useless definitions & some useless branches Algorithm leaves behind empty blocks & extraneous control-flow Two more issues Simplifying control-flow Eliminating unreachable blocks Both are CFG transformations (no need for SSA) Algorithm from: Cytron, Ferrante, Rosen, Wegman, & Zadeck, Efficiently Computing Static Single Assignment Form and the Control Dependence Graph, ACM TOPLAS 13(4), October 1991 with a correction due to Rob Shillner *

COMP 512, Fall Constant Propagation Safety Proves that name always has known value Specializes code around that value  Moves some computations to compile time (  code motion )  Exposes some unreachable blocks (  dead code ) Opportunity Value ≠  signifies an opportunity Profitability Compile-time evaluation is cheaper than run-time evaluation Branch removal may lead to block coalescing ( C LEAN )  If not, it still avoids the test & makes branch predictable

COMP 512, Fall Using SSA — Sparse Constant Propagation  expression, e Value(e)  WorkList  Ø  SSA edge s = if Value(u) ≠ TOP then add s to WorkList while (WorkList ≠ Ø) remove s = from WorkList let o be the operation that uses v if Value(o) ≠ BOT then t  result of evaluating o if t ≠ Value(o) then  SSA edge add to WorkList { TOP if its value is unknown c i if its value is known BOT if its value is known to vary Evaluating a Ø-node: Ø(x 1,x 2,x 3, … x n ) is Value(x 1 )  Value(x 2 ) Value(x 3 ) ...  Value(x n ) Where TOP  x = x  x c i  c j = c i if c i = c j c i  c j = BOT if c i ≠ c j BOT  x = BOT  x Same result, fewer  operations Performs  only at Ø nodes i.e., o is “a  b op v” or “a  v op b”

COMP 512, Fall Using SSA — Sparse Constant Propagation How long does this algorithm take to halt? Initialization is two passes  |ops| + 2 x |ops| edges Value(x) can take on 3 values  TOP, c i, BOT  Each use can be on WorkList twice  2 x |args| = 4 x |ops| evaluations, WorkList pushes & pops As discussed in Lecture 7, this is an optimistic algorithm Initialize all values to TOP, unless they are known constants Every value becomes BOT or c i, unless its use is uninitialized TOP BOT cici cjcj ckck clcl cmcm cncn...

COMP 512, Fall      Pessimistic initializations Leads to: i 1  12     x   * 17   j   i 3   Sparse Conditional Constant Optimism i 0  12 while ( … ) i 1  Ø(i 0,i 3 ) x  i 1 * 17 j  i 1 i 2  … … i 3  j Clear that i is always 12 at def of x In general Optimism helps inside loops Largely a matter of initial value M.N. Wegman & F.K. Zadeck, Constant propagation with conditional branches, ACM TOPLAS, 13(2), April 1991, pages 181–210. * Optimism This version of the algorithm is an optimistic formulation Initializes values to TOP Prior version used  ( implicit ) 12 TOP Optimistic initializations Leads to: i 1  12  TOP  12 x  12 * 17  204 j  12 i 3  12 i 1  12  12  12

COMP 512, Fall Sparse Constant Propagation What happens when it propagates a value into a branch? TOP  we gain no knowledge BOT  either path can execute TRUE or FALSE  only one path can execute Working this into the algorithm Use two worklists: SSAWorkList & CFGWorkList  SSAWorkList determines values  CFGWorkList governs reachability Don’t propagate into operation until its block is reachable } But, the algorithm does not use this...

COMP 512, Fall Sparse Conditional Constant Propagation SSAWorkList  Ø CFGWorkList  n 0  block b clear b’s mark  expression e in b Value(e)  TOP Initialization Step To evaluate a branch if arg is BOT then put both targets on CFGWorklist else if arg is TRUE then put TRUE target on CFGWorkList else if arg is FALSE then put FALSE target on CFGWorkList To evaluate a jump place its target on CFGWorkList while ((CFGWorkList  SSAWorkList) ≠ Ø) while(CFGWorkList ≠ Ø) remove b from CFGWorkList mark b evaluate each Ø-function in b evaluate each op in b, in order while(SSAWorkList ≠ Ø) remove s = from WorkList let o be the operation that contains v t  result of evaluating o if t ≠ Value(o) then Value(o)  t  SSA edge if x is marked, then add to WorkList Propagation Step

COMP 512, Fall Sparse Conditional Constant Propagation There are some subtle points Branch conditions should not be TOP when evaluated  Indicates an upwards-exposed use ( no initial value )  Hard to envision compiler producing such code Initialize all operations to TOP  Block processing will fill in the non-top initial values  Unreachable paths contribute TOP to Ø-functions Code shows CFG edges first, then SSA edges  Can intermix them in arbitrary order ( correctness )  Taking CFG edges first may help with speed ( minor effect )

COMP 512, Fall Sparse Conditional Constant Propagation More subtle points TOP * BOT  TOP  If TOP becomes 0, then 0 * BOT  0  This prevents non-monotonic behavior for the result value  Uses of the result value might go irretrievably to  Similar effects with any operation that has a “zero” Some values reveal simplifications, rather than constants  BOT * c i  BOT, but might turn into shifts & adds ( c i = 2, BOT ≥ 0 )  Removes commutativity ( reassociation )  BOT **2  BOT * BOT ( vs. series or call to library ) cbr TRUE  L 1,L 2 becomes br  L 1  Method discovers this; it must rewrite the code, too!

COMP 512, Fall Sparse Conditional Constant Unreachable Code Cannot get this any other way D EAD code cannot test (i > 0) D EAD marks j 2 as useful * Optimism Initialization to TOP is still important Unreachable code keeps TOP  with TOP has desired result i  17 if (i > 0) then j 1  10 else j 2  20 j 3  Ø(j 1, j 2 ) k  j 3 * 17 In general, combining two optimizations can lead to answers that cannot be produced by any combination of running them separately. This algorithm is one example of that general principle. Combining allocation & scheduling is another... All paths execute   17 With SCC marking blocks TOP TOP 10

COMP 512, Fall Using SSA In general, using SSA leads to Cleaner formulations Better results Faster algorithms We’ve seen two SSA-based algorithms. Dead-code elimination Sparse conditional constant propagation We’ll see more … but not next time …