Global Redundancy Elimination: Computing Available Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda.

Global Redundancy Elimination: Computing Available Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use.

Review So far, we have seen Local Value Numbering
Finds redundancy, constants, & identities in a block Superlocal Value Numbering Extends local value numbering to EBBs Used SSA-like name space to simplify bookkeeping Dominator Value Numbering Extends scope to “almost” global (no back edges) Uses dominance information to handle join points in CFG Today’s Lecture Global Common Subexpression Elimination (GCSE) Applying global data-flow analysis to the problem Today’s lecture: computing AVAIL COMP 512, Fall 2003

The trick lies in finding these redundant subexpressions
The Idea The evaluation of an expression e at point p is redundant if and only if every path from the procedure’s entry to p contains an evaluation of e and the value(s) of e’s consitutent subexpressions do not change between those earlier evaluations and p Evaluating e at p always produces the same value as those earlier evaluations From the example in the last lecture: u  e + f D E u2  (u0,u1) x  e + f F e+f is redundant The trick lies in finding these redundant subexpressions COMP 512, Fall 2003

Using Available Expressions for GCSE
The goal Find common subexpressions whose range spans basic blocks, and eliminate unnecessary re-evaluations The mechanism Pose the problem as a system of simultaneous equations over the CFG of the code Solve the equations to produce a set for each CFG node that contains the names of every expression available on entry Use these sets, AVAIL(n), as the basis for redundancy elimination COMP 512, Fall 2003

The goal Find common subexpressions whose range spans basic blocks, and eliminate unnecessary re-evaluations Safety x+y  AVAIL(n) proves that earlier value of x+y is the same Transformation must provide a name for each such value Several schemes for this mapping Profitability Don’t add any evaluations Add some copy operations Copies are inexpensive Many copies coalesce away Copies can shrink or stretch live ranges COMP 512, Fall 2003 *

Computing Available Expressions
For each block b Let AVAIL(b) be the set of expressions available on entry to b Let EXPRKILL(b) be the set of expression killed in b Let DEEXPR(b) be the set of downward exposed expressions x  DEEXPR(b)  x defined in b & not subsequently killed in b Now, AVAIL(b) can be defined as: AVAIL(b) = xpred(b) (DEEXPR(x)  (AVAIL(x)  EXPRKILL(x) )) AVAIL(n0) = Ø where preds(b) is the set of b’s predecessors in the CFG This system of simultaneous equations forms a data-flow problem Solve it with a data-flow algorithm Entry node in CFG is n0 COMP 512, Fall 2003

The Big Picture  block b, compute AVAIL(b) Assign unique global names to expressions in AVAIL(b)  block b, value number b starting with AVAIL(b) To compute AVAIL(b) :  block b, compute DEEXPR(b) and EXPRKILL(b) COMP 512, Fall 2003

First step is to compute DEEXPR & EXPRKILL assume a block b with operations o1, o2, …, ok VARKILL  Ø DEEXPR(b)  Ø for i = k to 1 assume oi is “x  y + z” add x to VARKILL if (y  VARKILL) and (z  VARKILL) then add “y + z” to DEEXPR(b) EXPRKILL(b)  Ø For each expression e for each variable v  e if v  VARKILL(b) then EXPRKILL(b)  EXPRKILL(b)  {e } Many data-flow problems have initial information that costs less to compute Backward through block } O(k) steps O(N) steps N is # of operations * COMP 512, Fall 2003

The worklist iterative algorithm Worklist  { all blocks, bi } while (Worklist  Ø) remove a block b from Worklist recompute AVAIL(b ) as AVAIL(bi) = xpred(b) (DEEXPR(x)  (AVAIL(x)  EXPRKILL(x) )) if AVAIL(b ) changed then Worklist  Worklist  successors(b ) } How do we know these things? Today, trust me Finds fixed point solution to equation for AVAIL That solution is unique Identical to “meet over all paths” solution * COMP 512, Fall 2003

Back to Our Example  AVAIL sets in blue A B G C D E F { a+b } { a+b }
m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d E v  a + b w  c + d x  e + f F { a+b } { a+b } { a+b,c+d } { a+b,c+d } { a+b,c+d,e+f } { a+b,c+d } COMP 512, Fall 2003

Remember the Big Picture
 block b, compute AVAIL(b) Assign unique global names to expressions in AVAIL(b)  block b, value number b starting with AVAIL(b) We’ve done step 1. COMP 512, Fall 2003

Global CSE (replacement step)
Managing the name space Need a unique name  e  AVAIL(b) 1. Can generate them as replacements are done (Fortran H) 2. Can compute a static mapping (Common strategy) 3. Can encode value numbers into names (Briggs 94) Strategy Works well, but requires 2 passes (or a lot of walking around IR) Fast, but limits replacement to textually identical expressions Requires more analysis (VN), but yields more CSEs Assume, w.l.o.g., solution 2 COMP 512, Fall 2003

Compute a static mapping from expression to name After analysis & before transformation  b, e  AVAIL(b), assign e a global name by hashing on e COMP 512, Fall 2003 *

Assigning unique names to global CSEs
Back to Our Example  Assigning unique names to global CSEs a+b  t1 c+d  t2 e+f  t3 m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d E v  a + b w  c + d x  e + f F { a+b } { a+b } { a+b,c+d } { a+b,c+d } { a+b,c+d,e+f } { a+b,c+d } COMP 512, Fall 2003

Remember the Big Picture
 block b, compute AVAIL(b) Assign unique global names to expressions in AVAIL(b)  block b, value number b starting with AVAIL(b) We’ve done steps 1 & 2. COMP 512, Fall 2003

Compute a static mapping from expression to name After analysis & before transformation  b, e  AVAIL(b), assign e a global name by hashing on e During transformation step Evaluation of e  insert copy name(e)  e Reference to e  replace e with name(e) The major problem with this approach Inserts extraneous copies At all definitions and uses of any e  AVAIL(b),  b Those extra copies are dead and easy to remove The useful ones often coalesce away Common strategy: Insert copies that might be useful Let DCE sort them out Simplifies design & implementation COMP 512, Fall 2003 *

An Aside on Dead Code Elimination
What does “dead” mean? Useless code — result is never used Unreachable code — code that cannot execute Both are lumped together as “dead” To perform DCE Must have a global mechanism to recognize usefulness Must have a global mechanism to eliminate unneeded stores Must have a global mechanism to simplify control-flow predicates All of these will come later in the course COMP 512, Fall 2003

Value Numbering To perform replacement, we can value numbering each block b Initialize hash table with AVAIL(b) Replace an expression in AVAIL(b) means copy from its name At each evaluation of a global name, copy new value to its name Otherwise, value number as in last two lectures Net Result Catches local redundancies with value numbering Catches nonlocal redundancies because of AVAIL sets Not quite same effect, but close Local redundancies found by value Global redundancies found by spelling COMP 512, Fall 2003

After replacement & local value numbering
Back to Our Example m  a + b t1  m n  t1 A p  c + d t2  p r  t2 B y  t1 z  t2 G q  t1 r  c + d t2  r C e  b + 18 s  t1 u  e + f t3  u D e  a + 17 t  t2 E v  t1 w  t2 x  t3 F After replacement & local value numbering COMP 512, Fall 2003

Back to Our Example A B G C D E F
m  a + b t1  m n  t1 A p  c + d t2  p r  t2 B y  t1 z  t2 G q  t1 r  c + d t2  r C e  b + 18 s  t1 u  e + f t3  u D e  a + 17 t  t2 E v  t1 w  t2 x  t3 F In practice, most of these copies will be folded into subsequent uses… We leave copy folding to another pass where it can be done with appropriate tools (interference graph) m m p r u m r COMP 512, Fall 2003

Some Copies Serve a Critical Purpose
In the example, all the copies coalesce away. Sometimes, the copies are needed. Copies into t1 create a common name along two paths Makes the replacement possible Later uses of w or x may preclude their sharing storage w  a + b t1  w x  a + b t1  x y  t1 w  a + b x  a + b y  a + b  Cannot write “w or x” COMP 512, Fall 2003

Example does not highlight value identity versus lexical identity
Back to Our Example m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d E v  a + b w  c + d x  e + f F N.B.: SVN subsumes LVN DVN subsumes SVN GRE & xVN are not directly comparable LVN GRE,SVN LVN GRE,SVN GRE,SVN GRE, DVN GRE, DVN GRE GRE, DVN GRE Example does not highlight value identity versus lexical identity COMP 512, Fall 2003

Next Class Interative data-flow analysis: Does it halt?
Does it produce the desired answer? How fast does it converge? Implementation strategies COMP 512, Fall 2003

And that’s the end of my story ….
Extra Slides COMP 512, Fall 2003

Data-flow Analysis Definition
Data-flow analysis is a collection of techniques for compile-time reasoning about the run-time flow of values Almost always involves building a graph Problems are trivial on a basic block Global problems  control-flow graph (or derivative) Whole program problems  call graph (or derivative) Usually formulated as a set of simultaneous equations Sets attached to nodes and edges Lattice (or semilattice) to describe values We solved AVAIL with an iterative fixed-point algorithm Desired result is usually meet over all paths solution “What is true on every path from the entry?” “Can this happen on any path from the entry?” Related to the safety of optimization COMP 512, Fall 2003

Data-flow Analysis Limitations
1. Precision – “up to symbolic execution” Assume all paths are taken 2. Solution – cannot afford to compute MOP solution Large class of problems where MOP = MFP= LFP Not all problems of interest are in this class 3. Arrays – treated naively in classical analysis Represent whole array with a single fact 4. Pointers – difficult (and expensive) to analyze Imprecision rapidly adds up Need to ask the right questions Summary For scalar values, we can quickly solve simple problems Good news: Simple problems can carry us pretty far * COMP 512, Fall 2003

Data-flow Analysis Semilattice
A semilattice is a set L and a meet operation  such that,  a, b, & c  L : 1. a  a = a 2. a  b = b  a 3. a  (b  c) = (a  b)  c  imposes an order on L,  a, b, & c  L : 1. a ≥ b  a  b = b 2. a > b  a ≥ b and a ≠ b A semilattice has a bottom element, denoted  1.  a  L,   a =  2.  a  L, a ≥  COMP 512, Fall 2003

Data-flow Analysis How does this relate to data-flow analysis?
Choose a semilattice to represent the facts Attach a meaning to each a  L Each a  L is a distinct set of known facts With each node n, associate a function fn : L  L fn models behavior of code in block corresponding to n Let F be the set of all functions that the code might generate Example — AVAIL Semilattice is (2E,), where E is the set of all expressions &  is  Set are bigger than |variables|,  is Ø For a node n, fn has the form fn(x) = Dn  (x Nn) Where Dn is DEF(n) and Nn is NKILL(n) COMP 512, Fall 2003

Concrete Example: Available Expressions
E = {a+b,c+d,e+f,a+17,b+18} 2E is the set of all subsets of E m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d E v  a + b w  c + d x  e + f F 2E = [ {a+b,c+d,e+f,a+17,b+18}, {a+b,c+d,e+f,a+17}, {a+b,c+d,e+f,b+18}, {a+b,c+d,a+17,b+18}, {a+b,e+f,a+17,b+18}, {c+d,e+f,a+17,b+18}, {a+b,c+d,e+f}, {a+b,c+d,b+18}, {a+b,c+d,a+17}, {a+b,e+f,a+17}, {a+b,e+f,b+18},{a+b,a+17,b+18}, {c+d,e+f,a+17}, {c+d,e+f,b+18}, {c+d,a+17,b+18},{e+f,a+17,b+18}, {a+b,c+d},{a+b,e+f},{a+b,a+17}, {a+b,b+18},{c+d,e+f},{c+d,a+17}, {c+d,b+18},{e+f,a+17},{e+f,b+18}, {a+17,b+18},{a+b}, {c+d}, {e+f}, {a+17}, {b+18}, {} ] COMP 512, Fall 2003

Concrete Example: Available Expressions
The Lattice { } {a+b} {c+d} {e+f} {a+17} {b+18} {a+b,c+d} {a+b,a+17} {c+d,e+f} {c+d,b+18} {e+f,b+18} {a+b,e+f} {a+b,b+18} {c+d,a+17} {e+f,a+17} {a+17,b+18} {a+b,c+d,e+f} {a+b,c+d,b+18} {a+b,c+d,a+17} {a+b,e+f,a+17} {a+b,e+f,b+18} {a+b,a+17,b+18} {c+d,e+f,a+17} {c+d,e+f,b+18} {c+d,a+17,b+18} {e+f,a+17,b+18}, {a+b,c+d,e+f,a+17} {a+b,c+d,e+f,b+18} {a+b,c+d,a+17,b+18} {a+b,e+f,a+17,b+18} {c+d,e+f,a+17,b+18} {a+b,c+d,e+f,a+17,b+18}, Comparability (transitive) meet COMP 512, Fall 2003 *

 Lattice Theory This stuff is somewhat dry
Everybody stand up and stretch  COMP 512, Fall 2003

Data-flow Analysis What does this have to do with the iterative algorithm? We can use a lattice-theoretic formulation to prove Termination – it halts on an instance of AVAIL Correctness – it produces the desired result for AVAIL Complexity – it runs pretty quickly (d(CFG)+3 passes) Worklist  { all blocks, bi } while (Worklist  Ø) remove a block bi from Worklist recompute AVAIL(bi ) as AVAIL(b) = xpred(b) (DEF(x)  (AVAIL(x)  NKILL(x) )) if AVAIL(bi ) changed then Worklist  Worklist  successors(bi ) COMP 512, Fall 2003

Data-flow Analysis Termination
If every fn  F is monotone, i.e., f(xy) ≤ f(x)  f(y), and If the lattice is bounded, i.e., every descending chain is finite Chain is sequence x1, x2, …, xn where xi  L, 1 ≤ i ≤ n xi > xi+1, 1 ≤ i < n  chain is descending Then The iterative algorithm must halt on an instance of the problem Set at each block can only change a finite number of times  Any finite semilattice is bounded Some infinite semilattices are bounded COMP 512, Fall 2003

Not distributive  answer may not be unique
Data-flow Analysis Correctness Does the iterative algorithm compute the desired answer? Admissible Function Spaces 1.  f  F,  x,y  L, f (xy) = f (x)  f (y) 2.  fi  F such that  x  L, fi(x) = x 3. f,g  F   h  F such that h(x ) = f (g(x)) 4.  x  L,  a finite subset H  F such that x = f  H f () If F meets these four conditions, then the problem (L,F,) has a unique fixed point solution  LFP = MFP = MOP  order of evaluation does not matter Not distributive  answer may not be unique COMP 512, Fall 2003 *

Data-flow Analysis Complexity
Sets stabilize in two passes around a loop Each pass does O(E ) meets & O(N ) other operations Data-flow Analysis Complexity For a problem with an admissible function space & a bounded semilattice, If the functions all meet the rapid condition, i.e., f,g  F,  x  L, f (g()) ≥ g()  f (x)  x then, a round-robin, reverse-postorder iterative algorithm will halt in d(G)+3 passes over a graph G d(G) is the loop-connectedness of the graph w.r.t a DFST Maximal number of back edges in an acyclic path Several studies suggest that, in practice, d(G) is small (<3) For most CFGs, d(G) is independent of the specific DFST COMP 512, Fall 2003 *

Data-flow analysis What does this mean? Reverse postorder
Number the nodes in a postorder traversal Reverse the order Round-robin iterative algorithm Visit all the nodes in a consistent order (RPO) Do it again until the sets stop changing So, these conditions are easily met Admissible framework, rapid function space Round-robin, reverse-postorder, iterative algorithm The analysis runs in (effectively) linear time COMP 512, Fall 2003

Data-flow Analysis How do we use these results?
Prove that data-flow framework is admissible & rapid Its just algebra Most (but not all) global data-flow problems are rapid This is a property of F Code up the iterative algorithm World’s simplest data-flow algorithm Other versions (worklist) have similar behavior This lets us ignore most of the other data-flow algorithms in 512 COMP 512, Fall 2003

EXTRA SLIDES START HERE
COMP 512, Fall 2003

Managing the name space Need a unique name  e  AVAIL(b) 1. Can generate them as replacements are done (Fortran H) 2. Can compute a static mapping 3. Can encode value numbers into names (Briggs 94) Strategy 1. This works; it is the classic method 2. Fast, but limits replacement to textually identical expressions 3. Requires more analysis (VN), but yields more CSEs Assume, w.l.o.g., solution 2 COMP 512, Fall 2003

The Big Picture 1. Build a control-flow graph 2. Gather the initial (local) data — DEF(b) & NKILL(b) 3. Propagate information around the graph, evaluating the equation 4. Post-process the information to make it useful (if needed) All data-flow problems are solved, essentially, this way Next lecture: Iterative computation of AVAIL information From Chapter 8 of EaC COMP 512, Fall 2003

Example A B G C D E F m  a + b n  a + b p  c + d r  c + d
y  a + b z  c + d G q  a + b C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d E v  a + b w  c + d x  e + f F COMP 512, Fall 2003

Example does not highlight value identity versus lexical identity
Back to Our Example  AVAIL sets in blue m  a + b n  a + b A p  c + d r  c + d B y  a + b z  c + d G q  a + b C e  b + 18 s  a + b u  e + f D e  a + 17 t  c + d E v  a + b w  c + d x  e + f F LVN { a+b } { a+b } GRE,SVN LVN { a+b,c+d } { a+b,c+d } GRE,SVN GRE,SVN { a+b,c+d,e+f } GRE, DVN GRE, DVN GRE { a+b,c+d } GRE, DVN GRE Example does not highlight value identity versus lexical identity COMP 512, Fall 2003

Global Redundancy Elimination: Computing Available Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda.

Similar presentations

Presentation on theme: "Global Redundancy Elimination: Computing Available Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Global Redundancy Elimination: Computing Available Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda.

Similar presentations

Presentation on theme: "Global Redundancy Elimination: Computing Available Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda."— Presentation transcript:

Similar presentations

About project

Feedback