Download presentation
Presentation is loading. Please wait.
Published byNiko Haavisto Modified over 6 years ago
1
Optimization through Redundancy Elimination: Value Numbering at Different Scopes COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use.
2
(need stronger methods)
Value Numbering m a + b n a + b A p c + d r c + d B y a + b z c + d G q a + b C e b + 18 s a + b u e + f D e a + 17 t c + d E v a + b w c + d x e + f F Missed opportunities (need stronger methods) Local Value Numbering 1 block at a time Strong local results No cross-block effects COMP 512, Fall 2003 *
3
An Aside on Terminology
Control-flow graph (CFG) Nodes for basic blocks Edges for branches Basis for much of program analysis & transformation m a + b n a + b A p c + d r c + d B y a + b z c + d G q a + b C e b + 18 s a + b u e + f D e a + 17 t c + d E v a + b w c + d x e + f F This CFG, G = (N,E) N = {A,B,C,D,E,F,G} E = {(A,B),(A,C),(B,G),(C,D), (C,E),(D,F),(E,F),(F,E)} |N| = 7, |E| = 8 COMP 512, Fall 2003
4
Extended Basic Blocks A B G C D E F An Extended Basic Block (EBB)
m a + b n a + b A p c + d r c + d B y a + b z c + d G q a + b C e b + 18 s a + b u e + f D e a + 17 t c + d E v a + b w c + d x e + f F An Extended Basic Block (EBB) Set of blocks b1, b2, …, bn b1 has > 1 predecessor All other bi have 1 predecessor EBBs provide more context for optimization COMP 512, Fall 2003 *
5
Extended Basic Blocks A B G C D E F An EBB contains 1 or more path
m a + b n a + b A p c + d r c + d B y a + b z c + d G q a + b C e b + 18 s a + b u e + f D e a + 17 t c + d E v a + b w c + d x e + f F An EBB contains 1 or more path If b1, b2, …, bn is a path b1 has > 1 predecessor bi has 1 predecessor, bi-1 {A, B, C, D, E} is an EBB {A,B}, {A,C,D}, and {A,C,E} are paths in {A, B, C, D, E} {F} and {G} are also EBBs They have only trivial paths COMP 512, Fall 2003 *
6
Superlocal Value Numbering
m a + b n a + b A p c + d r c + d B y a + b z c + d G q a + b C e b + 18 s a + b u e + f D e a + 17 t c + d E v a + b w c + d x e + f F The Concept Apply local method to each path in the EBB Do {A,B}, {A,C,D}, & {A,C,E} Obtain reuse from ancestors Does not help with F or G Key: avoid re-analyzing A & C COMP 512, Fall 2003 *
7
Superlocal Value Numbering
Efficiency Use A’s table to initialize tables for B & C To avoid duplication, use a scoped hash table A, AB, A, AC, ACD, AC, ACE, F, G Need a VN name mapping to handle kills Must restore map with scope Adds complication, not cost To simplify matters Unique name for each definition Makes name VN Use the SSA name space EaC: § & App. B m a + b n a + b A p c + d r c + d B y a + b z c + d G q a + b C e b + 18 s a + b u e + f D e a + 17 t c + d E v a + b w c + d x e + f F Subscripted names from example in last lecture COMP 512, Fall 2003
8
SSA Name Space (locally)
Example (from lecture 3) These are SSA names … Original Code a0 x0 + y0 b0 x0 + y0 a1 17 c0 x0 + y0 Renaming: Give each value a unique name Makes it clear With VNs a03 x01 + y02 b03 x01 + y02 a14 17 c03 x01 + y02 Notation: While complex, the meaning is clear Rewritten a03 x01 + y02 b03 a03 a14 17 c03 a03 Result: a03 is available rewriting works COMP 512, Fall 2003
9
SSA Name Space (in general)
Two principles Each name is defined by exactly one operation Each operand refers to exactly one definition To reconcile these principles with real code Insert -functions at merge points to reconcile name space Add subscripts to variable names for uniqueness We’ll look at how to construct SSA form later in the course x ... ... x + ... x0 ... x1 ... x2 (x0,x1) x becomes COMP 512, Fall 2003
10
Superlocal Value Numbering
m0 a + b n0 a + b A p0 c + d r0 c + d B r2 (r0,r1) y0 a + b z0 c + d G q0 a + b r1 c + d C e0 b + 18 s0 a + b u0 e + f D e1 a + 17 t0 c + d u1 e + f E e3 (e0,e1) u2 (u0,u1) v0 a + b w0 c + d x0 e + f F With all the bells & whistles Find more redundancy Pay little additional cost Still does nothing for F & G This is in SSA Form Superlocal techniques Some local methods extend cleanly to superlocal scopes VN does not back up If C adds to A, it’s a problem COMP 512, Fall 2003
11
What About Larger Scopes?
We have not helped with F or G Multiple predecessors Must decide what facts hold in F and in G For G, combine B & F? Merging state is expensive Fall back on what’s known G m0 a + b n0 a + b A p0 c + d r0 c + d B r2 (r0,r1) y0 a + b z0 c + d q0 a + b r1 c + d C e0 b + 18 s0 a + b u0 e + f D e1 a + 17 t0 c + d u1 e + f E e3 (e0,e1) u2 (u0,u1) v0 a + b w0 c + d x0 e + f F COMP 512, Fall 2003
12
Dominators Definitions
x dominates y if and only if every path from the entry of the control-flow graph to the node for y includes x By definition, x dominates x We associate a Dom set with each node |Dom(x )| ≥ 1 Immediate dominators For any node x, there must be a y in Dom(x ) closest to x We call this y the immediate dominator of x As a matter of notation, we write this as IDom(x ) COMP 512, Fall 2003
13
Dominators Dominators have many uses in analysis & transformation
Finding loops Building SSA form Making code motion decisions We’ll look at how to compute dominators later m0 a + b n0 a + b A p0 c + d r0 c + d B r2 (r0,r1) y0 a + b z0 c + d G q0 a + b r1 c + d C e0 b + 18 s0 a + b u0 e + f D e1 a + 17 t0 c + d u1 e + f E e3 (e0,e1) u2 (u0,u1) v0 a + b w0 c + d x0 e + f F Dominator sets A B C G F E D Dominator tree Back to the discussion of value numbering over larger scopes ... COMP 512, Fall 2003 *
14
What About Larger Scopes?
We have not helped with F or G Multiple predecessors Must decide what facts hold in F and in G For G, combine B & F? Merging state is expensive Fall back on what’s known Can use table from IDom(x ) to start x Use C for F and A for G Imposes a Dom-based application order Leads to Dominator VN Technique (DVNT) m0 a + b n0 a + b A p0 c + d r0 c + d B r2 (r0,r1) y0 a + b z0 c + d G q0 a + b r1 c + d C e0 b + 18 s0 a + b u0 e + f D e1 a + 17 t0 c + d u1 e + f E e3 (e0,e1) u2 (u0,u1) v0 a + b w0 c + d x0 e + f F COMP 512, Fall 2003 *
15
Dominator Value Numbering
The DVNT Algorithm Use superlocal algorithm on extended basic blocks Retain use of scoped hash tables & SSA name space Start each node with table from its IDom DVNT generalizes the superlocal algorithm No values flow along back edges (i.e., around loops) Constant folding, algebraic identities as before Larger scope leads to (potentially) better results Local + Superlocal + good start for new EBBs COMP 512, Fall 2003
16
Dominator Value Numbering
DVNT advantages Find more redundancy Little additional cost Retains online character m a + b n a + b A p c + d r c + d B r2 (r0,r1) y a + b z c + d G q a + b C e b + 18 s a + b u e + f D e a + 17 t c + d E e3 (e1,e2) u2 (u0,u1) v a + b w c + d x e + f F DVNT shortcomings Misses some opportunities No loop-carried CSEs or constants COMP 512, Fall 2003
17
End of Lecture … COMP 512, Fall 2003
18
The Story So Far, … Local algorithm (Balke, 1967)
Superlocal extension of Balke (many) Dominator VN technique (Simpson, 1996) All these propagate along forward edges None are global methods Global Methods Classic CSE (Cocke 1970) Partitioning algorithms (Alpern et al. 1988, Click 1995) Partial Redundancy Elimination (Morel & Renvoise 1979) SCC/VDCM (Simpson 1996) We will look at several global methods Future lectures Next lecture COMP 512, Fall 2003 *
19
Roadmap To recap … We have seen value numbering
Local, superlocal, dominator scopes We may look at global value numbering later We will look at global redundancy elimination We have used dominators We will look at how to calculate Dom later We have used SSA We will look at how to build SSA later Next: Global redundancy elimination from available expressions Iterative data-flow analysis COMP 512, Fall 2003
20
Using Available Expressions for GCSE
The goal Find common subexpressions whose range spans basic blocks, and eliminate unnecessary re-evaluations Safety Available expressions proves that the replacement value is current Transformation must ensure right namevalue mapping Profitability Don’t add any evaluations Add some copy operations Copies are inexpensive Many copies coalesce away Copies can shrink or stretch live ranges COMP 512, Fall 2003 *
21
Using Available Expressions for GCSE
Expressions defined in b Using Available Expressions for GCSE The Method 1. block b, compute DEF(b) and NKILL(b) 2. block b, compute AVAIL(b) 3. block b, value number the block starting from AVAIL(b) 4. Replace expressions in AVAIL(b) with references Two key issues Computing AVAIL(b) Managing the replacement process We’ll look at the replacement issue first Expressions not killed in b COMP 512, Fall 2003
22
Computing Available Expressions
For each block b Let AVAIL(b) be the set of expressions available on entry to b Let NKILL(b) be the set of expression not killed in b Let DEF(b) be the set of expressions defined in b and not subsequently killed in b Now, AVAIL(b) can be defined as: AVAIL(b) = xpred(b) (DEF(x) (AVAIL(x) NKILL(x) )) preds(b) is the set of b’s predecessors in the control-flow graph This system of simultaneous equations forms a data-flow problem Solve it with a data-flow algorithm COMP 512, Fall 2003
23
Global CSE (replacement step)
Managing the name space Need a unique name e AVAIL(b) 1. Can generate them as replacements are done (Fortran H) 2. Can compute a static mapping 3. Can encode value numbers into names (Briggs 94) Strategy 1. This works; it is the classic method 2. Fast, but limits replacement to textually identical expressions 3. Requires more analysis (VN), but yields more CSEs Assume, w.l.o.g., solution 2 COMP 512, Fall 2003
24
Global CSE (replacement step)
Compute a static mapping from expression to name After analysis & before transformation b, e AVAIL(b), assign e a global name by hashing on e During transformation step Evaluation of e insert copy name(e) e Reference to e replace e with name(e) The major problem with this approach Inserts extraneous copies At all definitions and uses of any e AVAIL(b), b Those extra copies are dead and easy to remove The useful ones often coalesce away Common strategy: Insert copies that might be useful Let DCE sort them out Simplifies design & implementation COMP 512, Fall 2003 *
25
An Aside on Dead Code Elimination
What does “dead” mean? Useless code — result is never used Unreachable code — code that cannot execute Both are lumped together as “dead” To perform DCE Must have a global mechanism to recognize usefulness Must have a global mechanism to eliminate unneeded stores Must have a global mechanism to simplify control-flow predicates All of these will come later in the course COMP 512, Fall 2003
26
Global CSE Now a three step process Compute AVAIL(b), block b
Assign unique global names to expressions in AVAIL(b) Perform replacement with local value numbering Earlier in the lecture, we said Now, we need to make good on the assumption Assume, without loss of generality, that we can compute available expressions for a procedure. This annotates each basic block, b, with a set AVAIL(b) that contains all expressions that are available on entry to b. COMP 512, Fall 2003
27
Computing Available Expressions
The Big Picture 1. Build a control-flow graph 2. Gather the initial (local) data — DEF(b) & NKILL(b) 3. Propagate information around the graph, evaluating the equation 4. Post-process the information to make it useful (if needed) All data-flow problems are solved, essentially, this way Next lecture: Iterative computation of AVAIL information From Chapter 8 of EaC COMP 512, Fall 2003
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.