Presentation is loading. Please wait.

Presentation is loading. Please wait.

PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

Similar presentations


Presentation on theme: "PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU."— Presentation transcript:

1 PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010

2 PSUCS322 HM 2 Agenda IR Optimization Redundancy Elimination Sample: CSE Partial Redundancy Elimination (PRE) Copy Propagation Value Numbering Loop Invariant Code Motion Counter Examples Strength Reduction Induction Variable (IV) Elimination

3 PSUCS322 HM 3 IR Optimization Definition: Optimization is the translation of an original program P1 into a semantically equivalent program P2 with better properties “Better” depends on the project. Possibilities include code compactness, execution speed, numeric precision, and others

4 PSUCS322 HM 4 IR Optimization Optimizations transform a program into a functionally-equivalent program with better performance. Transformation can be implemented at various stages and levels. Advantages of IR-Level Optimization: IR Operations are explicit, so cost estimations can be accurate IR Optimizations are machine-independent, hence the results are portable across different target machines Scopes of Optimization: Local: Transforming code by analyzing a single basic block Global: Transforming code by analyzing a whole subroutine Inter-Procedural: By analyzing the whole program Concepts and Techniques: Basic blocks & flow graphs Control-flow analysis & data-flow analysis

5 PSUCS322 HM 5 Redundancy Elimination IR code optimization removes redundant computations. The following are specific examples: Common Subexpression Elimination (CSE) — Based on lexical representation, applicable to global scope Partial Redundancy Elimination — More powerful than CSE Copy Propagation — Companion optimization to CSE Value Numbering (VN) — Value based, single Basic Block Super-local Value Numbering — Extends VN to multiple blocks Loop Invariant Elimination — Removes code from frequently to rarely executed part of program

6 PSUCS322 HM 6 Common Subexpression Elimination (CSE) E is a common subexpression if it occurs at L1 and L2, was computed at L1, and no components received new values along path to L2 To achieve CSE, introduce Temp to hold subexpression when first evaluated; see Example from Quicksort(): The second occurrence of 4*i in BB --from Quicksort()-- is a common subexpression; so is the second occurrence of 4*j t11 := 4*i x := a[t11] t12 := 4*i t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := 4*j a[t15] := x BB before CSE t11 := 4*i x := a[t11] t12 := t11 t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := t13 a[t15] := x t11 := 4*i x := a[t11] t13 := 4*j t14 := a[t13] a[t11]:= t14 a[t13]:= x BB’ after CSEBB’’ after total CSE

7 PSUCS322 HM 7 CSE Across BBs CSE can eliminate redundant computation across Basic Blocks: i := j a := 4 * i if … goto BB3 before CSE BB1 i := j b := 4 * i BB2 i := j c := 4 * i BB3 i := j temp := 4 * i a := temp if … goto BB3 after CSE BB1’ i := j b := temp BB2’ i := j c := temp BB3’

8 PSUCS322 HM 8 Global CSE both 4*i in BB5 (and BB6) are CSEs ⇒ eliminate t6 and t11, t7, t12, replace with t2 4*j in BB5 and BB6 are CSEs ⇒ eliminate t10 and t15, replace with t8 and t13 Now a[t2] in BB5 and BB6 become CSEs ⇒ replace with t3 i := m-1 j := n t1 := 4*n v := a[t1] BB1 i := i+1 t2 := 4*i t3 := a[t2] if t3<v goto BB2 BB2 j := j-1 t4 := 4*j t5 := a[t4] if t5 > v goto BB3 BB3 if i >= j goto BB6 BB4 t11 := 4*i x := a[t11] t12 := 4*i t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := 4*j a[t15] := x BB6 t6 := 4*i x := a[t6] t7 := 4*i t8 := 4*j t9 := a[t8] a[t7]:= t9 t10 := 4*j a[t10]:= x goto BB2 BB5

9 PSUCS322 HM 9 Global CSE i := m-1 j := n t1 := 4*n v := a[t1] BB1 i := i+1 t2 := 4*i t3 := a[t2] if t3 < v goto BB2 BB2 j := j-1 t4 := 4*j t5 := a[t4] if t5 > v goto BB3 BB3 if i >= j goto BB6 BB4 x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x BB6 x := t3 a[t2]:= t5 a[t4]:= x goto BB2 BB5

10 PSUCS322 HM 10 CSE Algorithm Available expressions: An expression x ⊕ y is available at node n if every path from the entry node to n evaluates the expression, and there are no definitions of x or y after the last evaluation Algorithm: 1. Compute available expressions for all expressions. 2. At each node n : w := x ⊕ y, where the expression x ⊕ y is available, search backwards for the evaluations of x ⊕ y that reach n 3. Replace each evaluation v := x ⊕ y found in the search by t := x ⊕ y; v := t 4. Replace n by w := t

11 PSUCS322 HM 11 An Improved CSE Algorithm The previous CSE algorithm performs the expensive backward search and inserts a new temp for every use of a common subexpression. The following ideas can improve the algorithm: –Reduce number of new temps by assigning a unique name to each unique expression –Avoid backward search by a separate traversal of the CFG Algorithm: 1. Compute available expressions for all expressions 2. Initialize an array Name[ e ] = ø for all expressions 3. At each node n : w := x ⊕ y, where the expression x ⊕ y (denoted e below) is available: If Name[ e ] = ø, allocate new name t and set Name[ e ] = t; Else let t = Name[ e ]; Replace n by w := t; 4. In a subsequent traversal of CFG, at each node v := e, if Name[ e ] != ø, let t = Name[ e ]; replace the node by t := e; v := t;

12 PSUCS322 HM 12 Yet Another CSE Algorithm Ideas: Create one temp for each unique expression. Let subsequent pass eliminate unnecessary temps. Algorithm: 1.Compute available expressions for all expressions. 2.At each evaluation of e: Hash e to a name, t, in a table Insert assignment t = e. 3.At a use of e where e is available: Look up e’s name t in the hash table Replace e with t.

13 PSUCS322 HM 13 Partial Redundancy Elimination (PRE) An expression x ⊕ y is partially redundant at node n, if some path from entry node to n evaluates x ⊕ y, and there are no definitions of x or y after the last evaluation PRE Optimization (it subsumes CSE): Discover partially redundant expressions Convert them to fully redundant expressions Remove redundancy, to reduce # of overall computations at runtime =... x ⊕ y =... n ⇒ nn ⇒

14 PSUCS322 HM 14 Copy Propagation Copy statement has the form f := g A large number of copy statements may be generated after performing CSE optimizations. Copy propagation eliminates copy statements by using g for f wherever possible t6 := 4*i x := a[t6] t7 := t6 t8 := 4*j t9 := a[t8] a[t7]:= t9 t10 := t8 a[t10]:= x goto BB2 BB5 Before t6 := 4*i x := a[t6] t8 := 4*j t9 := a[t8] a[t6]:= t9 a[t8]:= x goto BB2 BB5’ After ⇒

15 PSUCS322 HM 15 Cascading Problem CSE transformations may have a cascading effect — more rounds of CSE/Copy-propagation may be needed before reaching the final form: x := b + c y := a + x u := b + c v := a + u ⇒ x := b + c y := a + x u := x v := a + u x := b + c y := a + x v := a + x x := b + c y := a + x v := y ⇒⇒

16 PSUCS322 HM 16 Value Numbering Each variable is assumed to have a unique initial value Each unique value is assigned a unique number An expression’s value is represented by a corresponding symbolic expression based on the operands’ numbers E.g. expression x + y’s value is 1+2, if 1 and 2 are x and y’s value numbers, respectively Each unique expression value is also assigned a unique number When a new variable or expression is encountered, check to see if it has been assigned a number, if so, use the number, otherwise assign it a new number Use a hash table for efficient number lookup

17 PSUCS322 HM 17 Sample: Value Numbering Value numbering uses a single round to calculate the effect of cascaded optimizations x := b + c y := a + x u := b + c v := a + u statementvar or exprassigned # x := b + cb c b+c (1+2) x 12331233 y := a + xa a+x (4+3) y 455455 u := b + cu (1+2)3 v := a + uv (4+3)5

18 PSUCS322 HM 18 Loop Invariant Code Motion If a loop contains a statement t ← a ⊕ b such that a and b have the same values each time around the loop, then t will also have the same value each time. Hoist such loop-invariant statement out of loop! t1 := 0 i := i+1 t2 := a * b M[i]:= t2 if a < N goto BB3 BB2 x := t2 BB3 BB1 t1 := 0 t2 := a * b i := i+1 M[i]:= t2 if a < N goto BB3’ BB2’ x := t2 BB3’ BB1’ ⇒

19 PSUCS322 HM 19 Loop Invariant Criteria A statement S : t ← a1 ⊕ a2 is loop-invariant within loop L if, for each operand a i 1.) a i is a constant, or 2.) all definitions of a i that reach S are outside the loop, or 3.) only 1 definition of a i reaches S, which is loop-invariant An iterative algorithm can be used to find all loop-invariant statements

20 PSUCS322 HM 20 Strength Reduction (SR) Definition: Reduction in strength is the replacement of an operation by a cheaper one, e.g. replace * by + if feasible Do not make such changes in the source, e.g. do not replace j=2*k; with j=k+k; let optimizer do this if i >= y goto BB3 Call func1 j := 2 * k i := i + 1 goto BB1 BB2 x :=... BB3 BB1 ⇒ if i >= y goto BB3 Call func1 j := k + k i++ goto BB1 BB2 x :=... BB3 BB1

21 PSUCS322 HM 21 Induction Variable Elimination (IVE) Definition: Induction Variable (IV) is a variable iterating through a linear progression of values in a program section The program section is frequently a proper loop IV are either fundamental or dependent on other IVs IV elimination reduces multiple IVs into fewer, thus saving operations –Since these operations are inside inner loops, savings can be significant After IVE other optimizations can be applied too, e.g. SR

22 PSUCS322 HM 22 Induction Variable Elimination, Cont’d integer a(100) -- low bound is 1, not 0 like in C++ or Java, subtract! do i = 1, 100-- OK for i to be undefined after loop a(i) = 2 * i-- rhs deliberately not 4 * i, which would be easy: = IV enddo BB0 t2 = 2 * t1 t3 = 4 * t1 t4 = t3 – 4 t5 = A(a)+t4 *t5 = t2 t1 = t1 + 1 Goto BB1 If t1>100 goto BB3 t1 = 1 // i BB1 BB2 BB3 Ater loop i undefined BB0’ t2 = 2 * t1 t5 = A(a)+t0 *t5 = t2 t0 = t0 + 4 Goto BB1’ If t0>= 400 goto BB3’ t0 = 0 // IV t1 = 1 // i BB1’ BB2’ BB3’ Ater loop i undefined ⇒ BB0’’ t2 = 2 * t1 *t0 = t2 t0 = t0 + 4 Goto BB1’’ If t0>= A(a)+400 goto BB3’ t0 = A(a) // IV t1 = 1 // i BB1’’ BB2’’ BB3’ ⇒ BB3’’ Ater loop i undefined


Download ppt "PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU."

Similar presentations


Ads by Google