Presentation is loading. Please wait.

Presentation is loading. Please wait.

Loop Induction Variable Canonicalization. Motivation Background: Open64 Compilation Scheme Loop Induction Variable Canonicalization Project Tracing and.

Similar presentations


Presentation on theme: "Loop Induction Variable Canonicalization. Motivation Background: Open64 Compilation Scheme Loop Induction Variable Canonicalization Project Tracing and."— Presentation transcript:

1 Loop Induction Variable Canonicalization

2 Motivation Background: Open64 Compilation Scheme Loop Induction Variable Canonicalization Project Tracing and WHIRL Specification Loops References 3/27/20082 Copyright © 2008 - Juergen Ributzka. All rights reserved.

3 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 3 How to copy one array to another array?

4 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 4 for (int i = 0; i < SIZE; i++) { p[i] = q[i]; } int i = 0; while (i < SIZE) { p[i] = q[i]; i = i + 1; } while (p <= &p[SIZE-1]) { *p++ = *q++; } int i = 1; if (i <= SIZE) { do { p[i-1] = q[i-1]; } while (i++ <= SIZE); } One simple problem – many different solutions

5 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 5 int i = 0; while (i < SIZE) { p[i] = q[i]; i = i + 1; } Compiler prefer code which is easy to analyze: while (p <= &p[SIZE-1]) { *p++ = *q++; } User want high performance code: Compiler Optimization Compiler Transformation

6 Just one Induction Variable – starting at 0 – stride of 1 Unified Loop representation 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 6 int iv = 0; while (iv <= SIZE-1) { p[iv] = q[iv]; iv = iv + 1; }

7 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 7 Front End Loop Nest Optimizer (optional) Global Optimizer Code Generation IVR

8 Step 1: Induction Variable Injection Step 2: Inserting φ’s and Identity Assignments Step 3: Renaming Step 4: Induction Variable Analysis and Processing Step 5: Copy Propagation and Expression Simplification Step 6: Dead Store Elimination 3/27/20088 Copyright © 2008 - Juergen Ributzka. All rights reserved.

9 At this point we only have DO and WHILE loops – GOTO statements have been transformed to WHILE loops Loops are annotated with details of the high-level loop construct Inject a unit-stride induction variable into – Non-unit-stride DO loops – All WHILE loops 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 9 p = &a[0]; while (p <= &a[99]) { *p = 0; p = p + 1; } p = &a[0]; iv = 0; while (p <= &a[99]) { *p = 0; p = p + 1; iv = iv + 1; } Before:After:

10 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 10 Before: After: p ← &a[0] iv ← 0 p ← &a[0] iv ← 0 p ≤ &a[99] ? *p ← 0 p ← p + 4 iv ← iv + 1 *p ← 0 p ← p + 4 iv ← iv + 1 … … p ← &a[0] iv ← 0 p ← &a[0] iv ← 0 iv ← φ(iv, iv) p ← φ(p, p) p ≤ &a[99] ? iv ← φ(iv, iv) p ← φ(p, p) p ≤ &a[99] ? *p ← 0 p ← p + 4 iv ← iv + 1 *p ← 0 p ← p + 4 iv ← iv + 1 iv ← iv p ← p iv ← iv p ← p Insert φ’s

11 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 11 Before: After: p ← &a[0] iv ← 0 p ← &a[0] iv ← 0 iv ← φ(iv, iv) p ← φ(p, p) p ≤ &a[99] ? iv ← φ(iv, iv) p ← φ(p, p) p ≤ &a[99] ? *p ← 0 p ← p + 4 iv ← iv + 1 *p ← 0 p ← p + 4 iv ← iv + 1 iv ← iv p ← p iv ← iv p ← p p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← φ(p 1, p 3 ) p 2 ≤ &a[99] ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← φ(p 1, p 3 ) p 2 ≤ &a[99] ? *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 iv 4 ← iv 2 p 4 ← p 2 iv 4 ← iv 2 p 4 ← p 2 Rename variables

12 Process φ list at the beginning of the loop One operand must correspond to the initial value The other must be defined in the loop Initialize symbolic expression tree with this operand Recursively resolve variables in the expression tree which are not defined by a φ node, except both φ node operands are the same All variables in the symbolic expression tree must be now loop invariant or a result of a φ i 2 is an induction variable, if the expression tree is of the form i 2 ± where i 2 is a φ result. 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 12

13 i 1 and j 1 are initial values Expression Tree: i 2 ← i 3 i 2 ← j 3 + 2 i 2 ← i 2 + 5 (found IV) j 2 ← j 3 j 2 ← i 2 + 3 j 2 ← i 2 + 3 (can’t resolve i 2 ) 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 13 i 2 ← φ(i 1, i 3 ) j 2 ← φ(j 1, j 3 ) i 2 ≤ 100 ? i 2 ← φ(i 1, i 3 ) j 2 ← φ(j 1, j 3 ) i 2 ≤ 100 ? j 3 ← i 2 + 3 … i 3 ← j 3 + 2 j 3 ← i 2 + 3 … i 3 ← j 3 + 2 … … Example:

14 i 1 is initial values Expression Tree: i 2 ← i 5 i 2 ← φ(i 3, i 4 ) i 2 ← i 3 i 2 ← i 2 + 1 (found IV) 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 14 i 2 ← φ(i 1, i 5 ) i 2 ≤ 100 ? i 2 ← φ(i 1, i 5 ) i 2 ≤ 100 ? i2 < x ? … i 3 ← i 2 + 1 Example: … i 4 ← i 2 + 1 i 5 ← φ(i 3, i 4 ) … … ?=?=

15 Select Primary Induction Variable Compute Trip Count Exit Values s exit ← s init + x s step Define Secondary Induction Variables (s) with Primary Induction Variables (p) s ← s init + (p – p init ) x s step 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 15

16 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 16 Before: After: p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 p 2 ≤ &a[99] ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 p 2 ≤ &a[99] ? *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] iv 4 ← 100 p 4 ← &a[100] p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← φ(p 1, p 3 ) p 2 ≤ &a[99] ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← φ(p 1, p 3 ) p 2 ≤ &a[99] ? *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 iv 4 ← iv 2 p 4 ← p 2 iv 4 ← iv 2 p 4 ← p 2 Add exit values and replace φ’s

17 Preorder Traversal of the Dominator Tree If use of x 1 is defined by an assignment of the form x 1 ←, then substitute it by Example: Before:After: x 1 ← i 1 + j 1 x 1 ← i 1 + j 1 y 2 ← x 1 – y 1 y 2 ← i 1 + j 1 – y 1 x 2 ← y 2 + z 3 x 2 ← i 1 + j 1 – y 1 + z 3 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 17

18 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 18 Before: After: p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 (&a[0]+(iv 2 -0)x4) ≤ &a[99] ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 (&a[0]+(iv 2 -0)x4) ≤ &a[99] ? *(&a[0]+(iv 2 -0)x4) ← 0 p 3 ← &a[0]+(iv 2 -0)x4 + 4 iv 3 ← iv 2 + 1 *(&a[0]+(iv 2 -0)x4) ← 0 p 3 ← &a[0]+(iv 2 -0)x4 + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] iv 4 ← 100 p 4 ← &a[100] Copy Propagation p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 p 2 ≤ &a[99] ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 p 2 ≤ &a[99] ? *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 *p 2 ← 0 p 3 ← p 2 + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] iv 4 ← 100 p 4 ← &a[100]

19 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 19 Before: After: p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[iv 2 ] iv 2 ≤ 99 ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[iv 2 ] iv 2 ≤ 99 ? *a[iv 2 ] ← 0 p 3 ← &a[iv 2 ] + 4 iv 3 ← iv 2 + 1 *a[iv 2 ] ← 0 p 3 ← &a[iv 2 ] + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] iv 4 ← 100 p 4 ← &a[100] Simplification p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 (&a[0]+(iv 2 -0)x4) ≤ &a[99] ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[0]+(iv 2 -0)x4 (&a[0]+(iv 2 -0)x4) ≤ &a[99] ? *(&a[0]+(iv 2 -0)x4) ← 0 p 3 ← &a[0]+(iv 2 -0)x4 + 4 iv 3 ← iv 2 + 1 *(&a[0]+(iv 2 -0)x4) ← 0 p 3 ← &a[0]+(iv 2 -0)x4 + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] iv 4 ← 100 p 4 ← &a[100]

20 Mark all statements dead, except – I/O statements – return statements – procedure calls – statements with side effects (e.g. changes memory) Propagate liveness to the rest of the program – for each variable used in a live statement mark its defining statement alive – mark the conditional branch alive on which the statements depends Remove statements which has not been marked alive 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 20

21 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 21 After: p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[iv 2 ] iv 2 ≤ 99 ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[iv 2 ] iv 2 ≤ 99 ? *a[iv 2 ] ← 0 p 3 ← &a[iv 2 ] + 4 iv 3 ← iv 2 + 1 *a[iv 2 ] ← 0 p 3 ← &a[iv 2 ] + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] iv 4 ← 100 p 4 ← &a[100] Dead Store Elimination Before: p 1 ← &a[0] iv 1 ← 0 p 1 ← &a[0] iv 1 ← 0 iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[iv 2 ] iv 2 ≤ 99 ? iv 2 ← φ(iv 1, iv 3 ) p 2 ← &a[iv 2 ] iv 2 ≤ 99 ? *a[iv 2 ] ← 0 p 3 ← &a[iv 2 ] + 4 iv 3 ← iv 2 + 1 *a[iv 2 ] ← 0 p 3 ← &a[iv 2 ] + 4 iv 3 ← iv 2 + 1 iv 4 ← 100 p 4 ← &a[100] iv 4 ← 100 p 4 ← &a[100]

22 Given a loop, trace the intermediate representation (WHIRL) of the Open64 compiler as explained in the next slides. Create a CFG for each trace and explain what changed between each trace. The behavior that will be exposed by your trace will differ in certain aspects to the one presented in this presentation since Open64 has evolved over time. Is the result optimal? What could be improved? Extra Credit: Explain how the behavior has changed. 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 22

23 After the Front End opencc -c -O3 -show -keep loop1.c ir_b2a loop1.B > loop1.t After HSSA creation opencc -c -O3 -Wb,-tt25:0x0100 - PHASE:w=off filename.c (this will give you the trace before and after IVR) After Induction Variable Recognition opencc -c -O3 -Wb,-tt25:0x0100 - PHASE:w=off filename.c (this will give you the trace before and after IVR) 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 23

24 After Copy Propagation opencc -c -O3 -Wb,-tt25:0x0020 - PHASE:w=off filename.c After Boolean Simplification opencc -c -O3 -Wb,-tt26:0x0004 - PHASE:w=off filename.c After Dead Code Elimination opencc -c -O3 -Wb,-tt25:0x0080 - PHASE:w=off filename.c After each step you will find the trace in filename.t 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 24

25 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 25 Example C-Code: int foo (int *p, int size) { int sum = 0; int i; for (i = 0; i < size; i++) { sum += p[i]; } return sum; }

26 WHIRL: FUNC_ENTRY IDNAME 0 BODY BLOCK END_BLOCK BLOCK END_BLOCK BLOCK PRAGMA 0 120 0 (0x0) # PREAMBLE_END 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 26

27 LOC 1 4 int sum = 0; I4INTCONST 0 (0x0) I4STID 0 T LOC 1 5 int i; LOC 1 6 LOC 1 7 for (i=0; i<size; i++) { I4INTCONST 0 (0x0) I4STID 0 T WHILE_DO I4I4LDID 0 T I4I4GT 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 27

28 BODY BLOCK LOC 1 8 sum += p[i]; U8U8LDID 0 T I8I4LDID 0 T U8I8CVT U8INTCONST 4 (0x4) U8MPY U8ADD I4I4ILOAD 0 T T I4I4LDID 0 T I4ADD I4STID 0 T LOC 1 7 LABEL L1 0 I4I4LDID 0 T I4INTCONST 1 (0x1) I4ADD I4STID 0 T END_BLOCK 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 28 = sum+ load + * 4convert i p

29 LOC 1 9 } LOC 1 10 LOC 1 11 return sum; I4I4LDID 0 T I4RETURN_VAL END_BLOCK 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 29

30 int loop1 (int *p, int size) { int i = 0; while (i < size) { i = i + 3; p[i] = 0; i = i + 1; } return 0; } 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 30

31 int loop2 (int *p, int *q, int size) { int i; for (i=0; i != size; i++) { *p = *q; p = p + 2; q = q + 3; } return 0; } 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 31

32 int loop3 (int *p, int *q, int size) { int i = 0; while (i < size) { int j = i + 1; p[j] = 0; i = j + 3; q[i] = 1; } return 0; } 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 32

33 int loop4 (int *a, int size) { int *p = a; int *q = &a[size]; while (p != q) { *(++p) = 0; } return 0; } 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 33

34 int loop5 (int *a, int size) { int i = 0; while (i++ < size) { a[i] = 0; } return 0; } 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 34

35 int loop6 (int *a, int size, int t) { int i = 0; int sum = 0; while (i < size) { if (a[i] < t) { i = i + 1; continue; } sum += a[i]; i = i + 1; } return sum; } 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 35

36 int loop7 (int *a, int size) { int i,j; int sum = 0; int k = 0; for (i = 0; i < size; i++) { for (j = 0; j < size; j++) { sum += a[k]; k = k + 1; } return sum; } 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 36

37 Dr. Fred Chow (PathScale, LLC) Dr. Handong Ye (CAPSL) 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 37

38 Shin-Ming Liu, Raymond Lo and Fred Chow, “Loop Induction Variable Canonicalization in Parallelizing Compilers” WHIRL Intermediate Language Specification (http://www.open64.net/documentation/manuals.html) How to Debug Open64 (Open64/doc/HOW-TO-DEBUG-OPEN64) 3/27/2008 Copyright © 2008 - Juergen Ributzka. All rights reserved. 38


Download ppt "Loop Induction Variable Canonicalization. Motivation Background: Open64 Compilation Scheme Loop Induction Variable Canonicalization Project Tracing and."

Similar presentations


Ads by Google