Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture.

Similar presentations


Presentation on theme: "CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture."— Presentation transcript:

1 CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture Slides)

2 CMPUT429/CMPE382 Amaral 1/17/01 Another possibility: Software Pipelining Observation: if iterations from loops are independent, then we can get more ILP by scheduling execution instructions from different iterations Software pipelining: reorganizes loops so that each iteration is made from instructions chosen from different iterations of the original loop

3 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example Before: Unrolled 3 times 1 L.DF0,0(R1) 2 ADD.DF4,F0,F2 3 S.D0(R1),F4 4 L.DF6,-8(R1) 5 ADD.DF8,F6,F2 6 S.D-8(R1),F8 7 L.DF10,-16(R1) 8 ADD.DF12,F10,F2 9 S.D-16(R1),F12 10 DSUBUIR1,R1,#24 11 BNEZR1,LOOP After: Software Pipelined 1 S.D0(R1),F4 ;Stores M[i] 2 ADD.DF4,F0,F2 ;Adds to M[i-1] 3 L.DF0,-16(R1);Loads M[i-2] 4 DSUBUIR1,R1,#8 5 BNEZR1,LOOP Symbolic Loop Unrolling – Maximize result-use distance – Less code space than unrolling – Fill & drain pipe only once per loop vs. once per each unrolled iteration in loop unrolling SW Pipeline Loop Unrolled overlapped ops Time 5 cycles per iteration

4 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example Before: Unrolled 3 times 1 L.DF0,0(R1) 2 ADD.DF4,F0,F2 3 S.D0(R1),F4 4 L.DF6,-8(R1) 5 ADD.DF8,F6,F2 6 S.D-8(R1),F8 7 L.DF10,-16(R1) 8 ADD.DF12,F10,F2 9 S.D-16(R1),F12 10 DSUBUIR1,R1,#24 11 BNEZR1,LOOP After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4

5 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 F0F2F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 s X[1000]

6 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... + R1 T1 F0F2F4 s x[1000]

7 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T1 F0F2F4 s x[999]

8 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 T1 X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T1 F0F2F4 s x[999]

9 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[999] +

10 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[998]

11 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[998]

12 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) ------------------------------------ L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L ------------------------------------ S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] T2 X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[998]

13 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 3233 34 35 36 3738 General Registers (Physical) 001 1617 18 Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory 39 3233 34 35 36 373839 General Registers (Logical) 0 RRB

14 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x1 3233 34 35 36 3738 General Registers (Physical) 001 1617 18 Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory 39 3233 34 35 36 373839 General Registers (Logical) 0 RRB

15 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 001 1617 18 Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3233 34 35 36 3738 General Registers (Physical) 39 3233 34 35 36 373839 General Registers (Logical) 0 RRB

16 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 001 1617 18 Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3233 34 35 36 3738 General Registers (Physical) 39 3233 34 35 36 373839 General Registers (Logical) 0 RRB

17 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 001 1617 18 Predicate Registers 4 LC 3 EC 1 x4 x5 x1 x2 x3 Memory x1 3334 35 36 37 3839 General Registers (Physical) 32 33 34 35 36 373839 General Registers (Logical) RRB

18 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 101 1617 18 Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3334 35 36 37 3839 General Registers (Physical) 32 33 34 35 36 373839 General Registers (Logical) RRB

19 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 101 1617 18 Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3334 35 36 37 3839 General Registers (Physical) 32 33 34 35 36 373839 General Registers (Logical) x2 RRB

20 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 101 1617 18 Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3334 35 36 37 3839 General Registers (Physical) 32 33 34 35 36 373839 General Registers (Logical) x2 y1 RRB

21 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 101 1617 18 Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3334 35 36 37 3839 General Registers (Physical) 32 33 34 35 36 373839 General Registers (Logical) x2 y1 RRB

22 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 101 1617 18 Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3334 35 36 37 3839 General Registers (Physical) 32 33 34 35 36 373839 General Registers (Logical) x2 y1 RRB

23 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 2 LC 3 EC 1 x4 x5 x1 x2 x3 Memory x1 3435 36 37 38 3932 General Registers (Physical) 33 3233 34 35 36 373839 General Registers (Logical) x2 y1 -2 RRB

24 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 Memory x1 3435 36 37 38 3932 General Registers (Physical) 33 3233 34 35 36 373839 General Registers (Logical) x2y1x3 -2 RRB

25 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop y2 111 1617 18 Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 Memory 3435 36 37 38 3932 General Registers (Physical) 33 3233 34 35 36 373839 General Registers (Logical) x2y1x3 -2 RRB

26 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 y1 Memory y2 3435 36 37 38 3932 General Registers (Physical) 33 3233 34 35 36 373839 General Registers (Logical) x2y1x3 -2 RRB

27 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 y1 Memory y2 3435 36 37 38 3932 General Registers (Physical) 33 3233 34 35 36 373839 General Registers (Logical) x2y1x3 -2 RRB

28 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 1 LC 3 EC 1 x4 x5 x1 x2 x3 y1 Memory -3 RRB y2 3536 37 38 39 3233 General Registers (Physical) 34 3233 34 35 36 373839 General Registers (Logical) x2y1x3

29 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 1 LC 3 EC x4 x5 x1 x2 x3 y1 Memory -3 RRB y2 x4 3536 37 38 39 3233 General Registers (Physical) 34 3233 34 35 36 373839 General Registers (Logical) x2y1x3

30 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 1 LC 3 EC x4 x5 x1 x2 x3 y1 Memory y2 x4 3536 37 38 39 3233 General Registers (Physical) 34 3233 34 35 36 373839 General Registers (Logical) y3y1x3 -3 RRB

31 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 111 1617 18 Predicate Registers 1 LC 3 EC x4 x5 x1 x2 x3 y1 y2 Memory y2 x4 3536 37 38 39 3233 General Registers (Physical) 34 3233 34 35 36 373839 General Registers (Logical) y3y1x3 -3 RRB

32 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 111 1617 18 Predicate Registers 1 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 Memory y2 x4 3536 37 38 39 3233 General Registers (Physical) 34 3233 34 35 36 373839 General Registers (Logical) y3y1x3 -3 RRB

33 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 111 1617 18 Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 1 x4 x5 x1 x2 x3 y1 y2 Memory -4 RRB y2 x4 3637 38 39 32 3334 General Registers (Physical) 35 3233 34 35 36 373839 General Registers (Logical) y3y1x3

34 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 111 1617 18 Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 Memory y2 x5x4 3637 38 39 32 3334 General Registers (Physical) 35 3233 34 35 36 373839 General Registers (Logical) y3y1x3 -4 RRB

35 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 111 1617 18 Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 Memory y2 x5x4 3637 38 39 32 3334 General Registers (Physical) 35 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -4 RRB

36 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 111 1617 18 Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory -4 RRB y2 x5x4 3637 38 39 32 3334 General Registers (Physical) 35 3233 34 35 36 373839 General Registers (Logical) y3y1y4

37 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 111 1617 18 Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5x4 3637 38 39 32 3334 General Registers (Physical) 35 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -4 RRB

38 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 110 1617 18 Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 0 x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5x4 3738 39 32 33 3435 General Registers (Physical) 36 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -5 RRB

39 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 110 1617 18 Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5x4 3738 39 32 33 3435 General Registers (Physical) 36 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -5 RRB

40 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 110 1617 18 Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5y5 3738 39 32 33 3435 General Registers (Physical) 36 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -5 RRB

41 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 110 1617 18 Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y5 3738 39 32 33 3435 General Registers (Physical) 36 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -5 RRB

42 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 110 1617 18 Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y5 3738 39 32 33 3435 General Registers (Physical) 36 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -5 RRB

43 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 010 1617 18 Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 0 x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y5 3839 32 33 34 3536 General Registers (Physical) 37 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -6 RRB

44 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 010 1617 18 Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -6 RRB 3839 32 33 34 353637

45 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 010 1617 18 Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -6 RRB 3839 32 33 34 353637

46 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 010 1617 18 Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -6 RRB 3839 32 33 34 353637

47 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 010 1617 18 Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -6 RRB 3839 32 33 34 353637

48 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 010 1617 18 Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -6 RRB 3839 32 33 34 353637

49 CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 000 1617 18 Predicate Registers 0 LC 0 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 0 x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) 3233 34 35 36 373839 General Registers (Logical) y3y1y4 -7 RRB 3839 32 33 34 353637


Download ppt "CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture."

Similar presentations


Ads by Google