CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture.

Slides:



Advertisements
Similar presentations
Adders Used to perform addition, subtraction, multiplication, and division (sometimes) Half-adder adds rightmost (least significant) bit Full-adder.
Advertisements

D x D V y 1 L x D L x 1 L x 2 V y 2 V y 3 xDxD y1y1 y2y2 x1x1 x2x2 y3y3 x3x3 y4y4 z.
1
1. 2 Memória (R-bit register) Circuito Combinatório D1D1 DRDR TRTR T1T1 X1X1 XLXL Y1Y1 YNYN clockreset MEF.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Objectives: Generate and describe sequences. Vocabulary:
UNITED NATIONS Shipment Details Report – January 2006.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Create an Application Title 1D - Dislocated Worker Chapter 9.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
CALENDAR.
Instructions for using this template. Remember this is Jeopardy, so where I have written Answer this is the prompt the students will see, and where I.
This is Jeremy Miless collection of path When I want to draw a path diagram, I find the one most similar to.
Rule Learning – Overview Goal: learn transfer rules for a language pair where one language is resource-rich, the other is resource-poor Learning proceeds.
Break Time Remaining 10:00.
MS108 Computer System I Lecture 7 Tomasulos Algorithm Prof. Xiaoyao Liang 2014/3/24 1.
Discrete Math Recurrence Relations 1.
PP Test Review Sections 6-1 to 6-6
Discrete Mathematical Structures: Theory and Applications
Ideal Parent Structure Learning School of Engineering & Computer Science The Hebrew University, Jerusalem, Israel Gal Elidan with Iftach Nachman and Nir.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 Decision Procedures An algorithmic point of view Equality Logic and Uninterpreted Functions.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
On / By / With The building blocks of the Mplus language.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic F: IA-64 Hardware Support for Software Pipelining José Nelson Amaral
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Adding Up In Chunks.
While Loop Lesson CS1313 Spring while Loop Outline 1.while Loop Outline 2.while Loop Example #1 3.while Loop Example #2 4.while Loop Example #3.
23-8 3x6 Double it Take Away 6 Share By 9 Double it +10 Halve it Beginner Start Answer Intermediate 70 50% of this ÷7÷7 x8 Double it Start Answer.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 30, 2002 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Tomasulo’s.
Analyzing Genes and Genomes
2 x0 0 12/13/2014 Know Your Facts!. 2 x1 2 12/13/2014 Know Your Facts!
Essential Cell Biology
2 x /18/2014 Know Your Facts!. 11 x /18/2014 Know Your Facts!
Clock will move after 1 minute
PSSA Preparation.
Essential Cell Biology
2 x /10/2015 Know Your Facts!. 8 x /10/2015 Know Your Facts!
Immunobiology: The Immune System in Health & Disease Sixth Edition
Energy Generation in Mitochondria and Chlorplasts
EXAMPLE 3 DIV Unit is not Pipelined. So second instruction waits in ID stage although it is independent. DIV.D F0,F1,F2 IFID DIV1DIV1 DIV2DIV2 DIV3DIV3.
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Ellipses Date: ____________.
Chapter 3 โพรเซสเซอร์และการทำงาน The Processing Unit
Instruction-Level Parallelism
ILP: Software Approaches
5 x4. 10 x2 9 x3 10 x9 10 x4 10 x8 9 x2 9 x4.
EMIS 8374 LP Review: The Ratio Test. 1 Main Steps of the Simplex Method 1.Put the problem in row-0 form. 2.Construct the simplex tableau. 3.Obtain an.
Multiplication Facts Practice
SATISFIABILITY Eric L. Frederich.
Computational Facility Layout
Graeme Henchel Multiples Graeme Henchel
Quiz Number 2 Group 1 – North of Newark Thamer AbuDiak Reynald Benoit Jose Lopez Rosele Lynn Dave Neal Deyanira Pena Professor Kenneth D. Lawerence New.
FPGA Synthesis. 2 Agenda Brief tour in RTL synthesis  Basic concepts and representations LUT-based technology mapping  The chortle algorithm  The FlowMap.
(for Prof. Oleg Shpyrko)
0 x x2 0 0 x1 0 0 x3 0 1 x7 7 2 x0 0 9 x0 0.
Presentation stolen from the web (with changes) from the Univ of Aberta and Espen Skoglund and Thomas Richards (470 alum) and Our textbook’s authors IA-64:
T-SPaCS – A Two-Level Single-Pass Cache Simulation Methodology + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing Wei Zang.
7x7=.
Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.
Presentation transcript:

CMPUT429/CMPE382 Amaral 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic9: Software Pipelining (Some slides from David A. Patterson’s CS252, Spring 2001 Lecture Slides)

CMPUT429/CMPE382 Amaral 1/17/01 Another possibility: Software Pipelining Observation: if iterations from loops are independent, then we can get more ILP by scheduling execution instructions from different iterations Software pipelining: reorganizes loops so that each iteration is made from instructions chosen from different iterations of the original loop

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example Before: Unrolled 3 times 1 L.DF0,0(R1) 2 ADD.DF4,F0,F2 3 S.D0(R1),F4 4 L.DF6,-8(R1) 5 ADD.DF8,F6,F2 6 S.D-8(R1),F8 7 L.DF10,-16(R1) 8 ADD.DF12,F10,F2 9 S.D-16(R1),F12 10 DSUBUIR1,R1,#24 11 BNEZR1,LOOP After: Software Pipelined 1 S.D0(R1),F4 ;Stores M[i] 2 ADD.DF4,F0,F2 ;Adds to M[i-1] 3 L.DF0,-16(R1);Loads M[i-2] 4 DSUBUIR1,R1,#8 5 BNEZR1,LOOP Symbolic Loop Unrolling – Maximize result-use distance – Less code space than unrolling – Fill & drain pipe only once per loop vs. once per each unrolled iteration in loop unrolling SW Pipeline Loop Unrolled overlapped ops Time 5 cycles per iteration

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example Before: Unrolled 3 times 1 L.DF0,0(R1) 2 ADD.DF4,F0,F2 3 S.D0(R1),F4 4 L.DF6,-8(R1) 5 ADD.DF8,F6,F2 6 S.D-8(R1),F8 7 L.DF10,-16(R1) 8 ADD.DF12,F10,F2 9 S.D-16(R1),F12 10 DSUBUIR1,R1,#24 11 BNEZR1,LOOP After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 F0F2F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 s X[1000]

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED R1 T1 F0F2F4 s x[1000]

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T1 F0F2F4 s x[999]

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 T1 X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T1 F0F2F4 s x[999]

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[999] +

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[998]

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] X[999] X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[998]

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example After: Software Pipelined L.DF0,0(R1) ADD.DF4,F0,F2 L.DF0,-8(R1) L:S.D0(R1),F4 ;Stores M[i] ADD.DF4,F0,F2 ;Adds to M[i-1] L.DF0,-16(R1); Loads M[i-2] DSUBUIR1,R1,#8 BNEZR1,L S.D-8(R1),F4 ADD.DF4,F0,F2 S.D-16(R1),F4 X[1000] T2 X[998] X[997]... 0xFF00 0xFEE8 0xFEE0 0xFED8... R1 T2 F0F2F4 s x[998]

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop General Registers (Physical) Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory General Registers (Logical) 0 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x General Registers (Physical) Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory General Registers (Logical) 0 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) 0 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 4 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) 0 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 4 LC 3 EC 1 x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) x2 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) x2 y1 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) x2 y1 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 3 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) x2 y1 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 2 LC 3 EC 1 x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) x2 y1 -2 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 Memory x General Registers (Physical) General Registers (Logical) x2y1x3 -2 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop y Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 Memory General Registers (Physical) General Registers (Logical) x2y1x3 -2 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 y1 Memory y General Registers (Physical) General Registers (Logical) x2y1x3 -2 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 2 LC 3 EC x4 x5 x1 x2 x3 y1 Memory y General Registers (Physical) General Registers (Logical) x2y1x3 -2 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 1 LC 3 EC 1 x4 x5 x1 x2 x3 y1 Memory -3 RRB y General Registers (Physical) General Registers (Logical) x2y1x3

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 1 LC 3 EC x4 x5 x1 x2 x3 y1 Memory -3 RRB y2 x General Registers (Physical) General Registers (Logical) x2y1x3

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 1 LC 3 EC x4 x5 x1 x2 x3 y1 Memory y2 x General Registers (Physical) General Registers (Logical) y3y1x3 -3 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA-64 loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop Predicate Registers 1 LC 3 EC x4 x5 x1 x2 x3 y1 y2 Memory y2 x General Registers (Physical) General Registers (Logical) y3y1x3 -3 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 1 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 Memory y2 x General Registers (Physical) General Registers (Logical) y3y1x3 -3 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 1 x4 x5 x1 x2 x3 y1 y2 Memory -4 RRB y2 x General Registers (Physical) General Registers (Logical) y3y1x3

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 Memory y2 x5x General Registers (Physical) General Registers (Logical) y3y1x3 -4 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 Memory y2 x5x General Registers (Physical) General Registers (Logical) y3y1y4 -4 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory -4 RRB y2 x5x General Registers (Physical) General Registers (Logical) y3y1y4

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 3 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5x General Registers (Physical) General Registers (Logical) y3y1y4 -4 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 0 x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5x General Registers (Physical) General Registers (Logical) y3y1y4 -5 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5x General Registers (Physical) General Registers (Logical) y3y1y4 -5 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y1 y2 y3 Memory y2 x5y General Registers (Physical) General Registers (Logical) y3y1y4 -5 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y General Registers (Physical) General Registers (Logical) y3y1y4 -5 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 2 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y General Registers (Physical) General Registers (Logical) y3y1y4 -5 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 0 x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y General Registers (Physical) General Registers (Logical) y3y1y4 -6 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) General Registers (Logical) y3y1y4 -6 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) General Registers (Logical) y3y1y4 -6 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) General Registers (Logical) y3y1y4 -6 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) General Registers (Logical) y3y1y4 -6 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 1 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) General Registers (Logical) y3y1y4 -6 RRB

CMPUT429/CMPE382 Amaral 1/17/01 Software Pipelining Example in the IA Predicate Registers 0 LC 0 EC loop: (p16)ldl r32 = [r12], 1 (p17)add r34 = 1, r33 (p18)stl [r13] = r35,1 br.ctop loop 0 x4 x5 x1 x2 x3 y4 y5 y1 y2 y3 Memory y2 x5y5 General Registers (Physical) General Registers (Logical) y3y1y4 -7 RRB