Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,

Similar presentations


Presentation on theme: "ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,"— Presentation transcript:

1 ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining, cont’d Maciej Ciesielski www.ecs.umass.edu/ece/labs/vlsicad/ece232/spr2002/index_232.html

2 ECE 232 L19.Pipeline2.2 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Outline °Pipelining, review °Design Control Datapath Examples

3 ECE 232 L19.Pipeline2.3 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Designing a Pipelined Processor °Go back and examine your datapath and control diagram °Associate resources with states °Ensure that flows do not conflict, or figure out how to resolve the conflict °Assert control in appropriate stage

4 ECE 232 L19.Pipeline2.4 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelined Datapath (as in book); hard to read

5 ECE 232 L19.Pipeline2.5 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelined Processor (almost) for slides °What happens if we start a new instruction every cycle? Exec Reg. File Mem Access Data Mem A B S M Reg File Equal PC Next PC IR Inst. Mem Valid IRex Dcd Ctrl IRmem Ex Ctrl IRwb Mem Ctrl WB Ctrl

6 ECE 232 L19.Pipeline2.6 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Control and Datapath A <- R[rs]; B<– R[rt] S <– A + B; IR <- Mem[PC]; PC <– PC+4; R[rd] <– S; S <– A + SX; M <– Mem[S] R[rd] <– M; S <– A or ZX; R[rt] <– S; S <– A + SX; Mem[S] <- B If Cond PC < PC+SX; Equal Exec Reg. File Mem Access Data Mem A B S Reg File PC Next PC IR Inst. Mem D M

7 ECE 232 L19.Pipeline2.7 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelining the Load Instruction °The five independent functional units in the pipeline datapath are: Instruction Memory for the Ifetch stage Register File’s Read ports (bus A and busB) for the Reg/Dec stage ALU for the Exec stage Data Memory for the Mem stage Register File’s Write port (bus W) for the Wr stage Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7 IfetchReg/DecExecMemWr1st lw IfetchReg/DecExecMemWr2nd lw IfetchReg/DecExecMemWr3rd lw

8 ECE 232 L19.Pipeline2.8 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers The Four Stages of R-type °Ifetch: Instruction Fetch Fetch the instruction from the Instruction Memory °Reg/Dec: Registers Fetch and Instruction Decode °Exec: ALU operates on the two register operands Update PC °Wr: Write the ALU output back to the register file Cycle 1Cycle 2Cycle 3Cycle 4 IfetchReg/DecExecWrR-type

9 ECE 232 L19.Pipeline2.9 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelining the R-type and Load Instruction °We have pipeline conflict or structural hazard: Two instructions try to write to the register file at the same time! Only one write port Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type IfetchReg/DecExecMemWrLoad IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type Ops! We have a problem!

10 ECE 232 L19.Pipeline2.10 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Important Observation °Each functional unit can only be used once per instruction °Each functional unit must be used at the same stage for all instructions: Load uses Register File’s Write Port during its 5th stage R-type uses Register File’s Write Port during its 4th stage IfetchReg/DecExecMemWrLoad 12345 IfetchReg/DecExecWrR-type 1234 °2 ways to solve this pipeline hazard.

11 ECE 232 L19.Pipeline2.11 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Solution 1: Insert “Bubble” into the Pipeline °Insert a “bubble” into the pipeline to prevent 2 writes at the same cycle The control logic can be complex. Lose instruction fetch and issue opportunity. °No instruction is started in Cycle 6! Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecExecWrR-type IfetchReg/DecExec IfetchReg/DecExecMemWrLoad IfetchReg/DecExecWr R-type IfetchReg/DecExecWr R-type Pipeline Bubble IfetchReg/DecExecWr

12 ECE 232 L19.Pipeline2.12 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Solution 2: Delay R-type’s Write by One Cycle °Delay R-type’s register write by one cycle: Now R-type instructions also use Reg File’s write port at Stage 5 Mem stage is a NOOP stage: nothing is being done. Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecMemWrR-type IfetchReg/DecMemWrR-type IfetchReg/DecExecMemWrLoad IfetchReg/DecMemWrR-type IfetchReg/DecMemWrR-type Exec IfetchReg/Dec Exec WrR-type Mem 123 4 5

13 ECE 232 L19.Pipeline2.13 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Modified Control & Datapath Exec Reg. File Mem Access Data Mem A B S Reg File Equal PC Next PC IR Inst. Mem D M IR <- Mem[PC]; PC <– PC+4; A <- R[rs]; B<– R[rt] S <– A + B; R[rd] <– M; S <– A + SX; M <– Mem[S] R[rd] <– M; S <– A or ZX; R[rt] <– M; S <– A + SX; Mem[S] <- B if Cond PC < PC+SX; M <– S

14 ECE 232 L19.Pipeline2.14 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers The Four Stages of Store °Ifetch: Instruction Fetch Fetch the instruction from the Instruction Memory °Reg/Dec: Registers Fetch and Instruction Decode °Exec: Calculate the memory address °Mem: Write the data into the Data Memory Cycle 1Cycle 2Cycle 3Cycle 4 IfetchReg/DecExecMemStoreWr

15 ECE 232 L19.Pipeline2.15 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers The Three Stages of Beq °Ifetch: Instruction Fetch Fetch the instruction from the Instruction Memory °Reg/Dec: Registers Fetch and Instruction Decode °Exec: Compares the two register operand, Select correct branch target address Latch into PC Cycle 1Cycle 2Cycle 3Cycle 4 IfetchReg/DecExecMemBeqWr

16 ECE 232 L19.Pipeline2.16 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Control Diagram IR <- Mem[PC]; PC < PC+4; A <- R[rs]; B<– R[rt] S <– A + B; R[rd] <– S; S <– A + SX; M <– Mem[S] R[rd] <– M; S <– A or ZX; R[rt] <– S; S <– A + SX; Mem[S] <- B If Cond PC < PC+SX; M <– S Exec Reg. File Mem Access Data Mem A B S Reg File Equal PC Next PC IR Inst. Mem D M

17 ECE 232 L19.Pipeline2.17 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Datapath + Data Stationary Control Exec Reg. File Mem Access Data Mem A B S Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M rsrt op rs rt fun im ex me wb rw v me wb rw v wb rw v

18 ECE 232 L19.Pipeline2.18 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Let’s Try it Out °Execute these instructions on pipelined machine 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 ……. 100andr13, r14, 15 these addresses are octal

19 ECE 232 L19.Pipeline2.19 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Start: Fetch 10 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 …. 100andr13, r14, 15 Exec Reg. File Mem Access Data Mem A B S Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M rsrt im nn nn 10

20 ECE 232 L19.Pipeline2.20 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 14, Decode 10 PC 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 …. 100andr13, r14, 15 Exec Reg. File Mem Access Data Mem A B S Reg File Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 2rt im n nn 14 lw r1, r2(35)

21 ECE 232 L19.Pipeline2.21 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 20, Decode 14, Exec 10 PC 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 Exec Reg. File Mem Access Data Mem r2 B S Reg File Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 2rt 35 nn 20 lw r1 addI r2, r2, 3

22 ECE 232 L19.Pipeline2.22 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 24, Decode 20, Exec 14, Mem 10 PC 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 Exec Reg. File Mem Access Data Mem r2 B r2+35 Reg File Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M 45 3 n 24 lw r1 sub r3, r4, r5 addI r2, r2, 3

23 ECE 232 L19.Pipeline2.23 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 30, Dcd 24, Ex 20, Mem 14, WB 10 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 Exec Reg. File Mem Access Data Mem r4 r5 r2+3 Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl M[r2+35] 67 30 lw r1 beq r6, r7 100 addI r2 sub r3

24 ECE 232 L19.Pipeline2.24 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 34, Dcd 30, Ex 24, Mem 20, WB 14 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 r1= M [r2+35] Address 100 will be loaded into PC Exec Reg. File Mem Access Data Mem r6 r7 r2+3 Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl 9xx 34 beq addI r2 sub r3 r4-r5 100 ori r8, r9 17 =? EXE Assume branch taken 30: first delayed instruction

25 ECE 232 L19.Pipeline2.25 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 100, Dcd 34, Ex 30, Mem 24, WB 20 PC r1= M [r2+35] 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 r2 = r2+3 Exec Reg. File Mem Access Data Mem r9 x Reg File Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl 1112 100 beq sub r3 r4-r5 17 ori r8 xxx add r10, r11, r12 34: second delayed instruction

26 ECE 232 L19.Pipeline2.26 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 104, Dcd 100, Ex 34, Mem 30, WB 24 r1= M [r2+35] 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 r2 = r2+ 3 r3 = r4- r5 Two instructions squashed in the branch shadow! Exec Reg. File Mem Access Data Mem r11 r12 Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl 1415 104 beq xx ori r8 xxx add r10 and r13, r14, r15 n r9 | 17

27 ECE 232 L19.Pipeline2.27 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 108, Dcd 104, Ex 100, Mem 34, WB 30 r1= M [r2+35] 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 r2 = r2+ 3 r3 = r4- r5 Exec Reg. File Mem Access Data Mem r14 r15 Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl 110 xx ori r8 add r10 and r13 n r9 | 17 r11+r12

28 ECE 232 L19.Pipeline2.28 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Fetch 114, Dcd 110, Ex 104, Mem 100, WB 34 Exec Reg. File Mem Access Data Mem Reg File PC Next PC IR Inst. Mem D Decode Mem Ctrl WB Ctrl r1= M [r2+35] 10lw r1, r2(35) 14addI r2, r2, 3 20subr3, r4, r5 24beqr6, r7, 100 30orir8, r9, 17 34addr10, r11, r12 … 100andr13, r14, 15 114 r2 = r2+ 3 r3 = r4- r5 r8 = r9 | 17 add r10 and r13 n r11+r12 NO WB NO Overflow r14 & R15

29 ECE 232 L19.Pipeline2.29 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Summary: Pipelining °What makes it easy all instructions are the same length just a few instruction formats memory operands appear only in loads and stores °What makes it hard? structural hazards: suppose we had only one memory control hazards: need to worry about branch instructions data hazards: an instruction depends on a previous instruction °We’ll build a simple pipeline and look at these issues °We’ll talk about modern processors and what really makes it hard: exception handling trying to improve performance with out-of-order execution, etc.

30 ECE 232 L19.Pipeline2.30 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Summary °Pipelining is a fundamental concept multiple steps using distinct resources °Utilize capabilities of the datapath by pipelined instruction processing start next instruction while working on the current one limited by length of longest stage (plus fill/flush) detect and resolve hazards


Download ppt "ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,"

Similar presentations


Ads by Google