Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pipeline Hazards Hakim Weatherspoon CS 3410, Spring 2013

Similar presentations


Presentation on theme: "Pipeline Hazards Hakim Weatherspoon CS 3410, Spring 2013"— Presentation transcript:

1 Pipeline Hazards Hakim Weatherspoon CS 3410, Spring 2013
Computer Science Cornell University See P&H Chapter 4.7

2 Goals for Today Data Hazards Next time Revisit Pipelined Processors
Data dependencies Problem, detection, and solutions (delaying, stalling, forwarding, bypass, etc) Hazard detection unit Forwarding unit Next time Control Hazards What is the next instruction to execute if a branch is taken? Not taken?

3 MIPS Design Principles
Simplicity favors regularity 32 bit instructions Smaller is faster Small register file Make the common case fast Include support for constants Good design demands good compromises Support for different type of interpretations/classes

4 Recall: MIPS instruction formats
All MIPS instructions are 32 bits long, has 3 formats R-type I-type J-type op rs rt rd shamt func 6 bits 5 bits op rs rt immediate 6 bits 5 bits 16 bits op immediate (target address) 6 bits 26 bits

5 Recall: MIPS Instruction Types
Arithmetic/Logical R-type: result and two source registers, shift amount I-type: 16-bit immediate with sign/zero extension Memory Access load/store between registers and memory word, half-word and byte operations Control flow conditional branches: pc-relative addresses jumps: fixed offsets, register absolute

6 Recall: MIPS Instruction Types
Arithmetic/Logical ADD, ADDU, SUB, SUBU, AND, OR, XOR, NOR, SLT, SLTU ADDI, ADDIU, ANDI, ORI, XORI, LUI, SLL, SRL, SLLV, SRLV, SRAV, SLTI, SLTIU MULT, DIV, MFLO, MTLO, MFHI, MTHI Memory Access LW, LH, LB, LHU, LBU, LWL, LWR SW, SH, SB, SWL, SWR Control flow BEQ, BNE, BLEZ, BLTZ, BGEZ, BGTZ J, JR, JAL, JALR, BEQL, BNEL, BLEZL, BGTZL Special LL, SC, SYSCALL, BREAK, SYNC, COPROC

7 compute jump/branch targets
Pipelined Processor memory register file alu +4 addr PC din dout control memory compute jump/branch targets new pc extend Fetch Decode Execute Memory WB

8 compute jump/branch targets
Pipelined Processor Write- Back Instruction Fetch Instruction Decode Execute Memory memory register file A alu D D B +4 addr PC din dout inst B M control memory compute jump/branch targets new pc extend imm ctrl ctrl ctrl IF/ID ID/EX EX/MEM MEM/WB

9 Example: : Sample Code (Simple)
add r3, r1, r2; nand r6, r4, r5; lw r4, 20(r2); add r5, r2, r5; sw r7, 12(r3);

10 Example: Sample Code (Simple)
Assume eight-register machine Run the following code on a pipelined datapath add r3 r1 r2 ; reg 3 = reg 1 + reg 2 nand r6 r4 r5 ; reg 6 = ~(reg 4 & reg 5) lw r (r2) ; reg 4 = Mem[reg2+20] add r5 r2 r5 ; reg 5 = reg 2 + reg 5 sw r (r3) ; Mem[reg3+12] = reg 7 Slides thanks to Sally McKee

11 Time Graphs Clock cycle IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX
1 2 3 4 5 6 7 8 9 add nand lw sw IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB CPI = 1 (inverse of throughput) Latency: Throughput: Concurrency: CPI =

12 Time Graphs Clock cycle IF ID EX WB IF ID EX WB IF ID EX WB IF ID EX
1 2 3 4 5 6 7 8 9 add nand lw sw IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB CPI = 1 (inverse of throughput) Latency: 5 cycles Throughput: 1 instr/cycle Concurrency: 5 Latency: Throughput: Concurrency: CPI = 1

13 IF/ID ID/EX EX/MEM MEM/WB + 4 valA instruction valB valB Rd dest dest
target PC+4 PC+4 R0 regA R1 ALU result A L U Register file regB R2 valA PC Inst mem Data mem M U X instruction R3 ALU result mdata R4 R5 valB R6 M U X data R7 imm dest extend valB Bits 11-15 Rd dest dest Bits 16-20 Rt M U X Bits 26-31 op op op IF/ID ID/EX EX/MEM MEM/WB

14 IF/ID ID/EX EX/MEM MEM/WB add r3 r1 r2 At time 1, Fetch + Initial
U X 4 + R0 R1 36 A L U Register file R2 9 PC Inst mem Data mem M U X nop R3 12 R4 18 R5 7 R6 41 M U X data R7 22 dest extend Initial State Bits 11-15 Bits 16-20 M U X Bits 26-31 nop nop nop IF/ID ID/EX EX/MEM MEM/WB

15 IF/ID ID/EX EX/MEM MEM/WB add 3 1 2 + / 4 / 36 / 9 Fetch: add 3 1 2
U X 4 + 4 / 4 R0 R1 36 A L U Register file R2 9 PC Inst mem / 36 Data mem M U X add R3 12 R4 18 R5 7 / 9 R6 41 M U X data R7 22 dest extend Fetch: add 3 1 2 Bits 11-15 / 3 Bits 16-20 / 2 M U X Bits 26-31 nop / add nop nop IF/ID ID/EX EX/MEM MEM/WB Time: 1 / 2

16 IF/ID ID/EX EX/MEM MEM/WB nand 6 4 5 add 3 1 2 / 4 + / 8 / 18 / 45 / 7
U X 4 + / 4 8 4 / 8 R0 R1 36 1 36 A L U Register file R2 9 2 36 PC Inst mem M U X / 18 Data mem nand R3 12 / 45 R4 18 9 R5 7 9 / 7 R6 41 M U X data R7 22 3 dest extend / 9 Fetch: nand 6 4 5 Bits 11-15 / 6 3 3 / 3 Bits 16-20 / 5 2 M U X Bits 26-31 add / nand nop / add nop IF/ID ID/EX EX/MEM MEM/WB Time: 2 / 3

17 IF/ID ID/EX EX/MEM MEM/WB lw 4 20(2) nand 6 4 5 add 3 1 2
U X nand ( 18∙7 ) 18 = 7 = -3 = 4 + 4 / 8 12 8 R0 R1 36 4 / 45 / 18 A L U Register file R2 9 36 5 18 PC Inst mem Data mem M U X lw (2) R3 12 45 / -3 R4 18 / 7 9 R5 7 7 R6 41 M U X data R7 22 6 dest extend 9 / 7 0x 0x And = 10 Nand = Fetch: lw 4 20(2) Bits 11-15 3 6 3 3 / 6 / 3 Bits 16-20 2 5 M U X Bits 26-31 nand add / nand nop / add IF/ID ID/EX EX/MEM MEM/WB Time: 3 / 4

18 IF/ID ID/EX EX/MEM MEM/WB add 5 2 5 lw 4 20(2) nand 6 4 5 add 3 1 2 +
U X 4 + 8 16 12 R0 R1 36 2 45 18 A L U Register file R2 9 4 9 PC Inst mem Data mem M U X add R3 12 -3 R4 18 45 7 R5 7 18 R6 41 M U X data R7 22 20 dest extend 7 Fetch: add 5 2 5 Bits 11-15 6 6 3 6 3 Bits 16-20 5 4 M U X Bits 26-31 lw nand add IF/ID ID/EX EX/MEM MEM/WB Time: 4

19 IF/ID ID/EX EX/MEM MEM/WB
sw 7 12(3) add lw 4 20 (2) nand add M U X 4 + 12 20 16 R0 R1 36 45 2 -3 9 A L U Register file R2 9 5 9 PC Inst mem Data mem M U X sw 7 12(3) R3 45 29 R4 18 -3 R5 7 7 R6 41 M U X data R7 22 20 5 dest extend 18 Fetch: sw 7 12(3) Bits 11-15 5 4 6 3 4 6 Bits 16-20 4 5 M U X Bits 26-31 add lw nand IF/ID ID/EX EX/MEM MEM/WB Time: 5

20 IF/ID ID/EX EX/MEM MEM/WB sw 7 12(3) add 5 2 5 lw 4 20(2) nand 6 4 5 +
U X 4 + 16 20 R0 R1 36 -3 3 29 9 A L U Register file R2 9 7 45 PC Inst mem Data mem M U X R3 45 16 99 R4 18 29 7 R5 7 22 R6 -3 M U X data R7 22 12 dest extend 7 No more instructions Bits 11-15 5 5 4 6 5 4 Bits 16-20 5 7 M U X Bits 26-31 sw add lw IF/ID ID/EX EX/MEM MEM/WB Time: 6

21 IF/ID ID/EX EX/MEM MEM/WB nop nop sw 7 12(3) add 5 2 5 lw 4 20(2) +
U X 4 + 20 R0 R1 36 16 45 A L U Register file R2 9 PC Inst mem Data mem M U X R3 45 99 57 R4 99 16 R5 7 R6 -3 M U X data R7 22 12 dest extend 22 No more instructions Bits 11-15 7 5 4 7 5 Bits 16-20 7 M U X Bits 26-31 sw add IF/ID ID/EX EX/MEM MEM/WB Time: 7

22 IF/ID ID/EX EX/MEM MEM/WB nop nop nop sw 7 12(3) add 5 2 5 + No more
U X 4 + R0 R1 36 16 57 A L U Register file R2 9 PC Inst mem Data mem M U X R3 45 R4 99 57 22 R5 16 R6 -3 M U X data R7 22 22 dest extend No more instructions Bits 11-15 5 7 Bits 16-20 M U X Bits 26-31 sw IF/ID ID/EX EX/MEM MEM/WB Time: 8 Slides thanks to Sally McKee

23 IF/ID ID/EX EX/MEM MEM/WB nop nop nop nop sw 7 12(3) + No more
U X 4 + R0 R1 36 A L U Register file R2 9 PC Inst mem Data mem M U X R3 45 R4 99 R5 16 R6 -3 M U X data R7 22 dest extend No more instructions Bits 11-15 Bits 16-20 M U X Bits 21-23 IF/ID ID/EX EX/MEM MEM/WB Time: 9

24 Takeaway Pipelining is a powerful technique to mask latencies and increase throughput Logically, instructions execute one at a time Physically, instructions execute in parallel Instruction level parallelism Abstraction promotes decoupling Interface (ISA) vs. implementation (Pipeline)

25 Next Goal What about data dependencies (also known as a data hazard in a pipelined processor)? i.e. add r3, r1, r2 sub r5, r3, r4

26 “earlier” = started earlier
Data Hazards Data Hazards register file reads occur in stage 2 (ID) register file writes occur in stage 5 (WB) next instructions may read values about to be written destination reg of earlier instruction == source reg of current stage left “earlier” = started earlier = stage right assumes (WE=0 implies rD=0) everywhere, similar for rA needed, rB needed (can ensure this in ID stage)

27 Data Hazards: Broken Example
Clock cycle r3 = 10 r3 = 20 time r3 = 10 IF ID MEM WB add r3, r1, r2 r3 = 20 r3 = 10 sub r5, r3, r4 IF ID MEM WB r3 = 10 lw r6, 4(r3) IF ID MEM WB r6 = Mem[r3 + 4] r3 = 10 or r5, r3, r5 IF ID MEM WB r3 = 20 OK sw r6, 12(r3) IF ID MEM WB

28 “earlier” = started earlier
Data Hazards Data Hazards register file reads occur in stage 2 (ID) register file writes occur in stage 5 (WB) next instructions may read values about to be written How to detect? destination reg of earlier instruction == source reg of current stage left “earlier” = started earlier = stage right assumes (WE=0 implies rD=0) everywhere, similar for rA needed, rB needed (can ensure this in ID stage)

29 Data Hazards inst mem inst OP B A D M Rd mem PC IF/ID ID/EX EX/MEM
Ra Rb D B A inst PC+4 OP B A Rt D M imm Rd addr din dout +4 mem PC Rd IF/ID.Ra ≠ 0 && (IF/ID.Ra==ID/Ex.Rd IF/ID.Ra==Ex/M.Rd IF/ID.Ra==M/W.Rd) IF/ID ID/EX EX/MEM MEM/WB

30 “earlier” = started earlier
Data Hazards Data Hazards register file reads occur in stage 2 (ID) register file writes occur in stage 5 (WB) next instructions may read values about to be written How to detect? Logic in ID stage: stall = (IF/ID.Ra != 0 && (IF/ID.Ra == ID/EX.Rd || IF/ID.Ra == EX/M.Rd || IF/ID.Ra == M/WB.Rd)) || (same for Rb) destination reg of earlier instruction == source reg of current stage left “earlier” = started earlier = stage right assumes (WE=0 implies rD=0) everywhere, similar for rA needed, rB needed (can ensure this in ID stage)

31 Detecting Data Hazards
inst mem add r3, r1, r2 sub r5, r3, r5 or r6, r3, r4 add r6, r3, r8 Rd Ra Rb D B A inst PC+4 OP B A Rt D M imm Rd addr din dout +4 mem detect hazard PC Rd IF/ID ID/EX EX/MEM MEM/WB

32 Takeaway Data hazards occur when a operand (register) depends on the result of a previous instruction that may not be computed yet. A pipelined processor needs to detect data hazards.

33 Next Goal What to do if data hazard detected?
Nothing: Modify ISA to match implementation Stall: Pause current and all subsequent instruction Forward/Bypass: Steal correct value from elsewhere in pipeline

34 Resolving Data Hazards
What to do if data hazard detected? Wait/Stall Reorder in Software (SW) Forward/Bypass All the above None. We will use some other method Nothing: Modify ISA to match implementation Stall: Pause current and all subsequent instruction Forward/Bypass: Steal correct value from elsewhere in pipeline

35 Resolving Data Hazards
What to do if data hazard detected? Wait/Stall Reorder in Software (SW) Forward/Bypass All the above None. We will use some other method Discuss today Nothing: Modify ISA to match implementation Stall: Pause current and all subsequent instruction Forward/Bypass: Steal correct value from elsewhere in pipeline

36 Stalling How to stall an instruction in ID stage
prevent IF/ID pipeline register update stalls the ID stage instruction convert ID stage instr into nop for later stages innocuous “bubble” passes through pipeline prevent PC update stalls the next (IF stage) instruction prev slide: Q: what happens if branch in EX? A: bad stuff: we miss the branch So: lets try not to stall

37 Detecting Data Hazards
add r3, r1, r2 sub r5, r3, r5 or r6, r3, r4 add r6, r3, r8 inst mem Rd Ra Rb D B A inst PC+4 OP B A Rt D M imm Rd addr din dout +4 mem detect hazard PC Rd If detect hazard MemWr=0 RegWr=0 WE=0 IF/ID ID/EX EX/MEM MEM/WB

38 Stalling time add r3, r1, r2 sub r5, r3, r5 or r6, r3, r4
Clock cycle time 1 2 3 4 5 6 7 8 add r3, r1, r2 sub r5, r3, r5 or r6, r3, r4 add r6, r3, r8 IF ID 11=r1 22=r2 EX D=33 MEM D=33 WB r3=33 ID ?=r3 33=r3 EX MEM M find hazards first, then draw

39 Stalling IF ID Ex M W IF ID ID ID ID Ex M W IF IF IF IF ID Ex M IF ID
Clock cycle time 1 2 3 4 5 6 7 8 add r3, r1, r2 sub r5, r3, r5 or r6, r3, r4 add r6, r3, r8 IF ID 11=r1 22=r2 EX D=33 MEM D=33 WB r3=33 ID ?=r3 33=r3 EX MEM M r3 = 10 IF ID Ex M W r3 = 20 3 Stalls Stall IF ID ID ID ID Ex M W IF IF IF IF ID Ex M find hazards first, then draw IF ID Ex

40 Stalling nop /stall A A D D inst mem D rD B B data mem rA rB B M
+4 Rd Rd Rd (MemWr=0 RegWr=0) WE PC WE WE nop Op Op Op sub r5,r3,r5 add r3,r1,r2 draw control signals for /stall Q: what happens if branch in EX? A: bad stuff: we miss the branch So: lets try not to stall (WE=0) or r6,r3,r4 /stall NOP = If(IF/ID.rA ≠ 0 && (IF/ID.rA==ID/Ex.Rd IF/ID.rA==Ex/M.Rd IF/ID.rA==M/W.Rd))

41 Stalling nop nop /stall A A D D inst mem D rD B B data mem rA rB B M
+4 Rd Rd Rd (MemWr=0 RegWr=0) WE PC WE WE nop (MemWr=0 RegWr=0) Op Op Op nop sub r5,r3,r5 add r3,r1,r2 draw control signals for /stall Q: what happens if branch in EX? A: bad stuff: we miss the branch So: lets try not to stall (WE=0) or r6,r3,r4 /stall NOP = If(IF/ID.rA ≠ 0 && (IF/ID.rA==ID/Ex.Rd IF/ID.rA==Ex/M.Rd IF/ID.rA==M/W.Rd))

42 Stalling nop nop nop /stall A A D D inst mem D rD B B data mem rA rB B
+4 Rd Rd Rd (MemWr=0 RegWr=0) WE PC WE WE nop (MemWr=0 RegWr=0) (MemWr=0 RegWr=0) Op Op Op nop nop sub r5,r3,r5 add r3,r1,r2 draw control signals for /stall Q: what happens if branch in EX? A: bad stuff: we miss the branch So: lets try not to stall (WE=0) or r6,r3,r4 /stall NOP = If(IF/ID.rA ≠ 0 && (IF/ID.rA==ID/Ex.Rd IF/ID.rA==Ex/M.Rd IF/ID.rA==M/W.Rd))

43 Stalling How to stall an instruction in ID stage
prevent IF/ID pipeline register update stalls the ID stage instruction convert ID stage instr into nop for later stages innocuous “bubble” passes through pipeline prevent PC update stalls the next (IF stage) instruction prev slide: Q: what happens if branch in EX? A: bad stuff: we miss the branch So: lets try not to stall

44 Takeaway Data hazards occur when a operand (register) depends on the result of a previous instruction that may not be computed yet. A pipelined processor needs to detect data hazards. Stalling, preventing a dependent instruction from advancing, is one way to resolve data hazards. Stalling introduces NOPs (“bubbles”) into a pipeline. Introduce NOPs by (1) preventing the PC from updating, (2) preventing writes to IF/ID registers from changing, and (3) preventing writes to memory and register file. Bubbles in pipeline significantly decrease performance.

45 Next Goal: Resolving Data Hazards via Forwarding
What to do if data hazard detected? Wait/Stall Reorder in Software (SW) Forward/Bypass Nothing: Modify ISA to match implementation Stall: Pause current and all subsequent instruction Forward/Bypass: Steal correct value from elsewhere in pipeline

46 Forwarding Forwarding bypasses some pipelined stages forwarding a result to a dependent instruction operand (register). Three types of forwarding/bypass Forwarding from Ex/Mem registers to Ex stage (MEx) Forwarding from Mem/WB register to Ex stage (WEx) RegisterFile Bypass

47 Forwarding Datapath D B A A D D inst mem B data mem B M detect hazard
imm Rd Rd detect hazard Rb forward unit WE WE Ra draw control signals MC MC IF/ID ID/Ex Ex/Mem Mem/WB Three types of forwarding/bypass Forwarding from Ex/Mem registers to Ex stage (MEx) Forwarding from Mem/WB register to Ex stage (W  Ex) RegisterFile Bypass

48 Forwarding Datapath D B A A D D inst mem B data mem B M detect hazard
imm Rd Rd detect hazard Rb forward unit WE WE Ra draw control signals MC MC IF/ID ID/Ex Ex/Mem Mem/WB Three types of forwarding/bypass Forwarding from Ex/Mem registers to Ex stage (MEx) Forwarding from Mem/WB register to Ex stage (W  Ex) RegisterFile Bypass

49 “earlier” = started earlier
Forwarding Datapath 1 Ex/MEM to EX Bypass EX needs ALU result that is still in MEM stage Resolve: Add a bypass from EX/MEM.D to start of EX How to detect? Logic in Ex Stage: forward = (Ex/M.WE && EX/M.Rd != 0 && ID/Ex.Ra == Ex/M.Rd) || (same for Rb) destination reg of earlier instruction == source reg of current stage left “earlier” = started earlier = stage right assumes (WE=0 implies rD=0) everywhere, similar for rA needed, rB needed (can ensure this in ID stage)

50 Forwarding Datapath D B A A D D inst mem B data mem B M detect hazard
imm Rd Rd detect hazard Rb forward unit WE WE Ra draw control signals MC MC IF/ID ID/Ex Ex/Mem Mem/WB Three types of forwarding/bypass Forwarding from Ex/Mem registers to Ex stage (MEx) Forwarding from Mem/WB register to Ex stage (W  Ex) RegisterFile Bypass

51 Forwarding Datapath 1 IF ID Ex M W IF ID Ex M W A D add r3, r1, r2
B A inst mem data mem add r3, r1, r2 sub r5, r3, r1 IF ID Ex M W IF ID Ex M W

52 “earlier” = started earlier
Forwarding Datapath 2 Mem/WB to EX Bypass EX needs value being written by WB Resolve: Add bypass from WB final value to start of EX How to detect? Logic in Ex Stage: forward = (M/WB.WE && M/WB.Rd != 0 && ID/Ex.Ra == M/WB.Rd && not (ID/Ex.WE && Ex/M.Rd != 0 && ID/Ex.Ra == Ex/M.Rd) || (same for Rb) destination reg of earlier instruction == source reg of current stage left “earlier” = started earlier = stage right assumes (WE=0 implies rD=0) everywhere, similar for rA needed, rB needed (can ensure this in ID stage) Check pg. 369

53 Forwarding Datapath D B A A D D inst mem B data mem B M detect hazard
imm Rd Rd detect hazard Rb forward unit WE WE Ra draw control signals MC MC IF/ID ID/Ex Ex/Mem Mem/WB Three types of forwarding/bypass Forwarding from Ex/Mem registers to Ex stage (MEx) Forwarding from Mem/WB register to Ex stage (W  Ex) RegisterFile Bypass

54 Forwarding Datapath 2 IF ID Ex M W IF ID Ex M W IF ID Ex M W A D
B A inst mem data mem add r3, r1, r2 sub r5, r3, r1 or r6, r3, r4 IF ID Ex M W IF ID Ex M W IF ID Ex M W

55 Register File Bypass Register File Bypass Detect:
Reading a value that is currently being written Detect: ((Ra == MEM/WB.Rd) or (Rb == MEM/WB.Rd)) and (WB is writing a register) Resolve: Add a bypass around register file (WB to ID) Better: (Hack) just negate register file clock writes happen at end of first half of each clock cycle reads happen during second half of each clock cycle

56 Forwarding Datapath D B A A D D inst mem B data mem B M detect hazard
imm Rd Rd detect hazard Rb forward unit WE WE Ra draw control signals MC MC IF/ID ID/Ex Ex/Mem Mem/WB Three types of forwarding/bypass Forwarding from Ex/Mem registers to Ex stage (MEx) Forwarding from Mem/WB register to Ex stage (W  Ex) RegisterFile Bypass

57 Register File Bypass IF ID Ex M W IF ID Ex M W IF ID Ex M W IF ID Ex M
inst mem data mem add r3, r1, r2 sub r5, r3, r1 or r6, r3, r4 add r6, r3, r8 IF ID Ex M W IF ID Ex M W IF ID Ex M W IF ID Ex M W

58 Forwarding Example IF ID Ex M W IF ID Ex M W IF ID Ex M W IF ID Ex M W
Clock cycle time 1 2 3 4 5 6 7 8 add r3, r1, r2 sub r5, r3, r5 or r6, r3, r4 add r6, r3, r8 IF ID 11=r1 22=r2 EX D=33 MEM D=33 WB r3=33 ID 33=r3 EX MEM ID 33=r3 r3 = 10 IF ID Ex M W r3 = 20 IF ID Ex M W IF ID Ex M W IF ID Ex M W

59 Forwarding Example 2 IF ID Ex M W IF ID Ex M W IF ID Ex M W IF ID Ex M
time Clock cycle 1 2 3 4 5 6 7 8 add r3, r1, r2 sub r5, r3, r4 lw r6, 4(r3) or r5, r3, r5 sw r6, 12(r3) IF ID 11=r1 22=r2 EX D=33 MEM D=33 WB r3=33 ID 33=r3 EX D=55 MEM r5=55 ID 33=r3 D=? MEM D=66 r6=66 ID 33=r3 55=r5 66=r6 IF ID Ex M W IF ID Ex M W IF ID Ex M W IF ID Ex M W identify hazards then draw; if “or” had used r6, would have needed to stall IF ID Ex M W

60 Forwarding Datapath D B A A D D inst mem B data mem B M detect hazard
imm Rd Rd detect hazard Rb forward unit WE WE Ra draw control signals MC MC IF/ID ID/Ex Ex/Mem Mem/WB Three types of forwarding/bypass Forwarding from Ex/Mem registers to Ex stage (MEx) Forwarding from Mem/WB register to Ex stage (W  Ex) Register File Bypass

61 Takeaway Data hazards occur when a operand (register) depends on the result of a previous instruction that may not be computed yet. A pipelined processor needs to detect data hazards. Stalling, preventing a dependent instruction from advancing, is one way to resolve data hazards. Stalling introduces NOPs (“bubbles”) into a pipeline. Introduce NOPs by (1) preventing the PC from updating, (2) preventing writes to IF/ID registers from changing, and (3) preventing writes to memory and register file. Bubbles (nops) in pipeline significantly decrease performance. Forwarding bypasses some pipelined stages forwarding a result to a dependent instruction operand (register). Better performance than stalling.

62 Administrivia Prelim1: next Tuesday, February 26th in evening
Time: We will start at 7:30pm sharp, so come early Prelim Review: This Thur 6-8pm in Upson B14 and Fri, 5-7pm in Phillips 203 Closed Book Cannot use electronic device or outside material Practice prelims are online in CMS Material covered everything up to end of this week Appendix C (logic, gates, FSMs, memory, ALUs) Chapter 4 (pipelined [and non-pipeline] MIPS processor with hazards) Chapters 2 (Numbers / Arithmetic, simple MIPS instructions) Chapter 1 (Performance) HW1, HW2, Lab0, Lab1, Lab2

63 Administrivia HW2 is due tomorrow Project1 (PA1) due week after prelim
Fill out Survey online. Receive credit/points on homework for survey: Should have received from Kathryn Dimiduk Survey is anonymous Project1 (PA1) due week after prelim Continue working diligently. Use design doc momentum Save your work! Save often. Verify file is non-zero. Periodically save to Dropbox, . Beware of MacOSX 10.5 (leopard) and 10.6 (snow-leopard) Use your resources Lab Section, Piazza.com, Office Hours, Homework Help Session, Class notes, book, Sections, CSUGLab

64 Administrivia Check online syllabus/schedule
Slides and Reading for lectures Office Hours Homework and Programming Assignments Prelims (in evenings): Tuesday, February 26th Thursday, March 28th Thursday, April 25th Schedule is subject to change

65 Collaboration, Late, Re-grading Policies
“Black Board” Collaboration Policy Can discuss approach together on a “black board” Leave and write up solution independently Do not copy solutions Late Policy Each person has a total of four “slip days” Max of two slip days for any individual assignment Slip days deducted first for any late assignment, cannot selectively apply slip days For projects, slip days are deducted from all partners 25% deducted per day late after slip days are exhausted Regrade policy Submit written request to lead TA, and lead TA will pick a different grader Submit another written request, lead TA will regrade directly Submit yet another written request for professor to regrade.

66 Quiz Find all hazards, and say how they are resolved: add r3, r1, r2 sub r3, r2, r1 nand r4, r3, r1 or r0, r3, r4 xor r1, r4, r3 sb r4, 1(r0) 6 hazards be careful of store (no rD) be careful or r0 (never a hazard)

67 Quiz Find all hazards, and say how they are resolved: add r3, r1, r2 sub r3, r2, r1 nand r4, r3, r1 or r0, r3, r4 xor r1, r4, r3 sb r4, 1(r0) 6 hazards be careful of store (no rD) be careful or r0 (never a hazard) Hours and hours of debugging!

68 Data Hazard Recap Stall Forward/Bypass Tradeoffs?
Pause current and all subsequent instructions Forward/Bypass Try to steal correct value from elsewhere in pipeline Otherwise, fall back to stalling or require a delay slot Tradeoffs? 1: ISA tied to impl; messy asm; impl easy & cheap; perf depends on asm. 2: ISA correctness not tied to impl; clean+slow asm, or messy+fast asm; impl easy and cheap; same perf as 1. 3: ISA perf not tied to impl; clean+fast asm; impl is tricky; best perf


Download ppt "Pipeline Hazards Hakim Weatherspoon CS 3410, Spring 2013"

Similar presentations


Ads by Google