Download presentation
Presentation is loading. Please wait.
1
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 nestorj@lafayette.edu ECE 313 - Computer Organization Lecture 18 - Pipelined Processor Design 2 Fall 2004 Reading: 6.3-6.6, 6.8 Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s 18-347 Slides - Fall 1999 CMU other sources as noted
2
ECE 313 Fall 2004Lecture 18 - Pipelining 22 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
3
ECE 313 Fall 2004Lecture 18 - Pipelining 23 Pipelining in MIPS MIPS architecture was designed to be pipelined Simple instruction format (makes IF, ID easy) Single-word instructions Small number of instruction formats Common fields in same place (e.g., rs, rt) in different formats Memory operations only in lw, sw instructions (simplifies EX) Memory operands aligned in memory (simplifies MEM) Single value for writeback (limits forwarding) Pipelining is harder in CISC architectures
4
ECE 313 Fall 2004Lecture 18 - Pipelining 24 Pipelined Datapath with Control Signals
5
ECE 313 Fall 2004Lecture 18 - Pipelining 25 Next Step: Adding Control Basic approach: build on single-cycle control Place control unit in ID stage Pass control signals to following stages Later: extra features to deal with: Data forwarding Stalls Exceptions
6
ECE 313 Fall 2004Lecture 18 - Pipelining 26 Control for Pipelined Datapath Source: Book Fig. 6.29, p 469 RegDst ALUOp[1:0] ALUSrc MemRead MemWrite Branch RegWrite MemtoReg
7
ECE 313 Fall 2004Lecture 18 - Pipelining 27 Control for Pipelined Datapath Source: Book Fig. 6.25, p 401
8
ECE 313 Fall 2004Lecture 18 - Pipelining 28 Datapath and Control Unit
9
ECE 313 Fall 2004Lecture 18 - Pipelining 29 Tracking Control Signals - Cycle 1 LW
10
ECE 313 Fall 2004Lecture 18 - Pipelining 210 Tracking Control Signals - Cycle 2 SWLW
11
ECE 313 Fall 2004Lecture 18 - Pipelining 211 Tracking Control Signals - Cycle 3 ADDSWLW 0 01 1
12
ECE 313 Fall 2004Lecture 18 - Pipelining 212 Tracking Control Signals - Cycle 4 SUBADD SW LW 1 0 0
13
ECE 313 Fall 2004Lecture 18 - Pipelining 213 1 1 ADD Tracking Control Signals - Cycle 5 SUB SW LW
14
ECE 313 Fall 2004Lecture 18 - Pipelining 214 Pipelining Outline - Coming Up Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
15
ECE 313 Fall 2004Lecture 18 - Pipelining 215 Data Hazards Revisited… Data hazards occur when data is used before it is stored (Fig. 6.28)
16
ECE 313 Fall 2004Lecture 18 - Pipelining 216 Data Hazard Solution: Forwarding Key idea: connect data internally before it's stored (Fig. 6.29)
17
ECE 313 Fall 2004Lecture 18 - Pipelining 217 Data Hazard Solution: Forwarding Add hardware to feed back ALU and MEM results to both ALU inputs (Fig. 6.32)
18
ECE 313 Fall 2004Lecture 18 - Pipelining 218 Controlling Forwarding Need to test when register numbers match in rs, rt, and rd fields stored in pipeline registers "EX" hazard: EX/MEM - test whether instruction writes register file and examine rd register ID/EX - test whether instruction reads rs or rt register and matches rd register in EX/MEM "MEM" hazard: MEM/WB - test whether instruction writes register file and examine rd (rt) register ID/EX - test whether instruction reads rs or rt register and matches rd (rt) register in EX/MEM
19
ECE 313 Fall 2004Lecture 18 - Pipelining 219 Forwarding Unit Detail - EX Hazard if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10 if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10
20
ECE 313 Fall 2004Lecture 18 - Pipelining 220 Forwarding Unit Detail - MEM Hazard if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01
21
ECE 313 Fall 2004Lecture 18 - Pipelining 221 EX Hazard Complication What if a register is changed more than once? add $1, $1, $2; add $1, $1, $3; add $1, $1, $4; Answer: forward most recent result (in MEM stage)
22
ECE 313 Fall 2004Lecture 18 - Pipelining 222 Forwarding Unit Detail - MEM Hazard Revised if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (EX/MEM.RegisterRd ≠ ID/EX.RegisterRs) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (EX/MEM.RegisterRd ≠ ID/EX.RegisterRt) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01
23
ECE 313 Fall 2004Lecture 18 - Pipelining 223 Fig (6.33) Forwarding Elaboration Extra 2-1 mux needed for immediate instructions Added Mux
24
ECE 313 Fall 2004Lecture 18 - Pipelining 224 Data Hazards and Stalls We still have to stall when register is loaded from memory and used in following instruction (Fig. 6.34)
25
ECE 313 Fall 2004Lecture 18 - Pipelining 225 Data Hazards and Stalls Add a hazard detection unit to detect this condition and stall (Fig. 6.35) Typo: Should read AND
26
ECE 313 Fall 2004Lecture 18 - Pipelining 226 Pipelined Processor with Hazard Detection (Fig. 6.36)
27
ECE 313 Fall 2004Lecture 18 - Pipelining 227 Data Transfer Instructions - Binary Representation Used for load, store instructions op: Basic operation of the instruction (opcode) rs: first register source operand rt: second register source operand offset: 16-bit signed address offset (-32,768 to +32,767) Also called “I-Format” or “I-Type” instructions oprsrtoffset 6 bits5 bits 16 bits Address source for sw destination for lw
28
ECE 313 Fall 2004Lecture 18 - Pipelining 228 Hazard Detection Unit - Control Detail if (ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or ((ID/EX.RegisterRt = IF/ID.RegisterRt))) stall
29
ECE 313 Fall 2004Lecture 18 - Pipelining 229 Hazard detection unit - what happens MUX zeros out control signals for instruction in ID "squashes” the instruction “no-op” propagates through following stages IF/ID holds stalled instruction until next clock cycle PC holds current value until next clock cycle (re- loads first instruction)
30
ECE 313 Fall 2004Lecture 18 - Pipelining 230 Branch Hazards Just stalling for each branch is not practical Common assumption: branch not taken When assumption fails: flush three instructions (Fig. 6.37)
31
ECE 313 Fall 2004Lecture 18 - Pipelining 231 Reducing Branch Delay Key idea: move branch logic to ID stage of pipeline New adder calculates branch target (PC + 4 + extend(IMM) << 2) New hardware tests rs == rt after register read Add flush signal to squash instruction in IF/ID register Reduced penalty (1 cycle) when branch taken Example: Figure 6.38, p. 420
32
ECE 313 Fall 2004Lecture 18 - Pipelining 232 Pipelined Processor - Branch Hardware in ID (Old Fig. 6.51)
33
ECE 313 Fall 2004Lecture 18 - Pipelining 233 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
34
ECE 313 Fall 2004Lecture 18 - Pipelining 234 Branch Prediction Key idea: instead of always assuming branch not taken, use a prediction based on previous history Branch history table: small memory index using lower bits instruction address save “what happened” on last execution –branch taken OR –branch not taken Use history to make prediction
35
ECE 313 Fall 2004Lecture 18 - Pipelining 235 More about Branch Prediction Consider nested loops: for (i=1; i<M; i++) { oloop:... for (j=1;j<N; j++) { iloop:......... } bne $1,$2, iloop } bne $3,$4, oloop Prediction fails on first and last branch More history can improve performance
36
ECE 313 Fall 2004Lecture 18 - Pipelining 236 Branch Prediction w/2-Bit History Key idea: must be wrong twice before changing prediction
37
ECE 313 Fall 2004Lecture 18 - Pipelining 237 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
38
ECE 313 Fall 2004Lecture 18 - Pipelining 238 Pipelining and Exceptions Exceptions require suspension of execution Complicating factors Several instructions are in pipeline Exception may occur before instruction is complete Must flush pipeline to suspend execution, but may lose information about the exception
39
ECE 313 Fall 2004Lecture 18 - Pipelining 239 Pipelining and Exceptions (cont’d) (Fig. 6.42, old 6.55)
40
ECE 313 Fall 2004Lecture 18 - Pipelining 240 Pipelining and Exceptions (cont’d) Operation: Figure 6.43 (p. 508) Exceptions make life difficult - take a computer architecture course to learn more.
41
ECE 313 Fall 2004Lecture 18 - Pipelining 241 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
42
ECE 313 Fall 2004Lecture 18 - Pipelining 242 Performance of the Pipelined Implementation Use “gcc” instr. mix to calculate CPI lw25%1 cycle (2 cycles when load-use hazard) sw10%1 cycle R-type52%1 cycle branch11%1 cycle (2 when prediction wrong) jump2%2 cycles Assumptions: 50% of load instructions are followed by immed. use 25% of branch predictions are wrong Calculating CPI CPI = (1.5 cycles * 0.25) + (1 cycle * 0.10) + (1 cycle * 0.52) + (1.25 cycles * 0.11) + (2 cycles * 0.02) CPI = 1.17 cycles per instruction
43
ECE 313 Fall 2004Lecture 18 - Pipelining 243 Performance of the Pipelined Implementation (cont’d) Calculate the average execution time: Pipelined1.17 CPI * 200ps/clock= 234ps Single-Cycle 1 CPI * 600ps/clock=600ps Multicycle4.12 CPI * 200ps / clock=824ps Speedup of pipelined implementation 2.56X faster than single cycle 3.4X faster than multicycle CPI may differ as instruction mix changes, id est, depending on the performance benchmarks
44
ECE 313 Fall 2004Lecture 18 - Pipelining 244 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
45
ECE 313 Fall 2004Lecture 18 - Pipelining 245 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
46
ECE 313 Fall 2004Lecture 18 - Pipelining 246 Pipelining in MIPS MIPS architecture was designed to be pipelined Simple instruction format (makes IF, ID easy) Single-word instructions Small number of instruction formats Common fields in same place (e.g., rs, rt) in different formats Memory operations only in lw, sw instructions (simplifies EX) Memory operands aligned in memory (simplifies MEM) Single value for writeback (limits forwarding) Pipelining is harder in CISC architectures
47
ECE 313 Fall 2004Lecture 18 - Pipelining 247 Next Step: Adding Control Basic approach: build on single-cycle control Place control unit in ID stage Pass control signals to following stages Later: extra features to deal with: Data forwarding Stalls Exceptions
48
ECE 313 Fall 2004Lecture 18 - Pipelining 248 Pipelining Outline - Coming Up Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
49
ECE 313 Fall 2004Lecture 18 - Pipelining 249 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
50
ECE 313 Fall 2004Lecture 18 - Pipelining 250 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
51
ECE 313 Fall 2004Lecture 18 - Pipelining 251 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
52
ECE 313 Fall 2004Lecture 18 - Pipelining 252 Performance of the Pipelined Implementation Use “gcc” instr. mix to calculate CPI lw25%1 cycle (2 cycles when load-use hazard) sw10%1 cycle R-type52%1 cycle branch11%1 cycle (2 when prediction wrong) jump2%2 cycles Assmptions: 50% of load instructions are followed by immed. use 25% of branch predictions are wrong Calculating CPI CPI = (1.5 cycles * 0.25) + (1 cycle * 0.10) + (1 cycle * 0.52) + (1.25 cycles * 0.11) + (2 cycles * 0.02) CPI = 1.17 cycles per instruction
53
ECE 313 Fall 2004Lecture 18 - Pipelining 253 Performance of the Pipelined Implementation (cont’d) Calculate the average execution time: Pipelined1.17 CPI * 200ps/clock= 234ps Single-Cycle 1 CPI * 600ps/clock=600ps Multicycle4.12 CPI * 200ps / clock=824ps Speedup of pipelined implementation 2.56X faster than single cycle 3.4X faster than multicycle “Your mileage may differ” as instruction mix changes
54
ECE 313 Fall 2004Lecture 18 - Pipelining 254 Pipelining Outline Introduction Pipelined Processor Design Datapath Control Dealing with Hazards & Forwarding Branch Prediction Exceptions Performance Advanced Pipelining Superscalar Dynamic Pipelining Examples
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.