Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pipelining (II).

Similar presentations


Presentation on theme: "Pipelining (II)."— Presentation transcript:

1 Pipelining (II)

2 Single-Cycle Datapath
W B : r i t e b a c k M E m o y s I F n u f h D d / g l X x A R 1 2 L U Z S 4 P C 6 3 What do we need to add to actually split the datapath into stages?

3 Pipelined Datapath Pipelined Registers A d r e s I n t u c i o m y R a
1 2 W D l L U Z S h f x P C 4 / E X F M B 6 3 Pipelined Registers

4 Corrected Datapath A d r e s I n t u c i o m y R a g 1 2 W D l L U Z S h f x P C 4 / E X F M B 6 3 WriteRegister Index is passed through pipelined register

5 Control in Pipelined Implementation
W r i t P C S c o R g a d A s I n u y 1 2 ( 5 : ) 0 : 6 D l L U Z h f x 4 / E X F B 3 O p Borrow control logic from single cycle implementation * Taken-branch target is updated at 4th stage (MEM)

6 Pipelined Control Divide control lines into five groups according to pipeline stages What needs to be controlled in each stage? Instruction Fetch and PC Increment Instruction Decode / Register Fetch Execution Memory Stage Write Back

7 Pipeline Control Pass control signals along just like the data C o n t
X M W B I F / D s u c i

8 Datapath with Control Control signals for last three stages
B M E X P C S r c e m R a d A s I n t u i o y g 1 2 [ 5 ] 6 D l L U Z h f x 4 / F 3 O p Control signals for last three stages created in ID

9 Hazard Detection & Forwarding Units
Stall by inserting a bubble instead of letting instruction proceed Send control signals to MUXs in front of ALU M W B D a t m e o r y I n s u c i x A L U / E X F w d g P C l H z . R IF/IDWrite

10 Forwarding Example Time(clock cycles) IM Reg ALU DM IM Reg ALU DM IM
CC CC CC CC CC CC CC CC CC9 IM Reg ALU DM add $t1,$t2,$t3 IM Reg ALU DM sub $t4,$t1,$t3 IM Reg ALU DM lw $t6,20($t1) ALU DM Reg IM or $t7,$t6,$t3 xor $t5,$t2,$t6 IM Reg bubble

11 Forwarding

12 Hazard Detection & Forwarding
1. EX Hazard (Forward from EX/MEM to ID/EX) add $1, $1, $2 add $1, $1, $3 If (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = /* from prior ALU result */ Similar forwarding logic for RegisterRt and ForwardB IF ID MEM WB EX R-type or lw $0 is hardwired zero Register Indexes (from pipeline register)

13 Hazard Detection & Forwarding (cont’d)
2. MEM Hazard (Forward from MEM/WB to ID/EX) add $1, $1, $2 add $3, $3, $2 add $5, $1, $4 If (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = /* from memory or earlier ALU result */ Similar forwarding logic for RegisterRt and ForwardB IF ID MEM WB EX

14 Hazard Detection & Forwarding (cont’d)
One complication add $1, $1, $2 add $1, $1, $3 add $1, $1, $4 If (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (EX/MEM.RegisterRd ≠ ID/EX.RegisterRs) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 Similar modification for RegisterRt in MEM hazard is needed IF ID MEM WB EX X

15 Hazard Detection Logic
Load-Use Hazard needs one stall If ( ID/EX.MemRead and ( (ID/EX.RegisterRt = IF/ID.RegisterRs) or (ID/EX.RegisterRt = IF/ID.RegisterRt) ) ) Stall the pipeline (by inserting nop to EX) RegisterRt is a destination for lw IF ID MEM WB EX lw $s0,20($t1) bubble bubble bubble IF ID sub $t2,$s0,$t3 EX IF ID MEM WB

16 Inserting Bubble for Stall
If instruction in the ID stage stall instruction in the IF stage must be stalled as well Stalling means preventing any progress Prevent PC and IF/ID from changing (deassert PCWrite, IF/IDWrite) Keep the old values Insert nop in the EX stage Set the EX, MEM and WB control fields to 0 At least, deassert RegWrite and MemWrite not to change state IF ID MEM WB EX

17 Branch Prediction and Flush
Static branch prediction Assume all branches will be not-taken (fall-through) Pipeline stall only when prediction is incorrect add $4, $5, $6 IF ID EXE MEM WB beq $1, $2, 40 200 ps lw $3, 300($0) 200 ps add $4, $5, $6 IF ID EXE MEM WB bubble beq $1, $2, 40 200 ps IF or $7, $8, $9 400 ps

18 Flushing Instructions
a z d e c i u + 4 P I s m y S g x R = F w A L U D / E X M W B h f 2 . Flush an instruction at IF stage, if prediction fails Branch decision is moved to ID stage This requires forwarding and stall logics for comparison at ID

19 Branch Prediction Static “not-taken” prediction
A penalty of one cycle for taken branch Deeper pipelines, penalty increases and drastically hurts performance Need dynamic branch prediction Branch prediction buffer or Branch history table Last k bits form PC Predict Taken (11) Predict Taken (10) Predict Not-taken (01) Predict Not-taken (00) Taken Not-taken k-bits 2k entries A 2-bit prediction scheme for BHT 2-bit predictors

20 Branch Target Buffer Branch prediction still needs one cycle to calculate target address even after prediction (taken or not-taken) Need to fill in IF stage at every cycle Branch PC Predicted PC PC of instruction to fetch = Yes: instruction is branch use predicted PC as next PC No: instruction is not branch or predicted not-take branch Next PC = PC + 4 prediction state bits

21 Advanced Pipelining Increase the depth of the pipeline
Branch prediction is 95% accurate Correlated prediction, tournament prediction Start more than one instruction each cycle (multiple issue) Superscalar processors (multiple functional units) VLIW: very long instruction word, static multiple issue Loop unrolling to expose more ILP (instruction level parallelism) Modern processors often implement Superscalar Multiple issue Out-of-order execution

22 Summary Pipelined processors need Pipelined registers
To pass intermediate data and control for later stages Forwarding logic & stall logic To reduce stalls due to data hazard To correctly delay the pipeline for load-use data hazard Flush logic To flush out instructions from incorrect execution paths Branch history table or branch target buffer To reduce the stalls from control hazard


Download ppt "Pipelining (II)."

Similar presentations


Ads by Google