PROCESSOR PIPELINING YASSER MOHAMMAD
SINGLE DATAPATH DESIGN
ELEMENT USAGE DURING EXECUTION
ENTER STAGE REGISTERS
SEQUENCE OF OPERATION OF A LW INSTRUCTION
CORRECTED WB IN LW
LW
THIRD STAGE OF SW
KEY POINTS Each component can be used in at most one stage to avoid structural hazards Data needed in later stages can be passed through stage registered from the generating stage Always check where current addresses are coming from
DRAWING PIPELINES
DESIGNING A CONTROLLER FOR THE PIPELINE Take as much as you can from the single clock cycle design See the world using rose-colored glasses Steps: Label the lines Divide control lines per stage (remember one component one stage) Design a control circuit for each
THE LINES
ALU CONTROL
CONTROL LINES
CONTROL SIGNAL PASSING
THE COMPLETE BEAST
HAZARDS sub $2, $1,$3 and $12,$2,$5 or $13,$6,$2 add $14,$2,$2 sw $15,100($2)
IN THE PIPELINE 1a. EX/MEM.RegisterRd = ID/EX.RegisterRs 1b. EX/MEM.RegisterRd = ID/EX.RegisterRt 2a. MEM/WB.RegisterRd = ID/EX.RegisterRs 2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
FORWARDING HARDWARE Ignores forwarding to a store
LOOK CAREFULLY. ANY HAZARDS?
THE DATAPATH WITH FORWARDING AND ITS CONTROL
Lw $s0, 0($s1) Sw $s0,4($s1)
DEALING WITH THE IMMEDIATE
WHEN STALLING IS A MUST …. How many stalls will we need: Without forwarding? With Forwarding?
NOTES ON STALLING When stalling in ID, we must stall in IF. Why? How? Freeze PC and IF/ID
SRALLING + FORWARDING HARDWARE
CONTROL HAZARDS
RESOLUTION 1: PREDICT BRANCH TAKEN Continue fetching If the branch is taken (known in the MEM stage of the branch) Set all control signals in IF/ID, ID/EX,EX/MEM to unasserted (0) DONE
RESOLUTION 2: DECIDE EARLIER Most branches use simple tests that do not require a complete ALU What do we need? Calculate the address of the branch early Add an ADDER in the ID stage Compute the decision to branch early More involved but can be done in ID(for equality at least) Another source of data hazards. What is it? New logic is needed to forward to ID from EX/MEM or MEM/WB Extra sources of stalling: R-instruction followed by branch Load followed by branch Forwarding TO the barnch in ID One stall Two stalls
DYNAMIC BRANCH PREDICTION Branch prediction buffer Branch history table Indexed by LSBs of the branch address Prediction helps in decision calculation but not address calculation Use Delayed branch Branch target buffer Global branch behavior Tournament predictors
DELAYED BRANCH Only effective for short pipelines The compiler/assembler is responsible of rescheduling.
PUTTING IT ALL TOGETHER
EXCEPTIONS AND INTERRUPTS Arithmetic overflow Undefined instruction Basic action Save PC to EPC and Cause to Cause call the OS (by jumping to 0x ) How does the OS know the reason of the exception: Cause register (MIPS) Vectored interrupts (x86)
PIPELINED IMPLEMENTATION Exceptions are control hazards