Pipelining III Andreas Klappenecker CPSC321 Computer Architecture
Administrative Issues Talk by Laszlo Kish Quantum Computing Seminar, Thursday 10:00am-11:00am, HRBB 302 Projects: Get started!!!
Pipelined Datapath Pipeline separation registers, width varies
Control Lines Instruction fetch: control signal to read instruction memory and to write PC are always asserted - nothing special here Instruction decode/register file read: same thing happens every clock cycle, so no optional control lines to set Execution/address calculation RegDst selects the result register, ALUOp selects the ALU operation ALUSrc selects Read data 2 or sign-extd. immediate
Pipelined Datapath w/ Controls Signals
Control Lines Memory access Branch set by branch equal MemRead set by load instructions MemWrite set by store instructions Write back MemtoReg send ALU result or memory value RegWrite selects register
Pipelined Datapath w/ Controls Signals
Pass control signals along just like the data Pipeline Control
Datapath with Control
Assume that the compiler has to guarantee that no hazards occur Where do we insert the “nops” ? sub$2, $1, $3 and $12, $2, $5 or$13, $6, $2 add$14, $2, $2 sw$15, 100($2) Data Hazards
Data hazard: a dependency that “goes backward in time” Dependencies
Solution sub$2, $1, $3 nop and $12, $2, $5 or$13, $6, $2 add$14, $2, $2 sw$15, 100($2) Problem: this slows us down! Resolution of Data Hazards
Forwarding Do not wait until result have been written Use temporary results! Use register file forwarding to handle read/write to same register ALU forwarding
Forwarding
Load word can still cause a hazard: an instruction trying to read a register following a load instruction writing to the same register. Need a hazard detection unit to “stall” pipeline Obstructions to Forwarding
Stalling We can stall the pipeline by keeping an instruction in the same stage
Hazard Detection Unit Stall by letting an instruction that won’t write anything go forward
When we decide to branch, other instructions are in the pipeline! We are predicting “branch not taken” need to add hardware for flushing instructions if we are wrong Branch Hazards
Flushing Move branch decision from 4th pipeline stage to the second only one instruction following the branch will be in the pipeline IF.Flush turns fetched instruction into a nop by zeroing the IF/ID pipeline register
Flushing Instructions
Improving Performance Try and avoid stalls by reordering instructions Add a “branch delay slot” the next instruction after a branch is always executed rely on compiler to “fill” the slot with something useful Superscalar: start more than one instruction in the same cycle
Dynamic Scheduling The hardware performs the “scheduling” hardware tries to find instructions to execute out of order execution is possible speculative execution and dynamic branch prediction All modern processors are very complicated DEC Alpha 21264: 9 stage pipeline, 6 instruction issue PowerPC and Pentium: branch history table Compiler technology important This class has given you the background you need to learn more - read Chapter 6! More material will be posted!