Download presentation
Presentation is loading. Please wait.
Published byAnnabel Anderson Modified over 9 years ago
1
CSCE 212 Chapter 6 Enhancing Performance with Pipelining Instructor: Jason D. Bakos
2
CSCE 212 2 Pipelining
3
CSCE 212 3 MIPS Pipeline Basic idea: –Execute multiple instructions in parallel –Split instruction execution into 5 stages –Instructions execute in “assembly-line” PCRegFile control ALU fetchdecodeexecutememorywrite back address MemoryDataIn rs/rt A instruction register op/func 4 SE/imm SE/imm*4 B SHAMT MemRead MemWrite Address MemoryOut MemoryIn R register control for: memory/wb rs/rt/rd ctrl/NOOP R A, B registers control for: execute/memory/wb rs/rt/rd MDR register control for: wb rs/rt/rd
4
CSCE 212 4 Pipelined MIPS
5
CSCE 212 5 Pipelined MIPS
6
CSCE 212 6 Pipelined Control
7
CSCE 212 7 Pipelined Control
8
CSCE 212 8 Pipelined Control
9
CSCE 212 9 MIPS ISA MIPS pipeline stages –Fetch (F) read next instruction from memory, increment address counter assume 1 cycle to access memory –Decode (D) read register operands, resolve instruction in control signals, compute branch target –Execute (E) execute arithmetic/resolve branches –Memory (M) perform load/store accesses to memory, take branches assume 1 cycle to access memory –Write back (W) write arithmetic results to register file
10
CSCE 212 10 Hazards Hazards are data flow problems that arise as a result of pipelining –Limits the amount of parallelism, sometimes induces “penalties” that prevent one instruction per clock cycle –Structural hazards Two operations require a single piece of hardware Structural hazards can be overcome by adding additional hardware –Control hazards Conditional control instructions are not resolved until late in the pipeline, requiring subsequent instruction fetches to be predicted –Flushed if prediction does not hold (make sure no state change) Branchhazards can use dynamic prediction/speculation, branch delay slot –Data hazards Instruction from one pipeline stage is “dependant” of data computed in another pipeline stage
11
CSCE 212 11 Hazards Data hazards –Register values “read” in decode, written during write-back RAW hazard occurs when dependent inst. separated by less than 2 slots Examples: –ADD $2,$X,$X(E)ADD $2,$X,$X (M)ADD $2,$3,$4 (W) –ADD $X,$2,$X(D)…… –…ADD $X,$2,$X (D)… –……ADD $X,$2,$3 (D) –In most cases, data generated in same stage as data is required (EX) Data forwarding –ADD $2,$X,$X(M)ADD $2,$X,$X (W)ADD $2,$3,$4 (out-of-pipe) –ADD $X,$2,$X(E)…… –…ADD $X,$2,$X (E)… –……ADD $X,$2,$3 (E)
12
CSCE 212 12 “Load” Hazards Stalls required when data is not produced in same stage as it is needed for a subsequent instruction –Example: LW $2, 0($X) (M) ADD $X, $2(E) When this occurs, insert a “bubble” into EX state, stall F and D LW $2, 0($X) (W) NOOP (M) ADD $X, $2 (E) –Forward from W to E
13
CSCE 212 13 Data Hazards: Forwarding
14
CSCE 212 14 Data Hazards: Stalling for Load Hazard
15
CSCE 212 15 Control Hazards Need to make a branch decision based on data that has yet to be produced: –add $2,$3,$4 –beqz $2,loop Which stage is branch resolved? Approaches: –stall insert bubbles after all branches –always predict untaken if taken, instructions entering DEC and EX (and MEM?) transfer as NOOPs –branch delay slot instruction following branch is always executed –dynamic branch predictors
16
CSCE 212 16 Control Hazards Instructions are fetched every clock cycle Branch decisions happen in the EX stage Solutions: –Assume branch not taken (performs a flush of IF, ID, EX by inserting a nop into the pipeline registers on the clock edge) –Reduce the delay by moving the branch decision up Requires additional hardware (comparators, etc.) –Might increase cycle time, since register read and resolution are now in series and must be performed in half a cycle to allow for parallel register writes! Requires forwarding and stall hardware for new data hazards
17
CSCE 212 17 Example add $6,$5,$2 lw $7,0($6) addi $7,$7,10 add $6,$4,$2 sw $7,0($6) addi $2,$2,4 blt $2,$3,loop add $6,$5,$2 FDEMW FDEMW 123456789101112 FD EMW FDEMW FDEMW FDEMW FDEMW 13 FDEMW 1415 8 instructions, 15 - 4 cycles, CPI = 11/8
18
CSCE 212 18 Moving up Branch Resolution
19
CSCE 212 19 Moving up Branch Resolution
20
CSCE 212 20 Scheduling the Branch Delay Slot
21
CSCE 212 21 Dynamic Branch Prediction Assume taken/not-taken (static) –Loops have branches that are usually taken When wrong, we flush pipeline stages Deeper pipelines have higher branch penalties (misprediction penalty) Solution: –Look up address of branch to check if branch was previously taken –One-bit schemes –Two-bit schemes (must be wrong twice to change prediction)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.