Download presentation
Presentation is loading. Please wait.
1
Pipelining
2
Processor Data Path Single cycle processor makes poor use of units:
3
Processor Data Path ADD r1, r2, r3 running
4
Processor Data Path ADD r1, r2, r3 running
5
Processor Data Path ADD r1, r2, r3 running
6
Processor Data Path ADD r1, r2, r3 running
7
Processor Data Path ADD r1, r2, r3 running
8
Assembly Lines Single cycle laundry:
9
Assembly Lines Assembly line laundry:
10
IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps
Segmented Data Path IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 100ps
11
Segmented Data Path Registers to hold values between stages
12
Pipelined Each stage can work on different instruction:
13
Pipeline vs Not: Pipeline: 4 ins / 8 cycles
No Pipeline: 2 ins / 10 cycles
14
Throughput N stage pipeline: n - 1 cycles to "prime it"
Then one instruction per cycle
15
Throughput N stage pipeline:
Time for i instructions in n stage pipeline 𝑖+(𝑛 −1) Time for i instructions without pipelining 𝑛∙𝑖
16
Throughput N stage pipeline:
Time for i instructions in n stage pipeline 𝑖+(𝑛 −1) Time for i instructions without pipelining 𝑛∙𝑖 Max Speedup: 𝑛∙𝑖 𝑖+(𝑛 −1) = 𝑛 1+ (𝑛 −1) 𝑖 as 𝑖 → ∞ = 𝑛 1 = n
17
Pipelining Limits In theory: n times speedup for n stage pipeline But
Only if all stages are balanced Only if can be kept full
18
IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps
Weak Link & Latency Total data path = 800ps IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 100ps
19
IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps
Weak Link & Latency Pipelined : can't run faster than slowest step IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 100ps
20
IF : Instruction Fetch 200 ps ID : Instruction Decode 200ps
Weak Link & Latency Pipelined : can't run faster than slowest step 5 x 200ps = 1000ps Plus delay of memory between stages IF : Instruction Fetch 200 ps ID : Instruction Decode 200ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 200ps
21
Pipeline vs Not Clock time 800ps no pipeline 200ps pipeline
22
Weak Link & Latency First Instruction
No-pipeline: 800ps / 1 instruction Pipeline: 1000ps / 1 instruction "Speedup" on first instruction : 0.8x (25% slower) Increased Latency
23
Weak Link & Latency Full Pipeline
No-pipeline: 800ps / 1 instruction Pipeline: 1000ps / 5 instructions = 200 ps / inst Speedup with full pipeline = = 4x Increased Throughput
24
Designed for Pipelining
Consistent instruction length Simple decode logic No feeding data from memory to ALU
25
Hazards Hazard : Situation preventing next instruction from continuing in pipeline Structural : Resource (shared hardware) conflict Data : Needed data not ready Control : Correct action depends on earlier instruction
26
Structural Hazards What if one memory? IF and MEM access same unit Mem
27
Structural Hazards Conflict between MEM and IF
28
Dealing with Conflict Bubble : Unused pipeline stage
MOV Bubble LDR SUB ADD
29
Dealing with Conflict Bubbles to handle shared memory
30
Avoiding Structural Hazards
Separate Inst/Data cache Can’t send memory data to ALU
31
Data Hazards Sequence of instructions to be executed:
32
Data Hazards RAW : Read After Write
Later instruction depends on result from earlier ADD writes R1 at time 5 SUB wants r1 at time 3
33
Dealing with Data Hazards
Option 1 : NOP = No op = Bubble Assuming can read new value of r1 as being written : 2 cycles of bubble (otherwise 3)
34
Dealing with Data Hazards
Option 2 : Clever compiler/programmer reorders instructions: 1 Bubble eliminated by LDR before SUB
35
Reorder = New Problems While reordering, need to maintain critical ordering: RAW : Read after Write ADD r1, r3, r4 ADD r2, r1, r0 WAR : Write after Read ADD r2, r1, r0 ADD r1, r3, r4 WAW : Write after Write ADD r1, r4, r0 ADD r1, r3, r4
36
Dealing with Data Hazards
Option 3 : Forwarding Shortcut to send results back to earlier stages
37
Dealing with Data Hazards
r1’s value forwarded to ALU
38
Dealing with Data Hazards
Forwarding may not eliminate all bubbles
39
Dealing with Data Hazards
Requires complex hardware Potentially slows down pipeline
40
Pipeline History Pipelines:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.