Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pipelining.

Similar presentations


Presentation on theme: "Pipelining."— Presentation transcript:

1 Pipelining

2 Processor Data Path Single cycle processor makes poor use of units:

3 Processor Data Path ADD r1, r2, r3 running

4 Processor Data Path ADD r1, r2, r3 running

5 Processor Data Path ADD r1, r2, r3 running

6 Processor Data Path ADD r1, r2, r3 running

7 Processor Data Path ADD r1, r2, r3 running

8 Assembly Lines Single cycle laundry:

9 Assembly Lines Assembly line laundry:

10 IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps
Segmented Data Path IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 100ps

11 Segmented Data Path Registers to hold values between stages

12 Pipelined Each stage can work on different instruction:

13 Pipeline vs Not: Pipeline: 4 ins / 8 cycles
No Pipeline: 2 ins / 10 cycles

14 Throughput N stage pipeline: n - 1 cycles to "prime it"
Then one instruction per cycle

15 Throughput N stage pipeline:
Time for i instructions in n stage pipeline 𝑖+(𝑛 −1) Time for i instructions without pipelining 𝑛∙𝑖

16 Throughput N stage pipeline:
Time for i instructions in n stage pipeline 𝑖+(𝑛 −1) Time for i instructions without pipelining 𝑛∙𝑖 Max Speedup: 𝑛∙𝑖 𝑖+(𝑛 −1) = 𝑛 1+ (𝑛 −1) 𝑖 as 𝑖 → ∞ = 𝑛 1 = n

17 Pipelining Limits In theory: n times speedup for n stage pipeline But
Only if all stages are balanced Only if can be kept full

18 IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps
Weak Link & Latency Total data path = 800ps IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 100ps

19 IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps
Weak Link & Latency Pipelined : can't run faster than slowest step IF : Instruction Fetch 200 ps ID : Instruction Decode 100ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 100ps

20 IF : Instruction Fetch 200 ps ID : Instruction Decode 200ps
Weak Link & Latency Pipelined : can't run faster than slowest step 5 x 200ps = 1000ps Plus delay of memory between stages IF : Instruction Fetch 200 ps ID : Instruction Decode 200ps EX : Execute 200ps MEM : Memory Access 200ps WB : Write Back 200ps

21 Pipeline vs Not Clock time 800ps no pipeline 200ps pipeline

22 Weak Link & Latency First Instruction
No-pipeline: 800ps / 1 instruction Pipeline: 1000ps / 1 instruction "Speedup" on first instruction : 0.8x (25% slower) Increased Latency

23 Weak Link & Latency Full Pipeline
No-pipeline: 800ps / 1 instruction Pipeline: 1000ps / 5 instructions = 200 ps / inst Speedup with full pipeline = = 4x Increased Throughput

24 Designed for Pipelining
Consistent instruction length Simple decode logic No feeding data from memory to ALU

25 Hazards Hazard : Situation preventing next instruction from continuing in pipeline Structural : Resource (shared hardware) conflict Data : Needed data not ready Control : Correct action depends on earlier instruction

26 Structural Hazards What if one memory? IF and MEM access same unit Mem

27 Structural Hazards Conflict between MEM and IF

28 Dealing with Conflict Bubble : Unused pipeline stage
MOV Bubble LDR SUB ADD

29 Dealing with Conflict Bubbles to handle shared memory

30 Avoiding Structural Hazards
Separate Inst/Data cache Can’t send memory data to ALU

31 Data Hazards Sequence of instructions to be executed:

32 Data Hazards RAW : Read After Write
Later instruction depends on result from earlier ADD writes R1 at time 5 SUB wants r1 at time 3

33 Dealing with Data Hazards
Option 1 : NOP = No op = Bubble Assuming can read new value of r1 as being written : 2 cycles of bubble (otherwise 3)

34 Dealing with Data Hazards
Option 2 : Clever compiler/programmer reorders instructions: 1 Bubble eliminated by LDR before SUB

35 Reorder = New Problems While reordering, need to maintain critical ordering: RAW : Read after Write ADD r1, r3, r4 ADD r2, r1, r0 WAR : Write after Read ADD r2, r1, r0 ADD r1, r3, r4 WAW : Write after Write ADD r1, r4, r0 ADD r1, r3, r4

36 Dealing with Data Hazards
Option 3 : Forwarding Shortcut to send results back to earlier stages

37 Dealing with Data Hazards
r1’s value forwarded to ALU

38 Dealing with Data Hazards
Forwarding may not eliminate all bubbles

39 Dealing with Data Hazards
Requires complex hardware Potentially slows down pipeline

40 Pipeline History Pipelines:


Download ppt "Pipelining."

Similar presentations


Ads by Google