Download presentation
Presentation is loading. Please wait.
1
Designing a Pipelined CPU
Read 4.7 for next class Designing a Pipelined CPU Peer Instruction Lecture Materials for Computer Architecture by Dr. Leo Porter is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
2
Review -- Single Cycle CPU
3
Review -- Multiple Cycle CPU
4
Review -- Instruction Latencies
Single-Cycle CPU Load Ifetch Reg/Dec Exec Mem Wr Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Load Ifetch Reg/Dec Exec Mem Wr Add Ifetch Reg/Dec Exec Wr
5
Instruction Latencies and Throughput
Single-Cycle CPU Ifetch Reg/Dec Exec Mem Wr Load Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Ifetch Reg/Dec Exec Mem Wr Load Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8
6
Instruction Latencies and Throughput
Which of the statements below is true about a pipelined processor? Selection Statement A Instruction latency remains essentially unchanged from single-cycle (minus some overheads); Instruction throughput increases B Instruction latency remains essentially unchanged from multi-cycle (minus some overheads); Instruction throughput increases C Instruction latency improves by a factor of 5 over single-cycle (minus some overheads); Instruction throughput increases D Instruction latency improves by a factor of 5 over multi-cycle (minus some overheads); Instruction throughput increases E None of the above Tails - flipped Correct Answer - C
7
Instruction Latencies and Throughput
Single-Cycle CPU Ifetch Reg/Dec Exec Mem Wr Load Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Ifetch Reg/Dec Exec Mem Wr Load Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Ifetch Reg/Dec Exec Mem Wr Load
8
Instruction Latencies and Throughput
Single-Cycle CPU Ifetch Reg/Dec Exec Mem Wr Load Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Ifetch Reg/Dec Exec Mem Wr Load Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load
9
Instruction Latencies and Throughput
Single-Cycle CPU Ifetch Reg/Dec Exec Mem Wr Load Multiple Cycle CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Ifetch Reg/Dec Exec Mem Wr Load Pipelined CPU Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load Ifetch Reg/Dec Exec Mem Wr Load
10
Pipelining Advantages
PI throughput Vs. latency Higher maximum throughput Higher utilization of CPU resources But, more complicated datapath, more complex control(?)
11
Pipelining Throughput
PI Matching Peek Throughput 1. Longest instruction 2. Average instruction 3. Cycle time Selection SC MC Pipeline A 1 2 3 B C D E None of the above
12
A Pipelined Datapath IF: Instruction fetch
ID: Instruction decode and register fetch EX: Execution and effective address calculation MEM: Memory access WB: Write back
13
Idea for a Pipelined Datapath
14
Execution in a Pipelined Datapath
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 IF ID EX MEM WB IM Reg ALU DM lw IF ID EX MEM WB IM Reg ALU DM lw IM Reg ALU DM lw IM Reg ALU DM lw IM Reg ALU DM lw
15
Execution in a Pipelined Datapath
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 Make sure to point Out Steady State CPI = 1 IF ID EX MEM WB IM Reg ALU DM lw IF ID EX MEM WB IM Reg ALU DM lw IM Reg ALU DM lw IM Reg ALU DM lw IM Reg ALU DM lw steady state
16
Reason (Choose BEST answer)
Pipeline Stages Should we force every instruction to go through all 5 stages? Can we break it up like we did for multi-cycle, with R-type taking 4 cycles instead of 5? Selection Yes/No Reason (Choose BEST answer) A Yes Decreasing R-type to 4 cycles improves instruction throughput B Decreasing R-type to 4 cycles improves instruction latency C No Decreasing R-type to 4 cycles causes hazards D Decreasing R-type to 4 cycles causes hazards and doesn’t impact throughput E Decreasing R-type to 4 cycles causes hazards and doesn’t impact latency Tails - flipped Correct Answer - C
17
Mixed Instructions in the Pipeline
CC1 CC2 CC3 CC4 CC5 CC6 IM Reg ALU DM lw add
18
Pipeline Principles All instructions that share a pipeline must have the same stages in the same order. therefore, add does nothing during Mem stage sw does nothing during WB stage All intermediate values must be latched each cycle. There is no functional block reuse IF ID EX MEM WB IM Reg ALU DM
19
Pipelined Datapath Instruction Fetch Instruction Decode/
Register Fetch Execute/ Address Calculation Memory Access Write Back registers!
20
The Pipeline in Execution
add $10, $1, $2 Instruction Decode/ Register Fetch Execute/ Address Calculation Memory Access Write Back Draw active datapath through example
21
The Pipeline in Execution
lw $12, 1000($4) add $10, $1, $2 Execute/ Address Calculation Memory Access Write Back
22
The Pipeline in Execution
sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Memory Access Write Back
23
The Pipeline in Execution
Instruction Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Write Back
24
The Pipeline in Execution
Instruction Fetch Instruction Decode/ Register Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2
25
The Pipeline in Execution
Instruction Fetch Instruction Decode/ Register Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Write Register is wrong Did someone spot the error??
26
The Pipeline in Execution
Instruction Fetch Instruction Decode/ Register Fetch Execute/ Address Calculation sub $15, $4, $1 lw $12, 1000($4)
27
The Pipeline, with controls But….
28
Pipeline Control X Y None of the above Selection SC MC Pipeline A B C
Control Logic X. Combinational Logic Y. FSM or Microprogram Selection SC MC Pipeline A X Y B C D E None of the above
29
Pipelined Control IF/ID ID/EX EX/MEM MEM/WB control instruction
30
Pipelined Control Signals
31
The Pipeline with Control Logic
32
Is it really this simple?
33
Neg edge! What just happened here which is problematic (BEST ANSWER)?
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 IM Reg ALU DM sub $2, $1, $3 IM Reg ALU DM and $12, $2, $5 IM Reg ALU DM or $13, $6, $2 IM Reg ALU DM add $14, $2, $2 ALU IM Reg DM sw $15, 100($2) What just happened here which is problematic (BEST ANSWER)? The register file is trying to read and write the same register The ALU and data memory are both active in the same cycle A value is used before it is produced Both A and B Both A and C Neg edge!
34
Data Hazards When a result is needed in the pipeline before it is available, a “data hazard” occurs. R2 Available CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 IM Reg ALU DM sub $2, $1, $3 IM Reg ALU DM and $12, $2, $5 IM Reg ALU DM or $13, $6, $2 R2 Needed IM Reg ALU DM add $14, $2, $2 Use red and black lines! ALU IM Reg DM sw $15, 100($2)
35
Hazards continued What happens when... Draw dependencies
add $3, $10, $11 lw $8, 1000($3) sub $11, $8, $7 Draw dependencies
36
The Pipeline in Execution
lw $8, 1000($3) add $3, $10, $11 Execute/ Address Calculation Memory Access Write Back
37
The Pipeline in Execution
sub $11, $8, $7 lw $8, 1000($3) add $3, $10, $11 Memory Access Write Back
38
The Pipeline in Execution
add $10, $1, $2 sub $11, $8, $7 lw $8, 1000($3) add $3, $10, $11 Write Back
39
Pipelining Key Points ET = IC * CPI * CT
We achieve high throughput without reducing instruction latency. Pipelining exploits a special kind of parallelism (parallelism between functionality required in different cycles). Pipelining uses combinational logic to generate (and registers to propagate) control signals. Pipelining creates potential hazards.
40
Summary comparison (MIPS designs)
CPI CT Single Cycle Multi-cycle Pipeline
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.