Download presentation
Presentation is loading. Please wait.
Published byRosalind Hunter Modified over 9 years ago
1
Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization and Design, 4 th Edition, by Patterson and Hennessey, and were used with permission from Morgan Kaufmann Publishers.
2
Fall 2010ECE 445 - Computer Organization2 Material to be covered... Chapter 4: Sections 5 – 9, 13 – 14
3
Fall 2010ECE 445 - Computer Organization3 Performance of the Single-Cycle MIPS
4
Fall 2010ECE 445 - Computer Organization4
5
Fall 2010ECE 445 - Computer Organization5 Example: MIPS Clock Rate Determine the clock rate for the MIPS architecture, assuming the following: The MIPS is a Single Cycle Machine 1 clock cycle per instruction CPI = 1 Access time for memory units = 200 ps Operation time for ALU and adders = 100 ps Access time for register file = 50 ps
6
Fall 2010ECE 445 - Computer Organization6 Example: MIPS Clock Rate Instruction ClassFunctional Units used by the Instruction Class ALU InstructionInst. FetchRegisterALURegister Load WordInst. FetchRegisterALUMemoryRegister Store WordInst. FetchRegisterALUMemory BranchInst. FetchRegisterALU JumpInst. Fetch
7
Fall 2010ECE 445 - Computer Organization7 Example: MIPS Clock Rate Instruction ClassInstr Memory Register read ALU operation Data Memory Register write Total ALU Instruction20050100050400 ps Load Word2005010020050600 ps Store Word200501002000550 ps Branch2005010000350 ps Jump2000000200 ps
8
Fall 2010ECE 445 - Computer Organization8 Example: MIPS Clock Rate The clock cycle time for a machine with a single clock cycle per instruction will be determined by the longest instruction. In this example, the load word instruction requires 600 ps. The clock rate is then Clock rate = 1 / Clock Cycle Time Clock rate = 1 / 600 ps = 1.67 GHz
9
Fall 2010ECE 445 - Computer Organization9 Performance Issues Longest delay determines clock period Critical path: load word (lw) instruction Instruction memory register file ALU data memory register file Not feasible to vary clock period for different instructions Violates design principle Making the common case fast Improve performance by pipelining
10
Fall 2010ECE 445 - Computer Organization10 How does pipelining work?
11
Fall 2010ECE 445 - Computer Organization11 Pipelining Analogy Pipelined laundry: overlapping execution Parallelism improves performance §4.5 An Overview of Pipelining Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup = 2n/0.5n + 1.5 ≈ 4 = number of stages
12
Fall 2010ECE 445 - Computer Organization12 Objective: Keep all stages of the pipeline busy at all times.
13
Fall 2010ECE 445 - Computer Organization13 Pipelining: Improving Performance LatencyMax. Throughput Non-Pipelined2 hours0.5 Pipelined2 hours2 Latency = time from start of one load to the end of same load. Maximum Throughput = # of loads completed per hour. Assuming all stages of pipeline are busy at all times. Length of time for each load does not change.
14
Fall 2010ECE 445 - Computer Organization14 Pipelining: Improving Performance Pipelining improves performance by increasing instruction throughput, rather than decreasing execution time of an individual instruction.
15
Fall 2010ECE 445 - Computer Organization15 The MIPS Pipeline
16
Fall 2010ECE 445 - Computer Organization16 MIPS Pipeline Five stages, one step per stage – IF: Instruction fetch from memory – ID: Instruction decode & register read – EX: Execute operation or calculate address – MEM: Access memory operand – WB: Write result back to register
17
Fall 2010ECE 445 - Computer Organization17 MIPS Pipeline
18
Fall 2010ECE 445 - Computer Organization18 Pipeline Performance Assume time for stages is 100ps for register read or write 200ps for other stages Compare pipelined datapath with single-cycle datapath InstrInstr fetchRegister read ALU opMemory access Register write Total time lw200ps100 ps200ps 100 ps800ps sw200ps100 ps200ps 700ps R-format200ps100 ps200ps100 ps600ps beq200ps100 ps200ps500ps
19
Fall 2010ECE 445 - Computer Organization19 Pipeline Performance Single-cycle (T c = 800ps) Pipelined (T c = 200ps) Why is the clock period 800ps? Why is the clock period 200ps?
20
Fall 2010ECE 445 - Computer Organization20 Pipeline Speedup If all stages are balanced i.e., all take the same time Time between instructions pipelined = Time between instructions nonpipelined Number of stages If not balanced, speedup is less Speedup due to increased throughput Latency (time for each instruction) does not decrease
21
Fall 2010ECE 445 - Computer Organization21 Pipelining and ISA Design MIPS ISA designed for pipelining All instructions are 32-bits Easier to fetch and decode in one cycle c.f. x86: 1- to 17-byte instructions Few and regular instruction formats Can decode and read registers in one step Load/store addressing Can calculate address in 3 rd stage, access memory in 4 th stage Alignment of memory operands i.e. on word boundaries Memory access takes only one cycle
22
Fall 2010ECE 445 - Computer Organization22 Pipeline Summary Pipelining improves performance by increasing instruction throughput Executes multiple instructions in parallel Each instruction has the same latency Subject to hazards Structure, data, control Instruction set design affects complexity of pipeline implementation The BIG Picture hazards will be discussed in upcoming lectures
23
Fall 2010ECE 445 - Computer Organization23 MIPS Pipelined Datapath §4.6 Pipelined Datapath and Control
24
Fall 2010ECE 445 - Computer Organization24 Pipeline registers Need registers between stages To hold information produced in previous cycle Why?
25
Fall 2010ECE 445 - Computer Organization25 Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath “Single-clock-cycle” pipeline diagram Shows pipeline usage in a single cycle Highlight resources used “Multi-clock-cycle” diagram Graph of operation over time We’ll look at “single-clock-cycle” diagrams for load word and store word.
26
Fall 2010ECE 445 - Computer Organization26 IF for Load, Store, …
27
Fall 2010ECE 445 - Computer Organization27 ID for Load, Store, …
28
Fall 2010ECE 445 - Computer Organization28 EX for Load
29
Fall 2010ECE 445 - Computer Organization29 MEM for Load
30
Fall 2010ECE 445 - Computer Organization30 WB for Load Wrong register number Why?
31
Fall 2010ECE 445 - Computer Organization31 Corrected Datapath for Load
32
Fall 2010ECE 445 - Computer Organization32 EX for Store
33
Fall 2010ECE 445 - Computer Organization33 MEM for Store
34
Fall 2010ECE 445 - Computer Organization34 WB for Store
35
Fall 2010ECE 445 - Computer Organization35 Multi-Cycle Pipeline Diagram Form showing resource usage
36
Fall 2010ECE 445 - Computer Organization36 Multi-Cycle Pipeline Diagram Traditional form
37
Fall 2010ECE 445 - Computer Organization37 Single-Cycle Pipeline Diagram State of pipeline in a given cycle
38
Fall 2010ECE 445 - Computer Organization38 Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.