Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization and Design, 4 th Edition, by Patterson and Hennessey, and were used with permission from Morgan Kaufmann Publishers.

Fall 2010ECE 445 - Computer Organization2 Material to be covered... Chapter 4: Sections 5 – 9, 13 – 14

Fall 2010ECE 445 - Computer Organization3 Performance of the Single-Cycle MIPS

Fall 2010ECE 445 - Computer Organization4

Fall 2010ECE 445 - Computer Organization5 Example: MIPS Clock Rate Determine the clock rate for the MIPS architecture, assuming the following:  The MIPS is a Single Cycle Machine 1 clock cycle per instruction CPI = 1  Access time for memory units = 200 ps  Operation time for ALU and adders = 100 ps  Access time for register file = 50 ps

Fall 2010ECE 445 - Computer Organization6 Example: MIPS Clock Rate Instruction ClassFunctional Units used by the Instruction Class ALU InstructionInst. FetchRegisterALURegister Load WordInst. FetchRegisterALUMemoryRegister Store WordInst. FetchRegisterALUMemory BranchInst. FetchRegisterALU JumpInst. Fetch

Fall 2010ECE 445 - Computer Organization7 Example: MIPS Clock Rate Instruction ClassInstr Memory Register read ALU operation Data Memory Register write Total ALU Instruction20050100050400 ps Load Word2005010020050600 ps Store Word200501002000550 ps Branch2005010000350 ps Jump2000000200 ps

Fall 2010ECE 445 - Computer Organization8 Example: MIPS Clock Rate The clock cycle time for a machine with a single clock cycle per instruction will be determined by the longest instruction.  In this example, the load word instruction requires 600 ps. The clock rate is then Clock rate = 1 / Clock Cycle Time Clock rate = 1 / 600 ps = 1.67 GHz

Fall 2010ECE 445 - Computer Organization9 Performance Issues Longest delay determines clock period  Critical path: load word (lw) instruction  Instruction memory  register file  ALU  data memory  register file Not feasible to vary clock period for different instructions Violates design principle  Making the common case fast Improve performance by pipelining

Fall 2010ECE 445 - Computer Organization10 How does pipelining work?

Fall 2010ECE 445 - Computer Organization11 Pipelining Analogy Pipelined laundry: overlapping execution  Parallelism improves performance §4.5 An Overview of Pipelining Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup = 2n/0.5n + 1.5 ≈ 4 = number of stages

Fall 2010ECE 445 - Computer Organization12 Objective: Keep all stages of the pipeline busy at all times.

Fall 2010ECE 445 - Computer Organization13 Pipelining: Improving Performance LatencyMax. Throughput Non-Pipelined2 hours0.5 Pipelined2 hours2 Latency = time from start of one load to the end of same load. Maximum Throughput = # of loads completed per hour. Assuming all stages of pipeline are busy at all times. Length of time for each load does not change.

Fall 2010ECE 445 - Computer Organization14 Pipelining: Improving Performance Pipelining improves performance by increasing instruction throughput, rather than decreasing execution time of an individual instruction.

Fall 2010ECE 445 - Computer Organization15 The MIPS Pipeline

Fall 2010ECE 445 - Computer Organization16 MIPS Pipeline Five stages, one step per stage – IF: Instruction fetch from memory – ID: Instruction decode & register read – EX: Execute operation or calculate address – MEM: Access memory operand – WB: Write result back to register

Fall 2010ECE 445 - Computer Organization17 MIPS Pipeline

Fall 2010ECE 445 - Computer Organization18 Pipeline Performance Assume time for stages is  100ps for register read or write  200ps for other stages Compare pipelined datapath with single-cycle datapath InstrInstr fetchRegister read ALU opMemory access Register write Total time lw200ps100 ps200ps 100 ps800ps sw200ps100 ps200ps 700ps R-format200ps100 ps200ps100 ps600ps beq200ps100 ps200ps500ps

Fall 2010ECE 445 - Computer Organization19 Pipeline Performance Single-cycle (T c = 800ps) Pipelined (T c = 200ps) Why is the clock period 800ps? Why is the clock period 200ps?

Fall 2010ECE 445 - Computer Organization20 Pipeline Speedup If all stages are balanced  i.e., all take the same time  Time between instructions pipelined = Time between instructions nonpipelined Number of stages If not balanced, speedup is less Speedup due to increased throughput  Latency (time for each instruction) does not decrease

Fall 2010ECE 445 - Computer Organization21 Pipelining and ISA Design MIPS ISA designed for pipelining  All instructions are 32-bits Easier to fetch and decode in one cycle c.f. x86: 1- to 17-byte instructions  Few and regular instruction formats Can decode and read registers in one step  Load/store addressing Can calculate address in 3 rd stage, access memory in 4 th stage  Alignment of memory operands i.e. on word boundaries Memory access takes only one cycle

Fall 2010ECE 445 - Computer Organization22 Pipeline Summary Pipelining improves performance by increasing instruction throughput  Executes multiple instructions in parallel  Each instruction has the same latency Subject to hazards  Structure, data, control Instruction set design affects complexity of pipeline implementation The BIG Picture hazards will be discussed in upcoming lectures

Fall 2010ECE 445 - Computer Organization23 MIPS Pipelined Datapath §4.6 Pipelined Datapath and Control

Fall 2010ECE 445 - Computer Organization24 Pipeline registers Need registers between stages  To hold information produced in previous cycle Why?

Fall 2010ECE 445 - Computer Organization25 Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath  “Single-clock-cycle” pipeline diagram Shows pipeline usage in a single cycle Highlight resources used  “Multi-clock-cycle” diagram Graph of operation over time We’ll look at “single-clock-cycle” diagrams for load word and store word.

Fall 2010ECE 445 - Computer Organization26 IF for Load, Store, …

Fall 2010ECE 445 - Computer Organization27 ID for Load, Store, …

Fall 2010ECE 445 - Computer Organization28 EX for Load

Fall 2010ECE 445 - Computer Organization29 MEM for Load

Fall 2010ECE 445 - Computer Organization30 WB for Load Wrong register number Why?

Fall 2010ECE 445 - Computer Organization31 Corrected Datapath for Load

Fall 2010ECE 445 - Computer Organization32 EX for Store

Fall 2010ECE 445 - Computer Organization33 MEM for Store

Fall 2010ECE 445 - Computer Organization34 WB for Store

Fall 2010ECE 445 - Computer Organization35 Multi-Cycle Pipeline Diagram Form showing resource usage

Fall 2010ECE 445 - Computer Organization36 Multi-Cycle Pipeline Diagram Traditional form

Fall 2010ECE 445 - Computer Organization37 Single-Cycle Pipeline Diagram State of pipeline in a given cycle

Fall 2010ECE 445 - Computer Organization38 Questions?

Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

Similar presentations

Presentation on theme: "Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

Similar presentations

Presentation on theme: "Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer."— Presentation transcript:

Similar presentations

About project

Feedback