CSE378 Pipelining1 Pipelining Basic concept of assembly line –Split a job A into n sequential subjobs (A 1,A 2,…,A n ) with each A i taking approximately.

Slides:



Advertisements
Similar presentations
CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.
Advertisements

Instruction-Level Parallelism (ILP)
MIPS Pipelined Datapath
1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University RISC Pipeline See: P&H Chapter 4.6.
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
Review: MIPS Pipeline Data and Control Paths
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
CSE378 Multicycle impl,.1 Drawbacks of single cycle implementation All instructions take the same time although –some instructions are longer than others;
©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan
Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.
DLX Instruction Format
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
Pipelined Datapath and Control
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
CMPE 421 Parallel Computer Architecture Part 2: Hardware Solution: Forwarding.
CECS 440 Pipelining.1(c) 2014 – R. W. Allison [slides adapted from D. Patterson slides with additional credits to M.J. Irwin]
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.
Lecture 18: Pipelining I.
Pipeline Timing Issues
Computer Organization
Stalling delays the entire pipeline
CMSC 611: Advanced Computer Architecture
Morgan Kaufmann Publishers The Processor
Single Clock Datapath With Control
ECS 154B Computer Architecture II Spring 2009
Forwarding Now, we’ll introduce some problems that data hazards can cause for our pipelined processor, and show how to handle them with forwarding.
Chapter 4 The Processor Part 3
Review: MIPS Pipeline Data and Control Paths
Morgan Kaufmann Publishers The Processor
Chapter 4 The Processor Part 2
Pipelining review.
Single-cycle datapath, slightly rearranged
Current Design.
A pipeline diagram Clock cycle lw $t0, 4($sp) IF ID
Pipelining in more detail
Drawbacks of single cycle implementation
Pipelining Basic concept of assembly line
Pipeline control unit (highly abstracted)
The Processor Lecture 3.6: Control Hazards
The Processor Lecture 3.4: Pipelining Datapath and Control
Control unit extension for data hazards
An Introduction to pipelining
The Processor Lecture 3.5: Data Hazards
Instruction Execution Cycle
Pipeline control unit (highly abstracted)
Pipeline Control unit (highly abstracted)
Pipelining Basic concept of assembly line
Control unit extension for data hazards
Control Unit for Multiple Cycle Implementation
Pipelining Basic concept of assembly line
Pipelining Appendix A and Chapter 3.
Drawbacks of single cycle implementation
Morgan Kaufmann Publishers The Processor
Drawbacks of single cycle implementation
Control unit extension for data hazards
Control Unit for Multiple Cycle Implementation
FloorPlan for Multicycle MIPS
MIPS Pipelined Datapath
Pipelined datapath and control
Presentation transcript:

CSE378 Pipelining1 Pipelining Basic concept of assembly line –Split a job A into n sequential subjobs (A 1,A 2,…,A n ) with each A i taking approximately the same time –Each subjob is processed by a different substation (or resource) or, equivalently, passes through a series of stages –Subjobs of different jobs overlap their execution, i.e., when subjob A 1 of job A is finished in stage 1, subjob A 2 of job A will start executing in stage 2 while subjob B 1 of job B will start executing in stage 1

CSE378 Pipelining2 Pipeline performance Latency of a single job can be longer –because each stage takes as long as the longest one, say t max –because all jobs have to go through all stages even if they don’t do anything in one of the stages Throughput is enhanced –Ideally, in steady state, one job completes every t max rather than after (t 1 + t t n ) In the ideal case (all t i are the same) throughput is n times better if there are n stages. But : –Execution time of a job could take less than n stages –We assume that the pipeline can be kept full all the time

CSE378 Pipelining3 Pipeline applied to instruction execution Multiple cycle implementation In pipeline mode Ifetch Dec ALU Mem Wb Instr. i Instr. i+1 Idle time Ifetch Dec ALU Mem Wb Instr. i Instr. i+1 Instr. i+2

CSE378 Pipelining4 Pipeline parallelism Note that at any given time after the pipeline is full, we have 5 instructions in progress; however each individual instruction has a longer latency Ifetch Dec ALU Mem Wb 5 instructions in progress t t+1 t +2 t+3 t+4 t+5 t +6 t+7 t+8 t+9 Instr. i Instr. i+1 Instr. i+2 Instr. i+3 Instr. i+4 Instr. i+5

CSE378 Pipelining5 Pipeline implementation requirements 5 stages are active on 5 different instructions Thus all the resources needed for single cycle implementation will be required (at least) All stages are independent and isolated from each other. This implies that all information gathered at stage i needed in stages i+1, i+2 etc…must be kept and passed from stage to stage. For that purpose we will use “pipeline registers” between each stage

CSE378 Pipelining6 Pipeline data path (highly abstracted) IFIDEXMemWB IF/IDID/EXEX/MemMem/WB

CSE378 Pipelining7 Examples of what to store in pipeline registers The following is not an exhaustive list (just a sample) The register number where the result will be stored –Known at stage 2; needed at stage 5 The register number of the data containing the contents of a “store” as well as the contents of that register –Known at stage 2; needed at stage 4 The immediate value –Known at stage 2; needed at stage 3 The updated PC (we’ll see why later) etc..

CSE378 Pipelining8 Pipeline data path (a little less abstracted) Let’s look back at Fig (Single cycle implementation) and see where the pipeline registers should be put and what additional resources, if any, are needed.

CSE378 Pipelining9 Hazards Hazards are what prevents the pipeline to be ideal Three types of hazards: –Structural where two instructions at different stages want to use the same resource. Solved by using more resources (e.g., instruction and data memory; several ALU’s) –Data hazards when an instruction in the pipeline is dependent on another instruction still in the pipeline. Some of these hazards will be resolved by bypassing and some will result in the pipeline being stalled –Control hazards which happens on a successful branch (or procedure call/return). We’ll investigate several techniques to take care of these hazards.

CSE378 Pipelining10 Notation (cf. Figure 6.11) Pipeline registers keeping information between stages labeled by the names of the stages they connect (IF/ID, ID/EX, etc.) Information flows from left to right except for –writing the result register (potential for data hazard) –modifying the PC (potential for control hazard)

CSE378 Pipelining11 Tracing one instruction through all 5 stages (St 1) Stage 1: Instruction fetch and increment PC (same for all instructions) –Instruction fetched from instruction memory. The instruction will be needed in subsequent stages. Hence stored in IF/ID recall (i) the IR register in multiple cycle implementation, and (ii) that in single cycle implementation we had two “memories”, one for instruction and one for data –PC  PC + 4 The incremented PC will be needed to fetch the next instruction at the next cycle but it will also be needed if we have to do branch target computation. Hence incremented PC saved in IF/ID –Resources needed: Instruction memory and ALU –IF/ID contains instruction and incremented PC

CSE378 Pipelining12 Tracing one instruction through all 5 stages (St 2) Stage 2: Instruction decode and register read (same for all instructions) –We save in ID/EX everything that can be needed in next stages: The instruction and the incremented PC (from IF/ID) (e.g., function bits can be needed, the name of the register to be written etc.) The contents of registers rs and rt (recall registers A and B) The sign extended immediate field (for imm. Inst., load/store, branch) Control lines set-ups. We’ll deal with control later Note that we do not compute the branch target address here (why?). –Resources needed: register file, control unit –ID/EX contains PC, instruction, contents of rs and rt, extended imm. Field, control lines set-ups

CSE378 Pipelining13 Tracing one instruction through all 5 stages (St 3) Stage 3: depends on the type of instruction being executed. Let’s assume a Load. Hence this stage is “address computation” –ALU sources come from ID/EX (rs and immediate field) –ALU result stored in EX/Mem –Contents of rs, rt, and imm. field not needed any longer –Looks like PC not needed either but we’ll keep it because of possible exceptions (see later in the course) –Need to keep in EX/Mem part of the instruction (name of the rt register) and the indication we have a load

CSE378 Pipelining14 Stage 3 (ct’d) –Resource needed: ALU –But if we had a branch we need two ALU’s, one for the branch target computation and one for comparing registers. –EX/Mem needs to keep result of ALU (for load/store and arith. instructions) name of result register (for load and arith. instructions) contents of rs (for store) PC (in case of exceptions)

CSE378 Pipelining15 Tracing one instruction through all 5 stages (St 4) Stage 4: Access data memory. Still assuming we have a load –Access data memory with address kept in EX/Mem –Keep result of load in Mem/WB –Resource needed: Data memory –Mem/WB needs to keep Result of load if load instruction Result of arith. Instr. if arith instr (from EX/Mem) Name of result register No need to keep PC any longer (no exception occurs in last stage)

CSE378 Pipelining16 Tracing one instruction through all 5 stages (St 5) Stage 5: Write back register (for load and arith) –Contents of what to write and where to write it in Mem/WB –Nothing to be kept –Resource needed: register file –Ah! But registers were also needed in Stage 2! We allow writing registers in first part of cycle and reading registers in second part of cycle

CSE378 Pipelining17 Summary of requirements of ideal pipeline data path Stages 1 and 2: info to be kept is similar for all instructions The width of the pipeline registers must be such that it fits the “maximum” amount of info that will be needed Instructions (except branches, see soon) must pass through all stages even if nothing is done in that stage The “state of the machine” (ISA visible registers) is modified only in the last stage (except for PC and when branches occur)