ECS 154B Computer Architecture II Spring 2009

Slides:



Advertisements
Similar presentations
Pipeline Example: cycle 1 lw R10,9(R1) sub R11,R2, R3 and R12,R4, R5 or R13,R6, R7.
Advertisements

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor.
Review: MIPS Pipeline Data and Control Paths
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
The Processor: Datapath & Control
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 18 - Pipelined.
©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
 The actual result $1 - $3 is computed in clock cycle 3, before it’s needed in cycles 4 and 5  We forward that value to later instructions, to prevent.
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
1 Stalls and flushes  So far, we have discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
Supplementary notes for pipelining LW ____,____ SUB ____,____,____ BEQ ____,____,____ ; assume that, condition for branch is not satisfied OR ____,____,____.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
Pipelined Datapath and Control
CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-2 Read Section 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Basic Pipelining & MIPS Pipelining Chapter 6 [Computer Organization and Design, © 2007 Patterson (UCB) & Hennessy (Stanford), & Slides Adapted from: Mary.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
CMPE 421 Parallel Computer Architecture Part 2: Hardware Solution: Forwarding.
CSE431 L07 Overcoming Data Hazards.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 07: Overcoming Data Hazards Mary Jane Irwin (
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
PC Instruction Memory Address Instr. [31-0] 4 Fig 4.6 p 309 Instruction Fetch.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-1 Read Sections 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Access the Instruction from Memory
Stalling delays the entire pipeline
Note how everything goes left to right, except …
CDA 3101 Spring 2016 Introduction to Computer Organization
Computer Architecture
Single Cycle Processor
ECS 154B Computer Architecture II Spring 2009
\course\cpeg323-08F\Topic6b-323
ECE232: Hardware Organization and Design
CS/COE0447 Computer Organization & Assembly Language
Forwarding Now, we’ll introduce some problems that data hazards can cause for our pipelined processor, and show how to handle them with forwarding.
Chapter 4 The Processor Part 3
Review: MIPS Pipeline Data and Control Paths
Morgan Kaufmann Publishers The Processor
Chapter 4 The Processor Part 2
SOLUTIONS CHAPTER 4.
CS/COE0447 Computer Organization & Assembly Language
Single-cycle datapath, slightly rearranged
A pipeline diagram Clock cycle lw $t0, 4($sp) IF ID
CSCI206 - Computer Organization & Programming
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
Systems Architecture II
\course\cpeg323-05F\Topic6b-323
The Processor Lecture 3.6: Control Hazards
The Processor Lecture 3.3: Single-cycle Implementation
The Processor Lecture 3.2: Building a Datapath with Control
The Processor Lecture 3.5: Data Hazards
Instruction Execution Cycle
Introduction to Computer Organization and Architecture
CS/COE0447 Computer Organization & Assembly Language
Pipelining - 1.
Stalls and flushes Last time, we discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
The Processor: Datapath & Control.
COMS 361 Computer Organization
©2003 Craig Zilles (derived from slides by Howard Huang)
Processor: Datapath and Control
Pipelined datapath and control
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
CS/COE0447 Computer Organization & Assembly Language
Presentation transcript:

ECS 154B Computer Architecture II Spring 2009 Pipelining Datapath and Control §6.2-6.3 Partially adapted from slides by Mary Jane Irwin, Penn State And Kurtis Kredo, UCD

Pipelined CPU Break execution into five stages Corresponds to the five instruction cycles Fetch from instruction memory (IM) Decode and fetch registers (Reg) Execute the operation in the ALU Access data memory (DM) Write result back to register (Reg) ALU IM Reg DM

State Registers How do we store values across pipeline stages? Multi-cycle MIPS introduced state registers IR, MDR, B, A introduced Sufficient for a pipelined CPU? IR Address Memory Read Addr 1 Read Data 1 PC A Register File Read Data (Instr. or Data) Read Addr 2 ALUOut ALU Write Data Write Addr Read Data 2 B Write Data MDR

Pipelined State Registers Each instruction must maintain its own state Any information needed by a later stage must be passed along Consider PC + 4 Computed during Fetch stage (why?) Needed by Execute stage (why?)

Pipelined State Registers More registers necessary for a pipelined CPU Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Load Word Example (Fetch) Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Load Word Example (Decode) Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Load Word Example (Execute) Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Load Word Example (Memory) Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Load Word Example (Write Back) Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Load Word Example (Write Back) All required values must pass through registers Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Pipelined Instructions With the multi-cycle CPU, instructions could take a different number of cycles to complete Can we do this with a pipelined CPU? R-type sw Jump

Pipelined Instructions Add IF/ID ID/EX EX/MEM Add 4 Shift left 2 MEM/WB Addr 1 Instruction Memory Data Memory Register File Read Data 1 ALU PC Addr 2 Address Read Data Address Write Addr Read Data 2 Write Data Write Data Sign Extend 16 32

Pipeline Control What about control signals? Different for each instruction Required at different stages Where does the control unit go?

Pipeline Control Pass control signals using the state registers No need to pass after they are used Control IF/ID ID/EX EX/MEM MEM/WB

Pipeline Control Signals Execute signals RegDst ALUOp[1..0] ALUSrc Branch Memory access signals MemRead MemWrite Write back signals RegWrite MemtoReg

The Pipelined CPU So Far PCSrc ID/EX Control EX/MEM MEM/WB IF/ID Add Add Branch 4 RegWrite Shift left 2 Read Addr 1 DM Read Data 1 ALU Read Addr 2 MemtoReg Read Address Register File ALUSrc PC Read Data Address Read Data 2 Write Addr Write Data IM Write Data ALU Cntrl MemWrite MemRead Sign Extend ALUOp RegDst

Data Hazard Review Caused when data is needed before it is ready Read before write: Result of previous instruction needed by later instruction Load use: Value in data memory needed by later instruction ALU IM Reg DM ALU IM Reg DM ALU IM Reg DM

Read After Write Hazard Solution Stalling always an option Forwarding data improves CPI over stalling ALU IM Reg DM add $4, $5, $6 ALU IM Reg DM add $8, $4, $6 ALU IM Reg DM add $10, $9, $4

Data Forwarding Take the result from the earliest point that it exists in any of the pipeline state registers and forward it to the functional units (e.g., the ALU) that need it that cycle For ALU functional unit: the inputs can come from any pipeline register rather than just from ID/EX by adding multiplexors to the inputs of the ALU connecting the Rd write data in EX/MEM or MEM/WB to either (or both) of the EX’s stage Rs and Rt ALU mux inputs adding the proper control hardware to control the new muxes Other functional units may need similar forwarding logic With forwarding, the CPU can achieve a CPI of 1 even in the presence of data dependencies

Data Forwarding Conditions Only forward when state changes Use RegWrite control signal Don’t forward if destination is $0 Forward if previous destination current source Forwarding unnecessary in other cases Forward if either source register needs it

EX/MEM Forwarding Register value needed by next instruction Calculated by ALU this clock cycle Needed as input to ALU on next clock cycle ALU IM Reg DM add $4, $5, $6 add $8, $4, $7 or add $8, $7, $4 ALU IM Reg DM

EX/MEM Forwarding R[Rs] R[Rt] Immediate Rd Rt ID/EX EX/MEM MEM/WB RegWrite R[Rs] ALU MemWrite MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt RegDst

EX/MEM Forwarding R[Rs] R[Rt] Immediate Rd Rt Rs ID/EX EX/MEM MEM/WB RegWrite ALU MemWrite MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt Rs RegDst Forward Unit

MEM/WB Forwarding Register value needed two instructions later Calculated by ALU this clock cycle Needed as input to ALU in two clock cycles ALU IM Reg DM add $4, $5, $6 ALU IM Reg DM Unrelated Instruction add $8, $4, $7 or add $8, $7, $4 ALU IM Reg DM

MEM/WB Forwarding R[Rs] R[Rt] Immediate Rd Rt Rs ID/EX EX/MEM MEM/WB RegWrite ALU MemWrite MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt Rs RegDst Forward Unit

MEM/WB Forwarding R[Rs] R[Rt] Immediate Rd Rt Rs ID/EX EX/MEM MEM/WB RegWrite ALU MemWrite MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt Rs RegDst Forward Unit

Forwarding Complication Forward unit must forward most recent value It may appear necessary to do MEM/WB and EX/MEM forwarding simultaneously Only do EX/MEM forwarding this cycle Do EX/MEM forwarding again next cycle ALU IM Reg DM add $4, $5, $6 ALU IM Reg DM add $4, $4, $13 ALU IM Reg DM add $8, $4, $7

Complete ALU Input Forwarding ID/EX EX/MEM MEM/WB R[Rs] RegWrite ALU MemWrite MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt Rs RegDst Forward Unit

Register Definition How can we specify a particular signal? Each state register has a copy May vary across stages Reference the register that contains the value RegWrite value in EX/MEM state register EX/MEM.RegWrite RegWrite value in MEM/WB state register MEM/WB.RegWrite

Forwarding Conditions We want to forward when Previous instruction updates state Previous destination used as current source Previous destination not $0 Data Hazard code add $4, $5, $6 sub $8, $4, $9 How do we do this in hardware?

Forwarding Unit R[Rs] R[Rt] Immediate Rd Rt Rs ID/EX EX/MEM MEM/WB RegWrite ALU MemWrite MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt Rs RegDst Forward Unit

Forwarding Unit Details EX/MEM.RegWrite Forward EX/MEM.RegisterRd[4] ID/EX.RegisterRs[4] … EX/MEM.RegisterRd = ID/EX.RegisterRs EX/MEM.RegisterRd[0] ID/EX.RegisterRs[0] EX/MEM.RegisterRd[4] … EX/MEM.RegisterRd[0] EX/MEM.RegisterRd ≠ 0

Other Forwarding Possible Forwarding to Data Memory Data memory to data memory copy ALU IM Reg DM add $4, $5, $6 ALU IM Reg DM sw $4, 40($7) ALU IM Reg DM lw $4, 16($7) ALU IM Reg DM sw $4, 40($7)

Forwarding to Memory What happens here? add $5, $6, $7 sw $5, 8($10) Forwarding must occur, but not through ALU

Forwarding to Memory R[Rs] R[Rt] Immediate Rd Rt Rs ID/EX EX/MEM MEM/WB MemWrite R[Rs] RegWrite ALU MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt Rs Forward Unit

Forwarding to Memory R[Rs] R[Rt] Immediate Rd Rt Rs ID/EX EX/MEM MEM/WB MemWrite R[Rs] RegWrite ALU MemtoReg R[Rt] ALU Cntrl Immediate Rd Rt Rs Forward Unit