Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 ECE 313 - Computer Organization Pipelined Processor.

Slides:

Advertisements

Similar presentations

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 13 - A Verilog.

Advertisements

Pipeline Exceptions & ControlCSCE430/830 Pipelining in MIPS MIPS architecture was designed to be pipelined –Simple instruction format (makes IF, ID easy)

Pipeline Hazards CS365 Lecture 10. D. Barbara Pipeline Hazards CS465 2 Review  Pipelined CPU  Overlapped execution of multiple instructions  Each on.

ECE 445 – Computer Organization

Part 2 - Data Hazards and Forwarding 3/24/04++

Review: MIPS Pipeline Data and Control Paths

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

Computer Organization

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 19 - Pipelined.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 18 - Pipelined.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.

©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan

Chapter Six Enhancing Performance with Pipelining

1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 3.

Pipeline Exceptions & ControlCSCE430/830 Pipeline: Exceptions & Control CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.

 The actual result $1 - $3 is computed in clock cycle 3, before it’s needed in cycles 4 and 5  We forward that value to later instructions, to prevent.

Computer Organization Lecture Set – 06 Chapter 6 Huei-Yung Lin.

Lecture 28: Chapter 4 Today’s topic –Data Hazards –Forwarding 1.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.

1 Stalls and flushes  So far, we have discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Multi-Cycle Processor.

Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.

Pipelined Datapath and Control

CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-2 Read Section 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.

Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.

Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2012 Revised from original slides provided by MKP.

Basic Pipelining & MIPS Pipelining Chapter 6 [Computer Organization and Design, © 2007 Patterson (UCB) & Hennessy (Stanford), & Slides Adapted from: Mary.

Electrical and Computer Engineering University of Cyprus LAB3: IMPROVING MIPS PERFORMANCE WITH PIPELINING.

CMPE 421 Parallel Computer Architecture Part 2: Hardware Solution: Forwarding.

1 (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann,

CSE431 L07 Overcoming Data Hazards.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 07: Overcoming Data Hazards Mary Jane Irwin (

Computing Systems Pipelining: enhancing performance.

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 9: MIPS Pipeline.

CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 16 - Multi-Cycle.

PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.

CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-1 Read Sections 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State.

Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.

CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.

Stalling delays the entire pipeline

Note how everything goes left to right, except …

Morgan Kaufmann Publishers The Processor

ECS 154B Computer Architecture II Spring 2009

ECS 154B Computer Architecture II Spring 2009

ECE232: Hardware Organization and Design

Morgan Kaufmann Publishers The Processor

Data Hazards and Stalls

Forwarding Now, we’ll introduce some problems that data hazards can cause for our pipelined processor, and show how to handle them with forwarding.

Chapter 4 The Processor Part 3

Review: MIPS Pipeline Data and Control Paths

Morgan Kaufmann Publishers The Processor

Csci 136 Computer Architecture II – Data Hazard, Forwarding, Stall

Morgan Kaufmann Publishers The Processor

Single-cycle datapath, slightly rearranged

Morgan Kaufmann Publishers Enhancing Performance with Pipelining

Computer Organization CS224

Lecture 9. MIPS Processor Design – Pipelined Processor Design #2

Systems Architecture II

Pipelined Control (Simplified)

The Processor Lecture 3.5: Data Hazards

CSC3050 – Computer Architecture

Pipelining (II).

Control unit extension for data hazards

Morgan Kaufmann Publishers The Processor

Control unit extension for data hazards

Systems Architecture II

Pipelining - 1.

©2003 Craig Zilles (derived from slides by Howard Huang)

ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.

Presentation transcript:

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor Design 2 Feb 2004 Reading: , 6.8 Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s Slides - Fall 1999 CMU other sources as noted

Feb 2005Pipelining 22 Pipelining Outline  Introduction  Pipelined Processor Design   Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction  Exceptions  Performance  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

Feb 2005Pipelining 23 Pipelining in MIPS  MIPS architecture was designed to be pipelined  Simple instruction format (makes IF, ID easy) Single-word instructions Small number of instruction formats Common fields in same place (e.g., rs, rt) in different formats  Memory operations only in lw, sw instructions (simplifies EX)  Memory operands aligned in memory (simplifies MEM)  Single value for writeback (limits forwarding)  Pipelining is harder in CISC architectures

Feb 2005Pipelining 24 Pipelined Datapath with Control Signals

Feb 2005Pipelining 25 Next Step: Adding Control  Basic approach: build on single-cycle control  Place control unit in ID stage  Pass control signals to following stages  Later: extra features to deal with:  Data forwarding  Stalls  Exceptions

Feb 2005Pipelining 26 Control for Pipelined Datapath Source: Book Fig. 6.29, p 469 RegDst ALUOp[1:0] ALUSrc MemRead MemWrite Branch RegWrite MemtoReg

Feb 2005Pipelining 27 Control for Pipelined Datapath Source: Book Fig. 6.25, p 401

Feb 2005Pipelining 28 Datapath and Control Unit

Feb 2005Pipelining 29 Tracking Control Signals - Cycle 1 LW

Feb 2005Pipelining 210 Tracking Control Signals - Cycle 2 SWLW

Feb 2005Pipelining 211 Tracking Control Signals - Cycle 3 ADDSWLW

Feb 2005Pipelining 212 Tracking Control Signals - Cycle 4 SUBADD SW LW 1 0 0

Feb 2005Pipelining ADD Tracking Control Signals - Cycle 5 SUB SW LW

Feb 2005Pipelining 214 Pipelining Outline - Coming Up  Introduction  Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding   Branch Prediction  Exceptions  Performance  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

Feb 2005Pipelining 215 Data Hazards Revisited…  Data hazards occur when data is used before it is stored (Fig. 6.28)

Feb 2005Pipelining 216 Data Hazard Solution: Forwarding  Key idea: connect data internally before it's stored (Fig. 6.29)

Feb 2005Pipelining 217 Data Hazard Solution: Forwarding  Add hardware to feed back ALU and MEM results to both ALU inputs (Fig. 6.32)

Feb 2005Pipelining 218 Controlling Forwarding  Need to test when register numbers match in rs, rt, and rd fields stored in pipeline registers  "EX" hazard:  EX/MEM - test whether instruction writes register file and examine rd register  ID/EX - test whether instruction reads rs or rt register and matches rd register in EX/MEM  "MEM" hazard:  MEM/WB - test whether instruction writes register file and examine rd (rt) register  ID/EX - test whether instruction reads rs or rt register and matches rd (rt) register in EX/MEM

Feb 2005Pipelining 219 Forwarding Unit Detail - EX Hazard if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10 if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10

Feb 2005Pipelining 220 Forwarding Unit Detail - MEM Hazard if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

Feb 2005Pipelining 221 EX Hazard Complication  What if we a register is changed more than once?  add $1, $1, $2;  add $1, $1, $3;  add $1, $1, $4;  Answer: forward most recent result (in MEM stage)

Feb 2005Pipelining 222 Forwarding Unit Detail - MEM Hazard Revised if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (EX/MEM.RegisterRd ≠ ID/EX.RegisterRs) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (EX/MEM.RegisterRd ≠ ID/EX.RegisterRt) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

Feb 2005Pipelining 223 Fig (6.33) Forwarding Elaboration  Extra 2-1 mux needed for immediate instructions Added Mux

Feb 2005Pipelining 224 Data Hazards and Stalls  We still have to stall when register is loaded from memory and used in following instruction (Fig. 6.34)

Feb 2005Pipelining 225 Data Hazards and Stalls  Add a hazard detection unit to detect this condition and stall (Fig. 6.35) Typo: Should read AND

Feb 2005Pipelining 226 Pipelined Processor with Hazard Detection (Fig. 6.36)

Feb 2005Pipelining 227 Hazard Detection Unit - Control Detail if (ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or ((ID/EX.RegisterRt = IF/ID.RegisterRt))) stall

Feb 2005Pipelining 228 Hazard detection unit - what happens  MUX zeros out control signals for instruction in ID  "squashes” the instruction  “no-op” propagates through following stages  IF/ID holds stalled instruction until next clock cycle  PC holds current value until next clock cycle (re- loads first instruction)

Feb 2005Pipelining 229 Branch Hazards  Just stalling for each branch is not practical  Common assumption: branch not taken  When assumption fails: flush three instructions (Fig. 6.37)

Feb 2005Pipelining 230 Reducing Branch Delay  Key idea: move branch logic to ID stage of pipeline  New adder calculates branch target (PC extend(IMM) << 2)  New hardware tests rs == rt after register read  Add flush signal to squash instruction in IF/ID register  Reduced penalty (1 cycle) when branch taken  Example: Figure 6.38, p. 420

Feb 2005Pipelining 231 Pipelined Processor - Branch Hardware in ID (Old Fig. 6.51)

Feb 2005Pipelining 232 Pipelining Outline  Introduction  Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction   Exceptions  Performance  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

Feb 2005Pipelining 233 Branch Prediction  Key idea: instead of always assuming branch not taken, use a prediction based on previous history  Branch history table: small memory index using lower bits instruction address save “what happened” on last execution –branch taken OR –branch not taken  Use history to make prediction

Feb 2005Pipelining 234 More about Branch Prediction  Consider nested loops: for (i=1; I<M; i++) oloop:... for (j=1;j<N; j++) { iloop: } bne $1,$2, iloop } bne $3,$4, oloop  Prediction fails on first and last branch  More history can improve performance

Feb 2005Pipelining 235 Branch Prediction w/2-Bit History  Key idea: must be wrong twice before changing prediction

Feb 2005Pipelining 236 Pipelining Outline  Introduction  Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction  Exceptions   Performance  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

Feb 2005Pipelining 237 Pipelining and Exceptions  Exceptions require suspension of execution  Complicating factors  Several instructions are in pipeline  Exception may occur before instruction is complete  Must flush pipeline to suspend execution, but may lose information about the exception

Feb 2005Pipelining 238 Pipelining and Exceptions (cont’d) (Fig. 6.42)

Feb 2005Pipelining 239 Pipelining and Exceptions (cont’d)  Operation: Figure 6.43 (p. 508)  Exceptions make life difficult - take a computer architecture course to learn more.

Feb 2005Pipelining 240 Pipelining Outline  Introduction  Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction  Exceptions  Performance   Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

Feb 2005Pipelining 241 Performance of the Pipelined Implementation  Use “gcc” instr. mix to calculate CPI lw25%1 cycle (2 cycles when load-use hazard) sw10%1 cycle R-type52%1 cycle branch11%1 cycle (2 when prediction wrong) jump2%2 cycles  Assmptions:  50% of load instructions are followed by immed. use  25% of branch predictions are wrong  Calculating CPI  CPI = (1.5 cycles * 0.25) + (1 cycle * 0.10) + (1 cycle * 0.52) + (1.25 cycles * 0.11) + (2 cycles * 0.02)  CPI = 1.17 cycles per instruction

Feb 2005Pipelining 242 Performance of the Pipelined Implementation (cont’d)  Calculate the average execution time: Pipelined1.17 CPI * 200ps/clock= 234ps Single-Cycle 1 CPI * 600ps/clock=600ps Multicycle4.12 CPI * 200ps / clock=824ps  Speedup of pipelined implementation  2.56X faster than single cycle  3.4X faster than multicycle  “Your mileage may differ” as instruction mix changes

Feb 2005Pipelining 243 Pipelining Outline  Introduction  Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction  Exceptions  Performance  Advanced Pipelining   Superscalar  Dynamic Pipelining  Examples