Lecture 6: Pipelining MIPS R4000 and More Kai Bu

Slides:



Advertisements
Similar presentations
Lecture 4: CPU Performance
Advertisements

Tor Aamodt EECE 476: Computer Architecture Slide Set #6: Multicycle Operations.
Advanced Computer Architectures Laboratory on DLX Pipelining Vittorio Zaccaria.
COMP 4211 Seminar Presentation Based On: Computer Architecture A Quantitative Approach by Hennessey and Patterson Presenter : Feri Danes.
Instruction Set Issues MIPS easy –Instructions are only committed at MEM  WB transition Other architectures are more difficult –Instructions may update.
Instruction-Level Parallelism (ILP)
1 IF IDEX MEM L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall.
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;
EECS 470 Pipeline Hazards Lecture 4 Coverage: Appendix A.
1 Lecture 5: Pipeline Wrap-up, Static ILP Basics Topics: loop unrolling, VLIW (Sections 2.1 – 2.2) Assignment 1 due at the start of class on Thursday.
EECC551 - Shaaban #1 Fall 2002 lec# Floating Point/Multicycle Pipelining in MIPS Completion of MIPS EX stage floating point arithmetic operations.
COMP381 by M. Hamdi 1 Pipelining Control Hazards and Deeper pipelines.
DLX Instruction Format
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Oct 5, 2005 Topic: Instruction-Level Parallelism (Dynamic Scheduling: Scoreboarding)
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 4: Advanced Pipelines Control hazards, multi-cycle in-order pipelines, static ILP (Appendix A.4-A.10, Sections )
Appendix A Pipelining: Basic and Intermediate Concepts
EENG449b/Savvides Lec 5.1 1/27/04 January 27, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
EECC551 - Shaaban #1 Fall 2001 lec# Floating Point/Multicycle Pipelining in DLX Completion of DLX EX stage floating point arithmetic operations.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
Lecture 7: Pipelining Review Kai Bu
CPE 731 Advanced Computer Architecture Pipelining Review Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California,
Lecture 5: Pipelining Implementation Kai Bu
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
CSC 4250 Computer Architectures September 26, 2006 Appendix A. Pipelining.
Pipeline Extensions prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University MIPS Extensions1May 2015.
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
Professor Nigel Topham Director, Institute for Computing Systems Architecture School of Informatics Edinburgh University Informatics 3 Computer Architecture.
Branch Hazards and Static Branch Prediction Techniques
Appendix A. Pipelining: Basic and Intermediate Concept
EE524/CptS561 Jose G. Delgado-Frias 1 Processor Basic steps to process an instruction IFID/OFEXMEMWB Instruction Fetch Instruction Decode / Operand Fetch.
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
CSC 4250 Computer Architectures September 22, 2006 Appendix A. Pipelining.
CIS 662 – Computer Architecture – Fall Class 11 – 10/12/04 1 Scoreboarding  The following four steps replace ID, EX and WB steps  ID: Issue –
Instruction-Level Parallelism
Images from Patterson-Hennessy Book
CDA3101 Recitation Section 8
/ Computer Architecture and Design
Lecture 07: Pipelining Multicycle, MIPS R4000, and More
Appendix C Pipeline implementation
ECE232: Hardware Organization and Design
\course\cpeg323-08F\Topic6b-323
Exceptions & Multi-cycle Operations
Pipelining: Advanced ILP
CS 5513 Computer Architecture Pipelining Examples
Lecture 6: Advanced Pipelines
Pipelining Multicycle, MIPS R4000, and More
Pipelining in more detail
CSC 4250 Computer Architectures
CS 704 Advanced Computer Architecture
\course\cpeg323-05F\Topic6b-323
How to improve (decrease) CPI
Pipeline control unit (highly abstracted)
Instruction Execution Cycle
Project Instruction Scheduler Assembler for DLX
Pipelining Multicycle, MIPS R4000, and More
Pipeline control unit (highly abstracted)
Extending simple pipeline to multiple pipes
Lecture 4: Advanced Pipelines
Pipeline Control unit (highly abstracted)
CMSC 611: Advanced Computer Architecture
Lecture 5: Pipeline Wrap-up, Static ILP
CS 3853 Computer Architecture Pipelining Examples
Conceptual execution on a processor which exploits ILP
Pipelining Hazards.
Presentation transcript:

Lecture 6: Pipelining MIPS R4000 and More Kai Bu

Lab 2 Demo due April 15 Report due April 21 Assignment 2 Assignment-2.pdf Due April 15

Appendix C.5-C.7

Integer Op in 1 CC IF ID EX MEM WB

Multicycle FP Operation Floating-point (FP) operations take more time than integer operations do To complete an FP op in 1 cc: a slow clock? many logic in FP units?

Multicycle FP Operation FP pipeline allow for a longer latency for op; two changes over integer pipeline: repeat EX; use multiple FP functional units;

FP Pipeline

Outline Multicycle FP Operations Hazards and Forwarding MIPS R4000 Pipeline

Outline Multicycle FP Operations Hazards and Forwarding MIPS R4000 Pipeline

FP Pipeline loads and stores integer ALU operations branches FP add FP subtract FP conversion FP and integer multiplier FP and integer divider

FP Pipeline EX is not pipelined No other instruction using that functional unit may issue until the previous instruction leaves EX If an instruction cannot proceed to EX, the entire pipeline behind that instruction will be stalled

FP Pipeline Latency the number of intervening cycles between an instruction that produces a result and an instruction that uses the result Initiation/Repeat Interval the number of cycles that must elapse between issuing two operations of a given type

FP Pipeline Essentially, pipeline latency is 1 cycle less than the depth of the execution pipeline e.g., FP add takes 4 stages

Generalized FP Pipeline EX is pipelined (except for FP divider) Additional pipeline registers e.g., ID/A1 FP divider: 24 CCs

Generalized FP Pipeline Example italics: stage where data is needed bold: stage where a result is available

Outline Multicycle FP Operations Hazards and Forwarding MIPS R4000 Pipeline

Hazard Divider is not fully pipelined – structural hazard

Hazard Instructions have varying running times, maybe >1 register write in a cycle - structural hazard

Hazard Instructions no longer reach WB in order – Write after write (WAW) hazard

Hazard Instructions may complete in a different order than they were issued – exceptions

Hazard Longer latency of operations – more frequent stalls for RAW hazards

RAW Hazards

Structural Hazards

Interlock Detection Method 1: track the use of the write port in the ID stage and stall an instruction before it issues ::a shift register tracks when already- issued instructions will use the register file; if the instruction in ID is needs to use the register file at the same time, stall

Structural Hazards Interlock Detection Method 2: stall a conflicting instruction when it tries to enter MEM/WB ::could stall either issuing or issued one; give priority to the unit with the longest latency; more complicated: stall arises from MEM/WB

WAW Hazards If L.D were issued one cycle earlier L.D would write F2 one cycle earlier than ADD.D – WAW hazard what if another instruction using F2 between them? --- No WAW

Hazard Detection in ID 1. Check for structural hazards wait until the required functional unit is not busy (only for divides); make sure the register write port is available when it will be needed;

Hazard Detection in ID 2. Check for RAW data hazards wait until source registers are available when needed --- not pending destinations of issued instructions

Hazard Detection in ID 3. Check for WAW data hazards determine if any instruction in A1 – A4, D, M1-M7 has the same register destination as this instruction; if so, stall the issue of the instr in ID

Forwarding Generalized with more sources EX/MEM, A4/MEM, M7/MEM, D/MEM, MEM/WB -> source registers of an FP instruction

Out-of-order Completion ADD and SUB complete before DIV Out-of-order completion: instructions are completing in a different order than they were issued

Out-of-order Completion How to deal with out-of-order? 1. ignore the problem 2. buffer the results of an operation until all the operations issued earlier complete 3. tracking what operations were in the pipeline and their PCs 4. issue an instruction only if it is certain that all previous instructions will complete without exception

Outline Multicycle FP Operations Hazards and Forwarding MIPS R4000 Pipeline

All in MIPS R4000

MIPS R stage -> 8-stage Higher clock rate

MIPS R4000 IF: first half of instruction fetch; PC selection; initiation of instruction cache access;

MIPS R4000 IS: second half of instruction fetch; completion of instruction cache access;

MIPS R4000 RF: instruction decode and register fetch; hazard checking; instruction cache hit detection;

MIPS R4000 EX: execution effective address calculation; ALU operation; branch-target computation and condition evaluation;

MIPS R4000 DF: data fetch first half of data access;

MIPS R4000 DS: second half of data fetch completion of data cache access;

MIPS R4000 TC: tag check determine whether the data cache access hit;

MIPS R4000 WB: write back for loads and register-register operations;

MIPS R cycle load delay

MIPS R cycle branch delay: predicted-not-taken

MIPS R cycle branch delay: predicted-not-taken

MIPS R4000 Forwarding ALU/MEM or MEM/WB -> EX/DF, DF/DS, DS/TC, TC/WB

MIPS R4000 FP Pipeline FP unit with three functional units: FP divider, FP multiplier, FP adder 2 cycles to 112 cycles

MIPS R4000 FP unit with eight different stages

MIPS R4000 FP operations: latency and initiation interval

MIPS R4000 FP operations Example 1 FP multiply + FP add

MIPS R4000 FP operations Example 2 FP add + FP multiply

MIPS R4000 FP operations Example 3: divide + add

MIPS R4000 FP operations Example 4 FP add + FP divide

?