King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.

Slides:



Advertisements
Similar presentations
Pipelining (Week 8).
Advertisements

Morgan Kaufmann Publishers The Processor
1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Data Dependencies Describes the normal situation that the data that instructions use depend upon the data created by other instructions, or data is stored.
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Intro to Computer Org. Pipelining, Part 2 – Data hazards + Stalls.
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
Forwarding and Hazards MemberRole William ElliottTeam Leader Jessica Tyler ShulerWiki Specialist Tyler KimseyLead Engineer Cameron CarrollEngineer Danielle.
CMPT 334 Computer Organization
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
Chapter 8. Pipelining.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
Goal: Describe Pipelining
Computer Organization
Chapter 12 Pipelining Strategies Performance Hazards.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
Pipelining Andreas Klappenecker CPSC321 Computer Architecture.
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
DLX Instruction Format
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization.
-1.1- PIPELINING 2 nd week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PIPELINING 2 nd week References Pipelining concepts The DLX.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
9.2 Pipelining Suppose we want to perform the combined multiply and add operations with a stream of numbers: A i * B i + C i for i =1,2,3,…,7.
Memory/Storage Architecture Lab Computer Architecture Pipelining Basics.
Computer Science Education
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
Pipelined Datapath and Control
Pipelining (I). Pipelining Example  Laundry Example  Four students have one load of clothes each to wash, dry, fold, and put away  Washer takes 30.
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.
Chapter 4 The Processor. Chapter 4 — The Processor — 2 Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined.
CMPE 421 Parallel Computer Architecture
CS 1104 Help Session IV Five Issues in Pipelining Colin Tan, S

Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
5/13/99 Ashish Sabharwal1 Pipelining and Hazards n Hazards occur because –Don’t have enough resources (ALU’s, memory,…) Structural Hazard –Need a value.
Pipelining Example Laundry Example: Three Stages
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.
CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned to program MIPS And a bit about Intel’s x86 Instructions.
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CS203 – Advanced Computer Architecture Pipelining Review.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Lecture 5. MIPS Processor Design Pipelined MIPS #1 Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212 Computer Architecture.
Pipelining Chapter 6.
Single Clock Datapath With Control
Pipeline Implementation (4.6)
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Morgan Kaufmann Publishers The Processor
Pipelining review.
Pipelining Chapter 6.
Serial versus Pipelined Execution
Pipelining in more detail
CSCI206 - Computer Organization & Programming
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Instruction Execution Cycle
Chapter 8. Pipelining.
Pipelining Chapter 6.
Pipelining Chapter 6.
Need to stall for one cycle.
Presentation transcript:

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 1 COE 308 Enhancing Performance with Pipelining

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 2 COE 308 Laundry Example Student doing laundry (processing one load) Washing a single load of laundry Drying a single load of laundry Folding a single load Putting the load in the closet

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 3 COE 308 Sequential Laundry 6 PM AM Task order A B C D Sequential Laundry takes 8 hours for four loads of wash …

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 4 COE 308 Pipelined Laundry … while pipelined laundry takes just 3.5 hours 6 PM AM Task order A B C D

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 5 COE 308 Pipelining Analysis Pipelining possible because: All four laundry steps use independent stations –Washing uses the washer which is independent from the dryer used in the drying step and from the table used in the folding step. –This means that once the washing step is done, it is possible to use the washer (for another load) while the current load is drying in the dryer All steps are always used in the same order –Washing always occurs before drying as it is not correct to dry clothes that haven’t been washed yet –Drying always occur before folding –…

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 6 COE 308 Pipelining Processor Execution Processor executes instructions Instruction execution process can be pipelined ? –Yes because it can be divided into steps –And because the order of the execution steps is the same (most of the time) Instruction execution steps –Fetch instruction from memory –Read registers while decoding the instruction –Execute the operation –Access an operand in data memory –Write the result into a register

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 7 COE 308 Pipeline Stages Instruction execution steps are called: pipeline stages: Instruction Fetch (IF stage) Instruction Decode (ID) EXecute operation (EX) MEMory access (MEM) Write Back the result (WB) IF ID EX MEM WB

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 8 COE 308 Processor Pipeline Pipeline is well represented as a timing diagram (laundry example) The following sequence is represented: IFIDEXMEMWB add$1, $3, $5 sub$3, $1, $4 and$2, $5, $1 or$7, $1, $9 addi$10, $6, $3 IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB add sub and or addi Five Instructions are Executed in 9 cycles Clock Cycle

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 9 COE 308 Data Dependency Hazard Examine the following instructions: add$1, $3, $5 sub$3, $1, $4 and$2, $5, $1 or$7, $1, $9 There is a dependency between add and sub on register $1 as it is used by sub after it is modified by add IFIDEXMEMWB IFIDEXMEMWB add sub The result of the add instruction is written in the $1 register NOT BEFORE the WB stage However, the sub instruction fetches the value of register $1 during the ID stage Problem: The sub instruction will fetch the wrong value of register $1 because the correct value has not been written in there yet.

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 10 COE 308 Types of Dependencies All cases of data dependencies should be analyzed to see whether they cause any malfunction in the pipeline context: Data Dependency cases: Read After Write (RAW) Read After Read (RAR) Write After Write (WAW) Write After Read (WAR)

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 11 COE 308 RAW Dependency add$1, $3, $5 sub$3, $1, $4 and$2, $5, $1 or$7, $1, $9 Read After Write (RAW) dependencies It is the fact that some instructions have the same source register that is a destination in a previous instruction which means that the next instructions will need to read the value of this register while it is going to be written by the previous instruction Problem: The next instruction(s) will fetch the wrong values of the dependent registers because the correct values have not been written back yet.

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 12 COE 308 RAR Dependency add$1, $3, $5 sub$3, $5, $4 and$2, $4, $1 or$7, $1, $9 Read After Read (RAR) dependencies Two consecutive instructions use the same register as a source operand No Problem: As long as the registers are not modified, pipelining does not affect the normal execution process in this case

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 13 COE 308 WAW Dependency add$1, $3, $5 sub$1, $5, $4 and$4, $4, $1 or$4, $1, $9 Write After Write (WAW) dependencies Two consecutive instructions use the same register as a destination operand No Problem: Writes occur during the last pipeline stage and no inconsistency results from this situation because the instructions execution order is maintained

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 14 COE 308 WAR Dependency add$1, $3, $5 sub$3, $5, $2 and$2, $4, $1 or$7, $1, $9 Write After Read (RAR) dependencies The next instruction uses the same register, used as a source operand by a previous instruction, as destination register No Problem: Read occurs in ID stage and Write occurs in WB stage which means that the order of operations is not altered by the pipeline structure

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 15 COE 308 RAW Dependency Cases i:add$1, $3, $5 i+1:sub$3, $1, $4 i+2:and$2, $5, $1 i+3:or$7, $1, $9 Case 1 dependency between instruction i and instruction i+1 Case 2 dependency between instruction i and instruction i+2 Case 3 dependency between instruction i and instruction i+3 Every case needs to be checked in order to determined whether it poses a real problem or not

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 16 COE 308 RAW Dependency Case 1 i:add$1, $3, $5 i+1:sub$3, $1, $4 i+2:and$2, $5, $1 i+3:or$7, $1, $9 IFIDEXMEMWB IFIDEXMEMWB add sub Operand is fetched BEFORE it is written back Case 1 dependency between instruction i and instruction i+1

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 17 COE 308 WB RAW Dependency Case 2 i:add$1, $3, $5 i+1:sub$3, $1, $4 i+2:and$2, $5, $1 i+3:or$7, $1, $9 Operand is fetched BEFORE it is written back IFIDEXMEM IFIDEXMEMWB IFIDEXMEMWB add sub and Case 2 dependency between instruction i and instruction i+1

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 18 COE 308 RAW Dependency Case 3 i:add$1, $3, $5 i+1:sub$3, $1, $4 i+2:and$2, $5, $1 i+3:or$7, $1, $9 Operand is fetched AT THE SAME TIME it is written back IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB add sub and or Case 3 dependency between instruction i and instruction i+1

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 19 COE 308 Register File Model Case 3 does not pose a problem because we assume that: In the Register File Writes occur BEFORE Reads This is only true if we use the falling edge of the clock to write Clock ID Stage Write is prepared here Write occurs here Read occurs here

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 20 COE 308 Data Dependency Solutions Data dependency between instructions causes fetch of operands at the wrong time. Obvious remedy is to DELAY the fetch of operands to after the correct value is written in the register file –In software, by inserting NOP instructions –In hardware, by stalling the pipeline

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 21 COE 308 NOP Insertion Insertion of two NOP instructions will solve the data dependency problem IFIDEXMEMWB add$1, $3, $5 nop sub$3, $1, $4 and$2, $5, $1 IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB add nop sub and

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 22 COE 308 Pipeline Stall Delaying the fetch of the operands can be implemented in software IFIDEXMEMWB add$1, $3, $5 sub$3, $1, $4 and$2, $5, $1 or$7, $1, $9 addi$10, $6, $3 IFIDEXMEMWB IFIDEXMEM IFIDEX IFID add sub and or addi It is equivalent to … The instruction sub is maintained in the IF stage for two extra clock cycles IF

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 23 COE 308 Pipeline Stall … inserting bubbles in the pipeline IFIDEXMEMWB IF IDEXMEMWB IFIDEXMEM add sub or The instruction sub is maintained in the IF stage for two extra clock cycles IF IDEXMEMWB IDEXMEMWB While virtual nop instructions are inserted in the pipeline (as bubbles)

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 24 COE 308 Branch Hazard Examine the following instructions: beq$1, $3, Target sub$3, $1, $4 and$2, $5, $1... Target:or$3, $5, $9 In the case the branch is taken, the instructions sub and add are wrongfully executed because they are fetched BEFORE the branch decision is made Problem: Modification of the Program Logic: Unacceptable Behavior IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB beq sub and or Branch decision is taken and Target is fetched

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 25 COE 308 Branch Hazard Solution The solution is to: Not to let the instructions after the branch finish execution in the case the branch is taken –Instruction transformation into nops (in hardware) Put instructions which do not disturb the logic of the program after the branch instruction so that their execution will not modify the logic of the program. –Insertion of nop instructions after each branch instruction (by the compiler)

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 26 COE 308 NOP forcing IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB beq sub and or Branch decision is taken and Target is fetched Transformed into NOPs after branch taken After branch is taken, following instruction are forced as NOP instructions for the subsequent pipeline stages until the branch target instruction is fetched. NOP will have no effect. It is also said that instruction execution is killed

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 27 COE 308 NOP Insertion IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB IFIDEXMEMWB add nop or beq$1, $3, Target sub$3, $1, $4 and$2, $5, $1... Target:or$3, $5, $9 Insertion of NOP instructions, by the compiler, after each branch instruction, does not disturb the logic of the program.

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 28 COE 308 Delayed Branch Insertion of NOP instructions introduces a substantial overhead that increases the instruction count significantly. Idea is to move actual instructions from the area before the branch to the slots after the branch to fill in the nop slots without modifying the logic of the program xor$2, $2, $5 and$1, $7, $8 sub$10, $6, $4 add$3, $6, $7 beq$1, $3, Target sub$3, $1, $4 and$2, $5, $1 Original code and$1, $7, $8 add$3, $6, $7 beq$1, $3, Target xor$2, $2, $5 sub$10, $6, $4 sub$3, $1, $4 and$2, $5, $1 Transformed code No dependency Register $1 used by beq No dependency Register $3 used by beq

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 29 COE 308 Delayed Branch Consider the transformed code obtained after moving the xor and sub instructions after the beq instruction: and$1, $7, $8 add$3, $6, $7 beq$1, $3, Target xor$2, $2, $5 sub$10, $6, $4 sub$3, $1, $4 and$2, $5, $1 A programmer who reads the code without any idea about the execution will think that the branch occurs here The execution will actually make the branch take effect here; so while the instructions xor and sub are executed, the second sub and the and instructions are not Branch instruction and branch execution are sparated by a two instruction delay that’s why it is called: Delayed Branch

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 30 COE 308 Pipelined Datapath

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 31 COE 308 Inserting Pipeline Registers

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 32 COE 308 Writing Back the Result

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 33 COE 308 Destination Register Specifier ?

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 34 COE 308 Branch Logic

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 35 COE 308 Pipelined Control

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 36 COE 308 Data Hazards and Forwarding

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department College of Computer Science And Engineering College of Computer Science And Engineering Pipeline 37 COE 308 Forwarding Unit