1 ECE369 ECE369 Pipelining. 2 ECE369 addm (rs), rt # Memory[R[rs]] = R[rt] + Memory[R[rs]]; Assume that we can read and write the memory in the same cycle.

Slides:



Advertisements
Similar presentations
Lecture 4: CPU Performance
Advertisements

1 ECE369 ECE369 Pipelining. 2 ECE369 “toupper” :converts any lowercase characters (with ASCII codes between 97 and 122) in the null-terminated argument.
Pipeline Exceptions & ControlCSCE430/830 Pipelining in MIPS MIPS architecture was designed to be pipelined –Simple instruction format (makes IF, ID easy)
Advanced Computer Architectures Laboratory on DLX Pipelining Vittorio Zaccaria.
1 Datapath and Control (Multicycle datapath) CDA 3101 Discussion Section 11.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
ELEN 468 Advanced Logic Design
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
Pipelining III Andreas Klappenecker CPSC321 Computer Architecture.
Pipelining Andreas Klappenecker CPSC321 Computer Architecture.
1 Chapter Six - 2nd Half Pipelined Processor Forwarding, Hazards, Branching EE3055 Web:
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
L18 – Pipeline Issues 1 Comp 411 – Spring /03/08 CPU Pipelining Issues Finishing up Chapter 6 This pipe stuff makes my head hurt! What have you.
L17 – Pipeline Issues 1 Comp 411 – Fall /1308 CPU Pipelining Issues Finishing up Chapter 6 This pipe stuff makes my head hurt! What have you been.
CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?
 The actual result $1 - $3 is computed in clock cycle 3, before it’s needed in cycles 4 and 5  We forward that value to later instructions, to prevent.
1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.
7/2/ _23 1 Pipelining ECE-445 Computer Organization Dr. Ron Hayne Electrical and Computer Engineering.
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
1 Stalls and flushes  So far, we have discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
Memory/Storage Architecture Lab Computer Architecture Pipelining Basics.
Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.
CMPE 421 Parallel Computer Architecture
CS 1104 Help Session IV Five Issues in Pipelining Colin Tan, S
Winter 2002CSE Topic Branch Hazards in the Pipelined Processor.
1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve perfomance by increasing instruction throughput.
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
1/24/ :00 PM 1 of 86 Pipelining Chapter 6. 1/24/ :00 PM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which.
CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned to program MIPS And a bit about Intel’s x86 Instructions.
1. Convert the RISCEE 1 Architecture into a pipeline Architecture (like Figure 6.30) (showing the number data and control bits). 2. Build the control line.
CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.
Introduction to Computer Organization Pipelining.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
L17 – Pipeline Issues 1 Comp 411 – Fall /23/09 CPU Pipelining Issues Read Chapter This pipe stuff makes my head hurt! What have you been.
CSCE 212 Chapter 6 Enhancing Performance with Pipelining Instructor: Jason D. Bakos.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Chapter Six.
Stalling delays the entire pipeline
CDA3101 Recitation Section 8
Pipelining Chapter 6.
CSCI206 - Computer Organization & Programming
ELEN 468 Advanced Logic Design
Single Clock Datapath With Control
CDA 3101 Spring 2016 Introduction to Computer Organization
Processor Pipelining Yasser Mohammad.
Chapter 4 The Processor Part 3
CS 5513 Computer Architecture Pipelining Examples
Pipelining review.
Single-cycle datapath, slightly rearranged
Pipelining Chapter 6.
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Lecture 18: Pipelining Today’s topics:
CSCI206 - Computer Organization & Programming
CSCI206 - Computer Organization & Programming
Data Hazards Data Hazard
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Pipeline control unit (highly abstracted)
Chapter Six.
Chapter Six.
Lecture 18: Pipelining Today’s topics:
Instruction Execution Cycle
Pipeline control unit (highly abstracted)
Pipelining: Basic Concepts
Pipeline Control unit (highly abstracted)
Pipelining Chapter 6.
Data Path Diagrams.
Pipelining Chapter 6.
CS 3853 Computer Architecture Pipelining Examples
Problem ??: (?? marks) Consider executing the following code on the MIPS pipelined datapath: add $t5, $t6, $t8 add $t9, $t5, $t4 lw $t3, 100($t9) sub $t2,
Presentation transcript:

1 ECE369 ECE369 Pipelining

2 ECE369 addm (rs), rt # Memory[R[rs]] = R[rt] + Memory[R[rs]]; Assume that we can read and write the memory in the same cycle (like the register file, but this is likely not efficient to do in a real machine). All instructions use the same format (shown below), but not all instructions use all of the fields. Assume that each unused field is set to 0.

3 ECE369 InstrRegDstRegWriteMemReadMemWriteALUsrcMemToALUDataSrcPCSrc ALUOp addm x Add

4 ECE369 Pipelining One CPU manufacturer has proposed the 10-stage pipeline shown below. Here are the correspondences between this and the MIPS pipeline: Instructions are fetched in the FET stage. Register reading is performed in the REG stage. ALU operations and memory accesses are both done in the EXE stage. Branches are resolved in the DET stage. WRB is the writeback stage. Write and Read on Memory or Register File can occur in the same cycle Without forwarding, how many stall cycles are needed for the following code? Show your work to get credit. lw $t0, 0($a0) add $v1, $t0, $t0

5 ECE369 Solution

6 ECE369 Assume that the initial value of R3 is R2+396, How many cycles does this loop take to execute? Loop: LWR1, 0(R2) ADDIR1, R1,#1 SWR1, 0(R2) ADDIR2, R2, #4 SUBR4, R3, R2 BNEZ R4, Loop -no forwarding or bypassing hardware. -all memory and register writes occur during the first half and reads occur during the second half of the clock cycle. (a register read and a register write in the same cycle forwards through the register file). -branching is handled by flushing the pipeline and branches are resolved in Memory stage.

7 ECE369 branches are resolved in MEM. Second iterations starts 17 clock cycles after the first instructions. Last iterations takes 18 cycles. Loop executes 99 times. => 98*17+18=1684cycles.

8 ECE369 Assume that the initial value of R3 is R2+396, How many cycles does this loop take to execute? Loop: LWR1, 0(R2) ADDIR1, R1,#1 SWR1, 0(R2) ADDIR2, R2, #4 SUBR4, R3, R2 BNEZ R4, Loop -with forwarding and bypassing hardware. -all memory and register writes occur during the first half and reads occur during the second half of the clock cycle. (a register read and a register write in the same cycle forwards through the register file). -Assume that branch is resolved in Memory stage and handled by predicting it as not taken. {Use (m) for branch mis-prediction in the table}

9 ECE369 branches are resolved in MEM. Second iterations starts 10 clock cycles after the first instructions. Last iterations takes 11 cycles. Loop executes 99 times. => 98*10+11=991cycles.

10 ECE369 Assume that the initial value of R3 is R2+396, How many cycles does this loop take to execute? Loop: LWR1, 0(R2) ADDIR1, R1,#1 SWR1, 0(R2) ADDIR2, R2, #4 SUBR4, R3, R2 BNEZ R4, Loop Assuming the MIPS pipeline with a single cycle delayed branch and normal forwarding and bypassing hardware, Schedule the instructions in the loop including the branch delay slot. You may reorder the instructions and modify the individual instruction operands, but do not undertake other loop transformations that change the number or opcode of the instructions in the loop. Show a pipeline timing diagram and compute the number of cycles needed to execute the entire loop.

11 ECE369 =98*6+10=598 clocks Loop: LWR1, 0(R2) ADDIR1, R1,#1 SWR1, 0(R2) ADDIR2, R2, #4 SUBR4, R3, R2 BNEZ R4, Loop