Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 ECE 313 - Computer Organization Lecture 17 - Pipelined.

Slides:



Advertisements
Similar presentations
Pipeline Hazards CSCE430/830 Pipeline: Hazards CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U. of Maine Fall,
Advertisements

PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
CMPT 334 Computer Organization
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
Goal: Describe Pipelining
Chapter Six 1.
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor.
Computer Organization
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 19 - Pipelined.
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
1 Recap (Pipelining). 2 What is Pipelining? A way of speeding up execution of tasks Key idea : overlap execution of multiple taks.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 18 - Pipelined.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
Pipelining Andreas Klappenecker CPSC321 Computer Architecture.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 3.
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
CS430 – Computer Architecture Introduction to Pipelined Execution
1 Atanasoff–Berry Computer, built by Professor John Vincent Atanasoff and grad student Clifford Berry in the basement of the physics building at Iowa State.
Computer Organization Lecture Set – 06 Chapter 6 Huei-Yung Lin.
CS 61C L30 Introduction to Pipelined Execution (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
Introduction to Pipelining Rabi Mahapatra Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
CS1104: Computer Organisation School of Computing National University of Singapore.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Multi-Cycle Processor.
Computer Science Education
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 8: MIPS Pipelined.
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.
11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which.
Analogy: Gotta Do Laundry
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.

CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
1/24/ :00 PM 1 of 86 Pipelining Chapter 6. 1/24/ :00 PM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which.
Pipelining Example Laundry Example: Three Stages
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.
CS252/Patterson Lec 1.1 1/17/01 معماري کامپيوتر - درس نهم pipeline برگرفته از درس : Prof. David A. Patterson.
LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 16 - Multi-Cycle.
Introduction to Computer Organization Pipelining.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Lecture 5. MIPS Processor Design Pipelined MIPS #1 Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212 Computer Architecture.
Lecture 18: Pipelining I.
Pipelines An overview of pipelining
Morgan Kaufmann Publishers
Single Clock Datapath With Control
Pipeline Implementation (4.6)
Chapter 3: Pipelining 순천향대학교 컴퓨터학부 이 상 정 Adapted from
Morgan Kaufmann Publishers The Processor
Morgan Kaufmann Publishers Enhancing Performance with Pipelining
Lecturer: Alan Christopher
Morgan Kaufmann Publishers The Processor
Systems Architecture II
Guest Lecturer: Justin Hsia
A relevant question Assuming you’ve got: One washer (takes 30 minutes)
Presentation transcript:

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined Processor Design 1 Fall 2004 Reading: Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s Slides - Fall 1999 CMU other sources as noted

ECE 313 Fall 2004Lecture 17 - Pipelining 12 Roadmap for the term: major topics  Overview / Abstractions and Technology  Performance  Instruction sets  Logic & arithmetic  Processor Implementation  Single-cycle implemenatation  Multicycle implementation  Pipelined Implementation   Memory systems  Input/Output

ECE 313 Fall 2004Lecture 17 - Pipelining 13 Pipelining Outline  Introduction  Defining Pipelining   Pipelining Instructions  Hazards  Pipelined Processor Design  Datapath  Control  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

ECE 313 Fall 2004Lecture 17 - Pipelining 14 What is Pipelining?  A way of speeding up execution of instructions  Key idea: overlap execution of multiple instructions  Analogy: doing you laundry 1.Run load through washer 2.Run load through dryer 3.Fold clothes 4.Put away clothes 5.Go to 1  Observation: we can start another load as soon as we finish step 1!

ECE 313 Fall 2004Lecture 17 - Pipelining 15 The Laundry Analogy  Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold  Washer takes 30 minutes  Dryer takes 30 minutes  “Folder” takes 30 minutes  “Stasher” takes 30 minutes to put clothes into drawers ABCD

ECE 313 Fall 2004Lecture 17 - Pipelining 16 If we do laundry sequentially TaskOrderTaskOrder Time A 30 B C D 6 PM AM  Time Required: 8 hours for 4 loads

ECE 313 Fall 2004Lecture 17 - Pipelining AM 6 PM Time 30 A C D B TaskOrderTaskOrder To Pipeline, We Overlap Tasks  Time Required: 3.5 Hours for 4 Loads  Latency remains 2 hours  Throughput improves by factor of 2.3 (decreases for more loads)

ECE 313 Fall 2004Lecture 17 - Pipelining 18 Pipelining a Digital System  Key idea: break big computation up into pieces  Separate each piece with a pipeline register 1ns200ps Pipeline Register

ECE 313 Fall 2004Lecture 17 - Pipelining 19 Pipelining a Digital System  Why do this? Because it's faster for repeated computations 1ns Non-pipelined: 1 operation finishes every 1ns 200ps Pipelined: 1 operation finishes every 200ps

ECE 313 Fall 2004Lecture 17 - Pipelining 110 Comments about pipelining  Pipelining increases throughput, but not latency  Answer available every 200ps, BUT  A single computation still takes 1ns  Limitations:  Computations must be divisible into stage size  Pipeline registers add overhead

ECE 313 Fall 2004Lecture 17 - Pipelining 111 Pipelining Outline  Introduction  Defining Pipelining  Pipelining Instructions   Hazards  Pipelined Processor Design  Datapath  Control  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

ECE 313 Fall 2004Lecture 17 - Pipelining 112 Pipelining a Processor  Recall the 5 steps in instruction execution: 1.Instruction Fetch 2.Instruction Decode and Register Read 3.Execution operation or calculate address 4.Memory access 5.Write result into register  Review: Single-Cycle Processor  All 5 steps done in a single clock cycle  Dedicated hardware required for each step  What happens if we break execution into multiple cycles, but keep the extra hardware?

ECE 313 Fall 2004Lecture 17 - Pipelining 113 Review - Single-Cycle Processor IF Instruction Fetch ID Instruction Decode EX Execute/ Address Calc. MEM Memory Access WB Write Back

ECE 313 Fall 2004Lecture 17 - Pipelining 114 Pipelining - Key Idea  Question: What happens if we break execution into multiple cycles, but keep the extra hardware?  Answer: in the best case, we can start executing a new instruction on each clock cycle - this is pipelining  Pipelining stages:  IF - Instruction Fetch  ID - Instruction Decode  EX - Execute / Address Calculation  MEM - Memory Access (read / write)  WB - Write Back (results into register file)

ECE 313 Fall 2004Lecture 17 - Pipelining 115 Basic Pipelined Processor IF/ID Pipeline Registers ID/EXEX/MEMMEM/WB

ECE 313 Fall 2004Lecture 17 - Pipelining 116 Single-Cycle vs. Pipelined Execution Non-Pipelined Pipelined

ECE 313 Fall 2004Lecture 17 - Pipelining 117 Comments about Pipelining  The good news  Multiple instructions are being processed at same time  This works because stages are isolated by registers  Best case speedup of N  The bad news  Instructions interfere with each other - hazards Example: different instructions may need the same piece of hardware (e.g., memory) in same clock cycle Example: instruction may require a result produced by an earlier instruction that is not yet complete  Worst case: must suspend execution - stall

ECE 313 Fall 2004Lecture 17 - Pipelining 118 Pipelined Example - Executing Multiple Instructions  Consider the following instruction sequence: lw $r0, 10($r1) sw $r3, 20($r4) add $r5, $r6, $r7 sub $r8, $r9, $r10

ECE 313 Fall 2004Lecture 17 - Pipelining 119 Executing Multiple Instructions Clock Cycle 1 LW

ECE 313 Fall 2004Lecture 17 - Pipelining 120 Executing Multiple Instructions Clock Cycle 2 LWSW

ECE 313 Fall 2004Lecture 17 - Pipelining 121 Executing Multiple Instructions Clock Cycle 3 LWSWADD

ECE 313 Fall 2004Lecture 17 - Pipelining 122 Executing Multiple Instructions Clock Cycle 4 LWSWADD SUB

ECE 313 Fall 2004Lecture 17 - Pipelining 123 Executing Multiple Instructions Clock Cycle 5 LWSWADDSUB

ECE 313 Fall 2004Lecture 17 - Pipelining 124 Executing Multiple Instructions Clock Cycle 6 SWADDSUB

ECE 313 Fall 2004Lecture 17 - Pipelining 125 Executing Multiple Instructions Clock Cycle 7 ADD SUB

ECE 313 Fall 2004Lecture 17 - Pipelining 126 Executing Multiple Instructions Clock Cycle 8 SUB

ECE 313 Fall 2004Lecture 17 - Pipelining 127 Alternative View - Multicycle Diagram

ECE 313 Fall 2004Lecture 17 - Pipelining 128 Pipelining Outline  Introduction  Defining Pipelining  Pipelining Instructions  Hazards   Pipelined Processor Design  Datapath  Control  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

ECE 313 Fall 2004Lecture 17 - Pipelining 129 Pipeline Hazards  Where one instruction cannot immediately follow another  Types of hazards  Structural hazards - attempt to use same resource twice  Control hazards - attempt to make decision before condition is evaluated  Data hazards - attempt to use data before it is ready  Can always resolve hazards by waiting

ECE 313 Fall 2004Lecture 17 - Pipelining 130 Structural Hazards  Attempt to use same resource twice at same time  Example: Single Memory for instructions, data  Accessed by IF stage  Accessed at same time by MEM stage  Solutions  Delay second access by one clock cycle, OR  Provide separate memories for instructions, data This is what the book does This is called a “Harvard Architecture” Real pipelined processors have separate caches

ECE 313 Fall 2004Lecture 17 - Pipelining 131 Example Structural Hazard - Single Memory Memory Conflict

ECE 313 Fall 2004Lecture 17 - Pipelining 132 Control Hazards  Attempt to make a decision before condition is evaluated  Example: beq $s0, $s1, offset  Assume we add hardware to second stage to:  Compare fetched registers for equality  Compute branch target  This allows branch to be taken at end of second clock cycle  But, this still means result is not ready when we want to load the next instruction!

ECE 313 Fall 2004Lecture 17 - Pipelining 133 Control Hazard Solutions  Stall - stop loading instructions until result is available  Predict - assume an outcome and continue fetching (undo if prediction is wrong)  Delayed branch - specify in architecture that following instruction is always executed

ECE 313 Fall 2004Lecture 17 - Pipelining 134 Control Hazard - Stall beq writes PC here new PC used here

ECE 313 Fall 2004Lecture 17 - Pipelining 135 Control Hazard - Correct Prediction Fetch assuming branch taken

ECE 313 Fall 2004Lecture 17 - Pipelining 136 Control Hazard - Incorrect Prediction “Squashed” instruction

ECE 313 Fall 2004Lecture 17 - Pipelining 137 Control Hazard - Delayed Branch always executes correct PC avail. here

ECE 313 Fall 2004Lecture 17 - Pipelining 138 Summary - Control Hazard Solutions  Stall - stop fetching instr. until result is available  Significant performance penalty  Hardware required to stall  Predict - assume an outcome and continue fetching (undo if prediction is wrong)  Performance penalty only when guess wrong  Hardware required to "squash" instructions  Delayed branch - specify in architecture that following instruction is always executed  Compiler re-orders instructions into delay slot  Insert "NOP" (no-op) operations when can't use (~50%)  This is how original MIPS worked

ECE 313 Fall 2004Lecture 17 - Pipelining 139 Data Hazards  Attempt to use data before it is ready  Solutions  Stalling - wait until result is available  Forwarding- make data available inside datapath  Reordering instructions - use compiler to avoid hazards  Examples: add $s0, $t0, $t1; $s0 = $t0+$t1 sub $t2, $s0, $t3; $t2 = $s0-$t3 lw $s0, 0($t0) ; $s0 = MEM[$t0] sub $t2, $s0, $t3; $t2 = $s0-$t3

ECE 313 Fall 2004Lecture 17 - Pipelining 140 Data Hazard - Stalling

ECE 313 Fall 2004Lecture 17 - Pipelining 141 Data Hazards - Forwarding  Key idea: connect new value directly to next stage  Still read s0, but ignore in favor of new result  Problem: what about load instructions?

ECE 313 Fall 2004Lecture 17 - Pipelining 142 Data Hazards – Forwarding the load result  STALL still required for load - data avail. after MEM  MIPS architecture calls this delayed load, initial implementations required compiler to deal with this

ECE 313 Fall 2004Lecture 17 - Pipelining 143 Data Hazards - Reordering Instructions  Assuming we have data forwarding, what are the hazards in this code? lw $t0, 0($t1) lw $t2, 4($t1) sw $t2, 0($t1) sw $t0, 4($t1)  Reorder instructions to remove hazard: lw $t0, 0($t1) lw $t2, 4($t1) sw $t0, 4($t1) sw $t2, 0($t1)

ECE 313 Fall 2004Lecture 17 - Pipelining 144 Summary - Pipelining Overview  Pipelining increase throughput (but not latency)  Hazards limit performance  Structural hazards  Control hazards  Data hazards

ECE 313 Fall 2004Lecture 17 - Pipelining 145 Pipelining Outline - Coming Up  Introduction  Defining Pipelining  Pipelining Instructions  Hazards  Pipelined Processor Design   Datapath  Control  Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples