Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.

Slides:



Advertisements
Similar presentations
1 IKI20210 Pengantar Organisasi Komputer Kuliah no. 25: Pipeline 10 Januari 2003 Bobby Nazief Johny Moningka
Advertisements

Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
Goal: Describe Pipelining
Computer Architecture
Chapter Six 1.
Pipelining - Hazards.
Pipelined Processor II (cont’d) CPSC 321
11/1/2005Comp 120 Fall November Exam Postmortem 11 Classes to go! Read Sections 7.1 and 7.2 You read 6.1 for this time. Right? Pipelining then.
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Pipelining III Andreas Klappenecker CPSC321 Computer Architecture.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
1 Chapter Six - 2nd Half Pipelined Processor Forwarding, Hazards, Branching EE3055 Web:
L18 – Pipeline Issues 1 Comp 411 – Spring /03/08 CPU Pipelining Issues Finishing up Chapter 6 This pipe stuff makes my head hurt! What have you.
Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
L17 – Pipeline Issues 1 Comp 411 – Fall /1308 CPU Pipelining Issues Finishing up Chapter 6 This pipe stuff makes my head hurt! What have you been.
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
1 CSE SUNY New Paltz Chapter Six Enhancing Performance with Pipelining.
Pipelining Datapath Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley) and Hank Walker (TAMU)
Pipelining - II Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve performance by increasing instruction throughput.
Pipelining - II Rabi Mahapatra Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Enhancing Performance with Pipelining Slides developed by Rami Abielmona and modified by Miodrag Bolic High-Level Computer Systems Design.
CS1104: Computer Organisation School of Computing National University of Singapore.
1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.
B 0000 Pipelining ENGR xD52 Eric VanWyk Fall
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
CMPE 421 Parallel Computer Architecture
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.
Chapter 6 Pipelined CPU Design. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides Pipelined operation – laundry analogy Text Fig. 6.1.
CECS 440 Pipelining.1(c) 2014 – R. W. Allison [slides adapted from D. Patterson slides with additional credits to M.J. Irwin]
EECS 322 March 27, 2000 Based on Dave Patterson slides Instructor: Francis G. Wolff Case Western Reserve University This presentation.

Cs 152 L1 3.1 DAP Fa97,  U.CB Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Multiple tasks.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve perfomance by increasing instruction throughput.
Pipelining Example Laundry Example: Three Stages
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.
Computer Organization and Design Pipelining Montek Singh Dec 2, 2015 Lecture 16 (SELF STUDY – not covered on the final exam)
CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
L17 – Pipeline Issues 1 Comp 411 – Fall /23/09 CPU Pipelining Issues Read Chapter This pipe stuff makes my head hurt! What have you been.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
1  2004 Morgan Kaufmann Publishers No encoding: –1 bit for each datapath operation –faster, requires more memory (logic) –used for Vax 780 — an astonishing.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CS203 – Advanced Computer Architecture Pipelining Review.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.
Chapter Six.
Lecture 18: Pipelining I.
Computer Organization
CMSC 611: Advanced Computer Architecture
Pipeline Implementation (4.6)
ECE232: Hardware Organization and Design
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Chapter 4 The Processor Part 3
Chapter 4 The Processor Part 2
Pipelining in more detail
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Chapter Six.
Chapter Six.
November 5 No exam results today. 9 Classes to go!
Recall: Performance Evaluation
Presentation transcript:

Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining

Chap 6.2  Pipelining Overview  Pipelined Datapath  Pipelined Control  Hazards Structural Hazards Data Hazards Branch Hazards  Dynamic Scheduling  Examples of Pipelining  Summary Contents

Chap 6.3  The Five Classic Components of a Computer  Back to the datapath again to try to speed it up some more Single cycle datapath – great CPI but terrible cycle time (long critical path) Multiple cycle datapath – good cycle time but poor CPI Pipelining – get the best of both! Control Datapath Memory Processor Input Output The Big Picture: Where are We Now?

Chap 6.4  Sequential laundry takes 8 hours for 4 loads  If they learned pipelining, how long would laundry take? 30 TaskOrderTaskOrder B C D A Time 30 6 PM AM Sequential Laundry

Chap 6.5  Pipelined laundry takes 3.5 hours for 4 loads! TaskOrderTaskOrder 12 2 AM 6 PM Time B C D A 30 Pipelined Laundry: Start work ASAP

Chap 6.6  Ifetch: Instruction Fetch Fetch the instruction from the Instruction Memory  Reg/Dec: Register Fetch and Instruction Decode  Exec: Calculate the memory address  Mem: Read the data from the Data Memory  Wr: Write the data back to the register file Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IfetchReg/DecExecMemWrLoad One way to break up the datapath operations

Chap 6.7 I n s t r. O r d e r Time (clock cycles) Inst 0 Inst 1 Inst 2 Inst 4 Inst 3 ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg Apply pipelining

Chap 6.8 IFetchRegExecMemWB IFetchRegExecMemWB IFetchRegExecMemWB IFetchRegExecMemWB IFetchRegExecMemWB IFetchRegExecMemWB Program Flow Time Conventional Pipelined Execution Representation

Chap 6.9  Improve performance by increasing instruction throughput Ideal speedup is number of stages in the pipeline. Do we achieve this? Pipelining

Chap 6.10 Clk Cycle 1 Multiple Cycle Implementation: IfetchRegExecMemWr Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 LoadIfetchRegExecMemWr IfetchRegExecMem LoadStore Pipeline Implementation: IfetchRegExecMemWrStore Clk Single Cycle Implementation: LoadStoreWaste Ifetch R-type IfetchRegExecMemWrR-type Cycle 1Cycle 2 Single Cycle, Multiple Cycle, vs. Pipeline

Chap 6.11  Suppose we execute 100 instructions  Single Cycle Machine 45 ns/cycle x 1 CPI x 100 inst = 4500 ns  Multicycle Machine 10 ns/cycle x 4.6 CPI (due to inst mix) x 100 inst = 4600 ns  Ideal pipelined machine 10 ns/cycle x (1 CPI x 100 inst + 4 cycle drain) = 1040 ns How much improvement can pipelining give us ?

Chap 6.12  What makes it easy all instructions are the same length just a few instruction formats memory operands appear only in loads and stores  What makes it hard? structural hazards: suppose we had only one memory control hazards: need to worry about branch instructions data hazards: an instruction depends on a previous instruction  We’ll build a simple pipeline and look at these issues  We’ll talk about modern processors and what really makes it hard: exception handling trying to improve performance with out-of-order execution, etc. Pipelining

Chap 6.13  What do we need to add to actually split the datapath into stages? Basic Idea

Chap 6.14 Can you find a problem even if there are no dependencies? What instructions can we execute to manifest the problem? Pipelined Datapath

Chap 6.15 Corrected Datapath

Chap 6.16  Can help with answering questions like: how many cycles does it take to execute this code? what is the ALU doing during cycle 4? use this representation to help understand datapaths Graphically Representing Pipelines

Chap 6.17 Pipeline Control

Chap 6.18  We have 5 stages. What needs to be controlled in each stage? Instruction Fetch and PC Increment Instruction Decode / Register Fetch Execution Memory Stage Write Back  How would control be handled in an automobile plant? a fancy control center telling everyone what to do? should we use a finite state machine? Pipeline control

Chap 6.19  Pass control signals along just like the data Pipeline Control

Chap 6.20 Datapath with Control

Chap 6.21  Yes: Pipeline Hazards structural hazards: attempt to use the same resource two different ways at the same time -E.g., combined washer/dryer would be a structural hazard or folder busy doing something else (watching TV) data hazards: attempt to use item before it is ready -E.g., one sock of pair in dryer and one in washer; can ’ t fold until get sock from washer through dryer -instruction depends on result of prior instruction still in the pipeline control hazards: attempt to make a decision before condition is evaluated -E.g., washing football uniforms and need to get proper detergent level; need to see after dryer before next load in -branch instructions  Can always resolve hazards by waiting pipeline control must detect the hazard take action (or delay action) to resolve hazards Can pipelining get us into trouble?

Chap 6.22 Mem I n s t r. O r d e r Time (clock cycles) Load Instr 1 Instr 2 Instr 3 Instr 4 ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Reg MemReg ALU Mem Reg MemReg Detection is easy in this case! (right half highlight means read, left half write) Single Memory is a Structural Hazard

Chap 6.23  Problem with starting next instruction before first is finished dependencies that go backward in time are data hazards Dependencies

Chap 6.24  Have compiler guarantee no hazards  Where do we insert the nops?? sub$2, $1, $3 and $12, $2, $5 or$13, $6, $2 add$14, $2, $2 sw$15, 100($2)  Problem: this really slows us down! Software Solution

Chap 6.25  Use temporary results, don’t wait for them to be written register file forwarding to handle read/write to same register ALU forwarding what if this $2 was $13? Forwarding

Chap 6.26 Forwarding

Chap 6.27  Load word can still cause a hazard: an instruction tries to read a register following a load instruction that writes to the same register.  Thus, we need a hazard detection unit to stall the load instruction Can't always forward

Chap 6.28  We can stall the pipeline by keeping an instruction in the same stage Stalling

Chap 6.29  Stall by letting an instruction that won’t write anything go forward Hazard Detection Unit

Chap 6.30  When we decide to branch, other instructions are in the pipeline!  We are predicting branch not taken need to add hardware for flushing instructions if we are wrong Control (Branch) Hazards

Chap 6.31 Flushing Instructions

Chap 6.32  Stall: wait until decision is clear Its possible to move up decision to 2nd stage by adding hardware to check registers as being read  Impact: 2 clock cycles per branch instruction => slow I n s t r. O r d e r Time (clock cycles) Add Beq Load ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Reg MemReg Mem Example: Control Hazard Solutions

Chap 6.33  Predict: guess one direction then back up if wrong Predict not taken  Impact: 1 clock cycles per branch instruction if right, 2 if wrong (right ­ 50% of time)  More dynamic scheme: history of 1 branch ( ­ 90%) I n s t r. O r d e r Time (clock cycles) Add Beq Load ALU Mem Reg MemReg ALU Mem Reg MemReg Mem ALU Reg MemReg Example: Control Hazard Solutions

Chap 6.34  Redefine branch behavior (takes place after next instruction) “ delayed branch ”  Impact: 1 clock cycles per branch instruction if can find instruction to put in “ slot ” ( ­ 50% of time) I n s t r. O r d e r Time (clock cycles) Add Beq Misc ALU Mem Reg MemReg ALU Mem Reg MemReg Mem ALU Reg MemReg Load Mem ALU Reg MemReg Example: Control Hazard Solutions

Chap 6.35  The hardware performs the scheduling? hardware tries to find instructions to execute out of order execution is possible speculative execution and dynamic branch prediction  All modern processors are very complicated DEC Alpha 21264: 9 stage pipeline, 6 instruction issue PowerPC and Pentium: branch history table Compiler technology important Dynamic Scheduling

Chap 6.36  Pipelining is a fundamental concept multiple steps using distinct resources  Utilize capabilities of the Datapath by pipelined instruction processing start next instruction while working on the current one limited by length of longest stage (plus fill/flush) detect and resolve hazards Summary