ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.

Slides:



Advertisements
Similar presentations
PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
Advertisements

1 IKI20210 Pengantar Organisasi Komputer Kuliah no. 25: Pipeline 10 Januari 2003 Bobby Nazief Johny Moningka
Chapter 8. Pipelining.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
Computer Architecture
CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.
Chapter Six 1.
Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
1 CSE SUNY New Paltz Chapter Six Enhancing Performance with Pipelining.
Pipelining Datapath Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley) and Hank Walker (TAMU)
CS430 – Computer Architecture Introduction to Pipelined Execution
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 9, 2002 Topic: Pipelining Basics.
1 Atanasoff–Berry Computer, built by Professor John Vincent Atanasoff and grad student Clifford Berry in the basement of the physics building at Iowa State.
Pipelining - II Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.
Ceg3420 L13.1 DAP Fa97,  U.CB CEG3420 Computer Design Introduction to Pipelining.
CS152 / Kubiatowicz Lec /17/01©UCB Fall 2001 CS152 Computer Architecture and Engineering Lecture 13 Introduction to Pipelining: Datapath and Control.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
Pipelining - II Rabi Mahapatra Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Introduction to Pipelining Rabi Mahapatra Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)
Enhancing Performance with Pipelining Slides developed by Rami Abielmona and modified by Miodrag Bolic High-Level Computer Systems Design.
CS1104: Computer Organisation School of Computing National University of Singapore.
Computer Science Education
B 0000 Pipelining ENGR xD52 Eric VanWyk Fall
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
Pipelining (I). Pipelining Example  Laundry Example  Four students have one load of clothes each to wash, dry, fold, and put away  Washer takes 30.
Analogy: Gotta Do Laundry
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
Computer Organization CS224 Chapter 4 Part b The Processor Spring 2010 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
CS 152 Lec 12.1 CS 152: Computer Architecture and Engineering Lecture 12 Multicycle Controller Design Pipelining Randy H. Katz, Instructor Satrajit Chatterjee,
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Pipelining.
EECS 322 March 27, 2000 Based on Dave Patterson slides Instructor: Francis G. Wolff Case Western Reserve University This presentation.

Cs 152 L1 3.1 DAP Fa97,  U.CB Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Multiple tasks.
Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Pipelining Example Laundry Example: Three Stages
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CPE 442 hazards.1 Introduction to Computer Architecture CpE 442 Designing a Pipeline Processor (lect. II)
Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.
CS252/Patterson Lec 1.1 1/17/01 معماري کامپيوتر - درس نهم pipeline برگرفته از درس : Prof. David A. Patterson.
CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.
Lecture 18: Pipelining I.
Pipelines An overview of pipelining
Review: Instruction Set Evolution
CMSC 611: Advanced Computer Architecture
ECE232: Hardware Organization and Design
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
Dave Patterson (http.cs.berkeley.edu/~patterson)
Chapter 4 The Processor Part 2
Lecturer: Alan Christopher
Pipelining Lessons 6 PM T a s k O r d e B C D A 30
An Introduction to pipelining
Pipelining Appendix A and Chapter 3.
Recall: Performance Evaluation
Presentation transcript:

ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining Maciej Ciesielski

ECE 232 L18.Pipeline.2 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Outline °Review Microprogramming °Pipelining, basic concept °Hazards Structural Data hazards Control hazards

ECE 232 L18.Pipeline.3 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelining is Natural! Laundry Example °Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold °Washer takes 30 minutes °Dryer takes 40 minutes °“Folder” takes 20 minutes ABCD

ECE 232 L18.Pipeline.4 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Sequential Laundry °Sequential laundry takes 6 hours for 4 loads °If they learned pipelining, how long would laundry take? A B C D TaskOrderTaskOrder PM Midnight Time

ECE 232 L18.Pipeline.5 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelined Laundry: Start work ASAP °Pipelined laundry takes 3.5 hours for 4 loads ABCD TaskOrderTaskOrder 6 PM Midnight Time

ECE 232 L18.Pipeline.6 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Pipeline rate limited by slowest pipeline stage °Multiple tasks operating simultaneously using different resources °Potential speedup = Number pipe stages °Unbalanced lengths of pipe stages reduces speedup °Time to “fill” pipeline and time to “drain” it reduces speedup °Stall for Dependences ABCD TaskOrderTaskOrder 6 PM 789 Time

ECE 232 L18.Pipeline.7 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers The Five Stages of Load °IFetch: Instruction Fetch Fetch the instruction from the Instruction Memory °Reg/Dec: Registers Fetch and Instruction Decode °Exec: Calculate the memory address °Mem: Read the data from the Data Memory °Wr: Write the data back to the register file Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IfetchReg/DecExecMemWrLoad

ECE 232 L18.Pipeline.8 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Pipelined Execution °Utilization? °Now we just have to make it work IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB Program Flow Time

ECE 232 L18.Pipeline.9 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Single Cycle, Multiple Cycle, vs. Pipeline Clk Cycle 1 Multiple Cycle Implementation : IfetchRegExecMemWr Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 LoadIfetchRegExecMemWr IfetchRegExecMem LoadStore Pipeline Implementation : IfetchRegExecMemWrStore Clk Single Cycle Implementation : LoadStoreWaste Ifetch R-type IfetchRegExecMemWrR-type Cycle 1Cycle 2

ECE 232 L18.Pipeline.10 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Why Pipeline? °Suppose we execute 100 instructions °Single Cycle Machine 45 ns/cycle x 1 CPI x 100 inst = 4500 ns °Multicycle Machine 10 ns/cycle x 4.6 CPI (due to inst mix) x 100 inst = 4600 ns °Ideal pipelined machine 10 ns/cycle x (1 CPI x 100 inst + 4 cycle drain ) = 1040 ns

ECE 232 L18.Pipeline.11 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Why Pipeline? Because the resources are there! Time (clock cycles) I n s t r. O r d e r Inst 0 Inst 1 Inst 2 Inst 4 Inst 3 ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg ALU Im Reg DmReg

ECE 232 L18.Pipeline.12 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Can pipelining get us into trouble? °Yes: Pipeline Hazards structural hazards: attempt to use the same resource two different ways at the same time -E.g., combined washer/dryer would be a structural hazard or folder busy doing something else (watching TV) data hazards: attempt to use item before it is ready -E.g., one sock of pair in dryer and one in washer; can’t fold until get sock from washer through dryer -instruction depends on result of prior instruction still in the pipeline control hazards: attempt to make a decision before condition is evaulated -E.g., washing football uniforms and need to get proper detergent level; need to see after dryer before next load in -branch instructions °Can always resolve hazards by waiting pipeline control must detect the hazard take action (or delay action) to resolve hazards

ECE 232 L18.Pipeline.13 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Single Memory is a Structural Hazard Time (clock cycles) Mem I n s t r. O r d e r Load Instr 1 Instr 2 Instr 3 Instr 4 ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Reg MemReg ALU Mem Reg MemReg Detection is easy in this case! (right half highlight means read, left half write)

ECE 232 L18.Pipeline.14 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers °Stall: wait until decision is clear It is possible to move up decision to 2nd stage by adding hardware to check registers as being read °Impact: 2 clock cycles per branch instruction => slow Control Hazard Solutions I n s t r. O r d e r Time (clock cycles) Add Beq Load ALU Mem Reg MemReg ALU Mem Reg MemReg ALU Reg MemReg Mem

ECE 232 L18.Pipeline.15 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers °Predict: guess one direction then back up if wrong Predict not taken °Impact: 1 clock cycles per branch instruction if right, 2 if wrong (right ­ 50% of time) °More dynamic scheme: history of 1 branch (­ 90%) Control Hazard Solutions I n s t r. O r d e r Time (clock cycles) Add Beq Load ALU Mem Reg MemReg ALU Mem Reg MemReg Mem ALU Reg MemReg

ECE 232 L18.Pipeline.16 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers °Redefine branch behavior (takes place after next instruction) “delayed branch” °Impact: 0 clock cycles per branch instruction if can find instruction to put in “slot” (­ 50% of time) °As launch more instruction per clock cycle, less useful Control Hazard Solutions I n s t r. O r d e r Time (clock cycles) Add Beq Misc ALU Mem Reg MemReg ALU Mem Reg MemReg Mem ALU Reg MemReg Load Mem ALU Reg MemReg

ECE 232 L18.Pipeline.17 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Dependencies backwards in time are hazards Data Hazard on r1: I n s t r. O r d e r Time (clock cycles) add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 IFIF ID/R F EXEX ME M WBWB ALU Im Reg Dm Reg ALU Im Reg DmReg ALU Im Reg DmReg Im ALU Reg DmReg ALU Im Reg DmReg

ECE 232 L18.Pipeline.18 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers “Forward” result from one stage to another “or” OK if define read/write properly Data Hazard Solution: I n s t r. O r d e r Time (clock cycles) add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11 ALU Im Reg IFIF ID/R F EXEX ME M WBWB Dm Reg ALU Im Reg DmReg ALU Im Reg DmReg Im ALU Reg DmReg ALU Im Reg DmReg

ECE 232 L18.Pipeline.19 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Dependencies backwards in time are hazards Can’t solve with forwarding: Must delay/stall instruction dependent on loads Forwarding (or Bypassing): What about Loads Time (clock cycles) IFIF ID/R F EXEX ME M WBWB lw r1,0(r2) sub r4,r1,r3 ALU Im Reg Dm Reg ALU Im Reg DmReg

ECE 232 L18.Pipeline.20 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Designing a Pipelined Processor °Go back and examine your datapath and control diagrams °Associated resources with states °Ensure that flows do not conflict, or figure out how to resolve °Assert control in appropriate stage

ECE 232 L18.Pipeline.21 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers Summary °Microprogramming is a fundamental concept implement an instruction set by building a very simple processor and interpreting the instructions essential for very complex instructions and when few register transfers are possible °Pipelining is a fundamental concept multiple steps using distinct resources °Utilize capabilities of the Datapath by pipelined instruction processing start next instruction while working on the current one limited by length of longest stage (plus fill/flush) detect and resolve hazards