B 0000 Pipelining ENGR xD52 Eric VanWyk Fall 2012 1.

Slides:



Advertisements
Similar presentations
PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
Advertisements

OMSE 510: Computing Foundations 4: The CPU!
1 IKI20210 Pengantar Organisasi Komputer Kuliah no. 25: Pipeline 10 Januari 2003 Bobby Nazief Johny Moningka
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter 8. Pipelining.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
EECS 318 CAD Computer Aided Design LECTURE 2: DSP Architectures Instructor: Francis G. Wolff Case Western Reserve University This presentation.
Pipeline Computer Architecture Lecture 12: Designing a Pipeline Processor.
Computer Architecture
CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.
Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
1 CSE SUNY New Paltz Chapter Six Enhancing Performance with Pipelining.
Pipelining Datapath Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley) and Hank Walker (TAMU)
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 9, 2002 Topic: Pipelining Basics.
1 Atanasoff–Berry Computer, built by Professor John Vincent Atanasoff and grad student Clifford Berry in the basement of the physics building at Iowa State.
Pipelining - II Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.
Ceg3420 L13.1 DAP Fa97,  U.CB CEG3420 Computer Design Introduction to Pipelining.
Pipelining - II Rabi Mahapatra Adapted from CS 152C (UC Berkeley) lectures notes of Spring 2002.
Spring W :332:331 Computer Architecture and Assembly Language Spring 2005 Week 11 Introduction to Pipelined Datapath [Adapted from Dave Patterson’s.
Introduction to Pipelining Rabi Mahapatra Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)
CPE 442 pipeline.1 Intro to Computer Architecture CpE 242 Computer Architecture and Engineering Designing a Pipeline Processor.
CS1104: Computer Organisation School of Computing National University of Singapore.
Computer Science Education
Integrated Circuits Costs
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
Analogy: Gotta Do Laundry
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.
EECS 322 March 27, 2000 Based on Dave Patterson slides Instructor: Francis G. Wolff Case Western Reserve University This presentation.

Cs 152 L1 3.1 DAP Fa97,  U.CB Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Multiple tasks.
Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CPE 442 hazards.1 Introduction to Computer Architecture CpE 442 Designing a Pipeline Processor (lect. II)
Pipelining CS365 Lecture 9. D. Barbara Pipeline CS465 2 Outline  Today’s topic  Pipelining is an implementation technique in which multiple instructions.
CS252/Patterson Lec 1.1 1/17/01 معماري کامپيوتر - درس نهم pipeline برگرفته از درس : Prof. David A. Patterson.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
EEL-4713 Ann Gordon-Ross 1 EEL-4713C Computer Architecture Designing a Pipelined Processor.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.
B 0001 Pipelining ENGR xD52 Eric VanWyk Fall
Lecture 5. MIPS Processor Design Pipelined MIPS #1 Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212 Computer Architecture.
B1111 Pipelining ENGR xD52 Eric VanWyk Fall 2014.
Lecture 18: Pipelining I.
Computer Organization
Pipelines An overview of pipelining
Review: Instruction Set Evolution
CMSC 611: Advanced Computer Architecture
ECE232: Hardware Organization and Design
Chapter 3: Pipelining 순천향대학교 컴퓨터학부 이 상 정 Adapted from
Chapter 4 The Processor Part 2
Lecturer: Alan Christopher
Serial versus Pipelined Execution
CS-447– Computer Architecture Lecture 14 Pipelining (2)
An Introduction to pipelining
Pipelining Appendix A and Chapter 3.
A relevant question Assuming you’ve got: One washer (takes 30 minutes)
Recall: Performance Evaluation
Presentation transcript:

b 0000 Pipelining ENGR xD52 Eric VanWyk Fall

Acknowledgements Mark L. Chang lecture notes for Computer Architecture (Olin ENGR3410)

Today The Best Computer Architecture Slide Ever Introduction to Pipelining Reasons to Pipeline DIY Pipelined CPUs Hazards

Single Cycle, Multiple Cycle, vs. Pipeline Clk Cycle 1 Multiple Cycle Implementation: IfetchRegExecMemWr Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 LoadIfetchRegExecMemWr IfetchRegExecMem LoadStore Pipeline Implementation: IfetchRegExecMemWrStore Clk Single Cycle Implementation: LoadStoreWaste Ifetch R-type IfetchRegExecMemWrR-type Cycle 1Cycle 2

Recap Single Cycle CPUs – Speed limited by the slowest instruction Multi Cycle divides instr into multiple chunks – as slow as slowest chunk – Uses variable CPI for speed up Which is faster on average? When is the other option faster? Why?

6 Pipelining Example: Doing the laundry Ann, Brian, Cathy, & Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes “Folder” takes 20 minutes ABCD

ABCD PM Midnight TaskOrderTaskOrder Time Sequential Laundry Sequential laundry takes 6 hours for 4 loads If they learned pipelining, how long would laundry take?

8 Pipelined Laundry: Start work ASAP Pipelined laundry takes 3.5 hours for 4 loads ABCD 6 PM Midnight TaskOrderTaskOrder Time

ABCD 6 PM 789 TaskOrderTaskOrder Time Pipelining doesn’t help latency of single task, it helps throughput of entire workload Pipeline rate limited by slowest pipeline stage Multiple tasks operating simultaneously using different resources Potential speedup = Number pipe stages Unbalanced lengths of pipe stages reduces speedup Time to “fill” pipeline and time to “drain” it reduces speedup Pipelining Lessons

Pipelined Execution IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB IFetchDcdExecMemWB Program Flow Time CPI in pipelined processor is “issue rate” Ignore fill/drain, ignore latency Pipelining maximizes throughput

Single Cycle, Multiple Cycle, vs. Pipeline Clk Cycle 1 Multiple Cycle Implementation: IfetchRegExecMemWr Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 LoadIfetchRegExecMemWr IfetchRegExecMem LoadStore Pipeline Implementation: IfetchRegExecMemWrStore Clk Single Cycle Implementation: LoadStoreWaste Ifetch R-type IfetchRegExecMemWrR-type Cycle 1Cycle 2

Why Pipeline? Suppose we execute 100 instructions Single Cycle Machine – 45 ns/cycle x 1 CPI x 100 inst = ____ ns Multicycle Machine – 10 ns/cycle x 4.0 CPI x 100 inst = ____ ns Ideal pipelined machine – 10 ns/cycle x (1 CPI x 100 inst + 4 cycle drain) = ____ ns

Rules of Pipelining Divide CPU / Instruction in to stages Each functional unit belongs to one stage – Can’t reuse the e.g. ALU like a multi-cycle design – Reg File ports are different functional units – For now, split memory into instruction & data Stage order is the same for all instructions – How to “skip” a stage? – This is true for “purely” pipelined processors

Registers Divide datapath into multiple pipeline stages Pipelined Datapath PC Data Memory Instr. Memory Register File Register File IF Instruction Fetch RF Register Fetch EX Execute MEM Data Memory WB Writeback

15 Pipelined Control IF/ID Register ID/Ex Register Ex/Mem Register Mem/Wr Register Reg/DecExecMem ALUSrcA ALUOp RegDst ALUSrcB IorD MemWE Mem2Reg RegWE Main Control ALUSrcA ALUOp RegDst ALUSrcB Mem2Reg RegWE Mem2Reg RegWE Mem2Reg RegWE IorD MemWE IorD MemWE Wr The Main Control generates the control signals during Reg/Dec – Control signals for Exec (ALUOp, ALUSrcA, …) are used 1 cycle later – Control signals for Mem (MemWE, IorD, …) are used 2 cycles later – Control signals for Wr (Mem2Reg, RegWE, …) are used 3 cycles later

Pipelining the Load Instruction The five independent functional units in the pipeline datapath are: – Instruction Memory for the Ifetch stage – Register File’s Read ports (bus A and busB) for the Reg/Dec stage – ALU for the Exec stage – Data Memory for the Mem stage – Register File’s Write port (bus W) for the Wr stage Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7 IfetchReg/DecExecMemWr1st lw IfetchReg/DecExecMemWr2nd lw IfetchReg/DecExecMemWr3rd lw

The Four Stages of R-type Ifetch: Fetch the instruction from the Instruction Memory Reg/Dec: Register Fetch and Instruction Decode Exec: ALU operates on the two register operands Wr: Write the ALU output back to the register file Cycle 1Cycle 2Cycle 3Cycle 4 IfetchReg/DecExecWrR-type

Structural Hazard Interaction between R-type and loads causes structural hazard on writeback Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type IfetchReg/DecExecMemWrLoad IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type

Structural Hazard Interaction between R-type and loads causes structural hazard on writeback Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type IfetchReg/DecExecMemWrLoad IfetchReg/DecExecWrR-type IfetchReg/DecExecWrR-type

Structural Hazard: Solved Add a do-nothing (NO OP or NOP) MEM stage Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9 IfetchReg/DecMemWrR-type IfetchReg/DecMemWrR-type IfetchReg/DecExecMemWrLoad IfetchReg/DecMemWrR-type IfetchReg/DecMemWrR-type Exec

DIY Pipelining Design your own Pipelined MIPS CPU – ADD, LW, SW, – J, JAL if you have time. BEQ if you are rocking it Start with a 5 stage pipeline: – Instruction Fetch – Instruction Decode / Register Fetch – Execute – Data Memory – Register Write Back You can merge / split cycles if you’d like

DIY Pipelining Deliverables: – Schematic of Data path (stub out control signals) – Control signals chart for instructions – Pipeline chart for instructions Pretend to execute a simple program – Use a homework or lecture program? – Make a pipeline chart for this program

DIY Pipelining Record hazards you encounter in your design – Trying to double use a functional unit? – Trying to use information before it is ready? – Can you rewrite your program to avoid them? Specifically when is this possible? These will be the basis of Monday’s class – Come back with one specific hazard