9.2 Pipelining Suppose we want to perform the combined multiply and add operations with a stream of numbers: A i * B i + C i for i =1,2,3,…,7.

Slides:



Advertisements
Similar presentations
PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
Advertisements

OMSE 510: Computing Foundations 4: The CPU!
CMPT 334 Computer Organization
Pipelining I Topics Pipelining principles Pipeline overheads Pipeline registers and stages Systems I.
Pipeline and Vector Processing (Chapter2 and Appendix A)
Pipelining, Parallelism, and Simplified Circuits Discrete Math April 13, 2006 Harding University Jonathan White.
Chapter 8. Pipelining.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
EECS 318 CAD Computer Aided Design LECTURE 2: DSP Architectures Instructor: Francis G. Wolff Case Western Reserve University This presentation.
Goal: Describe Pipelining
Computer Architecture
CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.
Chapter Six 1.
Chapter 12 Pipelining Strategies Performance Hazards.
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
Chapter 12 CPU Structure and Function. Example Register Organizations.
Pipelining Datapath Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley) and Hank Walker (TAMU)
CS430 – Computer Architecture Introduction to Pipelined Execution
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 9, 2002 Topic: Pipelining Basics.
1 Atanasoff–Berry Computer, built by Professor John Vincent Atanasoff and grad student Clifford Berry in the basement of the physics building at Iowa State.
Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.
Introduction to Pipelining Rabi Mahapatra Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)
5-Stage Pipelining Fetch Instruction (FI) Fetch Operand (FO) Decode Instruction (DI) Write Operand (WO) Execution Instruction (EI) S3S3 S4S4 S1S1 S2S2.
CS1104: Computer Organisation School of Computing National University of Singapore.
Integrated Circuits Costs
B 0000 Pipelining ENGR xD52 Eric VanWyk Fall
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
Pipelining (I). Pipelining Example  Laundry Example  Four students have one load of clothes each to wash, dry, fold, and put away  Washer takes 30.
Chapter 4 The Processor. Chapter 4 — The Processor — 2 Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined.
Analogy: Gotta Do Laundry
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.

Cs 152 L1 3.1 DAP Fa97,  U.CB Pipelining Lessons °Pipelining doesn’t help latency of single task, it helps throughput of entire workload °Multiple tasks.
CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010
Pipelining Example Laundry Example: Three Stages
MS108 Computer System I Lecture 5 Pipeline Prof. Xiaoyao Liang 2015/3/27 1.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CS252/Patterson Lec 1.1 1/17/01 معماري کامپيوتر - درس نهم pipeline برگرفته از درس : Prof. David A. Patterson.
CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned to program MIPS And a bit about Intel’s x86 Instructions.
CPU Design and Pipelining – Page 1CSCI 4717 – Computer Architecture CSCI 4717/5717 Computer Architecture Topic: CPU Operations and Pipelining Reading:
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter One Introduction to Pipelined Processors.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.
BITS Pilani Pilani Campus Pawan Sharma Lecture / ES C263 INSTR/CS/EEE F241 Microprocessor Programming and Interfacing.
CPU (Central Processing Unit). The CPU is the brain of the computer. Sometimes referred to simply as the processor or central processor, the CPU is where.
Lecture 3. Performance Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212, CYDF210 Computer Architecture.
DICCD Class-08. Parallel processing A parallel processing system is able to perform concurrent data processing to achieve faster execution time The system.
Lecture 5. MIPS Processor Design Pipelined MIPS #1 Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212 Computer Architecture.
Lecture 18: Pipelining I.
Computer Architecture Chapter (14): Processor Structure and Function
Pipelines An overview of pipelining
Review: Instruction Set Evolution
CMSC 611: Advanced Computer Architecture
Chapter One Introduction to Pipelined Processors
Chapter 8. Pipelining.
Lecturer: Alan Christopher
Serial versus Pipelined Execution
An Introduction to pipelining
Chapter 8. Pipelining.
Pipelining Appendix A and Chapter 3.
A relevant question Assuming you’ve got: One washer (takes 30 minutes)
Pipelining.
Presentation transcript:

9.2 Pipelining Suppose we want to perform the combined multiply and add operations with a stream of numbers: A i * B i + C i for i =1,2,3,…,7

9.2 Pipelining The suboperations performed in each segment of the pipeline are as follows: R1  A i, R2  B i R3  R1 * R2 R4  C i R5  R3 + R4

Pipeline Performance n: instructions k : stages in pipeline  : clockcycle T k : total time n is equivalent to number of loads in the laundry example k is the stages (washing, drying and folding. Clock cycle is the slowest task time n k

Pipelining: Laundry Example Small laundry has one washer, one dryer and one operator, it takes 90 minutes to finish one load: Washer takes 30 minutes Dryer takes 40 minutes “operator folding” takes 20 minutes ABCD

Sequential Laundry This operator scheduled his loads to be delivered to the laundry every 90 minutes which is the time required to finish one load. In other words he will not start a new task unless he is already done with the previous task The process is sequential. Sequential laundry takes 6 hours for 4 loads ABCD PM Midnight TaskOrderTaskOrder Time 90 min

Efficiently scheduled laundry: Pipelined Laundry Operator start work ASAP Another operator asks for the delivery of loads to the laundry every 40 minutes!?. Pipelined laundry takes 3.5 hours for 4 loads ABCD 6 PM Midnight TaskOrderTaskOrder Time

Pipelining Facts Multiple tasks operating simultaneously Pipelining doesn’t help latency of single task, it helps throughput of entire workload Pipeline rate limited by slowest pipeline stage Potential speedup = Number of pipe stages Unbalanced lengths of pipe stages reduces speedup Time to “fill” pipeline and time to “drain” it reduces speedup ABCD 6 PM 789 TaskOrderTaskOrder Time The washer waits for the dryer for 10 minutes

Some definitions Pipeline: is an implementation technique where multiple instructions are overlapped in execution. Pipeline stage: The computer pipeline is to divided instruction processing into stages. Each stage completes a part of an instruction and loads a new part in parallel. The stages are connected one to the next to form a pipe - instructions enter at one end, progress through the stages, and exit at the other end.

Throughput of the instruction pipeline is determined by how often an instruction exits the pipeline. Pipelining does not decrease the time for individual instruction execution. Instead, it increases instruction throughput. Machine cycle. The time required to move an instruction one step further in the pipeline. The length of the machine cycle is determined by the time required for the slowest pipe stage. Some definitions

Instruction pipeline versus sequential processing sequential processing Instruction pipeline

Instruction pipeline (Contd.) sequential processing is faster for few instructions

Two Stage Instruction Pipeline

Difficulties... If a complicated memory access occurs in stage 1, stage 2 will be delayed and the rest of the pipe is stalled. If there is a branch, if.. and jump, then some of the instructions that have already entered the pipeline should not be processed. We need to deal with these difficulties to keep the pipeline moving

5-Stage Pipelining Fetch Instruction (FI) Fetch Operand (FO) Decode Instruction (DI) Write Operand (WO) Execution Instruction (EI) S3S3 S4S4 S1S1 S2S2 S5S S1S1 S2S2 S5S5 S3S3 S4S Time

Five Stage Instruction Pipeline Fetch instruction Decode instruction Fetch operands Execute instructions Write result

6-Stage Pipelining Instruction Fetch Decode Execution Fetch Operand S3S3 S4S4 S1S1 S2S2 S5S S1S1 S2S2 S5S5 S3S3 S4S Time 6 Write operand Calculate operand S6S6

Six Stage Instruction Pipeline Fetch instruction Decode instruction Calculate operands (Find effective address) Fetch operands Execute instructions Write result

Flow chart for four segment pipeline

Two major difficulties Branch Difficulties Data Dependency

Prefetch target instruction Prefetch the target instruction in addition to the instruction following th branch If the branch condition is successful, the pipeline continues from the branch target instruction

Branch target buffer (BTB) BTB is an associative memory Each entry in the BTB consists of the address of a previously executed branch instruction and the target instruction for the branch

Branch Prediction A pipeline with branch prediction uses some additional logic to guess the outcome of a conditional branch instruction before it is executed

Delayed Branch In this procedure, the compiler detects the branch instruction and rearrange the machine language code sequence by inserting useful instructions that keep the pipeline operating without interrupts An example of delay branch is presented in the next section