July 2005Computer Architecture, Data Path and ControlSlide 1 Part IV Data Path and Control.

Slides:

Advertisements

Similar presentations

Instruction Level Parallelism and Superscalar Processors

Advertisements

Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.

Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.

Chapter 3 Pipelining. 3.1 Pipeline Model n Terminology –task –subtask –stage –staging register n Total processing time for each task. –T pl =, where t.

Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.

Pipelined Datapath and Control (Lecture #13) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

S. Barua – CPSC 440 CHAPTER 6 ENHANCING PERFORMANCE WITH PIPELINING This chapter presents pipelining.

CIS429.S00: Lec3 - 1 CPU Time Analysis Terminology IC = instruction count = number of instructions in the program CPI = cycles per instruction (varies.

1 Lecture 17: Basic Pipelining Today’s topics:  5-stage pipeline  Hazards and instruction scheduling Mid-term exam stats:  Highest: 90, Mean: 58.

Pipelining II Andreas Klappenecker CPSC321 Computer Architecture.

Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

1 Recap (Pipelining). 2 What is Pipelining? A way of speeding up execution of tasks Key idea : overlap execution of multiple taks.

The Optimum Pipeline Depth for a Microprocessor Fang Pang Oct/01/02.

EECC551 - Shaaban #1 Spring 2006 lec# Pipelining and Instruction-Level Parallelism. Definition of basic instruction block Increasing Instruction-Level.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.

EECC551 - Shaaban #1 Fall 2005 lec# Pipelining and Instruction-Level Parallelism. Definition of basic instruction block Increasing Instruction-Level.

Pipelining Andreas Klappenecker CPSC321 Computer Architecture.

Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.

Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.

Pipelined Processor II CPSC 321 Andreas Klappenecker.

DLX Instruction Format

Computer Organization Lecture Set – 06 Chapter 6 Huei-Yung Lin.

EECC551 - Shaaban #1 Spring 2004 lec# Definition of basic instruction blocks Increasing Instruction-Level Parallelism & Size of Basic Blocks.

Appendix A Pipelining: Basic and Intermediate Concepts

Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.

1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.

Lecture 24: CPU Design Today’s topic –Multi-Cycle ALU –Introduction to Pipelining 1.

10-1 Chapter 10 - Trends in Computer Architecture Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.

FAMU-FSU College of Engineering 1 Computer Architecture EEL 4713/5764, Spring 2006 Dr. Michael Frank Module #13 – Instruction Execution Steps (in a Single-Cycle.

COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections

1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.

Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.

1 COMP541 Multicycle MIPS Montek Singh Apr 4, 2012.

Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.

Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2012 Revised from original slides provided by MKP.

ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013.

CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.

CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.

1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.

Electrical and Computer Engineering University of Cyprus LAB3: IMPROVING MIPS PERFORMANCE WITH PIPELINING.

1 COMP541 Pipelined MIPS Montek Singh Mar 30, 2010.

EECE 476: Computer Architecture Slide Set #5: Implementing Pipelining Tor Aamodt Slide background: Die photo of the MIPS R2000 (first commercial MIPS microprocessor)

CS 1104 Help Session IV Five Issues in Pipelining Colin Tan, S

Chapter 6 Pipelined CPU Design. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides Pipelined operation – laundry analogy Text Fig. 6.1.

COMP541 Multicycle MIPS Montek Singh Mar 25, 2010.

CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and

1 COMP541 Pipelined MIPS Montek Singh Apr 9, 2012.

Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.

10-1 Chapter 10 - Trends in Computer Architecture Department of Information Technology, Radford University ITEC 352 Computer Organization Principles of.

LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.

CPU Design and Pipelining – Page 1CSCI 4717 – Computer Architecture CSCI 4717/5717 Computer Architecture Topic: CPU Operations and Pipelining Reading:

1. Convert the RISCEE 1 Architecture into a pipeline Architecture (like Figure 6.30) (showing the number data and control bits). 2. Build the control line.

Principles of Linear Pipelining

Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.

Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.

Computer Architecture

Morgan Kaufmann Publishers

Part IV Data Path and Control

Part IV Data Path and Control

Serial versus Pipelined Execution

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Part IV Data Path and Control

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Presentation transcript:

July 2005Computer Architecture, Data Path and ControlSlide 1 Part IV Data Path and Control

July 2005Computer Architecture, Data Path and ControlSlide 2 IV Data Path and Control Topics in This Part Chapter 13 Instruction Execution Steps Chapter 14 Control Unit Synthesis Chapter 15 Pipelined Data Paths Chapter 16 Pipeline Performance Limits Design a simple computer (MicroMIPS) to learn about: Data path – part of the CPU where data signals flow Control unit – guides data signals through data path Pipelining – a way of achieving greater performance

July 2005Computer Architecture, Data Path and ControlSlide 3 15 Pipelined Data Paths Pipelining is now used in even the simplest of processors Same principles as assembly lines in manufacturing Unlike in assembly lines, instructions not independent Topics in This Chapter 15.1 Pipelining Concepts 15.2 Pipeline Stalls or Bubbles 15.3 Pipeline Timing and Performance 15.4 Pipelined Data Path Design 15.5 Pipelined Control 15.6 Optimal Pipelining

July 2005Computer Architecture, Data Path and ControlSlide 4 Single-Cycle Data Path of Chapter 13 Fig Key elements of the single-cycle MicroMIPS data path. Clock rate = 125 MHz CPI = 1 (125 MIPS)

July 2005Computer Architecture, Data Path and ControlSlide 5 Multicycle Data Path of Chapter 14 Fig Key elements of the multicycle MicroMIPS data path. Clock rate = 500 MHz CPI  4 (  125 MIPS)

July 2005Computer Architecture, Data Path and ControlSlide 6 Getting the Best of Both Worlds Single-cycle: Clock rate = 125 MHz CPI = 1 Multicycle: Clock rate = 500 MHz CPI  4 Pipelined: Clock rate = 500 MHz CPI  1

July 2005Computer Architecture, Data Path and ControlSlide Pipelining Concepts Fig Pipelining in the student registration process. Strategies for improving performance 1 – Use multiple independent data paths accepting several instructions that are read out at once: multiple-instruction-issue or superscalar 2 – Overlap execution of several instructions, starting the next instruction before the previous one has run to completion: (super)pipelined

July 2005Computer Architecture, Data Path and ControlSlide 8 Pipelined Instruction Execution Fig Pipelining in the MicroMIPS instruction execution process.

July 2005Computer Architecture, Data Path and ControlSlide 9 Alternate Representations of a Pipeline Fig Two abstract graphical representations of a 5-stage pipeline executing 7 tasks (instructions). Except for start-up and drainage overheads, a pipeline can execute one instruction per clock tick; IPS is dictated by the clock frequency

July 2005Computer Architecture, Data Path and ControlSlide 10 Pipelining Example in a Photocopier Example 15.1 A photocopier with an x-sheet document feeder copies the first sheet in 4 s and each subsequent sheet in 1 s. The copier’s paper path is a 4-stage pipeline with each stage having a latency of 1s. The first sheet goes through all 4 pipeline stages and emerges after 4 s. Each subsequent sheet emerges 1s after the previous sheet. How does the throughput of this photocopier varies with x, assuming that loading the document feeder and removing the copies takes 15 s. Solution Each batch of x sheets is copied in (x – 1) = 18 + x seconds. A nonpipelined copier would require 4x seconds to copy x sheets. For x > 6, the pipelined version has a performance edge. When x = 50, the pipelining speedup is (4  50) / ( ) = 2.94.

July 2005Computer Architecture, Data Path and ControlSlide Pipeline Stalls or Bubbles Fig Read-after-write data dependency and its possible resolution through data forwarding. First type of data dependency

July 2005Computer Architecture, Data Path and ControlSlide 12 Second Type of Data Dependency Fig Read-after-load data dependency and its possible resolution through bubble insertion and data forwarding.

July 2005Computer Architecture, Data Path and ControlSlide 13 Control Dependency in a Pipeline Fig Control dependency due to conditional branch.

July 2005Computer Architecture, Data Path and ControlSlide Pipeline Timing and Performance Fig Pipelined form of a function unit with latching overhead.

July 2005Computer Architecture, Data Path and ControlSlide 15 Fig Throughput improvement due to pipelining as a function of the number of pipeline stages for different pipelining overheads. Throughput Increase in a q-Stage Pipeline t t / q +  or q 1 + q  / t

July 2005Computer Architecture, Data Path and ControlSlide 16 Assume that one bubble must be inserted due to read-after-load dependency and after a branch when its delay slot cannot be filled. Let  be the fraction of all instructions that are followed by a bubble. Pipeline Throughput with Dependencies q (1 + q  / t)(1 +  ) Pipeline speedup = R-type44% Load24% Store12% Branch18% Jump 2% Example 15.3 Calculate the effective CPI for MicroMIPS, assuming that a quarter of branch and load instructions are followed by bubbles. Solution Fraction of bubbles  = 0.25( ) = CPI = 1 +  = (which is very close to the ideal value of 1)

July 2005Computer Architecture, Data Path and ControlSlide Pipelined Data Path Design Fig Key elements of the pipelined MicroMIPS data path.

July 2005Computer Architecture, Data Path and ControlSlide Pipelined Control Fig Pipelined control signals.

July 2005Computer Architecture, Data Path and ControlSlide Optimal Pipelining Fig Higher-throughput pipelined data path for MicroMIPS and the execution of consecutive instructions in it. MicroMIPS pipeline with more than four-fold improvement

July 2005Computer Architecture, Data Path and ControlSlide 20 Optimal Number of Pipeline Stages Fig Pipelined form of a function unit with latching overhead. Assumptions: Pipeline sliced into q stages Stage overhead is  q/2 bubbles per branch (decision made midway) Fraction b of all instructions are taken branches Derivation of q opt Average CPI = 1 + b q / 2 Throughput = Clock rate / CPI = [(t / q +  )(1 + b q / 2)] –1/2 Differentiate throughput expression with respect to q and equate with 0 q opt = [2(t /  ) / b)] 1/2