ENGS 116 Lecture 41 Instruction Set Design Part II Introduction to Pipelining Vincent H. Berk September 28, 2005 Reading for today: Chapter 2.1 – 2.12,

Slides:



Advertisements
Similar presentations
PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
Advertisements

OMSE 510: Computing Foundations 4: The CPU!
CMPT 334 Computer Organization
Pipelining I Topics Pipelining principles Pipeline overheads Pipeline registers and stages Systems I.
Chapter 8. Pipelining.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
EECS 318 CAD Computer Aided Design LECTURE 2: DSP Architectures Instructor: Francis G. Wolff Case Western Reserve University This presentation.
Computer Architecture
Pipelining Preview Basics & Challenges
CS252/Patterson Lec 1.1 1/17/01 Pipelining: Its Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer.
Lect 3: Instruction Set and Addressing Modes. 386 Instruction Set (3.4) –Basic Instruction Set : 8086/8088 instruction set –Extended Instruction Set :
1 Hardware and Software Architecture Chapter 2 n The Intel Processor Architecture n History of PC Memory Usage (Real Mode)
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.
Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
Assembly Language for Intel-Based Computers Chapter 2: IA-32 Processor Architecture Kip Irvine.
Computer ArchitectureFall 2007 © October 22nd, 2007 Majd F. Sakr CS-447– Computer Architecture.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 9, 2002 Topic: Pipelining Basics.
1 Atanasoff–Berry Computer, built by Professor John Vincent Atanasoff and grad student Clifford Berry in the basement of the physics building at Iowa State.
ICS312 Set 3 Pentium Registers. Intel 8086 Family of Microprocessors All of the Intel chips from the 8086 to the latest pentium, have similar architectures.
Appendix A Pipelining: Basic and Intermediate Concepts
Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 2 The Microprocessor and its Architecture.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
Introduction to Pipelining Rabi Mahapatra Adapted from the lecture notes of Dr. John Kubiatowicz (UC Berkeley)
Lect 4: Instruction Set and Addressing Modes. 386 Instruction Set (3.4)  Basic Instruction Set : 8086/8088 instruction set  Extended Instruction Set.
9.2 Pipelining Suppose we want to perform the combined multiply and add operations with a stream of numbers: A i * B i + C i for i =1,2,3,…,7.
CS1104: Computer Organisation School of Computing National University of Singapore.
Lecture 05: Pipelining Basics & Hazards Kai Bu
Integrated Circuits Costs
B 0000 Pipelining ENGR xD52 Eric VanWyk Fall
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
Analogy: Gotta Do Laundry
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
ECE 232 L18.Pipeline.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 18 Pipelining.

CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and
Computers organization & Assembly Language Chapter 1 THE 80x86 MICROPROCESSOR.
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
Pipelining Example Laundry Example: Three Stages
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CS252/Patterson Lec 1.1 1/17/01 معماري کامپيوتر - درس نهم pipeline برگرفته از درس : Prof. David A. Patterson.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 10 Computer Hardware Design (Pipeline Datapath and Control Design) Prof. Dr.
BITS Pilani Pilani Campus Pawan Sharma Lecture / ES C263 INSTR/CS/EEE F241 Microprocessor Programming and Interfacing.
Microprocessors CSE- 341 Dr. Jia Uddin Assistant Professor, CSE, BRAC University Dr. Jia Uddin, CSE, BRAC University.
Lecture 18: Pipelining I.
Pipelines An overview of pipelining
Review: Instruction Set Evolution
CMSC 611: Advanced Computer Architecture
ECE232: Hardware Organization and Design
Chapter 3: Pipelining 순천향대학교 컴퓨터학부 이 상 정 Adapted from
Chapter 4 The Processor Part 2
8086 Registers Module M14.2 Sections 9.2, 10.1.
Lecturer: Alan Christopher
Serial versus Pipelined Execution
CS 301 Fall 2002 Computer Organization
Systems I Pipelining I Topics Pipelining principles Pipeline overheads
An Introduction to pipelining
Chapter 8. Pipelining.
Computer Architecture CST 250
Unit-I 80386DX Architecture
Pipelining Appendix A and Chapter 3.
Pipelining.
Presentation transcript:

ENGS 116 Lecture 41 Instruction Set Design Part II Introduction to Pipelining Vincent H. Berk September 28, 2005 Reading for today: Chapter 2.1 – 2.12, Wulf article Reading for Friday: Chapter A.1 – A.3, Patterson&Ditzel Homework #1 tomorrow

ENGS 116 Lecture 42 Projects Teams of 2 Two options: –Research –Programming Proposal due Wednesday 12 th October: –2 pages –Introduction to the problem, objectives –Approach for solving the problem –Expected working plan, hypothesis –References to Literature

ENGS 116 Lecture 43 Projects Research Project: –Exhaustive overview study of a particular topic. –Research paper with a thesis and an argument (15-20 pages) –Future vision Programming Project: –Produce a simulator or a benchmark –Use the produced software to test a thesis –Present experimental results and analysis (Report)

ENGS 116 Lecture 44 Review: Instruction Set Design Parameters Operand storage in the CPU: Where are operands kept other than in memory? Number of explicit operands named per instruction: How many operands are named explicitly in a typical instruction? Operand location: Can any ALU operand be located in memory or must some or all of the operands be internal storage in the CPU? If an operand is located in memory, how is the memory location specified? Operations: What operations are provided in the instruction set? Type and size of operations: What is the type and size of each operand and how is it specified?

ENGS 116 Lecture 45 Intel 8086 Not truly general-purpose register machine because nearly every register has dedicated use 16-bit architecture: internal registers are 16 bits 20-bit address space, broken into 64-KB fragments Variable-length instructions 8086 has 14 registers divided into 4 groups: data registers, address registers, segment registers, and control registers Addressing modes: absolute (16-bit absolute address), register indirect, based, indexed, and based indexed with displacement Operations: data movement, arithmetic and logic, control flow, string 80386: 32-bit architecture with 32-bit registers and 32-bit address space, additional addressing modes and additional operations 80x86 is most successful instruction set architecture of all time Awkward, old architecture is barrier to improvements

ENGS 116 Lecture 46 Intel 80x86 Integer Registers 80386, 80486, Pentium 8086, GPR 0 GPR 1 GPR 2 GPR 3 GPR 4 GPR 5 GPR 6 GPR 7 PC Base Ptr. (for base of stack seg.) Stack Segment Ptr. (top of stack) EAX AX AH AL ECX CX CH CL EDX DX DH DL EBX BX BH BL ESP SP EBP BP ESI SI EDI DI  7 0 EIP IP FLAGS Accumulator Count Reg: String, Loop Data Reg: Multiply, Divide Base Addr. Reg Stack Ptr. Index Reg, String Source Ptr. Index Reg, String Dest. Ptr. Code Segment Ptr. Data Segment Ptr. Extra Data Segment Ptr. Data Segment Ptr. 2 Data Segment Ptr. 3 Instruction Ptr. (PC) Condition Codes CS SS DS ES FS GS

ENGS 116 Lecture 47 Intel 80x86 Floating Point Registers 790 FPR 0 FPR 1 FPR 2 FPR 3 FPR 4 FPR 5 FPR 6 FPR Status Top of FP Stack, FP Condition Codes

ENGS 116 Lecture 48 80x86 Length Distribution

ENGS 116 Lecture 49 Current Design Guidelines Use general-purpose registers with a load-store architecture Support these addressing modes: displacement, immediate, and register deferred Use a minimalist instruction set Support simple, most-commonly used instructions Support standard data sizes and types: 8-, 16-, and 32-bit integers and 64-bit IEEE 754 floating-point numbers Use fixed instruction encoding if interested in performance and variable instruction encoding if interested in code size Provide at least 16 general-purpose registers plus separate floating-point registers; 32 registers of each highly desirable

ENGS 116 Lecture 410 The Big Picture: The Performance Perspective Performance of a machine is determined by: – Instruction count – Clock cycle time – Clock cycles per instruction Processor design (datapath and control) will determine: – Clock cycle time – Clock cycles per instruction

ENGS 116 Lecture 411 Pipelining: It’s Natural! Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 40 minutes “Folder” takes 20 minutes ABCD

ENGS 116 Lecture 412 Sequential Laundry Sequential laundry takes 6 hours for 4 loads If they learned pipelining, how long would laundry take? TaskOrderTaskOrder A B C D 6 PM Midnight Time

ENGS 116 Lecture 413 Pipelined Laundry Start work ASAP Pipelined laundry takes 3.5 hours for 4 loads TaskOrderTaskOrder 6 PM Midnight Time 20 A B C D 30 40

ENGS 116 Lecture 414 Pipelining Lessons Pipelining doesn’t help latency of single task, it helps throughput of entire workload Pipeline rate limited by slowest pipeline stage Multiple tasks operating simultaneously Potential speedup = Number pipe stages Unbalanced lengths of pipe stages reduces speedup Time to “fill” pipeline and time to “drain” it reduces speedup TaskOrderTaskOrder 6 PM 789 Time 20 A B C D 30 40

ENGS 116 Lecture 415 Basic MIPS RISC Instruction Set All operations on data apply to data in registers Only operations that affect memory are load and store operations that move data from memory to a register or to memory from a register Instruction formats are few in number with all instructions typically being one size 32 registers 3 classes of instructions: ALU, Load and Store, Branches and jumps

ENGS 116 Lecture 416 Simple Implementation of the MIPS RISC Instruction Set Instruction fetch cycle (IF) –Send PC to memory –Fetch current instruction from memory –Update PC Instructions decode/register fetch cycle (ID) – Decode instruction – Read registers corresponding to register source specifiers from register file (in parallel with decoding) –Look for branch conditions, act accordingly

ENGS 116 Lecture 417 Simple Implementation of the MIPS RISC Instruction Set Execution/effective address cycle (EX) –ALU operates on operands prepared from prior cycle, then performs one of three things… – Memory reference: ALU adds base register and offset to form effective address –Register-register ALU instruction: ALU does operation specified by ALU opcode on values read from register file –Register-immediate ALU instruction in which ALU does operation specified by ALU opcode on first value read from register file + sign extended immediate

ENGS 116 Lecture 418 Simple Implementation of the MIPS RISC Instruction Set Memory Access (MEM) – Performs read using effective address if instruction is a load – Performs write of data from second register read from register file using effective address if instruction is a store Write-back Cycle (WB) – Write to register file for either register-register ALU instruction or load instruction

ENGS 116 Lecture 419

ENGS 116 Lecture 420 Example Consider a nonpipelined machine with 5 execution steps of lengths 50 ns, 50 ns, 60 ns, 50 ns, and 50 ns. Due to clock skew and setup, pipelining adds 5 ns of overhead to each instruction stage. Ignoring latency impact, how much speedup in the instruction execution rate will we gain from a pipeline?

ENGS 116 Lecture 421 Sequential Execution Pipelined Execution

ENGS 116 Lecture 422 It’s Not That Easy for Computers Limits to pipelining: Hazards prevent next instruction from executing during its designated clock cycle –Structural hazards: Hardware cannot support this combination of instructions –Data hazards: Instruction depends on result of prior instruction still in pipeline –Control hazards: Pipelining of branches & other instructions. Common solution is to stall the pipeline until the hazard “bubbles” through the pipeline

ENGS 116 Lecture 423 Speed Up Equation for Pipelining Speedup from pipelining= = Ideal CPI = CPI unpipelined /Pipeline depth Speedup =

ENGS 116 Lecture 424 Speed Up Equation for Pipelining