10/11: Lecture Topics Slides on starting a program from last time Where we are, where we’re going RISC vs. CISC reprise Execution cycle Pipelining Hazards.

Slides:



Advertisements
Similar presentations
PipelineCSCE430/830 Pipeline: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Prof. Yifeng Zhu, U of Maine Fall,
Advertisements

Morgan Kaufmann Publishers The Processor
1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
OMSE 510: Computing Foundations 4: The CPU!
10/9: Lecture Topics Starting a Program Exercise 3.2 from H+P Review of Assembly Language RISC vs. CISC.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
ELEN 468 Advanced Logic Design
CMPT 334 Computer Organization
Chapter 8. Pipelining.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Pipelining I (1) Fall 2005 Lecture 18: Pipelining I.
Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.
Goal: Describe Pipelining
Chapter Six 1.
Computer Organization
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
Pipelining Andreas Klappenecker CPSC321 Computer Architecture.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
CS430 – Computer Architecture Introduction to Pipelined Execution
7/2/ _23 1 Pipelining ECE-445 Computer Organization Dr. Ron Hayne Electrical and Computer Engineering.
Appendix A Pipelining: Basic and Intermediate Concepts
1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
(6.1) Central Processing Unit Architecture  Architecture overview  Machine organization – von Neumann  Speeding up CPU operations – multiple registers.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
9.2 Pipelining Suppose we want to perform the combined multiply and add operations with a stream of numbers: A i * B i + C i for i =1,2,3,…,7.
CS1104: Computer Organisation School of Computing National University of Singapore.
What have mr aldred’s dirty clothes got to do with the cpu
1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.
Chapter 8 Pipelining. A strategy for employing parallelism to achieve better performance Taking the “assembly line” approach to fetching and executing.
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 8: MIPS Pipelined.
Computer Organization CS224 Fall 2012 Lesson 28. Pipelining Analogy  Pipelined laundry: overlapping execution l Parallelism improves performance §4.5.
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
Chapter 4 The Processor. Chapter 4 — The Processor — 2 Introduction We will examine two MIPS implementations A simplified version A more realistic pipelined.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
1 Pipelining Part I CS What is Pipelining? Like an Automobile Assembly Line for Instructions –Each step does a little job of processing the instruction.
Chapter 6 Pipelined CPU Design. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides Pipelined operation – laundry analogy Text Fig. 6.1.
1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve perfomance by increasing instruction throughput.
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
Pipelining Example Laundry Example: Three Stages
Computer Organization and Design Pipelining Montek Singh Dec 2, 2015 Lecture 16 (SELF STUDY – not covered on the final exam)
LECTURE 7 Pipelining. DATAPATH AND CONTROL We started with the single-cycle implementation, in which a single instruction is executed over a single cycle.
CBP 2005Comp 3070 Computer Architecture1 Last Time … All instructions the same length We learned to program MIPS And a bit about Intel’s x86 Instructions.
10/11: Lecture Topics Execution cycle Introduction to pipelining
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
CS 141 ChienMay 3, 1999 Concepts in Pipelining u Last Time –Midterm Exam, Grading in progress u Today –Concepts in Pipelining u Reminders/Announcements.
1  2004 Morgan Kaufmann Publishers No encoding: –1 bit for each datapath operation –faster, requires more memory (logic) –used for Vax 780 — an astonishing.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CS203 – Advanced Computer Architecture Pipelining Review.
Chapter Six.
Lecture 18: Pipelining I.
Review: Instruction Set Evolution
Morgan Kaufmann Publishers
ELEN 468 Advanced Logic Design
CMSC 611: Advanced Computer Architecture
Single Clock Datapath With Control
Pipeline Implementation (4.6)
Lecturer: Alan Christopher
Chapter Six.
Chapter Six.
November 5 No exam results today. 9 Classes to go!
Instruction Execution Cycle
Chapter 8. Pipelining.
Pipelining.
Presentation transcript:

10/11: Lecture Topics Slides on starting a program from last time Where we are, where we’re going RISC vs. CISC reprise Execution cycle Pipelining Hazards

Where we’ve been: Architecture vs. implementation MIPS assembly Addressing modes, Instruction encoding Assembly, linking, and loading Chapters 1 & 3

Where we’re going Make it fast –pipelining (chapter 6) –caching (chapter 7) Make it useful –Input/Output (chapter 8) Current research, Future trends Midterm October 27th

Where we’re not going Performance: chapter 2 Bit twiddling: chapter 4 Datapath and control: chapter 5 –important, but depends on a background in digital logic Multiprocessors: chapter 9

RISC vs. CISC Reduced Instruction Set Computer –MIPS: about 100 instructions –Basic idea: compose simple instructions to get complex results Complex Instruction Set Computer –VAX: about 325 instructions –Basic idea: give programmers powerful instructions; fewer instructions to complete the work

The VAX Digital Equipment Corp, 1977 Advances in microcode technology made complex instructions possible Memory was expensive –Small program = good Compilers had a long way to go –Ease of translation from high-level language to assembly = good

VAX Instructions Queue manipulation instructions: –INSQUE: insert into queue Stack manipulation instructions: –POPR, PUSHR: pop, push registers Procedure call instructions Binary-encoded decimal instructions –ADDP, SUBP, MULP, DIVP –CVTPL, CVTLP (conversion)

The RISC Backlash Complex instructions: –Take longer to execute –Take more hardware to implement Idea: compose simple, fast instructions –Less hardware is required –Execution speed may actually increase PUSHR vs. sw + sw + sw

How many instructions? How many instructions do you really need? Potentially only one: subtract and branch if negative ( sbn ) See p. 206 of your book

Execution Cycle Five steps to executing an instruction: 1. Fetch Get the next instruction to execute from memory onto the chip 2. Decode Figure out what the instruction says to do Get values from registers 3. Execute Do what the instruction says; for example, –On a memory reference, add up base and offset –On an arithmetic instruction, do the math

More Execution Cycle 4. Memory Access If it’s a load or store, access memory If it’s a branch, replace the PC with the destination address Otherwise do nothing 5. Write back Place the result of the operation in the appropriate register

Laundry Four steps to doing the laundry: –Wash, Dry, Fold, Put Away If each step = 30 min., 4 loads = _____

Pipelined Laundry Allow laundry stages to operate concurrently Now four loads takes _____

Latency vs. Throughput The latency of a load of laundry is 2 hours –Does not change with pipelining The throughput of the laundry system is –1 loads/2 hours =.5 LPH without pipelining –1 load/.5 hours = 2 LPH with pipelining The speedup is 4, the same as the number of stages (when stages are balanced)

Balancing the Stages What if the dryer takes an hour, while the other stages take 30 minutes? 1 load/1 hour = 1 LPH speedup = 2

Pipelining instructions We can overlap the five stages of the execution cycle Five different instructions can be executing simultaneously, if: –they are all in different stages –the stages are nearly balanced –nothing else goes wrong

What could go wrong? Structural hazards –Two instructions are incompatible Control hazards –We need to make a decision, but not all of the information is available Data hazards –We need to use the result of a previous computation for this computation

Structural Hazards Suppose a lw instruction is in stage four (memory access) Meanwhile, an add instruction is in stage one (instruction fetch) Both of these actions require access to memory; they could collide In practice, they don’t, because of the design of the caching system

Control Hazards Suppose we have a slt/bne combination slt stores its result to a register in stage five bne needs that result at the beginning of stage four; it can’t proceed Can stall, waiting for the result Can do speculative execution, and guess the result

Data Hazards Suppose we want to execute: The first addition doesn’t store its result until the end of stage five The second addition wants to load its operands in stage two add $t2, $t0, $t1 add $t4, $t2, $t3

Handling Data Hazards Again, you can stall You can use data forwarding –pass the data directly from stage 3 of the first add to stage 3 of the second add Sometimes, you can do out-of-order execution –reorder the instructions such that: maintain correctness avoid or reduce stalls