11/1/2005Comp 120 Fall 20051 1 November Exam Postmortem 11 Classes to go! Read Sections 7.1 and 7.2 You read 6.1 for this time. Right? Pipelining then.

Slides:

Advertisements

Similar presentations

Adding the Jump Instruction

Advertisements

Morgan Kaufmann Publishers The Processor

1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.

Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.

Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.

Chapter 8. Pipelining.

Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.

Pipelining Hwanmo Sung CS147 Presentation Professor Sin-Min Lee.

Goal: Describe Pipelining

Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.

Pipelined Processor II (cont’d) CPSC 321

Pipelining II Andreas Klappenecker CPSC321 Computer Architecture.

1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.

Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,

1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.

Pipelining III Andreas Klappenecker CPSC321 Computer Architecture.

1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.

10/26/2004Comp 120 Fall October Only 11 to go! Questions? Today Exam Review and Instruction Execution.

1 Chapter Six - 2nd Half Pipelined Processor Forwarding, Hazards, Branching EE3055 Web:

L18 – Pipeline Issues 1 Comp 411 – Spring /03/08 CPU Pipelining Issues Finishing up Chapter 6 This pipe stuff makes my head hurt! What have you.

L17 – Pipeline Issues 1 Comp 411 – Fall /1308 CPU Pipelining Issues Finishing up Chapter 6 This pipe stuff makes my head hurt! What have you been.

Pipelined Processor II CPSC 321 Andreas Klappenecker.

1 CSE SUNY New Paltz Chapter Six Enhancing Performance with Pipelining.

7/2/ _23 1 Pipelining ECE-445 Computer Organization Dr. Ron Hayne Electrical and Computer Engineering.

Appendix A Pipelining: Basic and Intermediate Concepts

1  1998 Morgan Kaufmann Publishers Chapter Six Enhancing Performance with Pipelining.

ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.

1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve performance by increasing instruction throughput.

1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.

Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization.

Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.

Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.

CS1104: Computer Organisation School of Computing National University of Singapore.

1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.

Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.

Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.

Pipeline Hazards. CS5513 Fall Pipeline Hazards Situations that prevent the next instructions in the instruction stream from executing during its.

CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.

CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.

CMPE 421 Parallel Computer Architecture

1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.

1 COMP541 Pipelined MIPS Montek Singh Mar 30, 2010.

CS 1104 Help Session IV Five Issues in Pipelining Colin Tan, S

Chapter 6 Pipelined CPU Design. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides Pipelined operation – laundry analogy Text Fig. 6.1.

1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.

Chap 6.1 Computer Architecture Chapter 6 Enhancing Performance with Pipelining.

CSIE30300 Computer Architecture Unit 04: Basic MIPS Pipelining Hsin-Chou Chi [Adapted from material by and

1 (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann,

Computing Systems Pipelining: enhancing performance.

1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve perfomance by increasing instruction throughput.

Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.

Computer Organization and Design Pipelining Montek Singh Mon, Dec 2, 2013 Lecture 16.

Computer Organization and Design Pipelining Montek Singh Dec 2, 2015 Lecture 16 (SELF STUDY – not covered on the final exam)

Introduction to Computer Organization Pipelining.

Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.

L17 – Pipeline Issues 1 Comp 411 – Fall /23/09 CPU Pipelining Issues Read Chapter This pipe stuff makes my head hurt! What have you been.

1  2004 Morgan Kaufmann Publishers No encoding: –1 bit for each datapath operation –faster, requires more memory (logic) –used for Vax 780 — an astonishing.

CS203 – Advanced Computer Architecture Pipelining Review.

Computer Organization

CSCI206 - Computer Organization & Programming

Chapter 4 The Processor Part 3

CSCI206 - Computer Organization & Programming

November 5 No exam results today. 9 Classes to go!

September 17 Test 1 pre(re)view Fang-Yi will demonstrate Spim

CS203 – Advanced Computer Architecture

Presentation transcript:

11/1/2005Comp 120 Fall November Exam Postmortem 11 Classes to go! Read Sections 7.1 and 7.2 You read 6.1 for this time. Right? Pipelining then on to Memory hierarchy

11/1/2005Comp 120 Fall [6] Give as least 3 different formulas for execution time using the following variables. Each equation should be minimal; that is, it should not contain any variable that is not needed. CPI, R=clock rate, T=cycle time, M=MIPS, I=number of instructions in program, C=number of cycles in program. CPI*I*T, CPI*I/R, C*T, C/R, I/(MIPS*10^6)

11/1/2005Comp 120 Fall [12] Soon we will all have computers with multiple processors. This will allow us to run some kinds of programs faster. Suppose your computer has N processors and that a program you want to make faster has some fraction F of its computation that is inherently sequential (either all the processors have to do the same work thus duplicating effort, or the work has to be done by a single processor while the other processors are idle). What speedup can you expect to achieve on your new computer? 1/(F + (1-F)/N)

11/1/2005Comp 120 Fall [10] Show the minimum sequence of instructions required to sum up an array of 5 integers, stored as consecutive words in memory, given the address of the first word in $a0; leaving the sum in register $v0? 5 loads + 4 adds = 9 instructions

11/1/2005Comp 120 Fall [10] Which of the following cannot be EXACTLY represented by an IEEE double-precision floating-point number? (a) 0, (b) 10.2, (c) 1.625, (d) 11.5

11/1/2005Comp 120 Fall [12] All of the following equations are true for the idealized numbers you studied in algebra. Which ones are true for IEEE floating point numbers? Assume that all of the numbers are well within the range of largest and smallest possible numbers (that is, underflow and overflow are not a problem) –A+B=A if and only if B=0 –(A+B)+C = A+(B+C) –A+B = B+A –A*(B+C) = A*B+A*C

11/1/2005Comp 120 Fall

11/1/2005Comp 120 Fall [8] Draw a block diagram showing how to implement 2’s complement subtraction of 3 bit numbers (A minus B) using 1-bit full-adder blocks with inputs A, B, and Cin and outputs Sum and Cout and any other AND, OR, or INVERT blocks you may need.

11/1/2005Comp 120 Fall [10] Assume that multiply instructions take 12 cycles and account for 10% of the instructions in a typical program and that the other 90% of the instructions require an average of 4 cycles for each instruction. What percentage of time does the CPU spend doing multiplication? 12*0.1/(12*0.1+4*0.9)=25%

11/1/2005Comp 120 Fall [10] In a certain set of benchmark programs about every 4th instruction is a load instruction that fetches data from main memory. The time required for a load is 50ns. The CPI for all other instructions is 4. Assuming the ISA’s are the same, how much faster will the benchmarks run with a 1GHz clock than with a 500MHz clock? For 1GHz: = 62 cycles and 62 nanoseconds For 500MHz: = 37 cycles and 74 nanoseconds Speedup is 74 / 62 = 1.19

11/1/2005Comp 120 Fall [12] Explain why floating point addition and subtraction may result in loss of precision but multiplication and division do not. Before we can add we have to adjust the binary point so that the exponents are equal. This can cause many bits to be lost if the exponents are very different. Multiplication doesn’t have this problem. All the bits play equally.

11/1/2005Comp 120 Fall Results Average: 72.5 Median: 81

11/1/2005Comp 120 Fall Chapter 6 Pipelining

11/1/2005Comp 120 Fall Doing Laundry

11/1/2005Comp 120 Fall Pipelining Improve performance by increasing instruction throughput Ideal speedup is number of stages in the pipeline. Do we achieve this?

11/1/2005Comp 120 Fall We have 5 stages. What needs to be controlled in each stage? –Instruction Fetch and PC Increment –Instruction Decode / Register Fetch –Execution –Memory Stage –Register Write Back How would control be handled in an automobile plant? –a fancy control center telling everyone what to do? –should we use a finite state machine? Pipeline control

11/1/2005Comp 120 Fall Pipelining What makes it easy –all instructions are the same length –just a few instruction formats –memory operands appear only in loads and stores What makes it hard? –structural hazards: suppose we had only one memory –control hazards: need to worry about branch instructions –data hazards: an instruction depends on a previous instruction Individual Instructions still take the same number of cycles But we’ve improved the through-put by increasing the number of simultaneously executing instructions

11/1/2005Comp 120 Fall Structural Hazards Inst Fetch Reg Read ALUData Access Reg Write Inst Fetch Reg Read ALUData Access Reg Write Inst Fetch Reg Read ALUData Access Reg Write Inst Fetch Reg Read ALUData Access Reg Write

11/1/2005Comp 120 Fall Problem with starting next instruction before first is finished –dependencies that “go backward in time” are data hazards Data Hazards

11/1/2005Comp 120 Fall Have compiler guarantee no hazards Where do we insert the “nops” ? sub$2, $1, $3 and $12, $2, $5 or$13, $6, $2 add$14, $2, $2 sw$15, 100($2) Problem: this really slows us down! Software Solution

11/1/2005Comp 120 Fall Use temporary results, don’t wait for them to be written –register file forwarding to handle read/write to same register –ALU forwarding Forwarding

11/1/2005Comp 120 Fall Load word can still cause a hazard: –an instruction tries to read a register following a load instruction that writes to the same register. Thus, we need a hazard detection unit to “stall” the instruction Can't always forward

11/1/2005Comp 120 Fall Stalling We can stall the pipeline by keeping an instruction in the same stage

11/1/2005Comp 120 Fall When we decide to branch, other instructions are in the pipeline! We are predicting “branch not taken” –need to add hardware for flushing instructions if we are wrong Branch Hazards

11/1/2005Comp 120 Fall Improving Performance Try to avoid stalls! E.g., reorder these instructions: lw $t0, 0($t1) lw $t2, 4($t1) sw $t2, 0($t1) sw $t0, 4($t1) Add a “branch delay slot” –the next instruction after a branch is always executed –rely on compiler to “fill” the slot with something useful Superscalar: start more than one instruction in the same cycle

11/1/2005Comp 120 Fall Dynamic Scheduling The hardware performs the “scheduling” –hardware tries to find instructions to execute –out of order execution is possible –speculative execution and dynamic branch prediction All modern processors are very complicated –Pentium 4: 20 stage pipeline, 6 simultaneous instructions –PowerPC and Pentium: branch history table –Compiler technology important

11/1/2005Comp 120 Fall Chapter 7 Preview Memory Hierarchy

11/1/2005Comp 120 Fall Memory Hierarchy Memory devices come in several different flavors –SRAM – Static Ram fast (1 to 10ns) expensive (>10 times DRAM) small capacity (< ¼ DRAM) –DRAM – Dynamic RAM 16 times slower than SRAM (50ns – 100ns) Access time varies with address Affordable ($160 / gigabyte) 1 Gig considered big –DISK Slow! (10ms access time) Cheap! (< $1 / gigabyte) Big! (1Tbyte is no problem)

11/1/2005Comp 120 Fall Users want large and fast memories! Try to give it to them –build a memory hierarchy Memory Hierarchy

11/1/2005Comp 120 Fall Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be referenced soon. Why does code have locality?

11/1/2005Comp 120 Fall