1 Manchester Mark I, 1949. This was the second (the first was a small- scale prototype) machine built at Cambridge. A production version of this computer.

Slides:



Advertisements
Similar presentations
1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
Advertisements

Computer Structure 2014 – Out-Of-Order Execution 1 Computer Structure Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Pipelining and Control Hazards Oct
COMP4611 Tutorial 6 Instruction Level Parallelism
COMP 4211 Seminar Presentation Based On: Computer Architecture A Quantitative Approach by Hennessey and Patterson Presenter : Feri Danes.
CS152 Lec15.1 Advanced Topics in Pipelining Loop Unrolling Super scalar and VLIW Dynamic scheduling.
Pipelining 5. Two Approaches for Multiple Issue Superscalar –Issue a variable number of instructions per clock –Instructions are scheduled either statically.
Instruction Set Issues MIPS easy –Instructions are only committed at MEM  WB transition Other architectures are more difficult –Instructions may update.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
CMSC 611: Advanced Computer Architecture Instruction Level Parallelism Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides.
Instruction-Level Parallelism (ILP)
1 IF IDEX MEM L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall.
1 Lecture: Pipeline Wrap-Up and Static ILP Topics: multi-cycle instructions, precise exceptions, deep pipelines, compiler scheduling, loop unrolling, software.
CIS429/529 Winter 2007 Pipelining II- 1 Additional pipelining topics Why pipelining is so hard: exception handling ILP techniques: loop unrolling.
CIS429/529 Winter 2007 Pipelining-1 1 Pipeling RISC/MIPS64 five stage pipeline Basic pipeline performance Pipeline hazards Branch hazards More pipeline.
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 24, 2003 Topic: Pipelining -- Intermediate Concepts (Multicycle Operations;
Chapter 12 Pipelining Strategies Performance Hazards.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
EENG449b/Savvides Lec 4.1 1/22/04 January 22, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 4: Advanced Pipelines Control hazards, multi-cycle in-order pipelines, static ILP (Appendix A.4-A.10, Sections )
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Sep 21, 2005 Topic: Pipelining -- Intermediate Concepts (Control Hazards)
CIS429.S00: Lec12- 1 Miscellaneous pipelining topics Why pipelining is so hard: exception handling Advanced pipelining techniques: loop unrolling.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Computer Architecture
-1.1- PIPELINING 2 nd week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PIPELINING 2 nd week References Pipelining concepts The DLX.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
Pipeline Extensions prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University MIPS Extensions1May 2015.
CIS 662 – Computer Architecture – Fall Class 16 – 11/09/04 1 Compiler Techniques for ILP  So far we have explored dynamic hardware techniques for.
CMPE 421 Parallel Computer Architecture
Chapter 6 Pipelined CPU Design. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides Pipelined operation – laundry analogy Text Fig. 6.1.
Spring 2003CSE P5481 Precise Interrupts Precise interrupts preserve the model that instructions execute in program-generated order, one at a time If an.
1 Images from Patterson-Hennessy Book Machines that introduced pipelining and instruction-level parallelism. Clockwise from top: IBM Stretch, IBM 360/91,
Appendix A. Pipelining: Basic and Intermediate Concept
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
L17 – Pipeline Issues 1 Comp 411 – Fall /23/09 CPU Pipelining Issues Read Chapter This pipe stuff makes my head hurt! What have you been.
CSC 4250 Computer Architectures September 22, 2006 Appendix A. Pipelining.
ECE/CS 552: Pipeline Hazards © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim.
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
Data Hazards Dependent instructions add %g1, %g2, %g3 sub %l1, %g3, %o0 Forwarding helps, but not all hazards can be avoided.
CMSC 611: Advanced Computer Architecture
Instruction-Level Parallelism
Computer Organization CS224
Lecture 15: Pipelining: Branching & Complications
Pipelining Wrapup Brief overview of the rest of chapter 3
Appendix C Pipeline implementation
Exceptions & Multi-cycle Operations
Pipelining: Advanced ILP
CS 5513 Computer Architecture Pipelining Examples
Lecture 6: Advanced Pipelines
Pipelining Multicycle, MIPS R4000, and More
How to improve (decrease) CPI
Pipeline control unit (highly abstracted)
Project Instruction Scheduler Assembler for DLX
Overview What are pipeline hazards? Types of hazards
Pipeline control unit (highly abstracted)
Extending simple pipeline to multiple pipes
Lecture 4: Advanced Pipelines
Control Hazards Branches (conditional, unconditional, call-return)
CMSC 611: Advanced Computer Architecture
Interrupts and exceptions
CS 3853 Computer Architecture Pipelining Examples
CMSC 611: Advanced Computer Architecture
Presentation transcript:

1 Manchester Mark I, This was the second (the first was a small- scale prototype) machine built at Cambridge. A production version of this computer was sold by Ferranti. The logic was implemented with 4200 vacuum tubes, which resulted in a great deal of down time. A transistorized prototype was built in 1953.

2 COMP 206: Computer Architecture and Implementation Montek Singh Tue, Feb 10, 2009 Topic: Pipelining III (Control Hazards)

3Overview 1. Data Hazards that require stalls 2. Control hazards Branch delay Branch delay 3. Dealing with exceptions 4. Multiple functional units Floating point unit, for example Floating point unit, for example

4 Data Hazard Example  Consider LDR1,0(R2); Load R1 DSUBR4, R1, R5; Use R1 ANDR6, R1, R7 ORR8, R1, R9  Let’s look at pipeline diagram

5 Data Needed from the Future! A problem for even the most advanced hardware.

6 Stall is Necessary

7 Summary: Types of Data Hazards  RAW j tries to read before i writes (most common) j tries to read before i writes (most common)  WAW j tries to write before i writes j tries to write before i writes Would leave the value of i rather than j Would leave the value of i rather than j Not a problem w/ our simple 5 stage MIPS because there’s only one place to write Not a problem w/ our simple 5 stage MIPS because there’s only one place to write  WAR j tries to write before i reads j tries to write before i reads Not common because reads occur early Not common because reads occur early  RAR not a hazard

8 On to Control Hazards  Pipeline hazards caused by branch  With multiple instructions in flight, what happens if you branch?

9 Solution 1: Stall  This is a fairly simple strategy  Needs control to disable instructions in the pipeline  Simple implementation – stall even if not taken  10% - 30% penalty BranchIFIDEXMEMWB Branch +1 IFIFIDEXMEMWB Branch +2 IFIDEXMEMWB Branch +3 IFIDEXMEM Branch +4 IFIDEX

10 Solution 2: Predict Not Taken  If wrong (taken), make sure that non-branch instructions change no state  Predict taken no help for our pipeline we know dest. & outcome at same time we know dest. & outcome at same time BranchIFIDEXMEMWB i+1IFidleidleidleidle TargetIFIDEXMEMWB Target+1IFIDEXMEMWB Target+2IFIDEXMEM

11 Solution 3: Delayed Branch  As in MIPS  Sequence is Branch instruction Sequential successor Branch target or next in line  Works well with our 5-stage pipe (next) No stalls No stalls

12 Branch Delay Pipeline

13 Compiler Impl. – Option (a)  Easiest, just move previous instruction to delay slot

14 Compiler Impl.– Option (b)  Can’t move the DADD Dependency Dependency Note branch condition Note branch condition  So put target in slot Will that be OK? Will that be OK?

15Dependencies  Compiler computes dependencies  If target will not be used when branch not taken, then OK to write it Here’s where the condition style of MIPS helps – no condition codes to worry about Here’s where the condition style of MIPS helps – no condition codes to worry about  If register will be used, then put nop in delay slot

16 Branch Delay – Option (c)  To use this option, must be OK to execute the OR  In other words, result of R7 from before not needed by code after branch

17 Summary: Control Hazards  So far have looked at simple pipe  Delay Slot  Compiler options

18Exceptions  Problem when multiple instructions are in flight is dealing w/ exceptions  Need to (perhaps) stop execution of some instructions  Avoid changing state  Possibly re-start instructions after dealing with exception

19 Examples of Exceptions  I/O interrupt  System call  Tracing (breakpoint, single step)  Arithmetic problem, integer & float  Page fault  Memory protection error  Illegal instruction Non-existent or protected Non-existent or protected  Power failure  Hardware fault (machine check)

20 Types of Exceptions  Synchronous or asynchronous Caused by external action? Caused by external action?  Some exceptions happen between instructions, others within  Resume vs terminate Will we need to restart instructions, or just stop? Will we need to restart instructions, or just stop? More on next slide More on next slide

21 Restarting Instructions  Example given in HP is a page fault due to a load or store Occurs at MEM stage Occurs at MEM stage We have instructions in pipe after faulting instruction that must be restarted after page fault We have instructions in pipe after faulting instruction that must be restarted after page fault  Possible sequence after a fault Insert trap at IF Insert trap at IF Disallow writes for all instructions in pipe Disallow writes for all instructions in pipe Save PC Save PC  Harder than it seems  What if we are in middle of a delayed branch?

22 Precise Exceptions  A machine that supports precise exceptions will Not allow faulting instruction to write Not allow faulting instruction to write Restart it (perhaps) and subsequent instructions as if exception had not happened Restart it (perhaps) and subsequent instructions as if exception had not happened  Sometimes too expensive to guarantee this, so some machines also support imprecise exception Enable processing at higher performance Enable processing at higher performance

23 Faults on Multiple Instructions  Example in HP LD … DADD …  Scenarios Data fault on LD and arithmetic fault on DADD Data fault on LD and arithmetic fault on DADD  Will happen at same time (MEM, EX) Data fault on LD, instruction fault on DADD Data fault on LD, instruction fault on DADD  So fault on 2 nd instruction will happen 1 st !  To handle this, will need to store all faults in a vector and deal with them in order, meanwhile making sure that inappropriate stage changes aren’t allowed

24Complications  Won’t go into this, but  Problems with ISAs more complicated that MIPS State change in middle of pipeline rather than at end State change in middle of pipeline rather than at end Instructions that take variable number of cycles Instructions that take variable number of cycles  Will look at pipe for this, but not exceptions

25 Multi Cycle Operations  Refers to multiple cycles in a state  Why? Floating point operations (and also integer divide) can be made more efficient by dividing into multiple cycles Floating point operations (and also integer divide) can be made more efficient by dividing into multiple cycles Otherwise, the clock rate will suffer Otherwise, the clock rate will suffer  What are the implications for the pipeline?

26 Multiple Functional Units  In this simple version, one instruction enters the EX stage at a time  Simple ones finish in 1 cycle  Complicated ones take multiple cycles

27 Pipelined vs. Not Pipelined Units  All types of inst. except DIV can be issued once per clock  Potential problems with ordering, hazards Not pipelined

28 RAW Hazard (2 of them here)  MUL (double precision) must wait for LD  ADD stalls to wait for F0  Store waits because of structural hazard

29 Potential Structural Hazard  Three instructions end up in MEM stage at same time  Note that there’s no deep structural hazard Only the load uses the memory Only the load uses the memory

30 Could also have WAW Hazard  Imagine L.D one cycle earlier Then 2 nd write of F2 would be 1 st Then 2 nd write of F2 would be 1 st So we’d have wrong value in F2 So we’d have wrong value in F2 Note: If F2 was used after the ADD.D and before L.D, then this would be caught by RAW hazard circuit Note: If F2 was used after the ADD.D and before L.D, then this would be caught by RAW hazard circuit

31Summary  Things start getting complicated when instructions complete in different numbers of cycles  We’ll be looking more at this

32Stalls  Divide structural hazards are shown separately (and are rare)  Stalls from RAW hazards roughly proportional to latency 0 for integer 0 for integer 3 for FP add/sub 3 for FP add/sub 6 for multiply 6 for multiply 24 for divide 24 for divide About 50% of the latency (not always) About 50% of the latency (not always)

33 Next Time  Look at an implementation  On to Scoreboarding Out of order execution Out of order execution  Then move on to more complex ILP  Read App. A, Sec. 4-7