Lecture 4: Advanced Pipelines

Slides:



Advertisements
Similar presentations
1 Lecture: Pipelining Hazards Topics: Basic pipelining implementation, hazards, bypassing HW2 posted, due Wednesday.
Advertisements

1 Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ.
Instruction Set Issues MIPS easy –Instructions are only committed at MEM  WB transition Other architectures are more difficult –Instructions may update.
Lecture 6: Pipelining MIPS R4000 and More Kai Bu
Instruction-Level Parallelism (ILP)
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
1 Lecture: Pipeline Wrap-Up and Static ILP Topics: multi-cycle instructions, precise exceptions, deep pipelines, compiler scheduling, loop unrolling, software.
1 Lecture 17: Basic Pipelining Today’s topics:  5-stage pipeline  Hazards and instruction scheduling Mid-term exam stats:  Highest: 90, Mean: 58.
1 Lecture 3: Pipelining Basics Biggest contributors to performance: clock speed, parallelism Today: basic pipelining implementation (Sections A.1-A.3)
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
1 Lecture 5: Pipeline Wrap-up, Static ILP Basics Topics: loop unrolling, VLIW (Sections 2.1 – 2.2) Assignment 1 due at the start of class on Thursday.
1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 4: Advanced Pipelines Control hazards, multi-cycle in-order pipelines, static ILP (Appendix A.4-A.10, Sections )
Appendix A Pipelining: Basic and Intermediate Concepts
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.
Spring 2003CSE P5481 Precise Interrupts Precise interrupts preserve the model that instructions execute in program-generated order, one at a time If an.
Lecture 16: Basic Pipelining
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
1 Lecture: Pipeline Wrap-Up and Static ILP Topics: multi-cycle instructions, precise exceptions, deep pipelines, compiler scheduling, loop unrolling, software.
CDA3101 Recitation Section 8
Lecture 16: Basic Pipelining
Lecture 07: Pipelining Multicycle, MIPS R4000, and More
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards
Lecture: Pipelining Extensions
Appendix C Pipeline implementation
Exceptions & Multi-cycle Operations
Pipelining: Advanced ILP
Lecture 6: Advanced Pipelines
Pipelining Multicycle, MIPS R4000, and More
Pipelining review.
Lecture 16: Basic Pipelining
Pipelining Chapter 6.
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Lecture: Pipelining Hazards
Lecture 18: Pipelining Today’s topics:
Lecture: Pipelining Hazards
Lecture 5: Pipelining Basics
Lecture 18: Pipelining Today’s topics:
Pipelining in more detail
CSC 4250 Computer Architectures
Lecture: Pipelining Hazards
Lecture: Out-of-order Processors
\course\cpeg323-05F\Topic6b-323
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards.
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Pipeline control unit (highly abstracted)
Lecture: Pipelining Extensions
Lecture 20: OOO, Memory Hierarchy
Lecture: Pipelining Extensions
Lecture 20: OOO, Memory Hierarchy
Lecture 18: Pipelining Today’s topics:
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards.
Project Instruction Scheduler Assembler for DLX
Pipeline control unit (highly abstracted)
CS203 – Advanced Computer Architecture
Pipeline Control unit (highly abstracted)
Pipelining.
Lecture: Pipelining Hazards
Lecture 5: Pipeline Wrap-up, Static ILP
Guest Lecturer: Justin Hsia
Pipelining Hazards.
Presentation transcript:

Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix C.4-C.8)

A 5-Stage Pipeline Source: H&P textbook

Hazards Structural hazards: different instructions in different stages (or the same stage) conflicting for the same resource Data hazards: an instruction cannot continue because it needs a value that has not yet been generated by an earlier instruction Control hazard: fetch cannot continue because it does not know the outcome of an earlier branch – special case of a data hazard – separate category because they are treated in different ways

Data Hazards SUB R2  R1, R3 Uses R2 Uses R2 Uses R2 Uses R2

Bypassing Some data hazard stalls can be eliminated: bypassing

Bypassing

Pipeline Implementation Signals for the muxes have to be generated – some of this can happen during ID Need look-up tables to identify situations that merit bypassing/stalling – the number of inputs to the muxes goes up

Detecting Control Signals Situation Example code Action No dependence LD R1, 45(R2) DADD R5, R6, R7 DSUB R8, R6, R7 OR R9, R6, R7 No hazards Dependence requiring stall LD R1, 45(R2) DADD R5, R1, R7 Detect use of R1 during ID of DADD and stall Dependence overcome by forwarding DSUB R8, R1, R7 Detect use of R1 during ID of DSUB and set mux control signal that accepts result from bypass path Dependence with accesses in order OR R9, R1, R7 No action required

Example add R1, R2, R3 lw R4, 8(R1)

Example lw R1, 8(R2) lw R4, 8(R1)

Example lw R1, 8(R2) sw R1, 8(R3)

Summary For the 5-stage pipeline, bypassing can eliminate delays between the following example pairs of instructions: add/sub R1, R2, R3 add/sub/lw/sw R4, R1, R5 lw R1, 8(R2) sw R1, 4(R3) The following pairs of instructions will have intermediate stalls: lw R1, 8(R2) add/sub/lw R3, R1, R4 or sw R3, 8(R1) fmul F1, F2, F3 fadd F5, F1, F4

Control Hazards Simple techniques to handle control hazard stalls: for every branch, introduce a stall cycle (note: every 6th instruction is a branch!) assume the branch is not taken and start fetching the next instruction – if the branch is taken, need hardware to cancel the effect of the wrong-path instruction fetch the next instruction (branch delay slot) and execute it anyway – if the instruction turns out to be on the correct path, useful work was done – if the instruction turns out to be on the wrong path, hopefully program state is not lost

Branch Delay Slots

Slowdowns from Stalls Perfect pipelining with no hazards  an instruction completes every cycle (total cycles ~ num instructions)  speedup = increase in clock speed = num pipeline stages With hazards and stalls, some cycles (= stall time) go by during which no instruction completes, and then the stalled instruction completes Total cycles = number of instructions + stall cycles Slowdown because of stalls = 1/ (1 + stall cycles per instr)

Pipelining Limits Gap between indep instrs: T + Tovh Gap between dep instrs: T + Tovh Gap between indep instrs: T/3 + Tovh Gap between dep instrs: T + 3Tovh A B C A B C Gap between indep instrs: T/6 + Tovh Gap between dep instrs: T + 6Tovh A B C D E F A B C D E F Assume that there is a dependence where the final result of the first instruction is required before starting the second instruction

Multicycle Instructions Functional unit Latency Initiation interval Integer ALU 1 Data memory 2 FP add 4 FP multiply 7 FP divide 25

Effects of Multicycle Instructions Structural hazards if the unit is not fully pipelined (divider) Frequent RAW hazard stalls Potentially multiple writes to the register file in a cycle WAW hazards because of out-of-order instr completion Imprecise exceptions because of o-o-o instr completion Note: Can also increase the “width” of the processor: handle multiple instructions at the same time: for example, fetch two instructions, read registers for both, execute both, etc.

Precise Exceptions On an exception: must save PC of instruction where program must resume all instructions after that PC that might be in the pipeline must be converted to NOPs (other instructions continue to execute and may raise exceptions of their own) temporary program state not in memory (in other words, registers) has to be stored in memory potential problems if a later instruction has already modified memory or registers A processor that fulfils all the above conditions is said to provide precise exceptions (useful for debugging and of course, correctness)

Dealing with these Effects Multiple writes to the register file: increase the number of ports, stall one of the writers during ID, stall one of the writers during WB (the stall will propagate) WAW hazards: detect the hazard during ID and stall the later instruction Imprecise exceptions: buffer the results if they complete early or save more pipeline state so that you can return to exactly the same state that you left at

Title Bullet