CIS629 Fall 2002 Pipelining 2- 1 Control Hazards Created by branch statements BEQZLOC ADDR1,R2,R3. LOCSUBR1,R2,R3 PC needs to be computed but it happens.

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

Morgan Kaufmann Publishers The Processor
1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Instruction-Level Parallelism (ILP)
CIS429/529 Winter 2007 Pipelining-1 1 Pipeling RISC/MIPS64 five stage pipeline Basic pipeline performance Pipeline hazards Branch hazards More pipeline.
CIS429.S00: Lec10- 1 Control Hazards Created by branch statements BEQZLOC ADDR1,R2,R3. LOCSUBR1,R2,R3 PC needs to be computed but it happens too late in.
©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan
Chapter Six Enhancing Performance with Pipelining
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
EENG449b/Savvides Lec 4.1 1/22/04 January 22, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
DLX Instruction Format
Appendix A Pipelining: Basic and Intermediate Concepts
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Pipelining Basics Assembly line concept An instruction is executed in multiple steps Multiple instructions overlap in execution A step in a pipeline is.
-1.1- PIPELINING 2 nd week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PIPELINING 2 nd week References Pipelining concepts The DLX.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
CSC 4250 Computer Architectures September 15, 2006 Appendix A. Pipelining.
Lecture 7: Pipelining Review Kai Bu
Pipelining. 10/19/ Outline 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and Interrupts Conclusion.
Memory/Storage Architecture Lab Computer Architecture Pipelining Basics.
Lecture 5: Pipelining Implementation Kai Bu
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
CMPE 421 Parallel Computer Architecture
1 Pipelining Part I CS What is Pipelining? Like an Automobile Assembly Line for Instructions –Each step does a little job of processing the instruction.
CS 1104 Help Session IV Five Issues in Pipelining Colin Tan, S
Winter 2002CSE Topic Branch Hazards in the Pipelined Processor.
Processor Design CT101 – Computing Systems. Content GPR processor – non pipeline implementation Pipeline GPR processor – pipeline implementation Performance.
Branch Hazards and Static Branch Prediction Techniques
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
HazardsCS510 Computer Architectures Lecture Lecture 7 Pipeline Hazards.
CS252/Patterson Lec 1.1 1/17/01 معماري کامپيوتر - درس نهم pipeline برگرفته از درس : Prof. David A. Patterson.
EE524/CptS561 Jose G. Delgado-Frias 1 Processor Basic steps to process an instruction IFID/OFEXMEMWB Instruction Fetch Instruction Decode / Operand Fetch.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CSC 4250 Computer Architectures September 22, 2006 Appendix A. Pipelining.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Stalling delays the entire pipeline
Pipelining Chapter 6.
5 Steps of MIPS Datapath Figure A.2, Page A-8
Single Clock Datapath With Control
Appendix C Pipeline implementation
Chapter 4 The Processor Part 4
\course\cpeg323-08F\Topic6b-323
Pipelining: Implementation
Chapter 3: Pipelining 순천향대학교 컴퓨터학부 이 상 정 Adapted from
Morgan Kaufmann Publishers The Processor
CMSC 611: Advanced Computer Architecture
Pipelining review.
Pipelining Chapter 6.
Pipelining in more detail
\course\cpeg323-05F\Topic6b-323
Pipeline control unit (highly abstracted)
The Processor Lecture 3.6: Control Hazards
Control unit extension for data hazards
An Introduction to pipelining
Instruction Execution Cycle
Overview What are pipeline hazards? Types of hazards
Pipeline control unit (highly abstracted)
Pipeline Control unit (highly abstracted)
Control unit extension for data hazards
Pipelining Appendix A and Chapter 3.
Control unit extension for data hazards
Lecture 06: Pipelining Implementation
MIPS Pipelined Datapath
Pipelining Hazards.
Presentation transcript:

CIS629 Fall 2002 Pipelining 2- 1 Control Hazards Created by branch statements BEQZLOC ADDR1,R2,R3. LOCSUBR1,R2,R3 PC needs to be computed but it happens too late (at the end of ID, the 2nd stage)

CIS629 Fall 2002 Pipelining 2- 2 Four Branch Hazard Alternatives #1 STALL: until branch direction is known #2 Predict Branch Not Taken: guess that the branch will not be taken and execute successor instruction. Cancel out instructions in the pipeline if wrong guess. #3 Predict Branch Taken: opposite of above. #4 Delayed Branch: insert useful instructions into the pipeline until the branch direction is known.

CIS629 Fall 2002 Pipelining 2- 3 Solution #1: Stall PC is computed in ID stage Thus only a 1 cycle stall is incurred for branch hazards Solution #1 (STALL) incurs a penalty of 1 cycle.

CIS629 Fall 2002 Pipelining 2- 4 #2 Predict Not Taken (Fig. A.12)

CIS629 Fall 2002 Pipelining 2- 5 #2 Predict Not Taken Penalty: 0 if not taken, 1 if taken Assume 30% conditional branches of which 60% are not taken: CPI = (.70) ( 1) + (.30) ((.60)*(1) + (.40)(1+1))

CIS629 Fall 2002 Pipelining 2- 6 #3 Predict Taken Similar situation and analysis as Predict Not Taken

CIS629 Fall 2002 Pipelining 2- 7 #4 Delayed Branch Delay Slot = the slots in the pipeline that would be stalls (we are going to fill them with instructions and try to avoid stalls) (a) Non-cancelling Delayed Branch Useful instructions are inserted into the delay slots (b) Cancelling Delayed Branch Some instructions are inserted into the delay slots and cancelled if wrong guess (as in Predict Not Taken and Predict Taken)

CIS629 Fall 2002 Pipelining 2- 8 #4a: Delayed Branch - non-cancelling Fill the slot with a useful instruction penalty = 0 Sometimes no useful instruction can be found by the compiler penalty = 1

CIS629 Fall 2002 Pipelining 2- 9 #4b: Delayed Branch - cancelling with predict taken Try to fill with a useful instruction: since we predict taken, it can be chosen from instructions at the taken location If the prediction was right DLX penalty = 0 If the prediction was wrong, the instruction must be cancelled DLX penalty = 1

CIS629 Fall 2002 Pipelining #4b: Delayed Branch - cancelling with predict taken

CIS629 Fall 2002 Pipelining Where to get instructions to fill the branch delay slot(s)? From before the branch instruction From the target address (only OK if branch taken) From the fall through (only OK if branch not taken) Compiler effectiveness for single branch delay slot: –about 60% of slots are filled –about 80% of instructions in delay slots are useful (not cancelled)

CIS629 Fall 2002 Pipelining Where to get instructions for the delay slots (Fig. A.14)

CIS629 Fall 2002 Pipelining Predict (un)taken vs. Cancelling branch Predict taken/untaken –is all done in hardware –when branch is in ID stage, PC is modified to taken (or untaken) address –special hardware to detect whether the right choice was made (comparator) and hardware to cancel (change to NO-OP –no special branch instructions needed

CIS629 Fall 2002 Pipelining Predict (un)taken vs. Cancelling branch (cont) –a combination of hardware and software –compiler analyzes code and selects either cancelling or non-cancelling branch instructions (the programmer just uses the generic one) –compiler rearranges the machine instructions before execution –at execution time, hardware cancels (converts to NO-OP) if the wrong prediction is made

CIS629 Fall 2002 Pipelining Delayed branch options BEST:find a useful instruction that is independent of the branch NEXT BEST: find a useful instruction in the target code (predict taken) that has no negative impact if not taken OK: find a useful instruction in the target code (predict taken) ; if guessed wrong, it will be cancelled at runtime WORST: empty slot

CIS629 Fall 2002 Pipelining Performance Analysis Caution: you must be clear on the assumptions! Assume: equal pipestages, no change in IC or clock, instructions all require all pipestages in the unpiplined version. The penalty due to stalls depends on frequency analysis.

CIS629 Fall 2002 Pipelining Details of the MIPS datapath (Fig. A.17 - does not include pipeline HW)

CIS629 Fall 2002 Pipelining MIPS datapath details (p. A27-A28) Instruction Fetch (IF) IR <-- Mem[PC] NPC <-- PC + 4 Instruction Decode/Register Fetch (ID) A <-- Regs[rs] B <-- Regs[rt] Imm <-- sign extended immediate field of IR Execution/Effective address (EX) Memory access/Branch completion (MEM) Write-back (WB)

CIS629 Fall 2002 Pipelining MIPS pipeline details (p. A27-A28) Execution/Effective address (EX) If memory reference: ALUOutput <-- A + Imm If Reg-Reg ALU: ALUOutput <-- A func B If Reg-Imm ALU: ALUOutput <-- A op Imm If Branch: ALUOutput <-- NPC + (Imm << 2) Memory access/Branch completion (MEM) If memory reference: LMD <-- Mem[ALUOUtput] or Mem[ALUOutput] <-- B If Branch: if (cond) PC <-- ALUOutput Write-back (WB)

CIS629 Fall 2002 Pipelining MIPS pipeline details (p. A27-A28) Write-back (WB) If Reg-Reg ALU: Regs[rd] <-- ALUOutput If Reg-Imm ALU: Regs[rt] <-- ALUOutput If Load: Regs[rt] <-- LMD

CIS629 Fall 2002 Pipelining Details of the MIPS pipeline HW

CIS629 Fall 2002 Pipelining MIPS pipeline events compared to MIPS (non-pipelined) datapath (Fig. A.19) Non-pipelined Instruction Fetch (IF) IR <-- Mem[PC] NPC <-- PC + 4 Pipelined Instruction Fetch (IF) IF/EX.IR <-- Mem[PC] IF/ID.NPC,PC <-- ( if ((EX/MEM.opcode == branch) & EX/MEM) {EX/MEM.ALUOutput} else {PC + 4} (

CIS629 Fall 2002 Pipelining How HW detects data hazards (See Fig. A.21) LW R1, 0(R2) in EX stage ADD R3, R1, R4 in ID stage Check opcode field of ID/EX (bits 0-5): LOAD Check opcode field of IF/ID (bits 0-5): ALU Check operands for RAW: ID/EX.IR[rt] == IF/ID.IR[rs]

CIS629 Fall 2002 Pipelining How HW accomplishes forwarding (See Fig. A.22 which show all 10 needed comparisons) Example of first comparison: PipelineReg. source: EX/MEM Opcode of source: Reg-Reg ALU PipelineReg. destination: ID/EX Opcode of destination: Reg-Reg ALU, ALU imm, load,store, branch Destination of forward: Top ALU input Comparison test for forward: EX/MEM.IR[rd] == ID/EX.IR[rs] etc. etc. etc.

CIS629 Fall 2002 Pipelining Forwarding hardware