Reducing pipeline hazards – three techniques

Slides:

Advertisements

Similar presentations

Morgan Kaufmann Publishers The Processor

Advertisements

ILP: IntroductionCSCE430/830 Instruction-level parallelism: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.

COMP381 by M. Hamdi 1 (Recap) Pipeline Hazards. COMP381 by M. Hamdi 2 I n s t r. O r d e r add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11.

1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.

CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.

1 Advanced Computer Architecture Limits to ILP Lecture 3.

Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.

Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.

Instruction-Level Parallelism (ILP)

1 A few words about the quiz Closed book, but you may bring in a page of handwritten notes. –You need to know what the “core” MIPS instructions do. –I.

1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.

CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?

EENG449b/Savvides Lec 4.1 1/22/04 January 22, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.

Lec 9: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.

ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.

-1.1- PIPELINING 2 nd week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PIPELINING 2 nd week References Pipelining concepts The DLX.

Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.

Memory/Storage Architecture Lab Computer Architecture Pipelining Basics.

Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.

1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.

Computer Architecture Pipelines & Superscalars Sunset over the Pacific Ocean Taken from Iolanthe II about 100nm north of Cape Reanga.

Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.

Instruction Rescheduling and Loop-Unroll Department of Computer Science Southern Illinois University Edwardsville Fall, 2015 Dr. Hiroshi Fujinoki

5/13/99 Ashish Sabharwal1 Pipelining and Hazards n Hazards occur because –Don’t have enough resources (ALU’s, memory,…) Structural Hazard –Need a value.

Branch Hazards and Static Branch Prediction Techniques

10/11: Lecture Topics Execution cycle Introduction to pipelining

Introduction to Computer Organization Pipelining.

1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.

Speed up on cycle time Stalls – Optimizing compilers for pipelining

Computer Organization CS224

CDA3101 Recitation Section 8

Pipelining: Hazards Ver. Jan 14, 2014

5 Steps of MIPS Datapath Figure A.2, Page A-8

CS203 – Advanced Computer Architecture

Single Clock Datapath With Control

Pipeline Implementation (4.6)

Appendix C Pipeline implementation

CDA 3101 Spring 2016 Introduction to Computer Organization

Appendix A - Pipelining

Pipelining: Advanced ILP

Chapter 4 The Processor Part 3

Morgan Kaufmann Publishers The Processor

Computer Architecture Lecture 3 – Part 1 11th May, 2006

Computer Architecture Lecture 3

Pipelining review.

Pipelining Chapter 6.

The processor: Pipelining and Branching

Pipelining in more detail

CSCI206 - Computer Organization & Programming

CSC 4250 Computer Architectures

Computer Architecture

Data Hazards Data Hazard

Pipeline control unit (highly abstracted)

The Processor Lecture 3.6: Control Hazards

November 5 No exam results today. 9 Classes to go!

Instruction Execution Cycle

Overview What are pipeline hazards? Types of hazards

Pipeline control unit (highly abstracted)

CS203 – Advanced Computer Architecture

Pipelining: Basic Concepts

CS 286 Computer Architecture & Organization

Instruction Rescheduling and Loop-Unroll

Pipeline Control unit (highly abstracted)

Appendix C Practice Problem Set 1

Throughput = #instructions per unit time (seconds/cycles etc.)

Problem ??: (?? marks) Consider executing the following code on the MIPS pipelined datapath: add $t5, $t6, $t8 add $t9, $t5, $t4 lw $t3, 100($t9) sub $t2,

Pipelining Hazards.

Presentation transcript:

Reducing pipeline hazards – three techniques Department of Computer Science Southern Illinois University Edwardsville Fall, 2018 Dr. Hiroshi Fujinoki E-mail: hfujino@siue.edu Forwarding/000

Reducing pipeline hazards – three techniques Three techniques for different types of pipeline hazards 1. Forwarding – for reducing RAW data dependencies 2. Instruction Scheduling – for reducing RAW, WAR and WAW 3. Delayed Branch – for reducing control hazards Forwarding/001

Reducing pipeline hazards – three techniques Technique 1: Forwarding = Internal pipeline circuit to feedback outputs of a stage Latch Feedback-Wire IF ID EX ME WB Outputs from a pipeline stage can be fed to the same or different stages of another instruction Need hardware support Forwarding/002

Reducing pipeline hazards – three techniques Example ADD R1, R2, R3 LW R4, 10(R1) SW 12(R1), R4 // R1 = R2 + R3 // R4  MEM [R1 + 0] // MEM [R1+12]  R4 Pipeline time chart for an ordinary pipeline processor IF ID EX ME WB ADD R1, R2, R3: LW R4, 10(R1): SW 12(R1), R4: IF ID EX ME WB STALL IF STALL ID EX ME WB 1 2 3 4 5 6 7 8 9 10 11 12 13 Forwarding/003

Reducing pipeline hazards – three techniques Latch Feedback-Wire IF ID EX ME WB ADD R1, R2, R3: IF ID EX ME WB LW R4, 10(R1): IF ID EX ME WB ADD R1, R2, R3: LW R4, 0(R1): Forwarding/004

Reducing pipeline hazards – three techniques IF ID EX ME WB ADD R1, R2, R3: IF ID EX ME WB LW R4, 0(R1): IF ID EX ME WB SW 12(R1), R4: IF ID EX ME WB ADD R1, R2, R3: LW R4, 0(R1): SW 12(R1), R4: 1 2 3 4 5 6 7 8 Speed-up = 13/7 = 1.85 Forwarding/005

(in high-level language, such as C++) Reducing pipeline hazards – three techniques Technique 2: Instruction scheduling by a compiler a = b + c (in high-level language, such as C++) LOAD R1, b // R1  MEM [Address of b] LOAD R2, c // R2  MEM [Address of b] a = b + c ADD R3, R1, R2 // R3  R1 + R2 STORE a, R3 // MEM [Address of a]  R3 Scheduling/001

Reducing pipeline hazards – three techniques LOAD R1, b // R1  MEM [Address of b] LOAD R2, c // R2  MEM [Address of c] ADD R3, R1, R2 // R3  R1 + R2 STORE a, R3 // MEM [Address of a]  R3 IF ID EX ME WB LOAD R1, b: LOAD R2, c: ADD R3, R1, R2: STORE a, R3: IF ID EX ME WB IF ID EX ME WB STALL (3) IF ID EX ME WB STALL (6) Forwarding/002

Reducing pipeline hazards – three techniques 1 LOAD R1, b 2 LOAD R2, c X ADD R3, R1, R2 7 X 8 X 9 X 10 STORE a, R3 1 2 3 4 Scheduling/002

Reducing pipeline hazards – three techniques Now, we are going to execute two instructions a = b + c d = e + f Scheduling/003

Reducing pipeline hazards – three techniques a = b + c d = e + f 1 LOAD R1, b 2 LOAD R2, c X ADD R3, R1, R2 7 X 8 X 9 X 10 STORE a, R3 Time 11 LOAD R4, e 12 LOAD R5, f 13 X 14 X 15 X 16 ADD R6, R4, R5 17 X 18 X 19 X 20 STORE d, R6 Time Scheduling/004

Reducing pipeline hazards – three techniques 1 LOAD R1, b LOAD R2, c X LOAD R4, e X LOAD R5, f X X 6 ADD R3, R1, R2 X 7 X X 8 X ADD R6, R4, R5 9 X X 10 STORE c, R3 X X STORE d, R6 a = b + c d = e + f Delay the 2nd instruction  MERGE  Scheduling/005

Reducing pipeline hazards – three techniques a = b + c d = e + f 1 LOAD R1, b LOAD R2, c LOAD R4, e LOAD R5, f X 6 ADD R3, R1, R2 7 X 8 ADD R6, R4, R5 9 X 10 STORE c, R3 STORE d, R6 Speed-Up = 21/12 = 1.75 Scheduling/006

Reducing pipeline hazards – three techniques Technique 3: Delayed Branch: = Fill up clock cycles that will be flashed by a branch instruction If branch NOT taken IF ID EX WB Branch Instruction(i): Instruction(i+1): Instruction(i+2): IF ID EX ME WB IF ID EX ME WB 1 2 3 4 5 6 7 8 DelayBranch/001

Reducing pipeline hazards – three techniques New destination address is set in PC If branch taken IF ID EX WB Branch Instruction(i): IF IF ID EX ME WB Instruction(i+1): IF ID EX ME WB Instruction(i+2): 1 2 3 4 5 6 7 8 9 10 11 DelayBranch/002

Reducing pipeline hazards – three techniques Before Delayed Branch Applied IF ID EX ME WB Branch Instruction(i): Instruction(i-1): Instruction(i+2): Instruction(i-2): Instruction(i-3): Instruction(i+1): IF ID EX WB IF IF ID EX ME WB IF ID EX ME WB We are going to lose 3 cycles DelayBranch/003

Reducing pipeline hazards – three techniques After Delayed Branch Applied Delayed-branch slot = 3 IF ID EX WB Branch Instruction(i): IF ID EX ME WB Instruction(i-1): Instruction(i-2): Instruction(i-3): IF ID EX ME WB Instruction(i+2): Instruction(i+1): DelayBranch/004

Reducing pipeline hazards – three techniques Problem in delayed-branch: data dependency to the branch instruction Example: SUB R1, R2, R3 JPEZ R1 LW R8, 0(R4) Conditional branch (Jump if R1 = 0) We can’t do this! IF ID EX WB JPEZ R1, 0(R5): SUB R1, R2, R3: LW R8, 0(R4): IF ID EX ME WB IF ID EX ME WB 1 2 3 4 5 6 7 8 DelayBranch/005

Reducing pipeline hazards – three techniques Advantages Wasted machine cycles in branch slot can be utilized no matter if a branch is taken or not DelayBranch/006

Reducing pipeline hazards – three techniques Disadvantages It does not work if there is data dependency Improvement only if: branch instructions, no data dependency Probability of improvement only 50% (assuming 50:50 branch or not) DelayBranch/007

Reducing pipeline hazards – three techniques Summary for Delayed Branch: To reduce machine cycle wastes due to pipeline flashes For pipeline flashed due to control dependencies Don’t throw away results of instructions in the branch slot Improvement is rather limited DelayBranch/008

Reducing pipeline hazards – three techniques Scheduling/007

Static & Dynamic Code Optimizations Static optimizations No overhead for program executions  complex (time-consuming) code optimization algorithms can be applied without slowing-down programs. No additional cost for processors manufacturing  cheaper processors, more reliable processors. Less complex processor internal design  less heat generation (higher clock rate). Code Optimizations/001

Static & Dynamic Code Optimizations Dynamic optimizations  Codes that were not optimized can be optimized. performance will be optimized for each processor. “back-ward compatibility” Code Optimizations/002