Pipelining review.

Slides:



Advertisements
Similar presentations
Morgan Kaufmann Publishers The Processor
Advertisements

1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Instruction-Level Parallelism (ILP)
MIPS Pipelined Datapath
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Pipeline Hazards See: P&H Chapter 4.7.
Review: MIPS Pipeline Data and Control Paths
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
1 Stalls and flushes  So far, we have discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-2 Read Section 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
CMPE 421 Parallel Computer Architecture
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
CMPE 421 Parallel Computer Architecture Part 2: Hardware Solution: Forwarding.
CECS 440 Pipelining.1(c) 2014 – R. W. Allison [slides adapted from D. Patterson slides with additional credits to M.J. Irwin]
Winter 2002CSE Topic Branch Hazards in the Pipelined Processor.
5/13/99 Ashish Sabharwal1 Pipelining and Hazards n Hazards occur because –Don’t have enough resources (ALU’s, memory,…) Structural Hazard –Need a value.
2/15/02CSE Data Hazzards Data Hazards in the Pipelined Implementation.
Branch Hazards and Static Branch Prediction Techniques
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.
Interstage Buffers 1 Computer Organization II © McQuain Pipeline Timing Issues Consider executing: add $t2, $t1, $t0 sub $t3, $t1, $t0 or.
Pipeline Timing Issues
Exceptions Another form of control hazard Could be caused by…
Stalling delays the entire pipeline
CDA 3101 Spring 2016 Introduction to Computer Organization
Pipelining Chapter 6.
Morgan Kaufmann Publishers
Test 2 review Lectures 5-10.
Single Clock Datapath With Control
Pipeline Implementation (4.6)
Appendix C Pipeline implementation
Chapter 4 The Processor Part 4
ECS 154B Computer Architecture II Spring 2009
ECE232: Hardware Organization and Design
Chapter 4 The Processor Part 3
Review: MIPS Pipeline Data and Control Paths
Morgan Kaufmann Publishers The Processor
Pipelining Chapter 6.
Current Design.
Pipelining in more detail
CSCI206 - Computer Organization & Programming
Data Hazards Data Hazard
Pipeline control unit (highly abstracted)
The Processor Lecture 3.6: Control Hazards
Control unit extension for data hazards
The Processor Lecture 3.5: Data Hazards
Instruction Execution Cycle
Pipeline control unit (highly abstracted)
Pipeline Control unit (highly abstracted)
Pipelining (II).
Control unit extension for data hazards
Morgan Kaufmann Publishers The Processor
Introduction to Computer Organization and Architecture
Control unit extension for data hazards
Pipelining - 1.
Stalls and flushes Last time, we discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
MIPS Pipelined Datapath
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Pipelining Hazards.
Presentation transcript:

Pipelining review

Pipelining review Time IM Reg DM IM Reg DM IM Reg DM IM Reg DM Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time instr1 IM Reg DM instr2 IM Reg DM IM Reg DM instr3 IM Reg DM instr4 Instructions

Pipeline Control – big picture Control Unit ID/EX EX/Mem Mem/WB IF/ID IF ID EX Mem WB

Pipeline in action IF ID EX MEM WB Ctrl Instruction memory Register file clk ALU CLK SE PC Data Memory 0x1234 Let’s execute a program at address 0x1234: lw $1, 4($3) add $2, $3, $4 or $3, $1, $2 bne $1, $3, L Suppose: $1 = 1, $2 = 2, $3 = 1000, $4 = 6

Pipeline in action 1 IF ID EX MEM WB lw $1, 4($3) Instruction memory Register file ALU CLK SE PC Data Memory 0x1234 Clock: 1 $1 = 1, $2 = 2, $3 = 1000, $4 = 4

Pipeline in action 2 IF ID EX MEM WB add $2, $3, $4 Instruction memory Register file ALU CLK SE PC Data Memory 0x1238 lw $1, 4($3) Clock: 1 2 $1 = 1, $2 =2, $3 =1000, $4 = 6

Pipeline in action 3 IF ID EX MEM WB or $3, $1, $2 Instruction memory Register file ALU CLK SE PC Data Memory 0x123C add $2, $3, $4 lw $1, 4($3) Clock: 1 2 3 $1 = 1, $2 =2, $3 =1000, $4 = 6

Pipeline in action 4 (anything wrong?) IF ID EX MEM WB bne $1, $3, L Instruction memory Register file ALU CLK SE PC Data Memory 0x1240 or $3, $1, $2 add $2, $3, $4 lw $1, 4($3) Clock: Let’s assume memory returns 42 for lw. 1 2 3 4 $1 = 1, $2 =2, $3 =1000, $4 = 6

Data Hazards! IF ID EX MEM WB bne $1, $3, L Instruction memory Register file ALU CLK SE PC Data Memory 0x1240 or $3, $1, $2 add $2, $3, $4 lw $1, 4($3) Clock: 1 2 3 4 $1 = 1, $2 = 2, $3 = 1000, $4 = 6

Pipeline in action 5 IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 bne $1, $3, L or $3, $1, $2 add $2, $3, $4 Clock: $1 and $2 both wrong for or because we got old values! lw $1, 4($3) 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Data Hazards Data dependence Data dependence (RAW) occurs when: result of an operation needed before it is stored back in reg. file: lw $1, 4($3) add $2, $3, $4 or $3, $1, $2 The data hazard above is read after write (RAW) Data dependence (RAW) occurs when: An instruction wants to read a register in stage 2, and One instruction in stage 3 or stage 4 is going to write that register Note: if the instruction writing the register is in stage 5, this is fine since we can write a register and read it in the same cycle Hazard detection unit compares register fields across stages.

Resolving data dependencies Have the compiler generate no-ops Don’t have to deal with data hazards in hardware. Stall the pipeline, i.e., create bubbles the resulting delays are the same as for no-ops Send the result generated in stage 3 or stage 4 to the appropriate input of the ALU. This is forwarding or bypassing. More performance at the cost of more hardware For one simple pipeline, cost is slightly more control and extra buses For several pipelines, say n, communication grows as O(n2)

Forwarding from WB to EX IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 bne $1, $3, L or $3, $1, $2 add $2, $3, $4 Clock: $1 fixed, $2 still wrong lw $1, 4($3) 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Forwarding from MEM to EX IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 bne $1, $3, L or $3, $1, $2 add $2, $3, $4 Clock: lw $1, 4($3) 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Generally, need to forward to both inputs! IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 or $3, $1, $2 beq $1, $3, L add $2, $3, $4 What if we had: or $3, $1, $1 Clock: lw $1, 4($3) Need to forward $1 to both ALU inputs 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Forwarding a register twice? What if had: add $1, $2, $2 or $1, $3, $4 and $5, $1, $1 Where do you forward $1 from?

Another look at forwarding… Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time lw $1, 4($3) IM Reg DM Forward $1 add $2, $3, $4 IM Reg DM Forward $2 IM Reg DM or $3, $1, $2 fwd $3 IM Reg DM bne $1, $3, L Instructions

Data hazards and loads Time Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time Can’t do forwarding in this case: can’t go back in time. Have to insert a bubble between lw and or lw $1, 4($3) IM Reg DM ? or $3, $1, $2 IM Reg DM What if we switch these? IM Reg DM add $2, $3, $4 IM Reg DM bne $1, $3, L Instructions

Inserting a bubble Have to do it when you load into a register which is used in the next instruction Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time lw $1, 4($3) IM Reg DM Now we can forward $1 or $3, $1, $2 IM Reg DM do nothing IM Reg DM add $2, $3, $4 Insert bubble here IM Reg DM bne $1, $3, L Instructions …

Other hazards We’ve seen data hazards Other types of hazards: when an instruction in the pipeline is dependent on another instruction still in the pipeline. Resolved by: Bypassing/forwarding Stalling Other types of hazards: Structural hazards where two instructions at different stages want to use the same resource. Solved by using more resources (e.g., instruction and data memory; several ALU’s). Won’t happen in our pipeline. Control hazards happens on a taken branch. Evaluation of branch condition and calculation of branch target is not completed before next instruction is fetched.

Branch path IF ID EX MEM WB instr at L Instruction memory Register file ALU CLK SE PC sll $4, $3, 5 Data Memory L: This is wrong! (Should be instr at L) =? result of comparison or $3, $1, $2 Determine new PC add $2, $3, $4 Clock: bne $1, $3, L 1 2 3 4 5 6 $1 = 42, $2 = 1006, $3 = 1000, $4 = 6

Resolving control hazards Stall until result of the condition & target are known too slow Reduce penalty by redesigning the pipeline: move branch calculation to ID stage move branch comparison to ID stage how to do quick compare? use both edges of the clock Delay slots specify in ISA that instruction after branch is always executed. Branch prediction in IF stage, guess branch outcome correct it if turns out to be wrong.

Lw & sw IF ID EX MEM WB Ctrl Instruction memory Register file ALU CLK SE PC Data Memory Consider a memory copy program… loop: lw $1, 0($3) sw $1, 0($4) addi $3, $3, 4 addi $4, $4, 4 bne $4, $5, Loop Too many stalls! How do we fix this?