Pipelining in more detail

Slides:



Advertisements
Similar presentations
Morgan Kaufmann Publishers The Processor
Advertisements

Instruction-Level Parallelism (ILP)
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Pipeline Hazards See: P&H Chapter 4.7.
Review: MIPS Pipeline Data and Control Paths
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
1 Stalls and flushes  So far, we have discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
Pipelined Datapath and Control
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
CMPE 421 Parallel Computer Architecture
1 Designing a Pipelined Processor In this Chapter, we will study 1. Pipelined datapath 2. Pipelined control 3. Data Hazards 4. Forwarding 5. Branch Hazards.
CMPE 421 Parallel Computer Architecture Part 2: Hardware Solution: Forwarding.
CECS 440 Pipelining.1(c) 2014 – R. W. Allison [slides adapted from D. Patterson slides with additional credits to M.J. Irwin]
2/15/02CSE Data Hazzards Data Hazards in the Pipelined Implementation.
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
CSE431 L06 Basic MIPS Pipelining.1Irwin, PSU, 2005 MIPS Pipeline Datapath Modifications  What do we need to add/modify in our MIPS datapath? l State registers.
Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.
Chapter Six.
Pipeline Timing Issues
Computer Organization
Exceptions Another form of control hazard Could be caused by…
Stalling delays the entire pipeline
CDA 3101 Spring 2016 Introduction to Computer Organization
Pipelining Chapter 6.
CSCI206 - Computer Organization & Programming
Morgan Kaufmann Publishers
Morgan Kaufmann Publishers The Processor
Single Clock Datapath With Control
Pipeline Implementation (4.6)
Chapter 4 The Processor Part 4
ECS 154B Computer Architecture II Spring 2009
ECE232: Hardware Organization and Design
Chapter 4 The Processor Part 3
Review: MIPS Pipeline Data and Control Paths
Morgan Kaufmann Publishers The Processor
Morgan Kaufmann Publishers The Processor
Pipelining review.
ELEC / Computer Architecture and Design Spring Pipelining (Chapter 6)
Pipelining Chapter 6.
Current Design.
CSC 4250 Computer Architectures
CSCI206 - Computer Organization & Programming
Data Hazards Data Hazard
Pipeline control unit (highly abstracted)
Chapter Six.
The Processor Lecture 3.6: Control Hazards
Chapter Six.
Control unit extension for data hazards
The Processor Lecture 3.5: Data Hazards
Instruction Execution Cycle
Pipeline control unit (highly abstracted)
Pipeline Control unit (highly abstracted)
Pipelining (II).
Control unit extension for data hazards
Morgan Kaufmann Publishers The Processor
Introduction to Computer Organization and Architecture
Control unit extension for data hazards
Pipelining - 1.
Stalls and flushes Last time, we discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
MIPS Pipelined Datapath
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Pipelining Hazards.
Presentation transcript:

Pipelining in more detail Controls Hazards

Single Cycle MIPS Implementation All instructions take the same amount of time Signals propagate along longest path Lots of operations happening in parallel Increment PC ALU Branch target computation

Multicycle MIPS Implementation Instructions take different amounts of cycles Cycles are identical in length Share resources across cycles ALU Cycles are consecutive R-type and memory-reference instructions do different things in their 4th cycles

Pipelined MIPS Implementation Like single cycle All instructions take the same amount of cycles Lots of things going on in parallel But they belong to different instructions! Each pipeline stage has associated state Unlike multicycle Stages are fixed May do nothing in one stage MEM stage for Reg-Reg instructions

Pipelining review Time IM Reg DM IM Reg DM IM Reg DM IM Reg DM Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time instr1 IM Reg DM instr2 IM Reg DM IM Reg DM instr3 IM Reg DM instr4 Instructions

Pipeline Control – big picture Control Unit ID/EX EX/Mem Mem/WB IF/ID IF ID EX Mem WB

Pipeline control Many control signals same as in multicycle datapath No write signal for PC or pipeline registers These are written on every cycle Need to set control at each stage – extend pipeline registers to include control IF, ID: no controls to set (all instructions behave identically) EX: set ALU sources and op, select between rt/rd for result register Mem: memory read/write, PC select for branches WB: select result from memory or ALU, write enable control for reg file.

Pipeline in action IF ID EX MEM WB Ctrl Instruction memory Register file clk ALU CLK SE PC Data Memory 0x1234 Let’s execute a program at address 0x1234: lw $1, 4($3) add $2, $3, $4 or $3, $1, $2 bne $1, $3, L Suppose: $1 = 1, $2 = 2, $3 = 1000, $4 = 6

Pipeline in action 1 IF ID EX MEM WB lw $1, 4($3) Instruction memory Register file ALU CLK SE PC Data Memory 0x1234 Clock: 1 $1 = 1, $2 = 2, $3 = 1000, $4 = 4

Pipeline in action 2 IF ID EX MEM WB add $2, $3, $4 Instruction memory Register file ALU CLK SE PC Data Memory 0x1238 lw $1, 4($3) Clock: 1 2 $1 = 1, $2 =2, $3 =1000, $4 = 6

Pipeline in action 3 IF ID EX MEM WB or $3, $1, $2 Instruction memory Register file ALU CLK SE PC Data Memory 0x123C add $2, $3, $4 lw $1, 4($3) Clock: 1 2 3 $1 = 1, $2 =2, $3 =1000, $4 = 6

Pipeline in action 4 (anything wrong?) IF ID EX MEM WB bne $1, $3, L Instruction memory Register file ALU CLK SE PC Data Memory 0x1240 or $3, $1, $2 add $2, $3, $4 lw $1, 4($3) Clock: Let’s assume memory returns 42 for lw. 1 2 3 4 $1 = 1, $2 =2, $3 =1000, $4 = 6

Data Hazards! IF ID EX MEM WB bne $1, $3, L Instruction memory Register file ALU CLK SE PC Data Memory 0x1240 or $3, $1, $2 add $2, $3, $4 lw $1, 4($3) Clock: 1 2 3 4 $1 = 1, $2 = 2, $3 = 1000, $4 = 6

Pipeline in action 5 IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 bne $1, $3, L or $3, $1, $2 add $2, $3, $4 Clock: $1 and $2 both wrong for or because we got old values! lw $1, 4($3) 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Data Hazards Data dependence Data dependence (RAW) occurs when: result of an operation needed before it is stored back in register file: lw $1, 4($3) add $2, $3, $4 or $3, $1, $2 nop The data hazard above is called read after write (RAW) Data dependence (RAW) occurs when: An instruction wants to read a register in stage 2, and One instruction in stage 3 or stage 4 is going to write that register Note: if the instruction writing the register is in stage 5, this is fine since we can write a register and read it in the same cycle Hazard detection unit compares register fields across stages.

Resolving data dependencies Have the compiler generate no-ops Don’t have to deal with data hazards in hardware. Stall the pipeline, i.e., create bubbles the resulting delays are the same as for no-ops Send the result generated in stage 3 or stage 4 to the appropriate input of the ALU. This is called forwarding or bypassing. More performance at the cost of more hardware For one simple pipeline, cost is slightly more control and extra buses For several pipelines, say n, communication grows as O(n2)

Forwarding from WB to EX IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 bne $1, $3, L or $3, $1, $2 add $2, $3, $4 Clock: $1 fixed, $2 still wrong lw $1, 4($3) 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Forwarding from MEM to EX IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 bne $1, $3, L or $3, $1, $2 add $2, $3, $4 Clock: lw $1, 4($3) 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Generally, need to forward to both inputs! IF ID EX MEM WB sll $4, $3, 5 Instruction memory Register file ALU CLK SE PC Data Memory 0x1244 or $3, $1, $2 beq $1, $3, L add $2, $3, $4 What if we had: or $3, $1, $1 Clock: lw $1, 4($3) Need to forward $1 to both ALU inputs 1 2 3 4 5 $1 = 42, $2 = 2, $3 = 1000, $4 = 6

Forwarding a register twice? What if had: add $1, $2, $2 or $1, $3, $4 and $5, $1, $1 Where do you forward $1 from? Answer: from most recent $1 (i.e. or)

Another look at forwarding… Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time lw $1, 4($3) IM Reg DM Forward $1 add $2, $3, $4 IM Reg DM Forward $2 IM Reg DM or $3, $1, $2 fwd $3 IM Reg DM bne $1, $3, L Instructions

Data hazards and loads Time Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time Can’t do forwarding in this case: can’t go back in time. Have to insert a bubble between lw and or lw $1, 4($3) IM Reg DM ? or $3, $1, $2 IM Reg DM What if we switch these? IM Reg DM add $2, $3, $4 IM Reg DM bne $1, $3, L Instructions

Inserting a bubble Have to do it when you load into a register which is used in the next instruction Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 6 Clock Cycle 7 Time lw $1, 4($3) IM Reg DM Now we can forward $1 or $3, $1, $2 IM Reg DM do nothing IM Reg DM add $2, $3, $4 IM Reg DM bne $1, $3, L Instructions …

Other hazards We’ve seen data hazards Other types of hazards: when an instruction in the pipeline is dependent on another instruction still in the pipeline. Resolved by: Bypassing/forwarding Stalling Other types of hazards: Structural hazards where two instructions at different stages want to use the same resource. Solved by using more resources (e.g., instruction and data memory; several ALU’s). Won’t happen in our pipeline. Control hazards happens on a taken branch. Evaluation of branch condition and calculation of branch target is not completed before next instruction is fetched.

Branch path IF ID EX MEM WB instr at L Instruction memory Register file ALU CLK SE PC sll $4, $3, 5 Data Memory L: This is wrong! (Should be instr at L) =? result of comparison or $3, $1, $2 Determine new PC add $2, $3, $4 Clock: bne $1, $3, L 1 2 3 4 5 6 $1 = 42, $2 = 1006, $3 = 1000, $4 = 6

Resolving control hazards Stall until result of the condition & target are known too slow Reduce penalty by redesigning the pipeline: move branch calculation to ID stage move branch comparison to ID stage how to do quick compare? use both edges of the clock Delay slots specify in ISA that instruction after branch is always executed. Branch prediction in IF stage, guess branch outcome correct it if turns out to be wrong.