CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.

Slides:



Advertisements
Similar presentations
Morgan Kaufmann Publishers The Processor
Advertisements

1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Lecture Objectives: 1)Define branch prediction. 2)Draw a state machine for a 2 bit branch prediction scheme 3)Explain the impact on the compiler of branch.
Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
The Light at the End of the Tunnel – Doesn’t Mean You Died ! Part 3 - Branch Hazards – Easier! – 3/24/04 This is a control hazard – as opposed to data.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Pipeline Hazards See: P&H Chapter 4.7.
Pipeline Hazards Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University See P&H Appendix 4.7.
Pipeline Hazards CS365 Lecture 10. D. Barbara Pipeline Hazards CS465 2 Review  Pipelined CPU  Overlapped execution of multiple instructions  Each on.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor.
Part 2 - Data Hazards and Forwarding 3/24/04++
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Mary Jane Irwin ( ) [Adapted from Computer Organization and Design,
Part 4 - Exception Hazards – one last kicker 3/24/04 Similar problem as the conditional branch An exception is an involuntary branch from a non-branching.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
Goal: Reduce the Penalty of Control Hazards
CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?
Computer Organization Lecture Set – 06 Chapter 6 Huei-Yung Lin.
Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
Control Hazards.1 Review: Datapath with Data Hazard Control Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
1 Stalls and flushes  So far, we have discussed data hazards that can occur in pipelined CPUs if some instructions depend upon others that are still executing.
Abstraction Question General purpose processors have an abstraction layer fixed at the ISA and have little control over the compilers or code run on the.
Pipelining. 10/19/ Outline 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and Interrupts Conclusion.
CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-2 Read Section 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.
11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which.
CS.305 Computer Architecture Enhancing Performance with Pipelining Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
CMPE 421 Parallel Computer Architecture
CECS 440 Pipelining.1(c) 2014 – R. W. Allison [slides adapted from D. Patterson slides with additional credits to M.J. Irwin]
Winter 2002CSE Topic Branch Hazards in the Pipelined Processor.
1 (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann,
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Branch Hazards and Static Branch Prediction Techniques
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
1/24/ :00 PM 1 of 86 Pipelining Chapter 6. 1/24/ :00 PM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which.
CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)
CMPE 421 Parallel Computer Architecture Part 3: Hardware Solution: Control Hazard and Prediction.
CSIE30300 Computer Architecture Unit 06: Containing Control Hazards
PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Computer Organization
CS 230: Computer Organization and Assembly Language
Computer Organization CS224
Stalling delays the entire pipeline
5 Steps of MIPS Datapath Figure A.2, Page A-8
Single Clock Datapath With Control
Morgan Kaufmann Publishers The Processor
Chapter 4 The Processor Part 4
ECS 154B Computer Architecture II Spring 2009
Morgan Kaufmann Publishers The Processor
Pipelining review.
The processor: Pipelining and Branching
Current Design.
Pipelining in more detail
Data Hazards Data Hazard
The Processor Lecture 3.6: Control Hazards
Control unit extension for data hazards
Instruction Execution Cycle
Pipelining (II).
Control unit extension for data hazards
Control unit extension for data hazards
Pipelining - 1.
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Presentation transcript:

CML CML CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics Arizona State University Slides courtesy: Prof. Yann Hang Lee, ASU, Prof. Mary Jane Irwin, PSU, Ande Carle, UCB

CML CMLAnnouncements Alternate Project –Submit Nov 24 Quiz 5 –Thursday, Nov 19, 2009 –Pipelining Finals –Tuesday, Dec 08, 2009 –Please come on time (You’ll need all the time) –Open book, notes, and internet –No communication with any other human

CML CML Benefits of Pipelining Pipeline latches: pass the status and result of the current instruction to next stage Comparison: Clock Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 Ifetch lw sw Dec/Reg Exec Mem Wr Dec/Reg Exec Mem Ifetch Single- cycle inst. IfetchDec/Reg Exec Mem Wr IfetchDec/Reg Exec Mem Wr IfetchDec/Reg Exec Mem Wr pipelined

CML CML Branch Hazards So far, we’ve limited discussion of hazards to: –Arithmetic/logic operations –Data transfers Also need to consider hazards involving branches: –Example: 40:beq$1, $3, 28 44:and$12, $2, $5 48:or$13, $6, $2 52:add$14, $2, $2 72:lw$4, 50($7) How long will it take before the branch decision takes effect? –What happens in the meantime?

CML CML Branch signal determined in MEM stage Registers

CML CML Pipeline impact on branch If branch condition true, must skip 44, 48, 52 –But, these have already started down the pipeline –They will complete unless we do something about it How do we deal with this? –We’ll consider 2 possibilities

CML CML Dealing w/branch hazards: always stall Branch taken –Wait 3 cycles –No proper instructions in the pipeline –Same delay as without stalls (no time lost)

CML CML Dealing w/branch hazards: always stall Branch not taken –Still must wait 3 cycles –Time lost –Could have spent cycles fetching and decoding next instructions

CML CML Assume branch not taken On average, branches are taken ½ the time –If branch not taken… Continue normal processing –Else, if branch is taken… Need to flush improper instruction from pipeline Cuts overall time for branch processing in ½

CML CML Flushing unwanted instructions from pipeline Useful to compare w/stalling pipeline: –Simple stall: inject bubble into pipe at ID stage only Change control to 0 in the ID stage Let “bubbles” percolate to the right –Flushing pipe: must change inst. In IF, ID, and EX IF Stage: –Zero instruction field of IF/ID pipeline register –Use new control signal IF.Flush ID Stage: –Use existing “bubble injection” mux that zeros control for stalls –Signal ID.Flush is ORed w/stall signal from hazard detection unit EX Stage: –Add new muxes to zero EX pipeline register control lines –Both muxes controlled by single EX.Flush signal Control determines when to flush: –Depends on Opcode and value of branch condition

CML CML Flushing Pipeline PC IF/ID EX/MEM ID/EX MEM/WB WB M EX WB M M u x 0 M u x 0 M u x 0 Hazard Detection Unit Control IF.Flush ID.Flush EX.Flush Branch Decision Flush Pipeline

CML CML Assume “branch not taken”…and branch is not taken… Execution proceeds normally – no penalty

CML CML Assume “branch not taken”…and branch is taken… Bubbles injected into 3 stages during cycle 5

CML CML Reservation Table Picture Another way of looking at it… 40: beq $1, $3, 72 44: and $12, $2, $5 48: or $13, $6, $2 52: add $14, $2, $2 … 72: lw $4, 50($7) Assume Branch Not Taken and Correct Assume Branch Not Taken and NOT Correct IFBeqAndOrAdd56 IDBeqAndOrAdd56 EXBeqAndOrAdd56 Me m BeqAndOrAdd56 WBBeqAndOrAdd IFBeqAndOrAddSw IDBeqAndOrAddSw EXBeqAndOrAddSw Me m Beq WBBeq No penalty 3 cycle penalty (FYI, branch Freq ~ 20%; & 3 cycle penalty 50% of time)

CML CML Branch Penalty Impact Assume 16% of all instructions are branches –4% unconditional branches: 3 cycle penalty –12% conditional: 50% taken For a sequence of N instructions (assume N is large) N cycles to initiate each 3 * 0.04 * N delays due to unconditional branches 0.5 * 3 * 0.12 * N delays due to conditional taken Also, an extra 4 cycles for pipeline to empty Total: –1.3*N + 4 total cycles (or 1.3 cycles/instruction) (CPI) 30% Performance Hit!!! (Bad thing)

CML CML Branch Penalty Impact Some solutions: –In ISA: branches always execute next 1 or 2 instructions Instruction so executed said to be in delay slot See SPARC ISA (example – loop counter update) –In organization: move comparator to ID stage and decide in the ID stage Reduces branch delay by 2 cycles Increases the cycle time

CML CML Branch Prediction Prior solutions are “ugly” Better (& more common): guess in IF stage –Technique is called “branch predicting”; needs 2 parts: “Predictor” to guess where/if instruction will branch (and to where) “Recovery Mechanism”: i.e. a way to fix your mistake –Prior strategy: Predictor: always guess branch never taken Recovery: flush instructions if branch taken –Alternative: accumulate info. in IF stage as to… Whether or not for any particular PC value a branch was taken next To where it is taken How to update with information from later stages

CML A Branch Predictor

CML CML Branch History Table

CML CML Branch Prediction Information One bit predictor: –Use result from last time we saw this instruction Problem: –Even if branch is almost always taken, we will be wrong at least twice 1 st time we the instruction 1 st time the branch is not taken Also, 1 st time branch is taken again after than And if branch alternates b/t taken, not taken… –We get 0% accuracy Can we do better? Yep.

CML CML Branch Prediction Information How to do better? –Keep a “counter” in each entry of the number of times taken in the last N times executed –Keep information about the “pattern” of previous branches Book’s scheme: a “2-bit saturating counter” –Increment when branch is taken –Decrement when branch is not taken –Don’t increment or decrement above or below a max/min count Use sign of count as predictor

CML CML Book’s 2 Bit Branch Counter

CML CML Computing Performance Program assumptions: –23% loads and in ½ of cases, next instruction uses load value –13% stores –19% conditional branches –2% unconditional branches –43% other Machine Assumptions: –5 stage pipe with all forwarding Only penalty is 1 cycle on use of load value immediately after a load) Jumps are totally resolved in ID stage for a 1 cycle branch penalty 75% branch prediction accuracy 1 cycle delay on misprediction

CML CML The Answer: CPI penalty calculation: –Loads: 50% of the 23% of loads have 1 cycle penalty:.5*.23=0.115 –Jumps: All of the 2% of jumps have 1 cycle penalty: 0.02*1 = 0.02 –Conditional Branches: 25% of the 19% are mispredicted for a 1 cycle penalty: 0.25*0.19*1 = Total Penalty: = Average CPI: =

CML CML Yoda says… Death is a natural part of life. Rejoice for those around you who transform into the Force. Mourn them do not. Miss them do not