Pipelining V Topics Branch prediction State machine design Systems I.

Slides:



Advertisements
Similar presentations
Branch prediction Titov Alexander MDSP November, 2009.
Advertisements

Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Department of Computer Architecture.
Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.
Pipelining and Control Hazards Oct
Lecture Objectives: 1)Define branch prediction. 2)Draw a state machine for a 2 bit branch prediction scheme 3)Explain the impact on the compiler of branch.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
CPE 631: Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic,
Dynamic Branch Prediction
Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Zhao Zhang Iowa State University Revised from original.
CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
1 Lecture: Branch Prediction Topics: branch prediction, bimodal/global/local/tournament predictors, branch target buffer (Section 3.3, notes on class webpage)
W04S1 COMP s1 Seminar 4: Branch Prediction Slides due to David A. Patterson, 2001.
1 Lecture 7: Static ILP, Branch prediction Topics: static ILP wrap-up, bimodal, global, local branch prediction (Sections )
EECS 470 Branch Prediction Lecture 6 Coverage: Chapter 3.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 8, 2003 Topic: Instruction-Level Parallelism (Dynamic Branch Prediction)
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Oct. 7, 2002 Topic: Instruction-Level Parallelism (Dynamic Branch Prediction)
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
EECC551 - Shaaban #1 lec # 5 Fall Reduction of Control Hazards (Branch) Stalls with Dynamic Branch Prediction So far we have dealt with.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
EECC551 - Shaaban #1 lec # 7 Fall Hardware Dynamic Branch Prediction Simplest method: –A branch prediction buffer or Branch History Table.
Goal: Reduce the Penalty of Control Hazards
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
1 COMP 740: Computer Architecture and Implementation Montek Singh Thu, Feb 19, 2009 Topic: Instruction-Level Parallelism III (Dynamic Branch Prediction)
Dynamic Branch Prediction
EENG449b/Savvides Lec /25/05 March 24, 2005 Prof. Andreas Savvides Spring g449b EENG 449bG/CPSC 439bG.
CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.
Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
1 Lecture 7: Branch prediction Topics: bimodal, global, local branch prediction (Sections )
Arvind and Joel Emer Computer Science and Artificial Intelligence Laboratory M.I.T. Branch Prediction.
CSCI 6461: Computer Architecture Branch Prediction Instructor: M. Lancaster Corresponding to Hennessey and Patterson Fifth Edition Section 3.3 and Part.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Branch.1 10/14 Branch Prediction Static, Dynamic Branch prediction techniques.
CPE 631 Session 17 Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville.
CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)
Dynamic Branch Prediction
CSL718 : Pipelined Processors
Computer Organization CS224
CS203 – Advanced Computer Architecture
Computer Structure Advanced Branch Prediction
Dynamic Branch Prediction
CS5100 Advanced Computer Architecture Advanced Branch Prediction
COSC3330 Computer Architecture Lecture 15. Branch Prediction
Samira Khan University of Virginia Nov 13, 2017
Chapter 4 The Processor Part 4
CMSC 611: Advanced Computer Architecture
The processor: Pipelining and Branching
So far we have dealt with control hazards in instruction pipelines by:
CPE 631: Branch Prediction
Branch statistics Branches occur every 4-6 instructions (16-25%) in integer programs; somewhat less frequently in scientific ones Unconditional branches.
Lecture: Branch Prediction
Dynamic Branch Prediction
So far we have dealt with control hazards in instruction pipelines by:
Lecture 10: Branch Prediction and Instruction Delivery
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
pipelining: static branch prediction Prof. Eric Rotenberg
Pipelining: dynamic branch prediction Prof. Eric Rotenberg
Adapted from the slides of Prof
Pipelining (II).
Dynamic Hardware Prediction
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
Wackiness Algorithm A: Algorithm B:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
Lecture 7: Branch Prediction, Dynamic ILP
Presentation transcript:

Pipelining V Topics Branch prediction State machine design Systems I

2 Branch Prediction Until now - we have assumed a “predict taken” strategy for conditional branches Compute new branch target and begin fetching from there If prediction is incorrect, flush pipeline and begin refetching However, there are other strategies Predict not-taken Combination (quasi-static) Predict taken if branch backward (like a loop) Predict not taken if branch forward

3 Branching Structures Predict not taken works well for “top of the loop” branching structures Loop: cmpl %eax, %edx je Out 1 nd loop instr. last loop instr jmp Loop Out: fall out instr But such loops have jumps at the bottom of the loop to return to the top of the loop – and incur the jump stall overhead Predict not taken doesn’t work well for “bottom of the loop” branching structures Loop: 1 st loop instr 2 nd loop instr. last loop instr cmpl %eax, %edx jne Loop fall out instr

4 Branch Prediction Algorithms Static Branch Prediction Prediction (taken/not-taken) either assumed or encoded into program Dynamic Branch Prediction Uses forms of machine learning (in hardware) to predict branches Track branch behavior Past history of individual branches Learn branch biases Learn patterns and correlations between different branches Can be very accurate (95% plus) as compared to less than 90% for static

5 IM PC BHT IR Prediction update Simple Dynamic Predictor Predict branch based on past history of branch Branch history table Indexed by PC (or fraction of it) Each entry stores last direction that indexed branch went (1 bit to encode taken/not-taken) Table is a cache of recent branches Buffer size of 4096 entries are common (track 4K different branches)

6 Multi-bit predictors A ‘predict same as last’ strategy gets two mispredicts on each loop Predict NTTT…TTT Actual TTTT…TTN Can do much better by adding inertia to the predictor e.g., two-bit saturating counter Predict TTTT…TTT Use two bits to encode: Strongly taken (T2) Weakly taken (T1) Weakly not-taken (N1) Strongly not-taken (N2) for(j=0;j<30;j++) { …} N2N1T1T2 TTTT NNNN State diagram to representing states and transitions

7 How do we build this in Hardware? This is a sequential logic circuit that can be formulated as a state machine 4 states (N2, N1, T1, T2) Transitions between the states based on action “b” General form of state machine: N2N1T1T2 TTTT NNNN State Variables (Flip-flops) Comb. Logic inputsoutputs

8 State Machine for Branch Predictor 4 states - can encode in two state bits 4 states - can encode in two state bits N2 = 00, N1 = 01, T1 = 10, T2 = 11 Thus we only need 2 storage bits (flip-flops in last slide) Input: b = 1 if last branch was taken, 0 if not taken Output: p = 1 if predict taken, 0 if predict not taken Now - we just need combinational logic equations for: p, S1 new, S0 new, based on b, S1, S0

9 Combinational logic for state machine p =1 if state is T2 or T1 thus p = S1 (according to encodings) The state variables S1, S0 are governed by the truth table that implements the state diagram S1 new = S1*S0 + S1*b + S0*b S0 new = S1*S0’ + S0’*S1’*b + S0*S1*b S1S0b S1 new S0 new p

10 IM PC BHT IR Prediction update Enhanced Dynamic Predictor Replace simple table of 1 bit histories with table of 2 bit state bits State transition logic can be shared across all entries in table Read entry out Apply combinational logic Write updated state bits back into table

11 YMSBP Yet more sophisticated branch predictors Predictors that recognize patterns eg. if last three instances of a given branches were NTN, then predict taken Predictors that correlate between multiple branches eg. if the last three instances of any branch were NTN, then predict taken Predictors that correlate weight different past branches differently e.g. if the branches 1, 4, and 8 ago were NTN, then predict taken Hybrid predictors that are composed of multiple different predictors e.g. two different predictors run in parallel and a third predictor predicts which one to use More sophisticated learning algorithms

12 Branch target buffers Predictor tells us taken/not-taken Actual target address still must be calculated Branch target buffer contains the predicted target address Allows speculative fetch to occur earlier in pipeline Requires more storage (PC, not just prediction state)

13 Summary Today Branch mispredictions cost a lot in performance CPU Designers willing to go to great lengths to improve prediction accuracy Predictors are just state machines that can be designed using combinational logic and flip-flops