1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio.

Slides:

Advertisements

Similar presentations

André Seznec Caps Team IRISA/INRIA 1 Looking for limits in branch prediction with the GTL predictor André Seznec IRISA/INRIA/HIPEAC.

Advertisements

Branch prediction Titov Alexander MDSP November, 2009.

Pipelining V Topics Branch prediction State machine design Systems I.

Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Department of Computer Architecture.

Hardware-based Devirtualization (VPC Prediction) Hyesoon Kim, Jose A. Joao, Onur Mutlu ++, Chang Joo Lee, Yale N. Patt, Robert Cohn* ++ *

André Seznec Caps Team IRISA/INRIA 1 The O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.

Dynamic Branch Prediction

Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.

Yue Hu David M. Koppelman Lu Peng A Penalty-Sensitive Branch Predictor Department of Electrical and Computer Engineering Louisiana State University.

A PPM-like, tag-based predictor Pierre Michaud. 2 Main characteristics global history based 5 tables –one 4k-entry bimodal (indexed with PC) –four 1k-entry.

TAGE-SC-L Branch Predictors

Dibakar Gope and Mikko H. Lipasti University of Wisconsin – Madison Championship Branch Prediction 2014 Bias-Free Neural Predictor.

CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.

Perceptron-based Global Confidence Estimation for Value Prediction Master’s Thesis Michael Black June 26, 2003.

1 Applying Perceptrons to Speculation in Computer Architecture Michael Black Dissertation Defense April 2, 2007.

1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)

VLSI Project Neural Networks based Branch Prediction Alexander ZlotnikMarcel Apfelbaum Supervised by: Michael Behar, Spring 2005.

Goal: Reduce the Penalty of Control Hazards

Branch Target Buffers BPB: Tag + Prediction

1 Lecture 8: Instruction Fetch, ILP Limits Today: advanced branch prediction, limits of ILP (Sections , )

Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.

1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )

EENG449b/Savvides Lec /25/05 March 24, 2005 Prof. Andreas Savvides Spring g449b EENG 449bG/CPSC 439bG.

CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.

Perceptrons Branch Prediction and its’ recent developments

Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.

Optimized Hybrid Scaled Neural Analog Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.

1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.

Revisiting Load Value Speculation:

Evaluation of the Gini-index for Studying Branch Prediction Features Veerle Desmet Lieven Eeckhout Koen De Bosschere.

1 A 64 Kbytes ITTAGE indirect branch predictor André Seznec INRIA/IRISA.

Analysis of Branch Predictors

1 Two research studies related to branch prediction and instruction sequencing André Seznec INRIA/IRISA.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

1 A New Case for the TAGE Predictor André Seznec INRIA/IRISA.

1 Revisiting the perceptron predictor André Seznec IRISA/ INRIA.

Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,

Not- Taken? Taken? The Frankenpredictor Gabriel H. Loh Georgia Tech College of Computing MICRO Dec 5, 2004.

Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,

1/25 June 28 th, 2006 BranchTap: Improving Performance With Very Few Checkpoints Through Adaptive Speculation Control BranchTap Improving Performance With.

André Seznec Caps Team IRISA/INRIA 1 A 256 Kbits L-TAGE branch predictor André Seznec IRISA/INRIA/HIPEAC.

CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)

Temporal Stream Branch Predictor (TS Predictor) Yongming Shen, Michael Ferdman.

Branch Prediction Perspectives Using Machine Learning Veerle Desmet Ghent University.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

Samira Khan University of Virginia April 12, 2016

Multiperspective Perceptron Predictor Daniel A. Jiménez Department of Computer Science & Engineering Texas A&M University.

CS203 – Advanced Computer Architecture

Dynamic Branch Prediction

Multiperspective Perceptron Predictor with TAGE

CS5100 Advanced Computer Architecture Advanced Branch Prediction

COSC3330 Computer Architecture Lecture 15. Branch Prediction

FA-TAGE Frequency Aware TAgged GEometric History Length Branch Predictor Boyu Zhang, Christopher Bodden, Dillon Skeehan ECE/CS 752 Advanced Computer Architecture.

Samira Khan University of Virginia Dec 4, 2017

Exploring Branch Prediction

CMSC 611: Advanced Computer Architecture

Exploring Value Prediction with the EVES predictor

Looking for limits in branch prediction with the GTL predictor

Dynamic Branch Prediction

Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt

Lecture 10: Branch Prediction and Instruction Delivery

TAGE-SC-L Again MTAGE-SC

5th JILP Workshop on Computer Architecture Competitions

Pipelining: dynamic branch prediction Prof. Eric Rotenberg

Adapted from the slides of Prof

Dynamic Hardware Prediction

The O-GEHL branch predictor

Samira Khan University of Virginia Mar 6, 2019

Presentation transcript:

1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio

2 For 25 years, branch predictors exploit: While (..){ If ((X % 3) || (X % 5)) {..} X++; } If (X< -2) {..} If (X> 1) {..} If (X==0) {..}  Local history predictors  Global history predictors

3 In practice, on real hardware, Just global history predictors  + a loop predictor (sometimes) local history is not very efficient  CBP4: ~5 % misprediction reduction  a mess to implement

4 The messy management of speculative local history Local History Table Local History Table update at commit time to prediction tables B h4 B h3 B h2 B h1 Speculative History for the most recent occurrence of branch B Window of inflight branches Several (many) instances of the same branch inflight: wrong history  wrong prediction

5 State-of-the-art global history predictors Neural predictors:  Piecewise linear, Hashed perceptron, SNAP, GEHL TAGE-GSC:  TAGE + a neural predictor TAGE-GSC= (TAGE-SC-L – local hist – loop pred.)

6 PC+ Glob hist ++ Prediction = sign Neural predictors

7 TAGE-GSC (Main) TAGE Predictor Stat. Cor. Prediction + Confidence PC +Global history Just a neural predictor: with TAGE prediction as an input

8 How predictors work Evers98: Branch B correlated with a few past branches  Not so many paths from correlators to B  Try to capture every path to B Kind of brute force approach

9 How to identify correlator branches The loop predictor does it smoothly for loops Albericio et al 2014 Correlation in multidimensinal loops

10 Wormhole branch prediction Albericio et al. Micro 2014  Correlation in multidimensional loops for (i=0;i <Nmax; i++) for (j=0; j < Mmax; j++){ if (A[j+i] >0) {..} if (B[i][j]-B[i-1][j])>0) if (C[j]>0){..} } Correlation with neighboring iterations but in the previous outer iteration j+i=Const same output j= Const weak correlation j= Const strong correlation

11 Wormhole predictor: a side predictor Monitor hard to predict branches:  in a loop with constant iteration number N (use the loop pred.)  Monitor the local history for this branch  Very long local history  Predict with a few bits in the local history (from the previous outer iteration) J-1J+1 JJ-1 N Outer iteration iOuter iteration i -1

12 Wormhole predictor + state-of-the-art global history predictor Capture correlation with a small number of entries  On a few branches  On a few benchmarks  CBP4 traces: 2 benchs / 40  CBP3 traces: 2 benchs / 40 But quite efficient on those traces

13 Wormhole predictor: not worth the implementation Requires a loop predictor Requires the branch to be executed on each iteration Unresolved issue of speculative local history management But let us keep the seminal observation

14 Let us analyze the problem Correlation to be captured is:  For branches in the inner most loop  With neighboring iterations, but in previous outer iteration(s) Would be nice to determine the iteration number !! for (i=0;i <Nmax; i++) for (j=0; j < Mmax; j++){ if (A[j+i] >0) {..} if (B[i][j]-B[i-1][j])>0) if (C[j]>0){..} }

15 The Inner Most Loop Iteration counter Most loops end by a conditional backward branch …B0...B1…..B3……B4….B5…...B6 if backward if taken IMLIcount ++ else IMLIcount =0 Perfectly counts the iteration numbers for the inner most loop

16 Same Iteration Correlation IMLI-SIC component IMLI-SIC component A predictor table indexed with IMLIcount and PC Just added to the neural part of predictor ++ IMLI SIC for (i=0;i <Nmax; i++) for (j=0; j < Mmax; j++){ if (A[j+i] >0) {..} if (B[i][j]-B[i-1][j])>0) if (C[j]>0){..} } correlation with Out[..][j]

17 IMLI-SIC component A simple add-on to TAGE-GSC or GEHL:  Brings higher accuracy than WH  Also captures most of the (small) benefit of the loop predictor  Get rid of the loop predictor !! Speculative IMLI counter easy to manage !! Works on different benchmarks than WH !!

18 What remains from WH ? for (i=0;i <Mmax; i++) for (j=0; j < Nmax; j++){ if (B[i-j])>0) {..} if (A[j]>0){ A[j]= -A[j];..} } Branch 1: correlation with Out[i-1][j-1] Branch 2: Correlation Out[i][j]=1-Out[i-1][j] Not the exact correlations but their forms

19 IMLI-OH component (PC<<6) +IMLI IMLI OH IMLI History PIPEPIPE PIPEPIPE PC prediction counter Provides Out[i-1][j] and Out[i-1][j-1] ++ IMLI SIC IMLI OH

20 Yes, but IMLI-OH uses local history ? Several (many) instances of the same branch inflight: wrong history  wrong prediction Instances of the branch with equal IMLI counter wrong history  read wrong IMLI OH entries The targeted branches feature large iteration numbers Use of effective OH history: Same (PC,IMLIcount) = already comitted The others branches don’t suffer:  the beauty of neural predictors

21 Accuracy improvement on TAGE-GSC 80 benchmarks CBP3+CBP4 6-7 % misprediction reduction avg

22 Shrinking the potential benefit of local history Add local history + loop predictor  Over TAGE-GSC:  5-6 % misp. reduction  Over TAGE-GSC-IMLI:  3-4 % misp. reduction Loop predictor alone?  < 0.5 % misp. reduction

23 Summary Fundamental observation by Albericio et al. :  Correlation in multidimensional loops IMLI-based components for TAGE-based and neural predictors  Simple implementation  Simple management of speculative states  Directly suitable for hardware implementation