André Seznec Caps Team IRISA/INRIA 1 Looking for limits in branch prediction with the GTL predictor André Seznec IRISA/INRIA/HIPEAC.

Slides:

Advertisements

Similar presentations

TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST

Advertisements

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.

By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.

What two numbers will give you a product of 64 and a quotient of 4?

Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Title Subtitle.

Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×

Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.

What two numbers will give you a product of 64 and a quotient of 4?

DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.

MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)

ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.

SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION

MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.

Year 6 mental test 5 second questions

Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.

Patterns and sequences We often need to spot a pattern in order to predict what will happen next. In maths, the correct name for a pattern of numbers is.

Bimode Cascading: Adaptive Rehashing for ITTAGE Indirect Branch Predictor Y.Ishii, K.Kuroyanagi, T.Sawada, M.Inaba, and K.Hiraki.

$100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400 $100 $200 $300 $400.

BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.

SE-292 High Performance Computing

ABC Technology Project

A 2bcgskew Fused by a RHSP Veerle Desmet Hans Vandierendonck Koen De Bosschere Ghent University Member HiPEAC.

IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.

Squares and Square Root WALK. Solve each problem REVIEW:

Created by Susan Neal $100 Fractions Addition Fractions Subtraction Fractions Multiplication Fractions Division General $200 $300 $400 $500 $100 $200.

Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN

Introduction to Feedback Systems / Önder YÜKSEL Bode plots 1 Frequency response:

Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.4 Polynomials in Several Variables Copyright © 2013, 2009, 2006 Pearson Education, Inc.

Chapter 5 Test Review Sections 5-1 through 5-4.

GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.

Addition 1’s to 20.

25 seconds left…...

Equal or Not. Equal or Not

Test B, 100 Subtraction Facts

1 Atlantic Annual Viewing Trends Adults 35-54, Total TV, By Daypart Average Minute Audience (000) Average Weekly Reach (%) Average Weekly Hours Viewed.

H-Pattern: A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation Samir Otiv Second Year Undergraduate Kaushik Garikipati Second.

We will resume in: 25 Minutes.

A SMALL TRUTH TO MAKE LIFE 100%

A small truth to make life 100%

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan

A SMALL TRUTH TO MAKE LIFE 100%

1 Unit 1 Kinematics Chapter 1 Day

Foundations of Data Structures Practical Session #7 AVL Trees 2.

Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering.

André Seznec Caps Team IRISA/INRIA 1 The O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

A PPM-like, tag-based predictor Pierre Michaud. 2 Main characteristics global history based 5 tables –one 4k-entry bimodal (indexed with PC) –four 1k-entry.

TAGE-SC-L Branch Predictors

1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.

1 A 64 Kbytes ITTAGE indirect branch predictor André Seznec INRIA/IRISA.

Analysis of Branch Predictors

1 Two research studies related to branch prediction and instruction sequencing André Seznec INRIA/IRISA.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

1 A New Case for the TAGE Predictor André Seznec INRIA/IRISA.

1 Revisiting the perceptron predictor André Seznec IRISA/ INRIA.

André Seznec Caps Team IRISA/INRIA 1 A 256 Kbits L-TAGE branch predictor André Seznec IRISA/INRIA/HIPEAC.

CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.

1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

Exploring Value Prediction with the EVES predictor

Looking for limits in branch prediction with the GTL predictor

TAGE-SC-L Again MTAGE-SC

The O-GEHL branch predictor

Presentation transcript:

André Seznec Caps Team IRISA/INRIA 1 Looking for limits in branch prediction with the GTL predictor André Seznec IRISA/INRIA/HIPEAC

André Seznec Caps Team Irisa 2 Motivations Geometric history length predictors introduced in OGEHL, CBP-1, dec TAGE, JILP 06, feb Storage effective Exploits very long global histories Were defined with possible implementation in mind What are the limits of accuracy that can be captured with these schemes ? How do they compare with unconstrained prediction schemes ?

André Seznec Caps Team Irisa 3 L(0) ? L(4) L(3) L(2) L(1) TO T1 T2 T3 T4 Geometric history length predictors: global history +multiple lengths

André Seznec Caps Team Irisa 4 GEometric History Length predictor The set of history lengths forms a geometric series What is important: L(i)-L(i-1) is drastically increasing most of the storage for short history !! {0, 2, 4, 8, 16, 32, 64, 128} Capture correlation on very long histories

André Seznec Caps Team Irisa 5 Combining multiple predictions Neural inspired predictors Use a (multiply)-add tree Partial matching Use tagged tables and the longest matching history O-GEHL, CBP-1 TAGE, JILP 06

André Seznec Caps Team Irisa 6 L(0) L(4) L(3) L(2) L(1) TO T1 T2 T3 T4 CBP-1 (2004): O-GEHL Final computation through a sum Prediction=Sign 256Kbits: 12 components misp/KI

André Seznec Caps Team Irisa 7 =? JILP 06: TAGE longest matching history 256Kbits: misp/KI

André Seznec Caps Team Irisa 8 What is global history conditional branch history: path confusion on short histories path history: Direct hashing leads to path confusion 1.Represent all branches in branch history 2.Use path AND direction history

André Seznec Caps Team Irisa 9 Using a kernel history and a user history Traces mix user and kernel activities: Kernel activity after exception Global history pollution Solution: use two separate global histories User history is updated only in user mode Kernel history is updated in both modes

André Seznec Caps Team Irisa 10 Accuracy limits for TAGE Varying the predictor size, the number of components, the tag width, the history length. Allowing multiple allocations The best accuracy on distributed traces: misp/KI History length around 1, components No need for tags wider than 16 bits

André Seznec Caps Team Irisa 11 Accuracy limits for GEHL Varying the predictor size, the number of components, the history length, counter width (slightly) improving the update policy and fitting in the two hours simulation rule on the distributed traces: misp/KI 97 components 8 bits counter 2,000 bits global history

André Seznec Caps Team Irisa 12 GEHL vs TAGE Realistic implementation parameters (storage budget, number of components) TAGE is more accurate than (O-)GEHL Unlimited budget, huge number of components GEHL is more accurate than TAGE

André Seznec Caps Team Irisa 13 Will it be sufficient to win The Championship ? GEHL history length: 2, components misp/KI

André Seznec Caps Team Irisa 14 A step further: hybrid GEHL-TAGE On a few benchmarks, TAGE is more accurate than GEHL, Let us try an hybrid GEHL-TAGE predictor

André Seznec Caps Team Irisa 15 Hybrid GEHL-TAGE Branch/path history + PC GEHL TAGE Meta = egskew mux Inherit from: Agree/bimode, YAGS, 2bcgskew,

André Seznec Caps Team Irisa 16 GEHL+TAGE GEHL provides the main prediction: also used as the base predictor for TAGE (YAGS inspired) TAGE records when GEHL fails: {prediction, address, history} (agree/bimode, YAGS inspired) Meta selects between GEHL and TAGE (2bcgskew inspired)

André Seznec Caps Team Irisa 17 Let us have fun !! GEHL history length: 400 TAGE history length: 100, misp/KI

André Seznec Caps Team Irisa 18 Might still be unsufficient GEHL history length: 400 TAGE history length: 100, misp/KI

André Seznec Caps Team Irisa 19 Adding a loop predictor The loop predictor captures the number of iterations of a loop When successively encounters 8 times the same number of iterations, the loop predictor provides the prediction. Advantage: Very reliable

André Seznec Caps Team Irisa 20 GTL predictor Branch/path history + PC GEHL TAGE Meta = egskew mux Loop predictor mux + static prediction on first occurrence confidence

André Seznec Caps Team Irisa 21 Hope this will be sufficient to win the Championship !! GTL GEHL, 97 comp., 400 hist. + TAGE, 19 comp., 100,000 hist + loop predictor misp/KI

André Seznec Caps Team Irisa 22 Geometric History Length predictors and limits on branch prediction Unlimited budget, huge number of components GEHL is more accurate than TAGE Very old correlation can be captured: On two benchmarks, using 10,000 history is really helping Does not seem to be a lot of potential extra benefit from local history We did not find any interesting extra scheme apart loop prediction Loop prediction, very marginal apart gzip

André Seznec Caps Team Irisa 23 The End