H-Pattern: A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation Samir Otiv Second Year Undergraduate Kaushik Garikipati Second.

Slides:



Advertisements
Similar presentations
Bimode Cascading: Adaptive Rehashing for ITTAGE Indirect Branch Predictor Y.Ishii, K.Kuroyanagi, T.Sawada, M.Inaba, and K.Hiraki.
Advertisements

Dead Block Replacement and Bypass with a Sampling Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.
André Seznec Caps Team IRISA/INRIA 1 Looking for limits in branch prediction with the GTL predictor André Seznec IRISA/INRIA/HIPEAC.
Pipelining V Topics Branch prediction State machine design Systems I.
Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Department of Computer Architecture.
Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering.
André Seznec Caps Team IRISA/INRIA 1 The O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.
Yue Hu David M. Koppelman Lu Peng A Penalty-Sensitive Branch Predictor Department of Electrical and Computer Engineering Louisiana State University.
A PPM-like, tag-based predictor Pierre Michaud. 2 Main characteristics global history based 5 tables –one 4k-entry bimodal (indexed with PC) –four 1k-entry.
TAGE-SC-L Branch Predictors
CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
EECS 470 Branch Prediction Lecture 6 Coverage: Chapter 3.
1 Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlation Branches from a Larger Global History CSE 340 Project Presentation.
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
EECC551 - Shaaban #1 lec # 5 Fall Reduction of Control Hazards (Branch) Stalls with Dynamic Branch Prediction So far we have dealt with.
Combining Branch Predictors
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
Dynamic Branch Prediction
Optimized Hybrid Scaled Neural Analog Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.
1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.
Evaluation of the Gini-index for Studying Branch Prediction Features Veerle Desmet Lieven Eeckhout Koen De Bosschere.
1 A 64 Kbytes ITTAGE indirect branch predictor André Seznec INRIA/IRISA.
Analysis of Branch Predictors
1 Two research studies related to branch prediction and instruction sequencing André Seznec INRIA/IRISA.
André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.
1 A New Case for the TAGE Predictor André Seznec INRIA/IRISA.
Computer Architecture Lecture 26 Fasih ur Rehman.
1 Revisiting the perceptron predictor André Seznec IRISA/ INRIA.
André Seznec Caps Team IRISA/INRIA 1 A 256 Kbits L-TAGE branch predictor André Seznec IRISA/INRIA/HIPEAC.
CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.
1 Lecture: Out-of-order Processors Topics: branch predictor wrap-up, a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.
FAT predictor Sabareesh Ganapathy, Prasanna Venkatesh Srinivasan, Maribel Monica.
Fast Path-Based Neural Branch Prediction Daniel A. Jimenez Presented by: Ioana Burcea.
Value Prediction Kyaw Kyaw, Min Pan Final Project.
Dynamic Branch Prediction
Lecture: Out-of-order Processors
Module 11: File Structure
CS203 – Advanced Computer Architecture
Computer Structure Advanced Branch Prediction
Computer Architecture Advanced Branch Prediction
Multiperspective Perceptron Predictor with TAGE
COSC3330 Computer Architecture Lecture 15. Branch Prediction
Dynamically Sizing the TAGE Branch Predictor
CS 704 Advanced Computer Architecture
FA-TAGE Frequency Aware TAgged GEometric History Length Branch Predictor Boyu Zhang, Christopher Bodden, Dillon Skeehan ECE/CS 752 Advanced Computer Architecture.
Samira Khan University of Virginia Dec 4, 2017
Exploring Branch Prediction
CMSC 611: Advanced Computer Architecture
Exploring Value Prediction with the EVES predictor
Looking for limits in branch prediction with the GTL predictor
So far we have dealt with control hazards in instruction pipelines by:
Dynamic Branch Prediction
So far we have dealt with control hazards in instruction pipelines by:
Lecture 10: Branch Prediction and Instruction Delivery
TAGE-SC-L Again MTAGE-SC
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
Adapted from the slides of Prof
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
The O-GEHL branch predictor
Samira Khan University of Virginia Mar 6, 2019
Phase based adaptive Branch predictor: Seeing the forest for the trees
Presentation transcript:

H-Pattern: A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation Samir Otiv Second Year Undergraduate Kaushik Garikipati Second Year Undergraduate Milan Patnaik MTech Dr. V Kamakoti Professor Indian Institute of Technology Madras Department of Computer Science & Engineering

Approach Conditional branch instructions often follow patterns which periodically repeat. If a branch instruction is found to follow a certain repeating pattern, a predictor must have the ability to accurately predict its outcome for as long as the pattern persists. Predicting ALL patterns with periods of ANY length: Impossible, given a fixed storage budget.

Approach STRATEGY: Restrict ourselves to capturing patterns with a period only up to a certain predetermined length Objective: Creating a predictor that captures patterns with periods of lengths of up to n-bits. Challenges: 1.Using minimum space 2.The patterns followed can change – must dynamically relearn

Solution For every branch: Store local history of 2n bits If a branch instruction follows a pattern of execution with a period p, where p is at most equal to n, then the most recent set of n bits must be identical to the set of n bits that occurred p executions prior. outcome(h i ) = outcome(h i+p ) (where h i = i th most recent execution) To predict, all we do is compare the most recent n bits to successively older History Patterns (substrings of n bits of the local history), and stop at the first match. The bit, just after this matching substring, is our prediction for the next execution. (The picture on the next slide should clarify)

Here, with n=8, we store a local history of 16 bits. The branch instruction follows a repeating pattern –(110)-, which has a period of 3. The bit string h 0 to h 7 (Current Pattern) matches precisely with the bit string h 3 to h 11 (Matched Current Pattern). The prediction returned is the bit just after the matched current pattern – h 2. Illustration

H-Pattern: nBPAT + AltPred nBPAT: n-Bit Pattern Predictor AltPred: Any other alternate branch predictor When no pattern is detected (i.e. no pattern match occurs), AltPred is used. When a pattern is detected, the better performing predictor is used.

The nBPAT Predictor Every entry of the predictor is comprised of: A 2n-bit shift register for local history A saturating counter to keep track of the better performing predictor (as described in ‘Combining Branch Predictors’ by Scott McFarling) Storage: Various configurations possible – tagged/tagless/direct mapped/associative

The nBPAT Algorithm To Predict: 1.Match the current pattern (h 0 to h n-1 ) with successively older history patterns 2.If the first match is found at h i, then h i-1 is the predicted outcome. If the most significant bit of the saturating selection counter is 1, then return h i-1. 3.If there is no match, or if the most significant bit is 0, use AltPred To Update: 1.If AltPred mispredicted and nBPAT correctly predicted, increment the saturating selection counter. 2. If AltPred correctly predicted and nBPAT mispredicted, decrement the saturating selection counter. 3.If nBPAT was not ready, don’t change the saturating counter 4.Update the local history by inserting the outcome of the branch into the local history shift register

Combinations of H-Pattern H-Pattern: Various configuration decisions AltPred Component: Several possible options, for instance: Gshare TAGE ISL-TAGE nBPAT Storage Structure: Tagged/Tagless Associative/Direct Mapped

H-Pattern with Gshare Configuration: Tagless, direct-mapped table used for nBPAT – indexed by few of the least significant bits of the PC 50% of the storage budget assigned to nBPAT Outcome: Distinct improvement in accuracy observed, as will be shown soon.

H-Pattern with Gshare

H-Pattern with TAGE/ISL-TAGE Minimal portion of storage allocated to nBPAT The storage structure must facilitate maximum accuracy by nBPAT for very small storage spaces. Proportion of the storage budget allocated to nBPAT was different for different budgets Improvement in accuracy was lesser than that achieved with Gshare

H-Pattern with TAGE/ISL-TAGE CONFIGURATION: nBPAT STORAGE Partially tagged, 2-way set-associative. Selection Counter: 4-bits Useful Counter: Included in every entry. Serves as a measure of the effectiveness of an entry in the table. Decremented if: 1. No pattern match found 2. Misprediction by nBPAT & correct prediction by AltPred Incremented if misprediction by AltPred and correct prediction by nBPAT. All useful counters are reset periodically using a global reset counter. This correctly captures the notion of an entry in the table being effective or ineffective, and aids in the entry replacement policy.

H-Pattern with TAGE/ISL-TAGE UPDATE ALGORITHM: 1.If the TAGE predictor MISPREDICTED and there is no tag match in nBPAT 2-way associative table, and, either of the 2 potential entry locations have Useful = 0, then, make Tag = [BranchTag] and Useful = [Maximum]. 2.If the entry ALREADY exists in the nBPAT 2-way associative table, then, 1.If nBPAT was not ready, OR, nBPAT mispredicted and TAGE correctly predicted, decrease useful. 2.If nBPAT correctly predicted and TAGE mispredicted, increase useful 3.Update the nBPAT entry as described earlier in the nBPAT algorithm 4.Update the TAGE/ISL-TAGE predictor

Reference TAGE Configurations The optimized configuration for an 8-table TAGE predictor, as specified in the paper “A case for (partially) Tagged Geometric history length branch prediction”, by André Seznec and Pierre Michaud, was used. 4KB: History Lengths = 5 to KB: History Lengths = 5 to 450 Whereas for the unlimited case, 18 tagged tables were used. History Lengths = 3 to 2000

H-Pattern with TAGE Configurations 4KB: Tag length was reduced by 1 in every alternate table starting from T2. 4-BPAT predictor used with 7-bit tagged entries & 3-bit useful counters. 32KB: Table T6 of TAGE has been halved in size. 8-BPAT predictor used with 8-bit tagged entries & 4-bit useful counter. Unlimited: 8-BPAT predictor used with 16-bit tagged width.

H-Pattern with TAGE Mispredictions per Kilo Instructions – CBP 2014 Framework

Reference ISL-TAGE Configurations 4KB: Configuration was same as the 8-component predictor specified in the paper “A case for (partially) Tagged Geometric history length branch prediction”, by André Seznec and Pierre Michaud, with space freed from the base bimodal predictor by having only 2K prediction entries and 1K hysteresis entries to accommodate statistical corrector and loop predictor. History lengths = 5 to KB: Configuration (including history lengths) was identical to the one specified in the paper “A 64KBit ISL-TAGE branch predictor ”, by André Seznec, with all storage tables halved. Unlimited: 18 tagged tables were used. History Lengths = 3 to 2000

H-Pattern with ISL-TAGE Configurations 4KB From the reference 4KB ISL-TAGE, freed one tag bit from every alternate table starting from T2. 4-BPAT predictor used with 7-bit tagged entries & 3-bit useful counters. 32KB From the reference 32KB ISL-TAGE, halved the last shared table and reduced the size of statistical corrector and loop predictor. 4-BPAT predictor used with 6-bit tagged entries & 3-bit useful counters. Unlimited In combination with the reference Unlimited ISL-TAGE predictor, an 8-BPAT predictor was used with 16-bit tagged entries & 4-bit useful counters.

H-Pattern with ISL-TAGE Mispredictions per Kilo Instructions – CBP 2014 Framework

Further Statistics: Success rates H-Pattern withComponentUnlimited32KB4KB GsharenBPAT99.30%99.10%98.60% AltPred94.50%93.40%91.30% TAGEnBPAT99.50%99.59%98.98% AltPred97.93%97.15%96.86% ISL-TAGEnBPAT99.62%99.56%98.94% AltPred98.09%97.37%96.93%

Thank You