Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering.

Similar presentations


Presentation on theme: "Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering."— Presentation transcript:

1 Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering North Carolina State University

2 Baseline: IITAGE Indirect Branch Predictor [A. Seznec and P. Michaud, JILP 2006] A PPM-based predictor contains multiple Markov predictors with each capturing different history length and the one with the longest match will be used to make prediction. 2

3 Our Main Idea:  Longest history length vs. adaptive history lengths.  Address-target correlation. 3

4 Predictor Structure – Main Predictor Tag T1 u Target Alt Tagu Target Alt Tagu Target Alt Tagu Target T2 T3Tn … T1_Match T2_Match T1,2_Match T3_Match T1,n-1_Match Tn_Match T1_Match T2_Match Tn_Match Target Prediction HBT hit … hlen …

5 Main Predictor at Fetch stage ITTAGE as the baseline predictor (no T0) Two ways to adaptively select the proper table (or history length) 1. Alt bit in each entry (except T1) 2. A separate table for hard-to-predict branches 5 tagualttarget

6 Alt = 0, target from the current entry is preferred for the prediction. Alt = 1, a table with shorter history is to be used to make the final prediction. No alt bit for the table T1. Initially alt field is set to zero. Update mechanism:  If table with the longest match fails to make correct prediction while another table does, the alt field will be set for those entries with longer history lengths. 6 Using Alt bits to select a table

7 Hard-to-predict Branch Table (HBT) A cache like set associative structure with entry containing a tag, a misprediction counter (mc) and a history length (hlen). HBT updated based on the prediction provided by longest history mc field is used for replacement to allow hard to predict branches to be captured by HBT. hlen is used to select the hlen th longest history. 7 tagmchlen

8 For example, if hlen = 2 and T2, T4 and T5 have tag matches and their corresponding alt fields are false then T2 will be selected for prediction. The main predictor provides prediction at fetch stage. The main predictor is updated at retire stage of an indirect branch. 8 Hard to predict Branch table (HBT)

9 Auxiliary Predictor at AGEN stage Correlation between producer load address and consumer branch target, e. g., Load R19 = Mem [R3] //Address: 0x60848100 0x60846ec8 Br R19 //Target: 0x60751a64 0x607691c9 Producer load accesses two addresses with each address providing a different branch target. As long as data structures at these addresses do not change frequently, they are sufficient to predict branch target of consumer indirect branch. 9

10 tag Br pc Hashed load address Auxiliary Predictor Design

11 Address Target Correlation (ATC) is captured using Address Target Table (ATT). Accessed at agen stage of load instruction. PC of indirect branch used for tag match. Hashed load address is used to find matching address-target pair. Updated at the EXE stage of an indirect branch LRU replacement policy. Reduces misprediction penalty in case the prediction differs from the one provided at fetch stage. 11 Auxiliary Predictor Design tag Br pc Hashed load address

12 Storage Cost (1/2) Tagged table entry  U ctr: 2 bits  Target: 32 bits  Alt: 1 bit (except T1)  Tag: partial tag HBT (1,216 bits)  32 entries  Tag: 32 bits  mc: 2 bits  hlen: 4 bits ATT (11,882 bits)  26 entries  Tag: 32 bits  Lru: 5 bits  : bits 12

13 Global history – 640 * 2 bits Path history – 16 bits Other counters – 39 bits Total – 64.97 KB 13 Storage Cost (2/2)

14 Experimental Results Overall performance improvements (ATT 11,882 bits)– 15.6% Performance improvements with small ATT (1,624 bits) – 14.8% 14

15 1. Other contestants are doing superb! 2. Our baseline ITTAGE is not well tuned. The code and the predictor structure is modified based on L-TAGE Discussion: Why we may not win

16 Our main ideas, adaptive history length and address-target correlation, can further improve well-tuned predictors. Discussion: Why we can win

17 Conclusions Although control flow history carries correlation to targets, the strength of correlation may either increase or decrease for different indirect branches when we increase the history length. There exists strong correlation between producer load addresses and consumer branch targets. 17

18 Thank You 18


Download ppt "Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering."

Similar presentations


Ads by Google