Presentation is loading. Please wait.

Presentation is loading. Please wait.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

Similar presentations


Presentation on theme: "André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC."— Presentation transcript:

1 André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC

2 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 2 Objectives  State of the art accuracy: Any gain in branch prediction accuracy results in performance gain and power consumption gain  Keep the design implementable:  Rely on a single prediction scheme  Use only global history information: Designers hate maintaining speculative local history

3 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 3 L(0) ? L(4) L(3) L(2) L(1) TO T1 T2 T3 T4 The basis : A Multiple history length predictor

4 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 4 Selecting between multiple predictions  Classic solution:  Use of a meta predictor “wasting” storage !?! chosing among 5 or 10 predictions ??  Neural inspired predictors:  Use an adder tree instead of a meta-predictor Let’s use the adder tree

5 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 5 L(0) ∑ L(4) L(3) L(2) L(1) TO T1 T2 T3 T4 Final computation through a sum Prediction=Sign

6 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 6 From “old experience” on 2bcgskew  Some applications benefit from 100+ bits histories  Some don’t !!

7 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 7 GEometric History Length predictor The set of history lengths forms a geometric series {0, 2, 4, 8, 16, 32, 64, 128} What is important: L(i)-L(i-1) is drastically increasing Spends most of the storage for short history !!

8 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 8 Update policy  Perceptron inspired threshold based  Perceptron-like threshold did not work  Reasonable fixed threshold= Number of tables

9 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 9 Dynamic update threshold fitting On an O-GEHL predictor, best threshold depends on  the application   the predictor size   the counter width  By chance, on most applications, for the best fixed threshold, updates on mispredictions ≈ updates on correct predictions Monitor the difference and adapt the update threshold

10 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 10 Adaptative history length fitting (inspired by Juan et al 98) (½ applications: L(7) < 50) ≠ (½ applications: L(7) > 150 ) Let us adapt some history lengths to the behavior of each application  8 tables:  T2: L(2) and L(8)  T4: L(4) and L(9)  T6: L(6) and L(10)

11 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 11 Adaptative history length fitting (2) Intuition:  if high degree of conflicts on T7, stick with short history Implementation:  monitoring of aliasing on updates on T7 through a tag bit and a counter Simple is sufficient:  Flipping from short to long histories and vice-versa

12 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 12 Evaluation framework 1st Championship Branch Prediction traces: 20 traces including system activity Floating point apps : loop dominated Integer apps: usual SPECINT Multimedia apps Server workload apps: very large footprint

13 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 13 Reference configuration presented for Championship Branch Prediction  8 tables:  2 Kentries except T1, 1Kentries  5 bit counters for T0 and T1, 4 bit counters otherwise  1 Kbits of one bit tags associated with T7 10K + 5K + 6x8K + 1K = 64K  L(1) =3 and L(10)= 200  {0,3,5,8,12,19,31,49,75,125,200}

14 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 14 A case for the OGEHL predictor  2nd at CBP: 2.82 misp/KI  Best practice award:  The predictor the closest to a possible hardware implementation  Does not use exotic features: Various prime numbers, etc Strange initial state  Very short warming intervals:  Chaining all simulations: 2.84 misp/KI

15 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 15 A case for the OGEHL predictor (2)  High accuracy  32Kbits (3,150): 3.41 misp/KI better than any implementable 128 Kbits predictor before CBP 128 Kbits 2bcgkew (6,6,24,48): 3.55 misp/KI 176 Kbits PBNP (43) : 3.67 misp/KI  1Mbits (5,300): 2.27 misp/KI 1Mbit 2bcgskew (9,9,36,72): 3.19 misp/KI 1888 Kbits PBNP (58): 3.23 misp/KI

16 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 16 A case for the OGEHL predictor (3)  Robustness to variations of history lengths choices:  L(1) in [2,6], L(10) in [125,300]  misp. rate < 2.96 misp/KI  Geometric series: not a bad formula !!  best geometric L(1)=3, L(10)=223, 2.80 misp/KI  best overall {0, 2, 4, 9, 12, 18, 31, 54, 114, 145, 266} 2.78 misp/KI

17 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 17 Impact of the number of components  4 components — 8 components 64 Kbits: 3.02 -- 2.84 misp/KI 256Kbits: 2.59 -- 2.44 misp/KI 1Mbit: 2.40 -- 2.27 misp/KI  6 components — 12 components 48 Kbits: 3.02 – 3.03 misp/KI 768Kbits: 2.35 – 2.25 misp/KI 4 to 12 components bring high accuracy

18 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 18 Impact of counter width  Robustness to counter width variations:  3-bit counter, 49 Kbits: 3.09 misp/KI Dynamic update threshold fitting helps a lot  5-bit counter 79 Kbits: 2.79 misp/KI 4-bit is the best tradeoff

19 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 19 Prediction computation time  3 successive steps:  Index computation: a 3-entry XOR gate  Table read  Adder tree  May not fit on a single cycle:  But can be ahead pipelined !

20 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 20 Ahead pipelining a global history branch predictor (principle)  Initiate branch prediction X+1 cycles in advance to provide the prediction in time  Use information available: X-block ahead instruction address X-block ahead history  To ensure accuracy:  Use intermediate path information

21 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 21 Practice Ahead OGEHL: 8 // prediction computations bcd Ha A ABCD

22 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 22 Ahead Pipelined 64 Kbits OGEHL  3-block: 2.94 misp/KI  4-block: 2.99 misp/KI  5-block: 3.04 misp/KI Not such a huge accuracy loss

23 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 23 A final case for the O-GEHL predictor  delivers state-of-the-art accuracy  uses only global information:  Very long history: 200+ bits !!  can be ahead pipelined  many effective design points  Nb of tables, counter width, history lengths  prediction computation logic complexity is low (compared with concurrent predictors )

24 Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 24 The End


Download ppt "André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC."

Similar presentations


Ads by Google