Download presentation
Presentation is loading. Please wait.
Published byJames Kelly Modified over 9 years ago
1
André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC
2
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 2 Objectives State of the art accuracy: Any gain in branch prediction accuracy results in performance gain and power consumption gain Keep the design implementable: Rely on a single prediction scheme Use only global history information: Designers hate maintaining speculative local history
3
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 3 L(0) ? L(4) L(3) L(2) L(1) TO T1 T2 T3 T4 The basis : A Multiple history length predictor
4
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 4 Selecting between multiple predictions Classic solution: Use of a meta predictor “wasting” storage !?! chosing among 5 or 10 predictions ?? Neural inspired predictors: Use an adder tree instead of a meta-predictor Let’s use the adder tree
5
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 5 L(0) ∑ L(4) L(3) L(2) L(1) TO T1 T2 T3 T4 Final computation through a sum Prediction=Sign
6
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 6 From “old experience” on 2bcgskew Some applications benefit from 100+ bits histories Some don’t !!
7
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 7 GEometric History Length predictor The set of history lengths forms a geometric series {0, 2, 4, 8, 16, 32, 64, 128} What is important: L(i)-L(i-1) is drastically increasing Spends most of the storage for short history !!
8
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 8 Update policy Perceptron inspired threshold based Perceptron-like threshold did not work Reasonable fixed threshold= Number of tables
9
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 9 Dynamic update threshold fitting On an O-GEHL predictor, best threshold depends on the application the predictor size the counter width By chance, on most applications, for the best fixed threshold, updates on mispredictions ≈ updates on correct predictions Monitor the difference and adapt the update threshold
10
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 10 Adaptative history length fitting (inspired by Juan et al 98) (½ applications: L(7) < 50) ≠ (½ applications: L(7) > 150 ) Let us adapt some history lengths to the behavior of each application 8 tables: T2: L(2) and L(8) T4: L(4) and L(9) T6: L(6) and L(10)
11
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 11 Adaptative history length fitting (2) Intuition: if high degree of conflicts on T7, stick with short history Implementation: monitoring of aliasing on updates on T7 through a tag bit and a counter Simple is sufficient: Flipping from short to long histories and vice-versa
12
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 12 Evaluation framework 1st Championship Branch Prediction traces: 20 traces including system activity Floating point apps : loop dominated Integer apps: usual SPECINT Multimedia apps Server workload apps: very large footprint
13
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 13 Reference configuration presented for Championship Branch Prediction 8 tables: 2 Kentries except T1, 1Kentries 5 bit counters for T0 and T1, 4 bit counters otherwise 1 Kbits of one bit tags associated with T7 10K + 5K + 6x8K + 1K = 64K L(1) =3 and L(10)= 200 {0,3,5,8,12,19,31,49,75,125,200}
14
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 14 A case for the OGEHL predictor 2nd at CBP: 2.82 misp/KI Best practice award: The predictor the closest to a possible hardware implementation Does not use exotic features: Various prime numbers, etc Strange initial state Very short warming intervals: Chaining all simulations: 2.84 misp/KI
15
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 15 A case for the OGEHL predictor (2) High accuracy 32Kbits (3,150): 3.41 misp/KI better than any implementable 128 Kbits predictor before CBP 128 Kbits 2bcgkew (6,6,24,48): 3.55 misp/KI 176 Kbits PBNP (43) : 3.67 misp/KI 1Mbits (5,300): 2.27 misp/KI 1Mbit 2bcgskew (9,9,36,72): 3.19 misp/KI 1888 Kbits PBNP (58): 3.23 misp/KI
16
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 16 A case for the OGEHL predictor (3) Robustness to variations of history lengths choices: L(1) in [2,6], L(10) in [125,300] misp. rate < 2.96 misp/KI Geometric series: not a bad formula !! best geometric L(1)=3, L(10)=223, 2.80 misp/KI best overall {0, 2, 4, 9, 12, 18, 31, 54, 114, 145, 266} 2.78 misp/KI
17
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 17 Impact of the number of components 4 components — 8 components 64 Kbits: 3.02 -- 2.84 misp/KI 256Kbits: 2.59 -- 2.44 misp/KI 1Mbit: 2.40 -- 2.27 misp/KI 6 components — 12 components 48 Kbits: 3.02 – 3.03 misp/KI 768Kbits: 2.35 – 2.25 misp/KI 4 to 12 components bring high accuracy
18
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 18 Impact of counter width Robustness to counter width variations: 3-bit counter, 49 Kbits: 3.09 misp/KI Dynamic update threshold fitting helps a lot 5-bit counter 79 Kbits: 2.79 misp/KI 4-bit is the best tradeoff
19
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 19 Prediction computation time 3 successive steps: Index computation: a 3-entry XOR gate Table read Adder tree May not fit on a single cycle: But can be ahead pipelined !
20
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 20 Ahead pipelining a global history branch predictor (principle) Initiate branch prediction X+1 cycles in advance to provide the prediction in time Use information available: X-block ahead instruction address X-block ahead history To ensure accuracy: Use intermediate path information
21
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 21 Practice Ahead OGEHL: 8 // prediction computations bcd Ha A ABCD
22
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 22 Ahead Pipelined 64 Kbits OGEHL 3-block: 2.94 misp/KI 4-block: 2.99 misp/KI 5-block: 3.04 misp/KI Not such a huge accuracy loss
23
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 23 A final case for the O-GEHL predictor delivers state-of-the-art accuracy uses only global information: Very long history: 200+ bits !! can be ahead pipelined many effective design points Nb of tables, counter width, history lengths prediction computation logic complexity is low (compared with concurrent predictors )
24
Analysis of the O-GEHL branch predictor André Seznec Caps Team Irisa 24 The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.