Multiperspective Perceptron Predictor Daniel A. Jiménez Department of Computer Science & Engineering Texas A&M University
2 Branch-Predicting Perceptron Inputs (x ’ s) are from branch history n + 1 small integer weights (w ’ s) learned by on-line training Output (y) is dot product of x ’ s and w ’ s; predict taken if y ≥ 0 u Training finds correlations between history and outcome u Keep a table of perceptron weights vectors selected by hash of PC
Neural Prediction in Current Processors u We introduced the perceptron predictor [Jiménez & Lin 2001] u I and others improved it considerably through 2011 u Today, Oracle SPARC T4 contains S3 core with u “perceptron branch prediction” u “branch prediction using a simple neural net algorithm” u Their IEEE Micro paper cites our HPCA 2001 paper u You can buy one today u Today, AMD “Bobcat,” “Jaguar” and probably other cores u Have a “neural net logic branch predictor” u You can buy one today 3
Hashed Perceptron u Introduced by Tarjan and Skadron 2005 u Breaks the 1-1 correspondence between history bits and weights u Basic idea: u Hash segments of branch history into different tables u Sum weights selected by hash functions, apply threshold to predict u Update the weights using perceptron learning 4
Multiperspective Idea u Rather than just global/local history, use many features u Multiple perspectives on branch history u Multiperspective Perceptron Predictor u Hashed Perceptron u Sum weights indexed by hashes of features u Update weights using perceptron training u Contribution is a wide range of features 5
Traditional Features u GHIST(a,b) – hash of a to b most recent branch outcomes u PATH(a,b) – hash of recent a PCs, shifted by b u LOCAL – 11-bit local history u I DO ADVOCATE FOR LOCAL HISTORY IN REAL BRANCH PREDICTORS! u GHISTPATH - combination of GHIST and PATH u SGHISTPATH – alternate formulation allowing range u BIAS – bias of the branch to be taken regardless of history 6
Novel Features u IMLI – from Seznec’s innermost loop iteration counter work: u When a backward branch is taken, count up u When a backward branch is not taken, reset counter u I propose an alternate IMLI u When a forward branch is not taken, count up u When a forward branch is taken, reset counter u This represents loops where the decision to continue is at the top u Typical in code compiled for size or by JIT compilers u Forward IMLI works better than backward IMLI on these traces u I use both forward and backward in the predictor 7
Novel Features cont. u MODHIST – modulo history u Branch histories become misaligned when some branches are skipped u MODHIST records only branches where PC ≡ 0 (mod n) for some n. u Hopefully branches responsible for misalignment will not be recorded u Try many values of n to come up with a good MODHIST feature 8
Novel Features cont. u MODPATH – same idea with path of branch PCs u GHISTMODPATH – combine two previous ideas u RECENCY u Keep a recency stack of n branch PCs managed with LRU replacement u Hash the stack to get the feature u RECENCYPOS u Position (0..n-1) of current branch in recency stack, or n if no match u Works surprisingly well 9
Novel Features cont. u BLURRYPATH u Shift higher-order bits of branch PC into an array u Only record the bits if they don’t match the current bits u Parameters are depth of array, number of bits to truncate u Indicates region a branch came from rather than the precise location 10
Novel Features cont. u ACYCLIC u Current PC indexes a small array, recording the branch outcome there u The array always has the latest outcome for a given bin of branches u Acyclic – loop or repetition behavior is not recorded u Parameter is number of bits in the array 11
Putting it Together u Each feature computed, hashed, and XORed with current PC u Resulting index selects weight from a table u Weights are summed, thresholded to make prediction u Weights are updated with perceptron learning 12
Optimizations u Filter always/never taken branches u Apply sigmoidal transfer function to weights before summing u Coefficients for features to emphasize relative accuracy u Bit width optimization for tables u Shared magnitudes – two signs share one magnitude u Alternate prediction on low confidence (see paper) u Adaptive threshold training u Hashing some tables together with IMLI and RECENCYPOS 13
Contribution of Features (8KB) 14
Results u 8KB – MPKI u 64KB – MPKI u Unlimited – MPKI 15
Submit to HPCA 2017! 16 Note: Deadline is August 1, 2016!