Multiperspective Perceptron Predictor Daniel A. Jiménez Department of Computer Science & Engineering Texas A&M University.

Slides:

Advertisements

Similar presentations

Author: Chengchen, Bin Liu Publisher: International Conference on Computational Science and Engineering Presenter: Yun-Yan Chang Date: 2012/04/18 1.

Advertisements

Perceptron Branch Prediction with Separated T/NT Weight Tables Guangyu Shi and Mikko Lipasti University of Wisconsin-Madison June 4, 2011.

Dead Block Replacement and Bypass with a Sampling Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.

Hardware-based Devirtualization (VPC Prediction) Hyesoon Kim, Jose A. Joao, Onur Mutlu ++, Chang Joo Lee, Yale N. Patt, Robert Cohn* ++ *

Computer Science Department University of Central Florida Adaptive Information Processing: An Effective Way to Improve Perceptron Predictors Hongliang.

André Seznec Caps Team IRISA/INRIA 1 The O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

A PPM-like, tag-based predictor Pierre Michaud. 2 Main characteristics global history based 5 tables –one 4k-entry bimodal (indexed with PC) –four 1k-entry.

TAGE-SC-L Branch Predictors

Dibakar Gope and Mikko H. Lipasti University of Wisconsin – Madison Championship Branch Prediction 2014 Bias-Free Neural Predictor.

CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.

Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Calvin Lin Dept. of Computer Science Rutgers University Univ. of Texas Austin Presented.

VLSI Project Neural Networks based Branch Prediction Alexander ZlotnikMarcel Apfelbaum Supervised by: Michael Behar, Spring 2005.

Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.

CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.

Perceptrons Branch Prediction and its’ recent developments

Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.

Optimized Hybrid Scaled Neural Analog Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.

1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.

Revisiting Load Value Speculation:

Korea Univ B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors 컴퓨터 · 전파통신공학과 최병준 1 Computer Engineering and Systems Group.

Analysis of Branch Predictors

Microprocessor Arch. 김인식 - 인사

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

1 Revisiting the perceptron predictor André Seznec IRISA/ INRIA.

CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.

André Seznec Caps Team IRISA/INRIA 1 A 256 Kbits L-TAGE branch predictor André Seznec IRISA/INRIA/HIPEAC.

Idealized Piecewise Linear Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.

1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio.

Temporal Stream Branch Predictor (TS Predictor) Yongming Shen, Michael Ferdman.

Branch Prediction Perspectives Using Machine Learning Veerle Desmet Ghent University.

Fast Path-Based Neural Branch Prediction Daniel A. Jimenez Presented by: Ioana Burcea.

Dynamic Branch Prediction

Multilayer Perceptron based Branch Predictor

CS203 – Advanced Computer Architecture

Dynamic Branch Prediction

Multiperspective Perceptron Predictor with TAGE

CS5100 Advanced Computer Architecture Advanced Branch Prediction

COSC3330 Computer Architecture Lecture 15. Branch Prediction

FA-TAGE Frequency Aware TAgged GEometric History Length Branch Predictor Boyu Zhang, Christopher Bodden, Dillon Skeehan ECE/CS 752 Advanced Computer Architecture.

Samira Khan University of Virginia Dec 4, 2017

CMSC 611: Advanced Computer Architecture

15-740/ Computer Architecture Lecture 25: Control Flow II

Exploring Value Prediction with the EVES predictor

Machine Learning Today: Reading: Maria Florina Balcan

Looking for limits in branch prediction with the GTL predictor

Perceptrons for Dummies

So far we have dealt with control hazards in instruction pipelines by:

15-740/ Computer Architecture Lecture 24: Control Flow

Scaled Neural Indirect Predictor

Dynamic Branch Prediction

Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt

So far we have dealt with control hazards in instruction pipelines by:

Lecture 10: Branch Prediction and Instruction Delivery

TAGE-SC-L Again MTAGE-SC

5th JILP Workshop on Computer Architecture Competitions

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

pipelining: static branch prediction Prof. Eric Rotenberg

Adapted from the slides of Prof

Neural networks (1) Traditional multi-layer perceptrons

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

rePLay: A Hardware Framework for Dynamic Optimization

So far we have dealt with control hazards in instruction pipelines by:

David Kauchak CS158 – Spring 2019

The O-GEHL branch predictor

Eshan Bhatia1, Gino Chacon1, Elvira Teran2, Paul V. Gratz1, Daniel A

Samira Khan University of Virginia Mar 6, 2019

Presentation transcript:

Multiperspective Perceptron Predictor Daniel A. Jiménez Department of Computer Science & Engineering Texas A&M University

2 Branch-Predicting Perceptron  Inputs (x ’ s) are from branch history  n + 1 small integer weights (w ’ s) learned by on-line training  Output (y) is dot product of x ’ s and w ’ s; predict taken if y ≥ 0 u Training finds correlations between history and outcome u Keep a table of perceptron weights vectors selected by hash of PC

Neural Prediction in Current Processors u We introduced the perceptron predictor [Jiménez & Lin 2001] u I and others improved it considerably through 2011 u Today, Oracle SPARC T4 contains S3 core with u “perceptron branch prediction” u “branch prediction using a simple neural net algorithm” u Their IEEE Micro paper cites our HPCA 2001 paper u You can buy one today u Today, AMD “Bobcat,” “Jaguar” and probably other cores u Have a “neural net logic branch predictor” u You can buy one today 3

Hashed Perceptron u Introduced by Tarjan and Skadron 2005 u Breaks the 1-1 correspondence between history bits and weights u Basic idea: u Hash segments of branch history into different tables u Sum weights selected by hash functions, apply threshold to predict u Update the weights using perceptron learning 4

Multiperspective Idea u Rather than just global/local history, use many features u Multiple perspectives on branch history u Multiperspective Perceptron Predictor u Hashed Perceptron u Sum weights indexed by hashes of features u Update weights using perceptron training u Contribution is a wide range of features 5

Traditional Features u GHIST(a,b) – hash of a to b most recent branch outcomes u PATH(a,b) – hash of recent a PCs, shifted by b u LOCAL – 11-bit local history u I DO ADVOCATE FOR LOCAL HISTORY IN REAL BRANCH PREDICTORS! u GHISTPATH - combination of GHIST and PATH u SGHISTPATH – alternate formulation allowing range u BIAS – bias of the branch to be taken regardless of history 6

Novel Features u IMLI – from Seznec’s innermost loop iteration counter work: u When a backward branch is taken, count up u When a backward branch is not taken, reset counter u I propose an alternate IMLI u When a forward branch is not taken, count up u When a forward branch is taken, reset counter u This represents loops where the decision to continue is at the top u Typical in code compiled for size or by JIT compilers u Forward IMLI works better than backward IMLI on these traces u I use both forward and backward in the predictor 7

Novel Features cont. u MODHIST – modulo history u Branch histories become misaligned when some branches are skipped u MODHIST records only branches where PC ≡ 0 (mod n) for some n. u Hopefully branches responsible for misalignment will not be recorded u Try many values of n to come up with a good MODHIST feature 8

Novel Features cont. u MODPATH – same idea with path of branch PCs u GHISTMODPATH – combine two previous ideas u RECENCY u Keep a recency stack of n branch PCs managed with LRU replacement u Hash the stack to get the feature u RECENCYPOS u Position (0..n-1) of current branch in recency stack, or n if no match u Works surprisingly well 9

Novel Features cont. u BLURRYPATH u Shift higher-order bits of branch PC into an array u Only record the bits if they don’t match the current bits u Parameters are depth of array, number of bits to truncate u Indicates region a branch came from rather than the precise location 10

Novel Features cont. u ACYCLIC u Current PC indexes a small array, recording the branch outcome there u The array always has the latest outcome for a given bin of branches u Acyclic – loop or repetition behavior is not recorded u Parameter is number of bits in the array 11

Putting it Together u Each feature computed, hashed, and XORed with current PC u Resulting index selects weight from a table u Weights are summed, thresholded to make prediction u Weights are updated with perceptron learning 12

Optimizations u Filter always/never taken branches u Apply sigmoidal transfer function to weights before summing u Coefficients for features to emphasize relative accuracy u Bit width optimization for tables u Shared magnitudes – two signs share one magnitude u Alternate prediction on low confidence (see paper) u Adaptive threshold training u Hashing some tables together with IMLI and RECENCYPOS 13

Contribution of Features (8KB) 14

Results u 8KB – MPKI u 64KB – MPKI u Unlimited – MPKI 15

Submit to HPCA 2017! 16 Note: Deadline is August 1, 2016!