Download presentation
Presentation is loading. Please wait.
Published byEzra Blankenship Modified over 8 years ago
1
FAT predictor Sabareesh Ganapathy, Prasanna Venkatesh Srinivasan, Maribel Monica
2
UNIVERSITY OF WISCONSIN-MADISON What is FAT? :) FAT is a Frequency-Analysis based branch predictor, integrated with TAGE. Frequency analysis involves studying the frequency transformation characteristics of a branch to predict the branch outcome as Taken/ Not taken. Historical context : Frequency analysis using Fourier Transform has been explored in FAB [1] by static profiling of the branch frequency characteristics across different workloads. FAT is a dynamic branch predictor. [1] M.Kampe, P.Stenstromand M. Dubois, “The FAB predictor: Using Fourier Analysis to Predict the Outcome of Conditional Branches”,Proceedings of the Eighth International Symposium on High Performance Computer Architecture (HPCA), 2002.
3
UNIVERSITY OF WISCONSIN-MADISON Underlying philosophy History repeats itself! May ‘15 December ‘15 May ‘16
4
UNIVERSITY OF WISCONSIN-MADISON Frequency Analysis Local history table 256 entries, 128 bits for each PC-based address Frequency table 256-entry 4-way set associative PC and Global history-based address b0b1 bn pc FABentry0FABe1FABe2FABe3 TAG IFFT.. ghist hash LHRtableFtable TAG1 TAG2IFFT Time Count THConfid
5
UNIVERSITY OF WISCONSIN-MADISON TAGE predictor …… T12 ……… TAGE: 1 bimodal predictor and 12 tables of TAGE used. Minimum history length = 4 and maximum global history=640 Folded history and PC are hashed to compute the index, TAG for each TAGE entry
6
UNIVERSITY OF WISCONSIN-MADISON FAB + TAGE algorithm Yes No
7
UNIVERSITY OF WISCONSIN-MADISON Update FAB Time_count==3 Increment time_count Calculate FFT from LHR Remove DC component Normalize FFT Filter Top N f freq comp Compute IFFT from filtered array and store Compute TH and store Yes No Update threshold Threshold = *Freq_sum / (Number of frequency components N f ), where Freq_sum = sum of absolute values of filtered array.
8
UNIVERSITY OF WISCONSIN-MADISON Infrastructure CBP-2016 infrastructure was used. Branch traces for server and mobile benchmarks were provided. The main program decodes the instructions and passes only conditional branches to the predictor. The predictor function was written for our custom predictor. FFTW library was used for computing FFT and DCT transforms in C++ program.
9
UNIVERSITY OF WISCONSIN-MADISON Perl script was written to scan the edge sequence in trace files. Dependencies between edges was determined and the local history was found for all branches allocated on FAB table. Local history determined using script was used in MATLAB. FAB predictor was modelled in MATLAB and analysis was performed to determine the threshold and number of frequency components required for correct prediction. MATLAB Analysis
10
UNIVERSITY OF WISCONSIN-MADISON Parameters such as local history register bits, # frequency components were varied to observe the effect on misprediction rate. FAB predictor was modified to use Discrete Cosine Transform instead of Discrete Fourier Transform. DC component was used in prediction when local history register was all 1’s. Regression analysis in CBP infrastructure
11
UNIVERSITY OF WISCONSIN-MADISON X axis label 12345678910 TechniqueFFT DCT – No opt DCT - opt Time count 4644408326444 LHR length512 256128 LHR table entries 2^14 2^202^14 2^10
12
UNIVERSITY OF WISCONSIN-MADISON # Frequency Components
13
UNIVERSITY OF WISCONSIN-MADISON HW Budget and Implementation ConfigurationStorage 1 Bimodal + 12 TAGE tables 250 Kbits 1 Bimodal + 11 TAGE + 1 LHR table + 1 FAB table 370 Kbits TAG1TAG2IFFT/IDCT Time Count ThresholdConfidence Implementation of frequency transform(DCT/FFT) in HW is complex. Stochastic implementation of DCT was explored. FAB ENTRY
14
UNIVERSITY OF WISCONSIN-MADISON Stochastic Logic 1,1,0,1,0,1,1,1 1,1,0,0,0,0,1,0 a = 6/8 1,1,0,0,1,0,1,0 b = 4/8 c = 3/8 A real value x(0-1) is represented by sequence of random bits. Simple logic and fault tolerant characteristics. Applicable to frequency transforms and image processing. Slide content derived from Mark Reidel's circuits course in UMinn
15
UNIVERSITY OF WISCONSIN-MADISON Stochastic DCT Xc(k) = (1/N) Σ x(n)cos(k2πn/N), k=0...N-1. Steps were taken for finding top frequency components, thresholding and IDCT. Results comparable to DCT using fftw library. (result degrades by 5%). Angle Mapper (0-pi/4) SNG Cos(x) Cos(2x) Cos(4x) Cos(8x) Multiplier, Adder Branch History DCT
16
UNIVERSITY OF WISCONSIN-MADISON FTAGE Using filtered local history as tag and index for a Pattern history table. FILTER FTABLE Local History FFT Filtered History Choose top 10 IFFT+Thres holding TAG128 bit filtered history History folding PC TAGCTRConfid FTAGE-TABLE Used for prediction
17
UNIVERSITY OF WISCONSIN-MADISON Results The best way to predict the future is to invent it – Alan Kay.
18
UNIVERSITY OF WISCONSIN-MADISON FTAGE-Future Work An IIR filter can be used for filtering. The top ten frequency components were measured for a number of traces.
19
UNIVERSITY OF WISCONSIN-MADISON References [1] M.Kampe, P.Stenstromand M. Dubois, “The FAB predictor: Using Fourier Analysis to Predict the Outcome of Conditional Branches”,Proceedings of the Eighth International Symposium on High Performance Computer Architecture (HPCA), 2002. [2] A.Seznecand P. Michaud,“A case for (partially) Tagged Geometric history length branch prediction”,Journal of Instruction Level Parallelism, Feb. 2006 [3] Weikang Qian, Xin Li, Marc D. Riedel, Kia Bazargan, and David J. Lilja, “An Architecture for Fault-Tolerant Computationwith Stochastic Logic”, IEEE Transactions on Computers, Vol 60, pp 93-105 [4]Xiaowei Qin, Shenglong Shang, Adong Fan, “Low-complexity FPGA Implementation of Sine/CosineGenerator Based on Stochastic Computation”.
20
UNIVERSITY OF WISCONSIN-MADISON
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.