Value Prediction Kyaw Kyaw, Min Pan Final Project.

Slides:



Advertisements
Similar presentations
Branch prediction Titov Alexander MDSP November, 2009.
Advertisements

Final Project : Pipelined Microprocessor Joseph Kim.
Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Department of Computer Architecture.
Instruction-Level Parallelism compiler techniques and branch prediction prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University March.
Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.
Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering.
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.
Dynamic Branch Prediction
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Limits on ILP. Achieving Parallelism Techniques – Scoreboarding / Tomasulo’s Algorithm – Pipelining – Speculation – Branch Prediction But how much more.
A PPM-like, tag-based predictor Pierre Michaud. 2 Main characteristics global history based 5 tables –one 4k-entry bimodal (indexed with PC) –four 1k-entry.
National & Kapodistrian University of Athens Dep.of Informatics & Telecommunications MSc. In Computer Systems Technology Advanced Computer Architecture.
CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
Hit or Miss ? !!!.  Cache RAM is high-speed memory (usually SRAM).  The Cache stores frequently requested data.  If the CPU needs data, it will check.
Glenn Reinman, Brad Calder, Department of Computer Science and Engineering, University of California San Diego and Todd Austin Department of Electrical.
Perceptron-based Global Confidence Estimation for Value Prediction Master’s Thesis Michael Black June 26, 2003.
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
Data value prediction Bas van der Tol. Limits to ILP Instruction Level Parallelism is limited by Control flow Data flow: true dependencies.
Branch Target Buffers BPB: Tag + Prediction
Address-Value Delta (AVD) Prediction Onur Mutlu Hyesoon Kim Yale N. Patt.
Benefits of Early Cache Miss Determination Memik G., Reinman G., Mangione-Smith, W.H. Proceedings of High Performance Computer Architecture Pages: 307.
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
Dynamic Branch Prediction
Prophet/Critic Hybrid Branch Prediction Falcon, Stark, Ramirez, Lai, Valero Presenter: Christian Wanamaker.
CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.
Predictor-Directed Stream Buffers Timothy Sherwood Suleyman Sair Brad Calder.
1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
Low Power Cache Design M.Bilal Paracha Hisham Chowdhury Ali Raza.
IA-64 Architecture RISC designed to cooperate with the compiler in order to achieve as much ILP as possible 128 GPRs, 128 FPRs 64 predicate registers of.
SOCSAMS e-learning Dept. of Computer Applications, MES College Marampally VIRTUALMEMORY.
CSCI 6461: Computer Architecture Branch Prediction Instructor: M. Lancaster Corresponding to Hennessey and Patterson Fifth Edition Section 3.3 and Part.
Advanced Computer Architecture Lab University of Michigan Compiler Controlled Value Prediction with Branch Predictor Based Confidence Eric Larson Compiler.
Adaptive GPU Cache Bypassing Yingying Tian *, Sooraj Puthoor†, Joseph L. Greathouse†, Bradford M. Beckmann†, Daniel A. Jiménez * Texas A&M University *,
1/25 June 28 th, 2006 BranchTap: Improving Performance With Very Few Checkpoints Through Adaptive Speculation Control BranchTap Improving Performance With.
The life of an instruction in EV6 pipeline Constantinos Kourouyiannis.
Computer Structure Advanced Branch Prediction
CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.
CS252 Graduate Computer Architecture Lecture 14 Prediction (Con’t) (Dependencies, Load Values, Data Values) John Kubiatowicz Electrical Engineering and.
Prophet/Critic Hybrid Branch Prediction B B B
1 Lecture: Out-of-order Processors Topics: branch predictor wrap-up, a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
1/25 HIPEAC 2008 TurboROB TurboROB A Low Cost Checkpoint/Restore Accelerator Patrick Akl 1 and Andreas Moshovos AENAO Research Group Department of Electrical.
现代计算机体系结构 主讲教师:张钢天津大学计算机学院 2009 年.
Lecture: Out-of-order Processors
CS203 – Advanced Computer Architecture
Computer Structure Advanced Branch Prediction
Dynamic Branch Prediction
Computer Architecture Advanced Branch Prediction
COSC3330 Computer Architecture Lecture 15. Branch Prediction
CS 704 Advanced Computer Architecture
FA-TAGE Frequency Aware TAgged GEometric History Length Branch Predictor Boyu Zhang, Christopher Bodden, Dillon Skeehan ECE/CS 752 Advanced Computer Architecture.
Exploring Value Prediction with the EVES predictor
TLC: A Tag-less Cache for reducing dynamic first level Cache Energy
Address-Value Delta (AVD) Prediction
Lecture on High Performance Processor Architecture (CS05162)
Smruti R. Sarangi Computer Science and Engineering, IIT Delhi
Sampoorani, Sivakumar and Joshua
Serene Banerjee, Lizy K. John, Brian L. Evans
15-740/ Computer Architecture Lecture 14: Prefetching
Adapted from the slides of Prof
Dynamic Hardware Prediction
Patrick Akl and Andreas Moshovos AENAO Research Group
Aliasing and Anti-Aliasing in Branch History Table Prediction
rePLay: A Hardware Framework for Dynamic Optimization
Lois Orosa, Rodolfo Azevedo and Onur Mutlu
Presentation transcript:

Value Prediction Kyaw Kyaw, Min Pan Final Project

What is Value Prediction? Predict the value of instructions before they are executed Branch Prediction – eliminates the control dependences  Value Prediction – eliminates the data dependences

Why Value Prediction? Results of many instructions can be accurately predicted before they are issued or executed. Dependent instructions are no longer bound by the serialization constraints imposed by data dependences. More parallelism can be explored

Why Value Prediction is possible? Value Locality

Why Value Prediction is possible?

Causes of Value Locality Data redundancy Error checking Program constants Computed branches …… Virtual function calls Glue code Addressability Memory alias resolution

Value Prediction Units Three factors determine the efficacy Accuracy ability to avoid mispredictions Coverage ability to predict as many instruction outcomes as possible Scope The set of instructions that the predictor targets

Relationships between factors Accuracy ↔ Coverage trade-off Scope Low implementation cost Achieve better accuracy and coverage Mispredictions for useless predictions are eliminated

Value Prediction Units Three types History-Based Predictors Computational Predictors Hybrid Predictors

Value Prediction Units

Example for Value Prediction Implementation

Value Prediction Techniques Last Value Predictor Register Value Predictor Stride 2-delta Predictor Last Four value Predictor Finite Context Method Predictor Confidence Estimation

Sample Research Works “Value Locality and Load Value Prediction” M. H. Lipasti, C. B. Wilkerson, J. P. Shen ASPLOS-VII, October 1996 “Selective Value Prediction” B. Calder, G. Reinman, D. M. Tullsen Proceedings of 26 th International Symposium on Computer Architecture, May 1999

Value Locality & Prediction Likelihood of a previously-seen value recurring repeatedly within a storage location Exists primarily due to an effective compile-time optimization

Load Value Prediction Based on Branch Prediction idea, tries to predict all 32-bit or 64-bit value Load Value Prediction Table ~ branch target buffer Load Classification Table ~ branch history table Constant Verification Unit – to avoid accessing memory and force LVPT entries coherent with main memory

Load Value Prediction Unit

Results PowerPC 620 = avg. 3% (max 21%) performance gain Alpha = avg. 6% (max 17%) performance gain

Selective Value Prediction Not only on Load instructions but on all important instructions To speculate on operations with large gains and small losses even when confidence is low, and on operations with small gains and large losses when confidence is high To intelligently choose – When to use value prediction Which instructions to use value prediction

Confidence Prediction Predicted value is used if the confidence associated with that value is above given threshold. Confidence Saturating Counter – (low, med, high) Confidence History Counter – similar to local branch history

Minimizing Capacity Misses To prevent unnecessary replacement in the value prediction table Replacement Counter – increment on correct prediction, decrement otherwise Also decrement when another instruction attempts to use that entry Warn-up Counter – increment only every time an instruction hits in the value table, set to 0 on a replacement Only after the warn-up counter has reached a certain threshold, predictions made, confidence counter updated and VPT allowed to be modified

Filtering Producers of Predicted Values To reduce pressure on the prediction table, predict fewer instructions Only allow entries to be allocated to instructions that define registers which are actually used by another instruction in the current instruction window Limit the instructions that are on the critical path to be inserted into VPT

Filtering Producers of Predicted Values To reduce pressure on the prediction table, predict fewer instructions Only allow entries to be allocated to instructions that define registers which are actually used by another instruction in the current instruction window Limit the instructions that are on the critical path to be inserted into VPT

Finding the Important Consumers Filtering which instructions use a predicted value Use confidence bits (low, med, high) Path Heuristic – use predicted values for instructions that have low confidence but are on the longest path