Data value prediction Bas van der Tol. Limits to ILP Instruction Level Parallelism is limited by Control flow Data flow: true dependencies.

Slides:



Advertisements
Similar presentations
CS136, Advanced Architecture Limits to ILP Simultaneous Multithreading.
Advertisements

Out Of Order Execution (Part 1) Updated by Franck Sala.
Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.
Computer Structure 2014 – Out-Of-Order Execution 1 Computer Structure Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Combining Statistical and Symbolic Simulation Mark Oskin Fred Chong and Matthew Farrens Dept. of Computer Science University of California at Davis.
Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Lecture 8: More ILP stuff Professor Alvin R. Lebeck Computer Science 220 Fall 2001.
EECS 470 Lecture 7 Branches: Address prediction and recovery (And interrupt recovery too.)
Limits on ILP. Achieving Parallelism Techniques – Scoreboarding / Tomasulo’s Algorithm – Pipelining – Speculation – Branch Prediction But how much more.
Clustered Indexing for Conditional Branch Predictors Veerle Desmet Ghent University Belgium.
EECE476: Computer Architecture Lecture 23: Speculative Execution, Dynamic Superscalar (text 6.8 plus more) The University of British ColumbiaEECE 476©
Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Calvin Lin Dept. of Computer Science Rutgers University Univ. of Texas Austin Presented.
Perceptron-based Global Confidence Estimation for Value Prediction Master’s Thesis Michael Black June 26, 2003.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Oct. 7, 2002 Topic: Instruction-Level Parallelism (Dynamic Branch Prediction)
Power Savings in Embedded Processors through Decode Filter Cache Weiyu Tang, Rajesh Gupta, Alex Nicolau.
Interpreter MathWorks Compiler Course – Day 9. Interpreter –Source interpretation MathWorks Compiler Course – Day 9.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
1 Lecture 9: More ILP Today: limits of ILP, case studies, boosting ILP (Sections )
Dependence-Based Value Prediction Yiannakis Sazeides University of Cyprus UPC-Barcelona 17/5/2001.
Csci4203/ece43631 Review Quiz. 1)It is less expensive 2)It is usually faster 3)Its average CPI is smaller 4)It allows a faster clock rate 5)It has a simpler.
1 Lecture 8: Instruction Fetch, ILP Limits Today: advanced branch prediction, limits of ILP (Sections , )
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
Computer Architecture 2010 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
CS 7810 Lecture 21 Threaded Multiple Path Execution S. Wallace, B. Calder, D. Tullsen Proceedings of ISCA-25 June 1998.
1 Lecture 7: Static ILP and branch prediction Topics: static speculation and branch prediction (Appendix G, Section 2.3)
Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.
CS 7810 Lecture 6 The Impact of Delay on the Design of Branch Predictors D.A. Jimenez, S.W. Keckler, C. Lin Proceedings of MICRO
Revisiting Load Value Speculation:
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
Power and Frequency Analysis for Data and Control Independence in Embedded Processors Farzad Samie Amirali Baniasadi Sharif University of Technology University.
Timing Analysis of Embedded Software for Speculative Processors Tulika Mitra Abhik Roychoudhury Xianfeng Li School of Computing National University of.
Advanced Computer Architecture Lab University of Michigan Compiler Controlled Value Prediction with Branch Predictor Based Confidence Eric Larson Compiler.
Limits of Instruction-Level Parallelism Presentation by: Robert Duckles CSE 520 Paper being presented: Limits of Instruction-Level Parallelism David W.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
1 CPRE 585 Term Review Performance evaluation, ISA design, dynamically scheduled pipeline, and memory hierarchy.
Superscalar - summary Superscalar machines have multiple functional units (FUs) eg 2 x integer ALU, 1 x FPU, 1 x branch, 1 x load/store Requires complex.
Final Review Prof. Mike Schulte Advanced Computer Architecture ECE 401.
CS 258 Spring The Expandable Split Window Paradigm for Exploiting Fine- Grain Parallelism Manoj Franklin and Gurindar S. Sohi Presented by Allen.
Lecture 1: Introduction Instruction Level Parallelism & Processor Architectures.
OOO Pipelines - III Smruti R. Sarangi Computer Science and Engineering, IIT Delhi.
现代计算机体系结构 主讲教师:张钢天津大学计算机学院 2009 年.
Pipelining. A process of execution of instructions may be decomposed into several suboperations Each of suboperations may be executed by a dedicated segment.
Value Prediction Kyaw Kyaw, Min Pan Final Project.
Pentium 4 Deeply pipelined processor supporting multiple issue with speculation and multi-threading 2004 version: 31 clock cycles from fetch to retire,
Chapter Six.
CS203 – Advanced Computer Architecture
Dynamic Branch Prediction
Simultaneous Multithreading
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Smruti R. Sarangi Computer Science and Engineering, IIT Delhi
Flow Path Model of Superscalars
Exploring Value Prediction with the EVES predictor
Module 3: Branch Prediction
Lecture 8: ILP and Speculation Contd. Chapter 2, Sections 2. 6, 2
Lecture on High Performance Processor Architecture (CS05162)
Yingmin Li Ting Yan Qi Zhao
Smruti R. Sarangi Computer Science and Engineering, IIT Delhi
Coe818 Advanced Computer Architecture
Chapter Six.
Advanced Computer Architecture
Lecture 10: Branch Prediction and Instruction Delivery
Dynamic Hardware Prediction
The University of Adelaide, School of Computer Science
Lecture 9: Dynamic ILP Topics: out-of-order processors
Lois Orosa, Rodolfo Azevedo and Onur Mutlu
Project Guidelines Prof. Eric Rotenberg.
Presentation transcript:

Data value prediction Bas van der Tol

Limits to ILP Instruction Level Parallelism is limited by Control flow Data flow: true dependencies

Types of Speculative Execution

Sources of predicatable data Data redundancy Error-checking Program constants Virtual function calls Glue code Call-subgraph identities Register spill code

Register value locality

Value Prediction Unit

VPT Hit rate sensitivity to Size

Example use of Value Prediction

Penalties Misprediction penalty Structural hazards, both on correct and mispredictions Penalties

Configurations used for experiments

PowerPC 620 Speedup

PowerPC 620+ Speedup

Infinite Machine Model Speedup

Data cache vs. Value Prediction

Improving Prediction Accuracy Last Value Prediction Stride Prediction Finite Context Method Predictors (fcm)

Finite Context Models

Prediction Success

Conclusions Data value prediction increases performance by 5% on PowerPC 620 A performance gain of 23% is possible Future Developments More parallel execution units Better prediction models