Prophet/Critic Hybrid Branch Prediction Falcon, Stark, Ramirez, Lai, Valero Presenter: Christian Wanamaker.

Slides:

Advertisements

Similar presentations

Branch prediction Titov Alexander MDSP November, 2009.

Advertisements

Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Department of Computer Architecture.

Instruction-Level Parallelism compiler techniques and branch prediction prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University March.

Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.

Dynamic Branch Prediction (Sec 4.3) Control dependences become a limiting factor in exploiting ILP So far, we’ve discussed only static branch prediction.

Hardware-based Devirtualization (VPC Prediction) Hyesoon Kim, Jose A. Joao, Onur Mutlu ++, Chang Joo Lee, Yale N. Patt, Robert Cohn* ++ *

Dynamic Branch Prediction

Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.

Limits on ILP. Achieving Parallelism Techniques – Scoreboarding / Tomasulo’s Algorithm – Pipelining – Speculation – Branch Prediction But how much more.

TAGE-SC-L Branch Predictors

A Scalable Front-End Architecture for Fast Instruction Delivery Paper by: Glenn Reinman, Todd Austin and Brad Calder Presenter: Alexander Choong.

CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.

Glenn Reinman, Brad Calder, Department of Computer Science and Engineering, University of California San Diego and Todd Austin Department of Electrical.

Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Calvin Lin Dept. of Computer Science Rutgers University Univ. of Texas Austin Presented.

EECS 470 Branch Prediction Lecture 6 Coverage: Chapter 3.

Perceptron-based Global Confidence Estimation for Value Prediction Master’s Thesis Michael Black June 26, 2003.

1 Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlation Branches from a Larger Global History CSE 340 Project Presentation.

1 Applying Perceptrons to Speculation in Computer Architecture Michael Black Dissertation Defense April 2, 2007.

EE8365/CS8203 ADVANCED COMPUTER ARCHITECTURE A Survey on BRANCH PREDICTION METHODOLOGY By, Baris Mustafa Kazar Resit Sendag.

VLSI Project Neural Networks based Branch Prediction Alexander ZlotnikMarcel Apfelbaum Supervised by: Michael Behar, Spring 2005.

EECC722 - Shaaban #1 Lec # 10 Fall Conventional & Block-based Trace Caches In high performance superscalar processors the instruction fetch.

Goal: Reduce the Penalty of Control Hazards

Better Branch Prediction Through Prophet/Critic Hybrids A. Falcón, J. Stark, A. Ramirez, K. Lai, M. Valero Paper Presentation and Discussion.

1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.

Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.

Branch Prediction Dimitris Karteris Rafael Pasvantidιs.

Dynamic Branch Prediction

CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.

Spring 2003CSE P5481 Control Hazard Review The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction.

1 Lecture 7: Branch prediction Topics: bimodal, global, local branch prediction (Sections )

EECC722 - Shaaban #1 Lec # 9 Fall Conventional & Block-based Trace Caches In high performance superscalar processors the instruction fetch.

Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.

Evaluation of Dynamic Branch Prediction Schemes in a MIPS Pipeline Debajit Bhattacharya Ali JavadiAbhari ELE 475 Final Project 9 th May, 2012.

Improving the Performance of Object-Oriented Languages with Dynamic Predication of Indirect Jumps José A. Joao *‡ Onur Mutlu ‡* Hyesoon Kim § Rishi Agarwal.

Evaluation of the Gini-index for Studying Branch Prediction Features Veerle Desmet Lieven Eeckhout Koen De Bosschere.

Analysis of Branch Predictors

ACSAC’04 Choice Predictor for Free Mongkol Ekpanyapong Pinar Korkmaz Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Institute.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

1 Dynamic Branch Prediction. 2 Why do we want to predict branches? MIPS based pipeline – 1 instruction issued per cycle, branch hazard of 1 cycle. –Delayed.

CSCI 6461: Computer Architecture Branch Prediction Instructor: M. Lancaster Corresponding to Hennessey and Patterson Fifth Edition Section 3.3 and Part.

Advanced Computer Architecture Lab University of Michigan Compiler Controlled Value Prediction with Branch Predictor Based Confidence Eric Larson Compiler.

Fetch Directed Prefetching - a Study

CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)

Copyright 2016 Csaba Andras MoritzECE668 Power Aware Branching.1 Few slides adapted from Patterson, et al © UCB and Morgan Kaufmann Csaba Andras Moritz.

Prophet/Critic Hybrid Branch Prediction B B B

Branch Prediction Perspectives Using Machine Learning Veerle Desmet Ghent University.

Fast Path-Based Neural Branch Prediction Daniel A. Jimenez Presented by: Ioana Burcea.

Value Prediction Kyaw Kyaw, Min Pan Final Project.

Pentium 4 Deeply pipelined processor supporting multiple issue with speculation and multi-threading 2004 version: 31 clock cycles from fetch to retire,

Computer Architecture Chapter (14): Processor Structure and Function

CS203 – Advanced Computer Architecture

Dynamic Branch Prediction

CS 704 Advanced Computer Architecture

Samira Khan University of Virginia Nov 13, 2017

FA-TAGE Frequency Aware TAgged GEometric History Length Branch Predictor Boyu Zhang, Christopher Bodden, Dillon Skeehan ECE/CS 752 Advanced Computer Architecture.

CMSC 611: Advanced Computer Architecture

Lecture 19: Branches, OOO Today’s topics: Instruction scheduling

Module 3: Branch Prediction

Dynamic Hardware Branch Prediction

Lecture 18: Pipelining Today’s topics:

Lecture 19: Branches, OOO Today’s topics: Instruction scheduling

Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt

Lecture 10: Branch Prediction and Instruction Delivery

Recovery: Redirect fetch unit to T path if actually T.

Pipelining: dynamic branch prediction Prof. Eric Rotenberg

Adapted from the slides of Prof

Wackiness Algorithm A: Algorithm B:

The O-GEHL branch predictor

Phase based adaptive Branch predictor: Seeing the forest for the trees

Presentation transcript:

Prophet/Critic Hybrid Branch Prediction Falcon, Stark, Ramirez, Lai, Valero Presenter: Christian Wanamaker

Outline  Overview & Motivation  Hybrid Branch Prediction  The Prophet/Critic Branch Predictor  Results  Conclusions

Overview  Better Branch Prediction is a highly desirable technique because it does not require trade- offs between performance, power, and energy  Despite much research on Branch Prediction, it is by no means solved  Branch prediction is liable to become even more important as pipelines deepen and issue-widths increase

Overview (continued)  Issue width = # of uOps (micro ops) issued per clock cycle  Using the Branch Target Buffer to look ahead for branches  Perceptrons – simple neural network. Can look at a longer history than simpler counters.

Hybrid Branch Predictors Use two or more different branch prediction techniques One may override another (either based on a third selector or one may always override) The predictions may be combined, for instance as a majority vote Example: Tournament predictors (often a branch prediction buffer and a Correlating branch predictor with a third predictor choosing which of the two is used in this situation)

Prophet/Critic Branch Predictor  Basic Idea – the Prophet makes a series of predictions of future branches, the Critic critiques and if necessary alters them.  The Prophet makes predictions based on the history of the branch.  The Critic looks at the branches prediction of the Prophet after the prophet has predicted a certain number of steps ahead, then critiques the prophecy

Prophet Critic Basics  BTB looks ahead, Prophet predicts whether BTB branches are taken/not taken  Branch Outcome Register – the predictions that are in the critics “future” - the number of future bits allowed.  More future bits allow for more accurate viewing of the future

Prophet Critic Basics

 The branch predictions are kept in the Fetch Target Queue (FTQ)  Once the future bits are received, the critic makes it’s pronouncement  The critic overrides the prophet if it comes to a different conclusion  If so, the FTQ is purged of un-critiqued predictions, and the prophet is redirected to the path shown by the critic

Prophet/Critic Architecture (cont)

Prophet/Critic: How it works

Prophet Critic: How it works  The Prophet will mispredict A  The Critic will note that the Prophet mispredicted in this case the first time the misprediction occurs.  In the future, when the Critic sees the misprediction, it will correct it  More future bits increase the accuracy of prediction, but reduce the history, so there is an important tradeoff here.

Prophet/Critic Filtering  The critic can be limited by multiple branches contending for the same resources  In addition, the critic is not always correct  So, easy to predict branches should be filtered out  This is achieved with tags that are set when a mispredict occcurs  If not tagged, the critic’s critique is ignored.

Prophet/Critic Filtering

Testing and Result  Testing was done on a cycle accurate IA32 with Long Instruction Traces  The simulator had to follow bad branches, as otherwise the critic would not learn.  Branch Predictors: Gshare, 2bc-gskew, perceptron  uPC is uOps per cycle  misp/kuops is misses per thousand uops

Hardware Budgets and predictor types

Results of varying Future Bits

Results

Conclusions  Speedup of up to 8% with 12 future bits (using the same amount of branch prediction space)  The mispredict rate can be reduced up to %  Adding future bits helps, but more is not always better  Research suggests that the best future bits can be chosen dynamically

Thank you Any questions?