ECE 721 Alternatives to ROB-based Retirement

Slides:



Advertisements
Similar presentations
NC STATE UNIVERSITY Transparent Control Independence (TCI) Ahmed S. Al-Zawawi Vimal K. Reddy Eric Rotenberg Haitham H. Akkary* *Dept. of Electrical & Computer.
Advertisements

CS 6290 Instruction Level Parallelism. Instruction Level Parallelism (ILP) Basic idea: Execute several instructions in parallel We already do pipelining…
NC STATE UNIVERSITY 1 Assertion-Based Microarchitecture Design for Improved Fault Tolerance Vimal K. Reddy Ahmed S. Al-Zawawi, Eric Rotenberg Center for.
1 Lecture 11: Modern Superscalar Processor Models Generic Superscalar Models, Issue Queue-based Pipeline, Multiple-Issue Design.
CS6290 Speculation Recovery. Loose Ends Up to now: –Techniques for handling register dependencies Register renaming for WAR, WAW Tomasulo’s algorithm.
EECS 470 Lecture 8 RS/ROB examples True Physical Registers? Project.
Federation: Repurposing Scalar Cores for Out- of-Order Instruction Issue David Tarjan*, Michael Boyer, and Kevin Skadron* University of Virginia Department.
1 Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ.
Out-of-Order Machine State Instruction Sequence: Inorder State: Look-ahead State: Architectural State: R3  A R7  B R8  C R7  D R4  E R3  F R8  G.
Spring 2003CSE P5481 Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers)
Scalable Load and Store Processing in Latency Tolerant Processors Amit Gandhi 1,2 Haitham Akkary 1 Ravi Rajwar 1 Srikanth T. Srinivasan 1 Konrad Lai 1.
Single-Chip Multiprocessor Nirmal Andrews. Case for single chip multiprocessors Advances in the field of integrated chip processing. - Gate density (More.
Computer Architecture 2011 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
CS 7810 Lecture 10 Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors O. Mutlu, J. Stark, C. Wilkerson, Y.N.
Computer Architecture 2011 – out-of-order execution (lec 7) 1 Computer Architecture Out-of-order execution By Dan Tsafrir, 11/4/2011 Presentation based.
Reducing the Complexity of the Register File in Dynamic Superscalar Processors Rajeev Balasubramonian, Sandhya Dwarkadas, and David H. Albonesi In Proceedings.
1 Lecture 9: Dynamic ILP Topics: out-of-order processors (Sections )
Computer Architecture 2010 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
1/25 HIPEAC 2008 TurboROB TurboROB A Low Cost Checkpoint/Restore Accelerator Patrick Akl and Andreas Moshovos AENAO Research Group Department of Electrical.
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
Hardware Multithreading. Increasing CPU Performance By increasing clock frequency By increasing Instructions per Clock Minimizing memory access impact.
1/25 June 28 th, 2006 BranchTap: Improving Performance With Very Few Checkpoints Through Adaptive Speculation Control BranchTap Improving Performance With.
1 Lecture 7: Speculative Execution and Recovery Branch prediction and speculative execution, precise interrupt, reorder buffer.
OOO Pipelines - II Smruti R. Sarangi IIT Delhi 1.
1 Lecture: Out-of-order Processors Topics: a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
Out-of-order execution Lihu Rappoport 11/ MAMAS – Computer Architecture Out-Of-Order Execution Dr. Lihu Rappoport.
1/25 HIPEAC 2008 TurboROB TurboROB A Low Cost Checkpoint/Restore Accelerator Patrick Akl 1 and Andreas Moshovos AENAO Research Group Department of Electrical.
CS 352H: Computer Systems Architecture
Dynamic Scheduling Why go out of style?
CSL718 : Superscalar Processors
/ Computer Architecture and Design
Smruti R. Sarangi IIT Delhi
Simultaneous Multithreading
PowerPC 604 Superscalar Microprocessor
Lecture: Out-of-order Processors
/ Computer Architecture and Design
Commit out of order Phd student: Adrián Cristal.
Sequential Execution Semantics
Lecture 6: Advanced Pipelines
Lecture 10: Out-of-order Processors
Lecture 11: Out-of-order Processors
Lecture: Out-of-order Processors
Superscalar Pipelines Part 2
Tolerating Long Latency Instructions
Lecture 8: ILP and Speculation Contd. Chapter 2, Sections 2. 6, 2
Smruti R. Sarangi IIT Delhi
Out-of-Order Commit Processor
Lecture 8: Dynamic ILP Topics: out-of-order processors
Adapted from the slides of Prof
15-740/ Computer Architecture Lecture 5: Precise Exceptions
* From AMD 1996 Publication #18522 Revision E
Adapted from the slides of Prof
Prof. Onur Mutlu Carnegie Mellon University Fall 2011, 9/30/2011
/ Computer Architecture and Design
Instruction-Level Parallelism (ILP)
15-740/ Computer Architecture Lecture 10: Out-of-Order Execution
Overview Prof. Eric Rotenberg
Additional ILP Topics Prof. Eric Rotenberg
CSC3050 – Computer Architecture
Patrick Akl and Andreas Moshovos AENAO Research Group
High-level view Out-of-order pipeline
Lecture 9: Dynamic ILP Topics: out-of-order processors
Spring 2019 Prof. Eric Rotenberg
ECE 721, Spring 2019 Prof. Eric Rotenberg.
Handling Stores and Loads
Spring’19 Prof. Eric Rotenberg
Sizing Structures Fixed relations Empirical (simulation-based)
ECE 721 Modern Superscalar Microarchitecture
Spring 2019 Prof. Eric Rotenberg
Dynamic Scheduling Physical Register File ready bits Issue Queue (IQ)
Presentation transcript:

ECE 721 Alternatives to ROB-based Retirement Spring 2019 Prof. Eric Rotenberg

ROB-based Retirement Benefit At what price? * ROB = Reorder Buffer = Active List Benefit Can roll back to any offending instruction (exception or mispredicted branch) Don’t lose any work prior to the offending instruction At what price? The values of all instructions in the instruction window must be retained so that state can be rolled back to any instruction Since these values are held in the Physical Register File (PRF), the instruction window size (scope for parallelism) is limited by the size of the PRF ECE 721, Spring 2019 Prof. Eric Rotenberg

Checkpoint-based Retirement Idea: Relax requirement that state can be rolled back to any instruction Create periodic checkpoints A checkpoint “pins” physical registers corresponding to the precise architectural state at that point in the dynamic instruction stream Physical registers of intermediate values (not pinned in any checkpoint) are aggressively freed ECE 721, Spring 2019 Prof. Eric Rotenberg

Checkpoint-based Retirement (cont.) Benefit Do not need to buffer results of all instructions in the instruction window Logically, can have a very large instruction window (scope for exposing parallelism) with a disproportionately smaller physical register file Example: Virtual instruction window: 1000s of instructions Physical Register File: few 100s of registers Summary: Decouple perceived size of instruction window from size of PRF. Extract high ILP with efficient resources. At what price? Can’t roll back to any offending instruction (exception or mispredicted branch) May lose work prior to the offending instruction Must judiciously select checkpoint placement Exceptions are rare anyways Confidence estimation for branches ECE 721, Spring 2019 Prof. Eric Rotenberg

Aside: ROB + Checkpoints But didn’t we use checkpoints with ROB? Shadow Map Tables are checkpoints of Rename Map Table However: Shadow Map Tables aren’t involved in the algorithm that determines when physical registers can be freed The ROB controls freeing of physical registers, and ensures the integrity of physical registers pointed to by a Shadow Map Table The only purpose of the Shadow Map Table is to accelerate restoring the Rename Map Table, providing a 1-cycle alternative to forward- walking or backward-walking the ROB ECE 721, Spring 2019 Prof. Eric Rotenberg

Summary ROB-based processor Checkpoint-based processor Fine-grain recovery (roll back to any instruction) Instruction window size = PRF size Checkpoint-based processor Coarse-grain recovery (roll back to selected points) Instruction window size >> PRF size ECE 721, Spring 2019 Prof. Eric Rotenberg