Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECS 470 Register Renaming Lecture 8 Coverage: Chapter 3.

Similar presentations


Presentation on theme: "EECS 470 Register Renaming Lecture 8 Coverage: Chapter 3."— Presentation transcript:

1 EECS 470 Register Renaming Lecture 8 Coverage: Chapter 3

2 MEM Reorder Buffer Recap @ Alloc –Allocate result storage at Tail @ Sched –Get inputs (ROB T-to-H then ARF) –Wait until all inputs ready @ WB –Write results/fault to ROB –Indicate result is ready @ CT –Wait until inst @ Head is done –If fault, initiate handler –Else, write results to ARF –Deallocate entry from ROB IFID AllocSched EX ROB CT HeadTail PC Dst regID Dst value Except? Reorder Buffer (ROB) –Circular queue of spec state –May contain multiple definitions of same register In-order Any order ARF

3 A “High Complexity” Reorder Buffer ROB HeadTail Serial scan! regID === valval valval valval >>>  @ Sched we must access the nearest-previous definition –Requires a serial scan of ROB –Tail (newest) to head (oldest) –Implemented with daisy-chain What is the latency of ROB access? –O(N), with N ROB entries –Due to more wire and logic What does this mean wrt. ILP?

4 Factors that Determine tCLK Recall: t CPU = N inst *CPI*t CLK What defines t CLK ? –Critical path latency (= logic + wire latency) –Latch latency –Clock skew –Clock period design margins In current and future generation designs –Wire latency becoming dominant latency of critical path –Due to growing side-wall capacitance –Brings a spatial dimension to architecture optimization E.g., How long are the wires that will connect these two devices?

5 Determining the Latency of a Wire scale shrinks grows

6 Reducing Complexity with Register Renaming Key observation –The definition we want is the last one written Register Renaming –Implement a table (indexed by regID) that returns the ROB entry that holds the last definition of the register –Translate the program from register identifiers to one that accesses reorder buffer entries directly –Then, access ROB entry directly, no scanning for nearest-previous register definitions!

7 Logical vs. Physical Registers Logical registers (aka “architected” registers) –Register names used by programmer/compiler to identify program values –How many do we need? Physical registers –Storage names implemented in the microarchitecture used to hold actual register values –ROB entries in our microarchitecture Other implementations possible, e.g., P4 physical register file What is the advantage/disadvantage of P4’s physical register file? –How many do we need?

8 Register renaming translates program from logical register accesses to physical storage accesses Logical ProgramPhysical Program r6 = r5 + r2p52 = p45 + p42 r8 = r6 + r3p53 = p52 + r3 r6 = r9 + r10p54 = r9 + r10 r12 = r8 + r6p55 = p53 + p54 Note: program semantics have not changed –Only storage names have changed –Storage names are unimportant to program semantics Register Translation Example rename

9 MEM Pipeline with Register Renaming IFID AllocREN EX ROB CT In-order Any order ARF Sched Rename Table regIDrobIDX v @ REN –Index table with source operand regID to locate ROB/ARF entry @ Sched –Get inputs from ROB/ARF entry specified by REN –Wait until all inputs ready @ CT –Wait until inst @ Head is done –If fault, initiate handler –Else, write results to ROB/ARF entry specified by REN –Deallocate entry from ROB –Invalidate rename table entry @ dest regID iff the entry still points to ROB entry being deallocated Rename Table –Indexed with regID –Returns (valid, robIDX) –If valid, ROB does/will contain value of register –If invalid, ARF holds value (no instruction in flight defines this register) Why ?

10 Register Renaming Example Logical ProgramPhysical Program r6 = r5 + r2 r8 = r6 + r3 r6 = r9 + r10 r12 = r8 + r6 1 2 3 4 5 6 7 8 9 10 11 12 Logical ProgramPhysical Program r6 = r5 + r2p52 = p45 + p42 r8 = r6 + r3 r6 = r9 + r10 r12 = r8 + r6 1 2 3 4 5 6 7 8 9 10 11 12 p42 p45 x x p42 p45 p52 x x x

11 Register Renaming Example Logical ProgramPhysical Program r6 = r5 + r2p52 = p45 + p42 r8 = r6 + r3p53 = p52 + r3 r6 = r9 + r10 r12 = r8 + r6 1 2 3 4 5 6 7 8 9 10 11 12 Logical ProgramPhysical Program r6 = r5 + r2p52 = p45 + p42 r8 = r6 + r3p53 = p52 + r3 r6 = r9 + r10p54 = r9 + r10 r12 = r8 + r6 1 2 3 4 5 6 7 8 9 10 11 12 p42 p45 p53 x x x p42 p45 p54 p53 x x x x p52 x

12 Register Renaming Example Logical ProgramPhysical Program r6 = r5 + r2p52 = p45 + p42 r8 = r6 + r3p53 = p52 + r3 r6 = r9 + r10p54 = r9 + r10 r12 = r8 + r6p55 = p53 + p54 1 2 3 4 5 6 7 8 9 10 11 12 p45 p54 p53 p55 x x x x x p42

13 Cross-cutting Issue: Mispeculation What are the impacts of mispeculation or exceptions? –When instructions are flushed from the pipeline, rename mappings must be restored to point-of-restart –Otherwise, new instructions will see stale definitions Two recovery approaches –Simple/slow 1.Wait until the faulting/mispredicting instruction reaches retirement 2.Flush ALL speculative register definitions by clearing all rename table valid bits –Complex/fast 1.Checkpoint ENTIRE rename table anywhere recovery may be needed 2.At soon as mispeculation detected, recover table associated with PC

14 Discussion Points What are the trade-offs between rename table flush recovery and checkpointing? What if another instruction (being renamed) needs to access a physical storage entry after it has been overwritten? Can I rename memory?


Download ppt "EECS 470 Register Renaming Lecture 8 Coverage: Chapter 3."

Similar presentations


Ads by Google