Download presentation
Presentation is loading. Please wait.
Published byBraden Cousens Modified over 9 years ago
1
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 ILP, cont. Maintaining Sequential Appearance –Precise Interrupts –RUU approach to OoO Scheduling
2
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Superscalar Processors: The Big Picture Program Form Processing Phase Static program dynamic inst. Stream (trace) execution window completed instructions Fetch and CT prediction Dispatch/ dataflow inst. Issue inst execution inst. Reorder & commit
3
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 A Generic Superscalar OOO Processor Pre-decode I-CACHE buffer Rename Dispatch scheduler Reorder buffer RF FUs Memory Interface
4
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Maintaining Sequential Semantics What if execution gets interrupted at an arbitrary point? –All insts. before commit –None thereafter We’ll focus on interrupts Same mechanisms used today to support SPECULATIVE EXECUTION “Definition”: Instr. executes speculatively up to complete. We don’t know yet if we should have executed this instr. Verification happens at commit (if ever).
5
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Interrupts Examples –Power Failing, Arithmetic Overflow –I/O Device Request, OS Call, Page Fault –Invalid Opcode, Breakpoint, Protection Viol. Aka Faults, Exceptions, or Traps Requirements –Surprise Jump (to vectored Address) –Linking Return Address –Saving State –Changing State (e.g., kernel mode)
6
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Classifying Interrupts 1a: Synchronous –Function of program state –overflow, page fault, etc. 1b. Asynchronous –e.g., External device or malfunction 2. Use Request –OS Call 2b. Coersed –From OS or hardware –page fault, protection violation 3a. User Maskable –Use can disable processing 3b. Non-Maskable –Guess!!!
7
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Classifying Interrupts, contd. 4a. Between Instructions –Usually Asynchronous 4b. Within an Instruction –Usually Synchronous –Harder to deal with, why??? 5a. Resume –As if nothing happened as far as the program is concerned 5b. Catastrophic –Say, bye bye, program is leaving us
8
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Restartable Pipelines Interrupts within an instruction are not catastrophic Most machines support this –Needed for virtual memory Some machines did not support this –Cost & Slowdown PRECISE INTERRUPTS is key –As if the interrupt happened at a well defined point in the original sequential order –First let’s consider a simple DLX-style pipeline
9
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Precise Interrupts Sequential Semantics Complete instructions before the offending instruction Squash (effects of) instructions after Save PC Force trap instruction into FETCH stage –divert execution to interrupt handler
10
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Precise Interrupts Jim Smith and Andrew Plezkun Paper Original work was for a “simple” pipeline Today the same principles are used in virtually all modern microprocessors –Support for SPECULATIVE EXECUTION executing instruction without knowing whether we should more on this later –and of course, precise interrupts We’ll stick to precise interrupts for the time being
11
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Do the Simple Thing First Modify State only when all preceding insts. are KNOWN to be exception free. Mechanism: Result Shift Register Stage = cycle At FETCH: Reserve all stages for the duration of the instruction
12
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Simple Solution Discussion Essentially In-Order Completion –Simple Easy to implement –Performance? Execution overlap still possible Writebacks in order Amplifies latencies Dependent Instructions wait longer
13
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Allowing out-of-order completes Add one more state for instruction execution: –COMPLETE & COMMIT COMPLETE: –Result calculated –Dependent instructions can use –BUT, don’t know if preceding instructions are all OK –I.e., don’t know if this instruction should have executed now based on the original program order COMMIT: –All preceding instructions executed with no problems –Can safely commit stage changes
14
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 OOO Complete & IO Commit Want: Out-of-Order Completion –Allow OOO completion –Maintain in-order COMMIT –Allow maximum overlap –Guarantee precise state if needed How does this improve performance? In-Order Complete OOO Complete Time DIV R3, _, _ ADD R1, _, _ ADD _, R1, _ In-order commits
15
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Reorder Buffer Result Shift Register: –Reserve Result Bus –Out-of-Order Completion Reorder Buffer –Defer Commits and do them in-order –Allow OOO Completes by buffering state motion res = result v = valid e = result NYA When to complete When to commit
16
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Reorder Buffer Complications State is kept in the reorder buffer Have to bypass from every entry –Need to determine latest write w/ respect to the consuming instruction RF RB Essentially: 1. In-Order Commits 2. Buffer speculative state till commit
17
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Speculative State Updates Two fundamental approaches – Do changes but keep a record of old state –Everything OK? Just discard record of changes HISTORY BUFFER – Keep two states: Architectural and Speculative On COMPLETE write state to Speculative On ISSUE read from speculative On COMMIT write to Architectural On Error, throw out Speculative state FUTURE FILE
18
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 History Buffer Allow out-of-order register file updates At decode record current value of target register in RB –notice that this is the previous value the register had On Commit? –Do nothing, state is fine On Exception? –Use History to UNDO changes made RF HB results Source operands Destination registers Exception
19
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 History Buffer Discussion Simple Mechanism Additional Register File Port Single Source for Input Operands Normal Instruction processing Not changed by much –Control mostly unchanged –Nothing to do on Commit for the common case Slow response to Interrupts –Need to scan through HB –Complex?
20
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Future File: The Optimist’s View Two Register Files: –One updated Out-of-Order (FUTURE) assume no exception will occur –One updated in Order (ARCHITECTURAL) Advantage: No delay to restore state on exception RF RB Source operands FF results
21
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 How These Relate to Register Renaming? Physical Registers provide sufficient storage for both speculative and architectural storage It’s the register map table that determines what is the current state On interrupt we have to restore the map table –Values are there in the physical register file History and Future approaches still valid –History: keep track of changes to register map table –On interrupt undo them one by one –Future: keep two tables Speculative: updated at decode Architectural: updated at commit
22
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 RUU Sohi’s Paper Common Mechanism for Precise Interrupts and OOO Execution Register Update Unit –A collection of Reservation stations –Organized as a FIFO queue –Instructions Enter In-order at FETCH –They Exit In-Order at COMMIT Register File updates happen at this point. Simplescalar follows this model –Well, mostly –Cut’s corners on when Completes become visible
23
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 RUU: OOO Execution Decode: –Check RUU for most recent write to register –If none found, read value from RF Do it in parallel really –If found, link to producer with a TAG RUU number is the TAG Issue –Wait till all input operands are ready Complete –Broadcast value and RUU ID Waiting instructions will pick value up Commit –Head and Tail pointer for FIFO operation –Only when everyone before has committed
24
© A. Moshovos (ECE, Toronto) ECE1773 – Spring 2002 Where is the Rename Table? It’s the RUU –@ decode insts scan for the most recent update to register –If none found, then register in register file –Otherwise, get RUU entry # as tag Interrupts? –Simply flush RUU Pros/Cons: –Associative lookup for decode –RUU ports limit when consumers can read a value
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.