Download presentation
Presentation is loading. Please wait.
Published byBrendan French Modified over 9 years ago
1
ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan Zhou University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu Presented at CARG, March 8 th, 2006 By Jason Zebchuk
2
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 2 Data Value Speculation Predict the value and proceed speculatively When correct value is known, if mispediction, squash and re-execute Initial proposals L1 data value Store/Load Data dependences Aggressive novel proposals Retire speculative instructions Values of L2 misses Thread independence in Thread-Level Speculation (TLS) Speculative synchronization
3
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 3 Long-latency Speculation Misprediction recovery is very wasteful Most discarded instructions are still useful Checkpoint Prediction Resolution 210 Instructions 6.6 Instructions Speculatively retired instructions
4
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 4 Contributions (I) ReSlice: Architecture to buffer forward slice and re- execute it on misprediction Checkpoint Prediction Resolution Buffered Forward Slice If succeeds If fails
5
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 5 Contributions (II) A sufficient condition for correct slice re-execution and merge Guarantee to correctly repair the program state Application to Thread-Level Speculation (TLS) on SpecInt Prediction: task live-ins Removed 61% task squashes Speedup: up to 33% over TLS (avg 12%) E×D 2 reduction: avg 20%
6
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 6 Outline Motivation and Contributions Idea of ReSlice Architecture Design Experimental Results Conclusions
7
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 7 Idea of ReSlice Initial execution of the task Predict value of “risky” load and continue e.g L2 missing loads, exposed loads in TLS … Buffer in HW the forward slice of the load If a misprediction happens Re-execute the slice with the correct value If succeed: merge the register and memory state and continue If fail: roll back and re-execute (conventional recovery)
8
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 8 Why is this Challenging? … #1: LD R1 mem_A … #2: ST R2 (R1) … #3: LD R5 mem[0x20] … “Risky” Load R1 0x10 ST R2 mem[0x10] R1 0x20 ST R2 mem[0x20] LD R5 mem[0x20] Problem: Instruction #3 is not buffered! Buffered SliceCorrect Execution Initial Execution Initial execution is speculative Buffered slices may not be correct
9
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 9 Solution: Run-time Address Check … #1: LD R1 mem_A … #2: ST R2 (R1) … #3: LD R5 mem[0x20] … “Inhibiting” Store R1 0x10 ST R2 mem[0x10] R1 0x20 ST R2 mem[0x20] Buffered SliceSlice Re-ExecutionInitial Execution Different store addresses; And mem[0x20] was loaded in initial execution “Risky” Load
10
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 10 Problems Buffered Slice Missing Insts Extra Insts Data Corruption Slice Re-execution Initial Speculative Execution (Example just shown) “Dangling” Load“Inhibiting” Load
11
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 11 A Sufficient Condition Guarantee correct Slice re-execution State merge with program Two sub-conditions Branch takes same direction in both initial execution and slice re-execution No presence of any previous three problems Easy for HW to check at run-time Details please see our paper (Briefly: compare addresses accessed in old and new run, also look for references to same address in first run)
12
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 12 How does ReSlice Work? Timeline Checkpoint Prediction Resolution Slice Collection Compare the correct and predicted value Correct prediction Slice Re-Execution Misprediction Incorrect State Merging Correct
13
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 13 Architecture Design CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Descriptor Slice Descriptors Slice Info Slice Buffer Cache Register State Memory State
14
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 14 Other Architectural Requirements Tag physical registers with “SliceTag” Tag Cache for storing addresses accessed by slice instructions Undo Log records first store to each memory location in each Slice High-coverage Predictor to predict which instructions to use as seeds. Uses Dependence and Value Predictor used by TLS Uses TLS information to identify Speculatively Read/Written locations in cache
15
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 15 Step 1: Slice Collection Fill up the Slice Buffer when a prediction is made Track both register and memory dependences Save live-in operands and slice instructions Slices are buffered when instructions are retired CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Info Slice Buffer Cache Register State Memory State
16
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 16 An Example Slice Desc. Live InsBuffered Insts ST R3, mem[R4] R4 = 0x10 ADD R3, R1, R2 R4 = 0x10 ST R3, mem[R4]
17
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 17 Step 2: Slice Re-Execution REU takes over after a misprediction In-order execution Sufficient condition is checked during slice re-execution If success, merge the register and memory state; otherwise, squash the task and restart CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Info Slice Buffer Cache Register State Memory State
18
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 18 Step 3: State Merging Copy live registers back to CPU register file Merge memory state (details in paper) CPU Live Ins Buffered Instr. Re-Execution Unit (REU) Slice Descriptor Slice Info Slice Buffer Cache Register State Memory State
19
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 19 Multiple Overlapping Slices One slice may change live-ins of the other slice Detect overlapping slices during slice collection Combine the instructions of the overlapping slices and re- execute them in program order misprediction
20
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 20 Outline Motivation and Contributions Idea of ReSlice Architecture Design Experimental Results Conclusions
21
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 21 Methodology Simulated CMP architecture 5 GHz Private 16k L1 per core; 1 MB Shared L2 SpecInt 2000 applications TLS+ReSlice Four 3-issue cores with TLS and ReSlice Baseline: TLS Four 3-issue cores with TLS Serial One 3-issue core
22
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 22 Slice Characterization (Averages) 10.4 instructions 4.5 Reg live-ins 1.0 Mem live-ins 2.2 Reg live-outs 1.9 Mem live-outs 1.6 slices per task 15% tasks with overlapping slices
23
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 23 Number of Task Squashes 61% squashes are avoided because of ReSlice
24
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 24 Performance Speedup TLS+ReSlice Over TLS: up to 33% (avg 12%) Over Serial: avg 45%
25
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 25 Energy × Delay 2 E×D 2 reduction: 20% over TLS
26
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing Wei Liu, University of Illinois 26 Conclusions Generic architecture for forward slice re-execution Sufficient condition for correct re-execution and merge Improve TLS on SpecInt Speedups: up to 33% over TLS (avg 12%) E×D 2 reduction: avg 20% over TLS Recovering wasted work is a promising approach Boost performance Energy efficiency
27
ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan Zhou University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.