Download presentation
Presentation is loading. Please wait.
Published byNickolas Burns Modified over 9 years ago
1
NC STATE UNIVERSITY DSN 2007 © Vimal Reddy 2007 1 Inherent Time Redundancy (ITR): Using Program Repetition for Low-Overhead Fault Tolerance Vimal Reddy Eric Rotenberg Center for Efficient, Secure and Reliable Computing (CESR) Dept. of Electrical & Computer Engineering North Carolina State University Raleigh, NC
2
NC STATE UNIVERSITY DSN 2007 © Vimal Reddy 2007 2 Processor Fault Tolerance Main approaches: Time or space redundancy –Explicitly duplicate in time or space –Match values/signals –E.g., AR-SMT (time), IBM S/390 G5 (space) High overheads –Area/power/performance –Unsuitable for commodity high-performance processors
3
NC STATE UNIVERSITY DSN 2007 New Idea: Inherent Time Redundancy © Vimal Reddy 2007 3 == program program duplicateprogram Conventional time redundancyInherent time redundancy
4
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 4
5
NC STATE UNIVERSITY DSN 2007 Exploiting ITR for Fault Tolerance Key ideas: –Record input-independent signals for instructions –Confirm their correctness upon repetition This paper focuses on decode signals –Provides protection for fetch and decode stages Longer term goal –Apply ITR to other input-independent stages –My thesis combines ITR with other microarch.- level checks for a comprehensive fault regimen © Vimal Reddy 2007 5
6
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 6
7
NC STATE UNIVERSITY DSN 2007 ITR Cache for Checking Decode Signals Create decode signatures at trace granularity –Trace ends at branch or 16 instructions (arbitrary) –Combine decode signals of instructions in a trace Record decode signatures in an ITR cache –Indexed by start PC (program counter) of a trace For each new trace, compare with ITR cache –Fault if new signature mismatches ITR cache © Vimal Reddy 2007 7
8
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 8
9
NC STATE UNIVERSITY DSN 2007 pc(I) Scenario 1: ITR Cache Hit © Vimal Reddy 2007 9 Signature Generation IJK S2 (I, J, K) Tags Signatures S1 (I, J, K)pc(I) Hit I ≠ Fault Detected Notes: -- Faults on traces that hit (S2) are detectable -- These faults are recoverable by flushing the processor pipeline and restarting from the faulting trace ITR Cache
10
NC STATE UNIVERSITY DSN 2007 Scenario 2: ITR Cache Miss Followed by Hit © Vimal Reddy 2007 10 Signature Generation IJK S1 (I, J, K) Tags Signatures S1 (I, J, K)pc(I) Miss pc(I) I ITR Cache
11
NC STATE UNIVERSITY DSN 2007 pc(I) © Vimal Reddy 2007 11 Signature Generation IJK S2 (I, J, K) Tags Signatures S1 (I, J, K)pc(I) Hit ≠ Scenario 2: ITR Cache Miss Followed by Hit Fault Detected Notes: -- Faults on traces that miss but later hit (faulty S1) are detectable -- Faults on S1 need checkpoint for recovery or must abort ITR Cache
12
NC STATE UNIVERSITY DSN 2007 Scenario 3: ITR Cache Miss Followed by Eviction © Vimal Reddy 2007 12 Signature Generation IJK S1 (I, J, K) Tags Signatures S1 (I, J, K)pc(I) Miss pc(I) I ITR Cache
13
NC STATE UNIVERSITY DSN 2007 pc(M) © Vimal Reddy 2007 13 Signature Generation MNO S2 (M, N, O) Tags Signatures Evict Notes: -- Faults on missed & evicted traces (S1) are not detected Scenario 3: ITR Cache Miss Followed by Eviction S2 (M, N, O)S1 (I, J, K)pc(M)pc(I) Fault not detected ITR Cache
14
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 14
15
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 15 Decode signals redirected for signature generation and ITR cache access
16
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 16 Signature generation: Bitwise XOR decode signals of instr. from a trace
17
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 17 Signatures dispatched into ITR ROB (ITR cache more effective without wrong-path trace signatures)
18
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 18 Access ITR cache with trace start PC
19
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 19 Signature mismatch, set retry bit -- Flush pipeline and retry from faulting trace
20
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 20 Fault re-detected, old signature is faulty -- Abort or rollback to a safe checkpoint Fault disappears, successful recovery
21
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 21 ITR cache miss -- Write new signature to ITR cache
22
NC STATE UNIVERSITY DSN 2007 Microarchitecture Extensions to Superscalar © Vimal Reddy 2007 22 No fault detected (or possible loss of detection due to miss-followed-by- eviction case)
23
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 23
24
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 24
25
NC STATE UNIVERSITY DSN 2007 Results: Repetition in SPEC2K INT © Vimal Reddy 2007 25 % of total dynamic instructions
26
NC STATE UNIVERSITY DSN 2007 © Vimal Reddy 2007 26 Results: Repetition in SPEC2K FP % of total dynamic instructions
27
NC STATE UNIVERSITY DSN 2007 © Vimal Reddy 2007 27 Results: Proximity in SPEC2K INT % of total dynamic instructions # dynamic instructions separating repetitive traces
28
NC STATE UNIVERSITY DSN 2007 © Vimal Reddy 2007 28 Results: Proximity in SPEC2K FP # dynamic instructions separating repetitive traces % of total dynamic instructions
29
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 29
30
NC STATE UNIVERSITY DSN 2007 Fault Coverage Based on ITR Cache Misses Loss in fault detection coverage –Miss-followed-by-eviction causes loss in fault detection –(# of missed & evicted traces) x instructions/trace Loss in fault recovery coverage (w/o checkpoints) –Misses cause loss in fault recovery –(# of missed traces) x instructions/trace Tried various ITR cache configurations –Direct-mapped (DM), 2,4,8,16-way, fully-associative (FA) –256, 512 and 1024 signatures –Bzip, gzip, art, mgrid and wupwise had <1% loss in coverage (results not shown) © Vimal Reddy 2007 30
31
NC STATE UNIVERSITY DSN 2007 Results: Loss in Fault Detection Coverage © Vimal Reddy 2007 31 Coverage loss correlates with proximity of repetition e.g., perl and vortex Capacity is important for poor performers For 2-way, 1024 signatures: 8% max. (vortex) and 1.3% avg. loss in detection coverage % of total dynamic instructions
32
NC STATE UNIVERSITY DSN 2007 Results: Loss in Fault Recovery Coverage © Vimal Reddy 2007 32 Recovery coverage loss is bigger than loss in detection coverage For 2-way, 1024 signatures: 15% max. (vortex) and 2.5% avg. loss in recovery coverage % of total dynamic instructions
33
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 33
34
NC STATE UNIVERSITY DSN 2007 Fault Injection Experiments © Vimal Reddy 2007 34 Injected faults on decode signals of a cycle- accurate simulator Based on MIPS R10K processor 1000 random faults per benchmark
35
NC STATE UNIVERSITY DSN 2007 Decode Signals Modeled © Vimal Reddy 2007 35
36
NC STATE UNIVERSITY DSN 2007 Fault Injection Outcomes © Vimal Reddy 2007 36 Fault SDCMask ITR+SDC MayITR+SDC Undet+SDC ITR+Mask Undet+Mask MayITR+Mask 92% 97% 7% 1% 37% 3% 0% 63% Refer paper for other interesting fault injection results
37
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 37
38
NC STATE UNIVERSITY DSN 2007 Vimal Reddy © 2007 38 Area Overhead The IBM S/390 G5 replicates the fetch and decode units (I-unit) ITR cache configuration similar to BTB in the picture (2K entries, 2-way assoc., 35 bits per entry) ITR cache is 1/7 th the I-unit *From [slegel-IEEE-Micro-1999]
39
NC STATE UNIVERSITY DSN 2007 Vimal Reddy © 2007 39 Power Overhead Redundant fetching is a major power overhead Compare I-cache of IBM power4 (64KB, dm, 128B line, 1 rd./wr. port) with ITR cache (16KB, 2-way, 4B line, 1 rd. and 1wr. port) Excluded redundant decoding energy (more savings in reality)
40
NC STATE UNIVERSITY DSN 2007 Outline Exploiting ITR for Fault Tolerance ITR Cache for Checking Decode Signals ITR Cache Hit/Miss Scenarios Microarchitecture Extensions to Superscalar Results –Repetition –Fault Coverage Based on Cache Hits/Misses –Fault Coverage Based on Fault Injection Area and Power Overhead Conclusion © Vimal Reddy 2007 40
41
NC STATE UNIVERSITY DSN 2007 Conclusion Inherent Time Redundancy is a new opportunity for efficient fault tolerance 96% of faults on decode signals detected by small ITR cache (~16 KB) Future work: –Handling benchmarks with less repetition –Exploiting ITR for other input-independent parts of the processor –Evaluating a comprehensive fault regimen that includes ITR © Vimal Reddy 2007 41
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.