Download presentation
Presentation is loading. Please wait.
Published byCameron Watkins Modified over 9 years ago
1
A F AST AND A CCURATE M ULTI -C YCLE S OFT E RROR R ATE E STIMATION A PPROACH TO R ESILIENT E MBEDDED S YSTEMS D ESIGN Department of Computer Engineering Sharif University of Technology Tehran, IRAN Mahdi Fazeli, Seyed Ghassem Miremadi, Hossein Asadi, Seyed Nematollah Ahmadian Presenter: Saman Aliari University of Illinois at Urbana Chamapign
2
S PEECH O UTLINES Soft Errors SER Modeling in Multi-Cycle Operation SER Modeling in Single Cycle Operation Proposed SER Modeling in Multi Cycle Operation Tool Overview Experimental Results and Discussions Conclusions 2
3
W HAT IS SOFT ERROR ? 3 Transient Faults Transient Faults Due to radiation events 1 0 or 0 1 1 0 or 0 1 Alpha particles or Neutrons Memory, Flip-flops, Combinational Logic 1 1 1 1 1 0 0 0 0 Energetic Particle
4
E VIDENCES OF P ARTICLE S TRIKES 2000 [Forbes Magezine’00] SUN Enterprise servers crash, due to Cache problem 2001 [ITRS’01] Soft errors as a major issue in chip design 2003 [EE Times’04] Cisco routers failure, due to soft errors 2004 [Xilinx.com] Xilinx FPGAs highly sensitive to soft errors 2005 [Selse.org] Soft error workshop (70% industry attendees) 2011 [ZeroSoft’06] Expected 70% chips to fail in a year 4
5
M ULTI -C YCLE S OFT E RROR P ROPAGATION 5 First Cycle: The SET does not propagate to the Primary Output (PO) Second Cycle: The error propagates to the Primary Output (PO)
6
SER M ODELING IN S INGLE C YCLE Nominal FIT Logic Derating Timing Derating Electrical Derating Nominal FIT: Occurrence rate of cosmic rays at error site Computed once for library characterization Logical Derating Timing Derating Electrical Derating 6 D BC E A D FF 1 1
7
L OGICAL D ERATING M ODELING 7 The Main Idea: Traversing structural paths from SEU site to POs and FFs Using Signal Probabilities (SP) for off-path signals SP A : probability of gate “A” having logic value “1” Effective techniques available for SP computation EPP(A D) = SP B = 0.2 EPP: Error Propagation Probability EPP(A D) = SP B = 0.2 EPP: Error Propagation Probability EPP(A E) = EPP(A D) (1-SP C ) = 0.2 0.6 = 0.12 EPP(A E) = EPP(A D) (1-SP C ) = 0.2 0.6 = 0.12 off-path signals SP B =0.2 SP C =0.4 D B C E A on-path signals
8
P ROPAGATION R ULES : O N -P ATH G ATES Reconvergent Paths Error propagated to two or more inputs of a gate Polarity of propagated error matters! Need of 4 logic values to represent state of each line 0, 1 : no error propagation (Error masked) a: error propagation with same polarity as error site ā : error propagation with opposite polarity as error site P a (U i ), P ā (U i ), P 1 (U i ), P 0 (U i ) Developed Error Propagation Probability (EPP) Rules For all logic gates 8
9
P ROPAGATION R ULES 9 On-path gates: P a (U i ) + P ā (U i ) + P 1 (U i ) + P 0 (U i ) = 1 On-path gates: P a (U i ) + P ā (U i ) + P 1 (U i ) + P 0 (U i ) = 1 Off-path gates: P 1 (U i ) + P 0 (U i ) = 1 Off-path gates: P 1 (U i ) + P 0 (U i ) = 1
10
T IMING D ERATING M ODELING Find all possible propagated waveforms Enhanced static timing analysis Record all possible transitions at each reachable gate Due to glitch at error site How? Create glitch of width w Represented by two events: (a,t), (ā,t+w) For both positive and negative glitches Inject two events (a,t), (ā,t+w) at error site Find all events at the outputs of all on-path gates Calculate the error propagation probabilities Pa, Pā for each event The propagation is done until reaching a PO or FF. Error propagation probabilities for all possible waveforms are computed For each waveform, Latching Probability is computed as follows: S: Setup Time, H: Hold Time, W: Glitch Width, T:Clock Period 10
11
11 Different Glitches may propagate to the POs or FFs due to re-convergent fan- out
12
E LECTRICAL D ERATING M ODELING 1. Algorithm:Computing electrical masking while propagating events 2. Vomin(Gj, inputk):Minimum voltage of input k of Gj 3. Vomax(Gj, inputk):Maximum voltage of input k of Gj 4. Vomin(Gj ): Minimum voltage of Gj output 5. Vomax(Gj ): Maximum voltage of Gj output 6. PWo: Output pulse width 7. For each gate Gj in List(Gi) do 8. For each valid waveform (Wl) in Event List(Gj) do 9. Vomin(inputs) = Max(V omin of gate inputs on waveform Wl); 10. Vomax(inputs) = Min(V omax of gate inputs on waveform Wl); 11. Compute Vomin(Gj ) 12. Compute Vomax(Gj ) 13. Compute Pwo using computed Vomin(Gj ) and Vomax(Gj ) 14. end 15. end 12
13
A C ASE S TUDY : E RROR P ROPAGATION FOR T WO C LOCK C YCLES 13 Only logical derating may occur All three deratings may occur
14
T HE T OOL : MLET M ULTI -C YCLE L OGICAL -E LECTRICAL -T IMING D ERATING 14
15
E XPERIMENTAL R ESULTS : R UN T IME 15 Execution times for MC simulation approach, SP computation, and MLET approach On average, 4 orders of magnitude faster than MC based simulation Time required to compute SPs is also 5 orders of magnitude less than MC based simulation
16
E XPERIMENTAL R ESULTS : A CCURACY 16 Difference of derating factors obtained by MLET using various SP variances compared to MC simulations (for an injected pulse width of 50 ps) The MLET have an accuracy of about 97% as compared to the MC fault injection approach
17
M ULTI -C YCLE SER S 17 Multi-cycle SER estimation of s820 and s832 ISCAS’89 circuits using MLET
18
C ONCLUSIONS & F UTURE W ORK SER Estimation is very challenging as it requires dynamic analysis of transients. The existing SER estimation approaches rely on investigation of error propagation probabilities for only single cycle resulting in inaccurate system failure rate. We have proposed a very fast and accurate analytical approach so called MLET which has four main features: 1. It runs very fast. 2. All three masking factors are considered. 3. The effects of error propagation in re-convergent fan-outs are modeled. 4. The effect of multi-cycle error propagation on overall circuit SER is considered. 18
19
C ONCLUSIONS & F UTURE W ORK C ONT ’ D Experimental results extracted for some ISCAS89 circuit benchmark show that MLET is: 4 orders of magnitude faster than the MC simulation based fault injection method It has an accuracy of about 97%. Future work: we are going to estimate the SER of a circuit in the presence of Multiple Event Transients (METs) as a reliability concern in ultra deep sub-micron technologies 19
20
T HANK YOU FOR YOUR ATTENTION 20
21
R ELATED W ORK : SER M ODELING Circuit/Logic-Level Approach Fault injection SERA by Zhang et. al. [ICCAD’04] SEAT-LA by Rajaraman et. al. [VLSID’06] Mohanram et. al. [ITC’03] Maheshwari et. al. [DFT’03] Asadi et. al. [DSN’03] [PRDC’04] Seifert et. al. [TDMR’04] Probabilistic Transfer Matrices (PTM) Krishnaswamy et. al. [DATE’05] Binary Decision Diagram (BDD) FASER by Zhang et. al. [ISQED’06] [SELSE’05] 21
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.