Download presentation
Presentation is loading. Please wait.
1
Guihai Yan, Yinhe Han, and Xiaowei Li
A Unified Online Fault Detection Scheme via Checking of Stability Violation Guihai Yan, Yinhe Han, and Xiaowei Li Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences Apr. 22, 2009
2
Outline Introduction What’s Stability Violation
Fault Detection via Checking Stability Violation Design Considerations Hspice Simulation Results Conclusion
3
Introduction Two in-field reliability challenges Soft errors
SET, SEU Detection scheme: Redundancy (either temporal or spatial, or both) Aging failures Induced by NBTI, TDDB, etc. Detection scheme: Using aging sensor Can one fault model handle all of the above in-field faults? Since a unified detection scheme is possible only under a unified fault model!
4
What’s “Stability Violation”
Stable Period vs. Variable Period Stability Violation: Signal transitions occur in Stable Period.
5
In what situations would a SV occur?
When encounter delay faults resulting from Delay defects (introduced in manufacturing processes) Aging (Wearout) induced performance degradation Setup time Setup time violation Due to Delay Fault T T Thus, delay faults caused stability violation does not differ too much from “setup time violation” But, can soft errors be modeled by SV? YES!
6
Soft Errors can also cause SVs
SEU (Single Event Upset) Unintentional bit-flip in storage cells SET (Single Event Transient) Transient voltage pulse propagating in combinational logics SEU SET
7
How Soft Errors cause SV
Si violates Stability Requirement! SEU SET So violates Stability Requirement! Notice: NOLY the SVs occurring in “vulnerable window” --- within which the flip-flops are updated --- could cause failures.
8
Delay faults and soft errors can be modeled as
Now we conclude that… Delay faults and soft errors can be modeled as Stability Violations. The next problem is How to detect stability violations? Using Stability Checker
9
Stability Checker Basic operating principle Step1: Precharge S1 and S2 to “HIGH” Step2: Monitor state (evaluation) No stability violation S1 OR S2 = 1 Otherwise S1 OR S2 = 0 NOR Because during precharge checker is unable to monitor any signal, so when to precharge is an essential design consideration!
10
Objective of Manipulating Precharge (or Evaluation)
(1) Distinguish faulty transitions that cause SVs from normal signal transitions (2) Keep the vulnerable window under monitor (evaluation) (3) The evaluation period should be larger than the width of SET Eval. SET Update OR At the end? At the beginning?
11
Precharge at the end? Evaluation Precharge
NOT GOOD! Even a setup violation would escape unpunished!
12
At the beginning? Likely to catch normal tran---False Alarm !
What if a normal tran. happen here, far from setup requirement? Si So Likely to catch normal tran---False Alarm ! Comb. Precharge Evaluation What if a SEU occurs here and corresponding “So” 1) is masked (logic or latch window) 2) cause SV--- Propagation Detectable , or 3) is stabilized before the start of Eval. Precharge Mask Eval. Still NOT good! Precharge Eval.
13
At the Beginning (2) Precharge PDP. Eval. Still Open (XOR Protection)
Benign Period And precharge can be scheduled here Propagation Detectable Period
14
A Comprehensive Solution
tpd: propagation delay of the combinational logic tcd: contamination delay (a.k.a. short-path delay) tcq: flip-flop’s clock-to-q time TGB: “conservative” setup time requirement TDS: expected maximum width of SET
15
Experiments Using 65nm PTM Hspice Simulation Overhead Analysis Area
Power Performance Design complexity
16
Simulation Signal States
Guard Band Detection Slack CLK CLKS Normal Transions XOR So Fault Transions Aging delay SEU fault Voltage S1 S2 A1 B1 Fault detected X Fault detected Fault detected Time
17
Thank You! Conclusion A Unified Fault Model ---Stability Violation---
can facilitate implementing A Unified Fault Detection Scheme Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.