Download presentation
Presentation is loading. Please wait.
1
University of Michigan Electrical Engineering and Computer Science 1 Online Timing Analysis for Wearout Detection Jason Blome, Shuguang Feng, Shantanu Gupta, Scott Mahlke University of Michigan
2
Electrical Engineering and Computer Science 2 Wearout Mechanisms There are a lot of them: ► Electromigration (EM) ► Time-dependent dielectric breakdown (TDDB) ► Negative-bias threshold inversion (NBTI) ► Hot carrier injection (HCI) ► … All highly dependent on temperature and current density ► Both increasing fast!
3
University of Michigan Electrical Engineering and Computer Science 3 Goals of this Research Low-cost reliable system design ► How do physical wearout mechanisms progress ► How to determine that a device has failed ► How do we maintain operation given failed components
4
University of Michigan Electrical Engineering and Computer Science 4 Traditional and Recent Approaches Traditional detection techniques expensive ► Redundant checking structures Predictive techniques ► Canary circuits ► RAMP
5
University of Michigan Electrical Engineering and Computer Science 5 Proposed Technique Key Insight: ► Degradation in silicon decrease in performance ► Long incubation time followed by rapid deterioration Examples: ► TDDB: increases leakage, shifting voltage curves ► EM: increases resistance ► NBTI: shifts threshold voltage
6
University of Michigan Electrical Engineering and Computer Science 6 Outline Microprocessor model Wearout simulation methodology Wearout simulation results The wearout detection unit (WDU) WDU Analysis Conclusion
7
University of Michigan Electrical Engineering and Computer Science 7 Simulation Setup Open RISC 1200 Area1.28mm 2 Power92.2mW Clock Frequency200MHz Data Cache8KB Instruction Cache8KB
8
University of Michigan Electrical Engineering and Computer Science 8 Simulation Flow Step 1: Temperature and Activity Analysis Netlist Timing Synopsys VCS Activity Trace Parasitics PrimePowerHotSpot Power Trace Temperature Trace Benchmark
9
University of Michigan Electrical Engineering and Computer Science 9 Simulation Flow Step 2: Wearout Simulation Timing Synopsys VCS Benchmark Age Index MTTF Calculation Netlist Temperature Activity Relative Wearout Factors Signal Latency Data Wearout Simulation Device Delay = Original Delay * RWF * AI * RV ► RWF: Relative amount of wearout for a device ► AI: Performance degradation parameterized by age ► RV: Random variable
10
University of Michigan Electrical Engineering and Computer Science 10 Simulation Flow Step 2: Wearout Simulation
11
University of Michigan Electrical Engineering and Computer Science 11 Wearout Simulation Results Time (years) Signal Latency (ps) Sample Mean Latency (ps)
12
University of Michigan Electrical Engineering and Computer Science 12 Exploiting Performance Degradation Exponential moving average: ► EMA = α(sample – EMA previous ) + EMA previous
13
University of Michigan Electrical Engineering and Computer Science 13 Trend Analysis TRIX can be used to accurately track both local and long term latency trends
14
University of Michigan Electrical Engineering and Computer Science 14 Wearout Analysis Circuit input signal Latency Sampling TRIX l Calculation Prediction TRIX g Calculation 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 1 1 1
15
University of Michigan Electrical Engineering and Computer Science 15 System Integration Latency Sampling Prediction TRIX l Calculation + 0 TRIX g Calculation
16
University of Michigan Electrical Engineering and Computer Science 16 Dynamic Variation Temperature ► 50 o C ~4% increase in latency at 130nm Clock jitter ► Impact on latency varies ► Mean jitter typically modeled as 0 Worst-case variation would need to be sampled 12 times over 4 days
17
University of Michigan Electrical Engineering and Computer Science 17 WDU Implementation WDU (1 Signal)WDU (8 Signals)OR1200 Core Area (mm 2 ) 0.0140.0571.28 Power (mW) 1.158.0292.22
18
University of Michigan Electrical Engineering and Computer Science 18 WDU Prediction Results Each unit calibrated for a 30 year MTTF The WDU flagged at least one output from each module prior to the MTTF
19
University of Michigan Electrical Engineering and Computer Science 19 Lifetime Enhancement
20
University of Michigan Electrical Engineering and Computer Science 20 Conclusion Low-cost reliable system design ► Physical wearout mechanisms affect timing ► Failure prediction can be much cheaper than detection Wearout detection unit: ► Online timing analysis a good detector of wearout, predictor of failure ► Generic/self calibrating
21
University of Michigan Electrical Engineering and Computer Science 21 Simulation Results: Temperature and MTTF
22
University of Michigan Electrical Engineering and Computer Science 22 Technology Scaling Quickly shrinking feature sizes Sharp increase in frequency Slow decrease in supply voltage OR1200 Power Densities
23
University of Michigan Electrical Engineering and Computer Science 23
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.