University of Michigan Electrical Engineering and Computer Science 1 Online Timing Analysis for Wearout Detection Jason Blome, Shuguang Feng, Shantanu.

Slides:



Advertisements
Similar presentations
Reliability Enhancement via Sleep Transistors Frank Sill Torres +, Claas Cornelius*, Dirk Timmermann* + Department of Electronic Engineering, Federal University.
Advertisements

Tunable Sensors for Process-Aware Voltage Scaling
-1- VLSI CAD Laboratory, UC San Diego Post-Routing BEOL Layout Optimization for Improved Time- Dependent Dielectric Breakdown (TDDB) Reliability Tuck-Boon.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
University of Michigan Advanced Computer Architecture Laboratory StageWeb: Interweaving Pipeline Stages into a Wearout and Variation Tolerant CMP Fabric.
Power Reduction Techniques For Microprocessor Systems
Twin Logic Gates – Improved Logic Reliability by Redundancy concerning Gate Oxide Breakdown Hagen Sämrow, Claas Cornelius, Frank Sill, Andreas Tockhorn,
Mitigating the Performance Degradation due to Faults in Non-Architectural Structures Constantinos Kourouyiannis Veerle Desmet Nikolas Ladas Yiannakis Sazeides.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science August 20, 2009 Enabling.
BURN-IN, RELIABILITY TESTING, AND MANUFACTURING OF SEMICONDUCTORS
Optical Interconnects Speeding Up Computing Matt Webb PICTURE HERE.
On Modeling the Lifetime Reliability of Homogeneous Manycore Systems Lin Huang and Qiang Xu CUhk REliable computing laboratory (CURE) The Chinese University.
Lifetime Reliability-Aware Task Allocation and Scheduling for MPSoC Platforms Lin Huang, Feng Yuan and Qiang Xu Reliable Computing Laboratory Department.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Self-calibrated.
Test Wrapper Designs for the Detection of Signal Integrity Faults on Core External Interconnects of SOCs Qiang Xu and Yubin ZhangKrishnendu Chakrabarty.
Cost-Efficient Soft Error Protection for Embedded Microprocessors
University of Michigan Electrical Engineering and Computer Science 1 StageNet: A Reconfigurable CMP Fabric for Resilient Systems Shantanu Gupta Shuguang.
University of Michigan Electrical Engineering and Computer Science 1 Top 5 Reasons Reliability is the Biggest Fallacy in Computer Architecture Research.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Maestro: Orchestrating.
Optical Interconnects Speeding Up Computing Matt Webb PICTURE HERE.
1 paper I design and implementation of the aegis single-chip secure processor using physical random functions, isca’05 nuno alves 28/sep/06.
Statistical Critical Path Selection for Timing Validation Kai Yang, Kwang-Ting Cheng, and Li-C Wang Department of Electrical and Computer Engineering University.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
University of Michigan Electrical Engineering and Computer Science 1 A Microarchitectural Analysis of Soft Error Propagation in a Production-Level Embedded.
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY,
UC San Diego / VLSI CAD Laboratory Reliability-Constrained Die Stacking Order in 3DICs Under Manufacturing Variability Tuck-Boon Chan, Andrew B. Kahng,
-1- UC San Diego / VLSI CAD Laboratory Methodology for Electromigration Signoff in the Presence of Adaptive Voltage Scaling Wei-Ting Jonas Chan, Andrew.
Analysis of Instruction-level Vulnerability to Dynamic Voltage and Temperature Variations ‡ Computer Science and Engineering, UC San Diego variability.org.
Advanced Computing and Information Systems laboratory Device Variability Impact on Logic Gate Failure Rates Erin Taylor and José Fortes Department of Electrical.
Items for Discussion Chip reliability & testing Testing: who/where/what ??? GBTx radiation testing GBTx SEU testing Packaging – Low X0 options, lead free.
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
SMART: A Single- Cycle Reconfigurable NoC for SoC Applications -Jyoti Wadhwani Chia-Hsin Owen Chen, Sunghyun Park, Tushar Krishna, Suvinay Subramaniam,
Presented By : LAHSAINI Achraf & MAKARA Felipe.  Introduction  Difficult Challenges : > Difficult Challenges between 2013 – 2020 > Difficult Challenges.
The George Washington University School of Engineering and Applied Science Department of Electrical and Computer Engineering ECE122 – Lab 7 MOSFET Parameters.
The George Washington University School of Engineering and Applied Science Department of Electrical and Computer Engineering ECE122 – Lab 7 MOSFET Parameters.
A Power Grid Analysis and Verification Tool Based on a Statistical Prediction Engine M.K. Tsiampas, D. Bountas, P. Merakos, N.E. Evmorfopoulos, S. Bantas.
On-Chip Reliability Monitor for Measuring Frequency Degradation of Digital Circuits Department of Electrical and Computer Engineering By Han Lin Jiun-Yi.
1 A Cost-effective Substantial- impact-filter Based Method to Tolerate Voltage Emergencies Songjun Pan 1,2, Yu Hu 1, Xing Hu 1,2, and Xiaowei Li 1 1 Key.
Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST,
Outline Introduction: BTI Aging and AVS Signoff Problem
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science 1 Bundled Execution.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science Adaptive Online Testing.
SIAM M. Despeisse / 29 th January Toward a Gigatracker Front-end - Performance of the NINO LCO and HCO Matthieu Despeisse F. Osmic, S. Tiuraniemi,
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,
Harnessing Soft Computation for Low-Budget Fault Tolerance Daya S Khudia Scott Mahlke Advanced Computer Architecture Laboratory University of Michigan,
1 Design for Reliability
-1- UC San Diego / VLSI CAD Laboratory On Potential Design Impacts of Electromigration Awareness Andrew B. Kahng, Siddhartha Nath and Tajana S. Rosing.
Taniya Siddiqua, Paul Lee University of Virginia, Charlottesville.
-1- UC San Diego / VLSI CAD Laboratory Optimal Reliability-Constrained Overdrive Frequency Selection in Multicore Systems Andrew B. Kahng and Siddhartha.
University of Toronto,Toronto, Ontario, Canada 1 Circuit Research Labs, Intel Corporation, Hillsboro, OR Variations-Aware Low-Power Design with Voltage.
CS203 – Advanced Computer Architecture
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
CS203 – Advanced Computer Architecture Dependability & Reliability.
M. Atef, Hong Chen, and H. Zimmermann Vienna University of Technology
University of Michigan Electrical Engineering and Computer Science 1 Low Cost Control Flow Protection Using Abstract Control Signatures Daya S Khudia and.
Power-Optimal Pipelining in Deep Submicron Technology
CS203 – Advanced Computer Architecture
Raghuraman Balasubramanian Karthikeyan Sankaralingam
Harmonic Distortion Analyzer, Wave Analyzer and Function Generator
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Maestro: Orchestrating Lifetime Reliability in Chip Multiprocessors
Circuits Aging Min Chen( ) Ran Li( )
MCP Electronics Time resolution, costs
Circuits Aging Min Chen( ) Ran Li( )
Post-Silicon Calibration for Large-Volume Products
Encountering Gate Oxide Breakdown with Shadow Transistors to Increase Reliability Claas Cornelius1, Frank Sill2, Hagen Sämrow1, Jakob Salzmann1, Dirk Timmermann1,
HotAging — Impact of Power Dissipation on Hardware Degradation
Presentation transcript:

University of Michigan Electrical Engineering and Computer Science 1 Online Timing Analysis for Wearout Detection Jason Blome, Shuguang Feng, Shantanu Gupta, Scott Mahlke University of Michigan

Electrical Engineering and Computer Science 2 Wearout Mechanisms There are a lot of them: ► Electromigration (EM) ► Time-dependent dielectric breakdown (TDDB) ► Negative-bias threshold inversion (NBTI) ► Hot carrier injection (HCI) ► … All highly dependent on temperature and current density ► Both increasing fast!

University of Michigan Electrical Engineering and Computer Science 3 Goals of this Research Low-cost reliable system design ► How do physical wearout mechanisms progress ► How to determine that a device has failed ► How do we maintain operation given failed components

University of Michigan Electrical Engineering and Computer Science 4 Traditional and Recent Approaches Traditional detection techniques expensive ► Redundant checking structures Predictive techniques ► Canary circuits ► RAMP

University of Michigan Electrical Engineering and Computer Science 5 Proposed Technique Key Insight: ► Degradation in silicon  decrease in performance ► Long incubation time followed by rapid deterioration Examples: ► TDDB: increases leakage, shifting voltage curves ► EM: increases resistance ► NBTI: shifts threshold voltage

University of Michigan Electrical Engineering and Computer Science 6 Outline Microprocessor model Wearout simulation methodology Wearout simulation results The wearout detection unit (WDU) WDU Analysis Conclusion

University of Michigan Electrical Engineering and Computer Science 7 Simulation Setup Open RISC 1200 Area1.28mm 2 Power92.2mW Clock Frequency200MHz Data Cache8KB Instruction Cache8KB

University of Michigan Electrical Engineering and Computer Science 8 Simulation Flow Step 1: Temperature and Activity Analysis Netlist Timing Synopsys VCS Activity Trace Parasitics PrimePowerHotSpot Power Trace Temperature Trace Benchmark

University of Michigan Electrical Engineering and Computer Science 9 Simulation Flow Step 2: Wearout Simulation Timing Synopsys VCS Benchmark Age Index MTTF Calculation Netlist Temperature Activity Relative Wearout Factors Signal Latency Data Wearout Simulation Device Delay = Original Delay * RWF * AI * RV ► RWF: Relative amount of wearout for a device ► AI: Performance degradation parameterized by age ► RV: Random variable

University of Michigan Electrical Engineering and Computer Science 10 Simulation Flow Step 2: Wearout Simulation

University of Michigan Electrical Engineering and Computer Science 11 Wearout Simulation Results Time (years) Signal Latency (ps) Sample Mean Latency (ps)

University of Michigan Electrical Engineering and Computer Science 12 Exploiting Performance Degradation Exponential moving average: ► EMA = α(sample – EMA previous ) + EMA previous

University of Michigan Electrical Engineering and Computer Science 13 Trend Analysis TRIX can be used to accurately track both local and long term latency trends

University of Michigan Electrical Engineering and Computer Science 14 Wearout Analysis Circuit input signal Latency Sampling TRIX l Calculation Prediction TRIX g Calculation

University of Michigan Electrical Engineering and Computer Science 15 System Integration Latency Sampling Prediction TRIX l Calculation + 0 TRIX g Calculation

University of Michigan Electrical Engineering and Computer Science 16 Dynamic Variation Temperature ► 50 o C  ~4% increase in latency at 130nm Clock jitter ► Impact on latency varies ► Mean jitter typically modeled as 0 Worst-case variation would need to be sampled 12 times over 4 days

University of Michigan Electrical Engineering and Computer Science 17 WDU Implementation WDU (1 Signal)WDU (8 Signals)OR1200 Core Area (mm 2 ) Power (mW)

University of Michigan Electrical Engineering and Computer Science 18 WDU Prediction Results Each unit calibrated for a 30 year MTTF The WDU flagged at least one output from each module prior to the MTTF

University of Michigan Electrical Engineering and Computer Science 19 Lifetime Enhancement

University of Michigan Electrical Engineering and Computer Science 20 Conclusion Low-cost reliable system design ► Physical wearout mechanisms affect timing ► Failure prediction can be much cheaper than detection Wearout detection unit: ► Online timing analysis a good detector of wearout, predictor of failure ► Generic/self calibrating

University of Michigan Electrical Engineering and Computer Science 21 Simulation Results: Temperature and MTTF

University of Michigan Electrical Engineering and Computer Science 22 Technology Scaling Quickly shrinking feature sizes Sharp increase in frequency Slow decrease in supply voltage OR1200 Power Densities

University of Michigan Electrical Engineering and Computer Science 23