Download presentation
Presentation is loading. Please wait.
Published byEddy Brann Modified over 9 years ago
1
Using Hardware Vulnerability Factors to Enhance AVF Analysis Vilas Sridharan RAS Architecture and Strategy AMD, Inc. International Symposium on Computer Architecture June 23, 2010 David R. Kaeli ECE Department Northeastern University
2
What is this talk about? Transient faults Cause data corruption without damage to the underlying device Modeled as a bit flip in the microarchitecture (0 1 or vice versa) Vulnerability analysis Determines which faults matter and which do not Allows us to make informed decisions about which structures to protect We do this today using the Architectural Vulnerability Factor (AVF) This talk focuses primarily on the techniques For results, please refer to the paper 2
3
Architectural Vulnerability Factor (AVF) The fraction of bits in a hardware structure H that, when corrupted, will result in incorrect program output (an error) These bits are required for Architecturally Correct Execution (ACE bits) Other bits are unACE bits B H :Size in bits N:Number of cycles S. S. Mukherjee et al., Int’l Symposium on Microarchitecture, Dec 2003 3
4
Motivating Example Constant workload / Variable microarchitecture Variable workload / Constant microarchitecture AVF depends on hardwareand on software This talk focuses on quantifying hardware vulnerability V. Sridharan and D. R. Kaeli, Int’l Symposium on High Performance Computer Architecture, Feb 2009 4
5
Outline Introduction Quantifying Hardware Vulnerability Using HVF for Microarchitectural Exploration Estimating AVF at Runtime Conclusions 5
6
User ProgramOperating SystemVirtual MachineMicroarchitectureDevices A Typical System AVF TVF 6
7
The System Vulnerability Stack User ProgramOperating SystemVirtual MachineMicroarchitectureDevices Timing VF Program VF Operating System VFVirtual Machine VFHardware VF ABI ISA Functional VF = Vulnerability Factor ISA = Instruction Set Architecture ABI = Application Binary Interface 7
8
Fault Visibility Physical Registers Physical Memory Hardware-visible state Process (Virtual) Memory Program-visible state Reorder Buffer Issue Queue Load Buffer Store Buffer Architected Registers Hardware-visible fault Program-visible fault 8
9
Issue Queue Masked fault Exposed fault Hardware-visible fault Consequences of a Visible Fault Physical Registers Physical Memory Process (Virtual) Memory Reorder Buffer Load Buffer Store Buffer Architected Registers Hardware-visible fault Program-visible fault Activated fault 9
10
Hardware Vulnerability Factor The fraction of activated and exposed hardware-visible faults in hardware structure H These faults that cause a perturbation of the ISA Masked hardware-visible faults do not contribute to HVF B H :Size of H in bits N:Number of cycles 10
11
Outline Introduction Quantifying Hardware Vulnerability Using HVF for Microarchitectural Exploration Estimating AVF at Runtime Conclusions 11
12
Using HVF for Microarchitectural Exploration Full AVF analysis is possible at hardware design time Software workloads are available at design time Can HVF help? Provides additional insight to hardware designers Accelerates AVF simulation 12
13
Additional Insight Generated by HVF 0 1 2 3 4 5 6 7 8 9 10 Cycle Write Read P1 P2 P3 P4 (Live) Read (Dead) Read (Dead) Read (Dead) AVF = 10% 13
14
Additional Insight Generated by HVF 0 1 2 3 4 5 6 7 8 9 10 Cycle Write Read P1 P2 P3 P4 Read HVF = 40% HVF = 70% 14
15
Insight from HVF: Real-World Example equake mgrid Regions of similar register usage AVF ≈ 8% AVF ≈ 15% 15
16
Outline Introduction Quantifying Hardware Vulnerability Using HVF for Microarchitectural Exploration Estimating AVF at Runtime Conclusions 16
17
Estimating AVF at Runtime Allows a system to adapt to changing vulnerability environment Enable redundancy when AVF is high Increase performance when AVF is low Prior predictors don’t let software designers influence AVF estimate Predictors are entirely encoded in hardware Rely on training benchmarks or invariants (e.g., stored data is vulnerable) Assumptions fall apart in atypical programs (e.g., SW redundancy, games) We split AVF estimation into HVF and PVF components Allow software designers to measure PVF using a profiling step Estimate HVF in hardware at runtime using an HVF Monitor Unit < 3% error between measured and estimated AVF (see paper for details) 17
18
Summary Transient faults are a challenge for all processor manufacturers AVF analysis is a key part of understanding transient fault behavior HVF quantifies hardware vulnerability to transient faults HVF provides additional insight to hardware designers HVF simulation can accelerate AVF modeling during hardware design Runtime AVF estimation can be split into HVF and PVF components Software designers can influence runtime AVF estimates 18 HVF generates meaningful insight into system vulnerability to transient faults
19
Using Hardware Vulnerability Factors to Enhance AVF Analysis Questions?
20
References V. Sridharan and D. R. Kaeli, Using Hardware Vulnerability Factors to Enhance AVF Analysis, Int’l Symp. on Computer Architecture (ISCA-37), June 2010. V. Sridharan and D. R. Kaeli, Eliminating Microarchitectural Dependency from Architectural Vulnerability, Int’l Symp. on High-Performance Computer Architecture (HPCA-15), February 2009. A. Dixit et al., Trends from Ten Years of Soft Error Experimentation, Workshop on Silicon Errors in Logic – System Effects, March 2009. V. Sridharan and D. R. Kaeli, The Effect of Input Data on Program Vulnerability, Workshop on Silicon Errors in Logic – System Effects (SELSE-5), March 2009. V. Sridharan and D. R. Kaeli, Reliability in the Shadow of Long-Stall Instructions, Workshop on Silicon Errors in Logic – System Effects (SELSE-3), April 2007. R. Baumann, Radiation-Induced Soft Errors in Advanced Semiconductor Technologies, IEEE Trans. On Device and Materials Reliability, September 2005. S. S. Mukherjee et al., A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor, Int’l Symp. on Microarchitecture (MICRO-36), December 2003. P. Roche et al., Comparisons of Soft Error Rate for SRAMs in Commercial SOI and Bulk Below the 130-nm Technology Node, IEEE Trans. on Nuclear Science, December 2003. J. D. Dirk et al., Terrestrial Thermal Neutrons, IEEE Trans. On Nuclear Science, December 2003. 20
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.