Architectural Vulnerability Factor (AVF) Computation for Address-Based Structures Arijit Biswas, Paul Racunas, Shubu Mukherjee FACT Group, DEG, Intel Joel.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Memory Management: Overlays and Virtual Memory
Lecture 12 Reduce Miss Penalty and Hit Time
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee Margaret Martonosi.
Performance of Cache Memory
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
1 Saad Arrabi 2/24/2010 CS  Definition of soft errors  Motivation of the paper  Goals of this paper  ACE and un-ACE bits  Results  Conclusion.
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
IVF: Characterizing the Vulnerability of Microprocessor Structures to Intermittent Faults Songjun Pan 1,2, Yu Hu 1, and Xiaowei Li 1 1 Key Laboratory of.
Using Hardware Vulnerability Factors to Enhance AVF Analysis Vilas Sridharan RAS Architecture and Strategy AMD, Inc. International Symposium on Computer.
® 1 ISCA 2004 Shubu Mukherjee, FACT Group, MMDC, Intel Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor Techniques to Reduce.
© Karen Miller, What do we want from our computers?  correct results we assume this feature, but consider... who defines what is correct?  fast.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
CS 7810 Lecture 25 DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design T. Austin Proceedings of MICRO-32 November 1999.
Mitigating the Performance Degradation due to Faults in Non-Architectural Structures Constantinos Kourouyiannis Veerle Desmet Nikolas Ladas Yiannakis Sazeides.
Hit or Miss ? !!!.  Cache RAM is high-speed memory (usually SRAM).  The Cache stores frequently requested data.  If the CPU needs data, it will check.
CS Lecture 10 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers N.P. Jouppi Proceedings.
Computer ArchitectureFall 2008 © November 10, 2007 Nael Abu-Ghazaleh Lecture 23 Virtual.
LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop Kim, Yuanyuan.
Energy Efficient Instruction Cache for Wide-issue Processors Alex Veidenbaum Information and Computer Science University of California, Irvine.
Cost-Effective Register File Soft Error reduction Pablo Montesinos, Wei Liu and Josep Torellas, University of Illinois at Urbana-Champaign.
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
Vir. Mem II CSE 471 Aut 011 Synonyms v.p. x, process A v.p. y, process B v.p # index Map to same physical page Map to synonyms in the cache To avoid synonyms,
A Characterization of Processor Performance in the VAX-11/780 From the ISCA Proceedings 1984 Emer & Clark.
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University See P&H Chapter: , 5.8, 5.10, 5.15; Also, 5.13 & 5.17.
Paging. Memory Partitioning Troubles Fragmentation Need for compaction/swapping A process size is limited by the available physical memory Dynamic growth.
CSC 4250 Computer Architectures December 5, 2006 Chapter 5. Memory Hierarchy.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
A Hardware-based Cache Pollution Filtering Mechanism for Aggressive Prefetches Georgia Institute of Technology Atlanta, GA ICPP, Kaohsiung, Taiwan,
Lecture 2 Process Concepts, Performance Measures and Evaluation Techniques.
Lecture 19: Virtual Memory
Energy-Efficient Cache Design Using Variable-Strength Error-Correcting Codes Alaa R. Alameldeen, Ilya Wagner, Zeshan Chishti, Wei Wu,
Dept. of Computer and Information Sciences : University of Delaware John Cavazos Department of Computer and Information Sciences University of Delaware.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Implementing Precise Interrupts in Pipelined Processors James E. Smith Andrew R.Pleszkun Presented By: Ravikumar Source:
CS 211: Computer Architecture Lecture 6 Module 2 Exploiting Instruction Level Parallelism with Software Approaches Instructor: Morris Lancaster.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,
Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.
Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput.
11 Online Computing and Predicting Architectural Vulnerability Factor of Microprocessor Structures Songjun Pan Yu Hu Xiaowei Li {pansongjun, huyu,
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
1 System-Level Vulnerability Estimation for Data Caches.
Memory Management: Overlays and Virtual Memory. Agenda Overview of Virtual Memory –Review material based on Computer Architecture and OS concepts Credits.
Harnessing Soft Computation for Low-Budget Fault Tolerance Daya S Khudia Scott Mahlke Advanced Computer Architecture Laboratory University of Michigan,
Methodology to Compute Architectural Vulnerability Factors Chris Weaver 1, 2 Shubhendu S. Mukherjee 1 Joel Emer 1 Steven K. Reinhardt 1, 2 Todd Austin.
Computer Organization CS224 Fall 2012 Lessons 39 & 40.
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
SOFTENG 363 Computer Architecture Cache John Morris ECE/CS, The University of Auckland Iolanthe I at 13 knots on Cockburn Sound, WA.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
COMPSYS 304 Computer Architecture Cache John Morris Electrical & Computer Enginering/ Computer Science, The University of Auckland Iolanthe at 13 knots.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
CMSC 611: Advanced Computer Architecture
Improving Multi-Core Performance Using Mixed-Cell Cache Architecture
Memory COMPUTER ARCHITECTURE
Lecture 12 Virtual Memory.
TLC: A Tag-less Cache for reducing dynamic first level Cache Energy
Module IV Memory Organization.
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Module IV Memory Organization.
Dynamic Prediction of Architectural Vulnerability
Dynamic Prediction of Architectural Vulnerability
CSE451 Virtual Memory Paging Autumn 2002
COMP755 Advanced Operating Systems
Presentation transcript:

Architectural Vulnerability Factor (AVF) Computation for Address-Based Structures Arijit Biswas, Paul Racunas, Shubu Mukherjee FACT Group, DEG, Intel Joel Emer VSSAD, Intel Razvan Cheveresan Sun Microsystems, Intern FACT Group Ram Rangan Princeton University, Intern FACT Group

FACT Group, Intel2 Moore’s Law Graph Soft errors are a serious problem –Assuming a certain error rate, failure rate of whole chip increases 12x GAP Chart based on 200,000 latches as used in the Fujitsu SPARC Processor (2003)

FACT Group, Intel3 All bits are not created equal! Bit 1 0 Particle Strike Causes Bit Flip!

FACT Group, Intel4 All bits are not created equal! Bit Read? Bit has error protection benign fault no error yes no Does bit matter? no Does bit matter? Particle Strike Causes Bit Flip! Detection only Detection & Correction benign fault no error benign fault no error Silent Data Corruption yes no True Detected Unrecoverable Error False Detected Unrecoverable Error yes no

FACT Group, Intel5 Does bit matter? Architectural Vulnerability Factor (AVF) –Probability that a bit flip will cause user-visible error Soft Error Rate of a Structure = (AVF bit ) x (# Bits) x (Intrinsic Error Rate) bit Reducing AVF reduces SER –High AVF indicates need for protection –Low AVF can help remove protection hardware SER Protection can be Expensive –Impacts Area, Power, Performance, Design Time

FACT Group, Intel6 Simple Examples Committed Program Counter AVF ~ 100% Branch Predictor AVF = 0%

FACT Group, Intel7 Complex Examples Instruction Queue AVF = 29% Execution Units AVF = 9% Used a new concept –Architecturally Correct Execution (ACE)

FACT Group, Intel8 Architecturally Correct Execution (ACE) ACE path requires only a subset of values to flow correctly through the program’s data flow graph (and the machine) Anything else (un-ACE path) can be derated away Program Input Program Outputs

FACT Group, Intel9 Example of un-ACE instruction: Dynamically Dead Instruction Dynamically Dead Instruction Most bits of an un-ACE instruction do not affect program output

FACT Group, Intel10 ACE Breakdown of Instruction Queue Average across all of Spec2K slices for an IA64-like processor ACE % = AVF = 29%

FACT Group, Intel11 A New AVF Analysis – Address-Based Structures Caches, data translation buffers, store buffers –Make up large portions of a modern chip Simple ACE analysis is no longer enough Data & Tag structures need new concepts –Extended Lifetime Analysis –Hamming-Distance-1 Analysis –Cooldown –AVF Reduction - Flushing

FACT Group, Intel12 Lifetime Analysis Idle is unACE –Assuming all time intervals are equal –For 3/5 of the lifetime the bit is valid –Gives a measure of the structure’s utilization Number of useful bits Amount of time useful bits are resident in structure Valid for a particular trace Idle Valid FillRead Evict

FACT Group, Intel13 Lifetime Analysis of Write-through Data Cache Valid is not necessarily ACE ACE % = AVF = 2/5 = 40% Example Lifetime Components –ACE: fill-to-read, read-to-read –unACE: idle, read-to-evict, write-to-evict Idle FillRead Evict Write-through Data Cache

FACT Group, Intel14 Lifetime Analysis of Write-through Data Cache Data ACEness is a function of instruction ACEness Second Read is by an unACE instruction AVF = 1/5 = 20% Idle FillRead Evict Write-through DCache

FACT Group, Intel15 Tags are Hard A fault associated with a tag that is nominally associated with a particular instruction can impact the correct execution of a different independent instruction False Negatives only error if writeback is necessary –Uses standard lifetime analysis False Positives always result in error –Need bit-level analysis

FACT Group, Intel16 False Positive Expected Tag Miss, but got Hit – Error How do you compute the AVF? Fault injection? Incoming AddressTag Address Incoming Address Tag Address MISS HIT Expect: Acquire:

FACT Group, Intel17 Hamming-Distance-1 Analysis Assuming a single-bit error model Now we can use lifetime analysis on the identified bit(s) Tag Array Incoming Address Hamming-Distance-1 Match

FACT Group, Intel18 Edge Effects Simulation introduces unknown component –Simulation not run to completion –Only execute small segment of code Worst Case AVF = Known AVF + Unknown AVF How do we reduce/eliminate unknown? Idle Unknown FillRead Evict Not Simulated Sim End

FACT Group, Intel19 Cooldown run simulation beyond end interval. –Any bits that were already valid (the unknown bits), are resolved Trend: unknown AVF primarily resolves to unACE Best Estimate AVF = Known AVF after Cooldown 10 Million Instructions Simulation 10 Million Instructions Cooldown No Cooldown Cooldown

FACT Group, Intel20 Data AVFs (Average) STB AVF lower due to large idle component and bytemasks DTB AVF higher due to high average utilization Dcache (WB) AVF higher than Dcache (WT) since dirty bytes still ACE after last read

FACT Group, Intel21 Data AVF of DTB Large variability in AVF Ranges from ~0% to 80% Based on structure utilization by benchmark

FACT Group, Intel22 Tag AVFs (Average) Tag AVFs lower than expected for DTB and DCache (WT) –Only Hamming-Distance-1 matches contribute ACE time Tag AVFs higher than data for STB and DCache (WB) –Dynamically dead tags are still ACE for dirty bytes

FACT Group, Intel23 Tag AVF of DTB AVFs surprisingly small, little variation Protection added to DTB CAMs prior to AVF calculation (large # bits) AVF calculation shows NO protection was needed in this case

FACT Group, Intel24 AVF Observations DTB and Write-through Data Cache –Typically Tag AVF < Data AVF only hamming-distance 1 hits contribute to Tag AVF dynamic dead data are unACE STB and Write-back Data Cache –Typically Tag AVF ≥ Data AVF Tag AVF ACE till eviction if line is dirty dynamic dead data can be ACE Bytemasks and writes may make certain bytes of data unACE while all bits of tag are always ACE

FACT Group, Intel25 AVF Reduction: Flushing Flushing (emulates a context switch) –Also eliminates unknowns by flushing all live entries at end of simulation Main concept: Transform part of ACE time into unACE at the Expense of some Performance Idle ACE FillRead Evict Flush Fill

FACT Group, Intel26 AVF Reduction: Flushing –>50% AVF reduction for 100K cycle Flush (Flush takes 0 time) Max IPC reduction: 1.77% DTB, 1.25% WT/WB DCache Avg IPC reduction: 0.56% DTB, 0.19% WT/WB DCache Data Tags No Flushing 5M cycle Flush 1M cycle Flush 100K cycle Flush

FACT Group, Intel27 Summary SER is an ever-increasing problem –Need standard, quantitative way to evaluate design cost of adding protection/recovery to structures AVF Gives us a Quantitative way to Measure the cost of adding Protection Presented a Methodology to Compute the AVF of Address Based Structures –Lifetime Analysis –False Negatives and False Positives Hamming Distance-1 Analysis for False Positives –Edge Effects and Cooldown Analogous to Warmup –AVF Reduction - Flushing

FACT Group, Intel28 Backup Slides

FACT Group, Intel29 Simulation Setup (Backup) Simulated Regions of several Spec2000 Benchmarks for 10 Million instructions Simulated AVFs for 3 address-based structures on a IA64-like processor using ASIM –Data Translation Buffer (DTB) RAM and CAM arrays 128 entry, 92 bits/entry –Store Buffer (STB) Data and Address arrays 32 entry, 16 bytes/entry –Level 1 Data Cache (DCache) Data and Tags Simulated both Write-Through and Write-Back Modes 16 KB, 4-way set associative, 32 byte lines

FACT Group, Intel30 Lifetime Analysis – DCache (WT) Lifetime Breakdown Tracks Data AVFs Red components – unACE Black components - ACE FP Benchmarks show a low utilization of the cache (large Idle component)

FACT Group, Intel31 Lifetime Analysis - STB STB Lifetime shows a low utilization STB Address AVF always higher than Data AVF due to Bytemasks On Average, 6 bytes out of every 16 byte valid entry are used (valid) Red components – unACE Black components - ACE

FACT Group, Intel32 Data AVF for Write-through Data Cache (Backup)

FACT Group, Intel33 Data AVF for Write-back Data Cache (Backup)

FACT Group, Intel34 Data AVF for STB (Backup)

FACT Group, Intel35 Data AVF for DTB (Backup)

FACT Group, Intel36 Tag AVF for Write-through Data Cache (Backup)

FACT Group, Intel37 Tag AVF for Write-back Data Cache (Backup)

FACT Group, Intel38 Tag AVF for STB (Backup)

FACT Group, Intel39 Tag AVF for DTB (Backup)