Using Loop Invariants to Detect Transient Faults in the Data Caches Seung Woo Son, Sri Hari Krishna Narayanan and Mahmut T. Kandemir Microsystems Design.

Slides:



Advertisements
Similar presentations
Survey of Detection, Diagnosis, and Fault Tolerance Methods in FPGAs
Advertisements

MATH 224 – Discrete Mathematics
VLIW Very Large Instruction Word. Introduction Very Long Instruction Word is a concept for processing technology that dates back to the early 1980s. The.
Order Analysis of Algorithms Debdeep Mukhopadhyay IIT Madras.
Fault-Tolerant Systems Design Part 1.
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
725/ASP-DAC Using Loop Invariants to Fight Soft Errors in Data Caches Sri Hari Krishna N., Seung Woo Son, Mahmut Kandemir, Feihui Li Department of.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
Praveen Yedlapalli Emre Kultursay Mahmut Kandemir The Pennsylvania State University.
F ORMAL D IAGNOSIS OF H ARDWARE T RANSIENT E RRORS IN P ROGRAMS Layali Rashid, Karthik Pattabiraman and Sathish Gopalakrishnan T HE E LECTRICAL AND C OMPUTER.
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
Combinational Logic and Verilog. XORs and XNORs XOR.
NATW 2008 Using Implications for Online Error Detection Nuno Alves, Jennifer Dworak, R. Iris Bahar Division of Engineering Brown University Providence,
Chien Hsing James Wu David Gottesman Andrew Landahl.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.
DS -V - FDT - 1 HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Zuverlässige Systeme für Web und E-Business (Dependable Systems for Web and E-Business)
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Fehlererkennung in SW David Rigler. Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques.
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
Analysis of Algorithms CS 477/677
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
Design of SCS Architecture, Control and Fault Handling.
Servers Redundant Array of Inexpensive Disks (RAID) –A group of hard disks is called a disk array FIGURE Server with redundant NICs.
Advance Data Structure 1 College Of Mathematic & Computer Sciences 1 Computer Sciences Department م. م علي عبد الكريم حبيب.
TASK ADAPTATION IN REAL-TIME & EMBEDDED SYSTEMS FOR ENERGY & RELIABILITY TRADEOFFS Sathish Gopalakrishnan Department of Electrical & Computer Engineering.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Success status, page 1 Collaborative learning for security and repair in application communities MIT & Determina AC PI meeting July 10, 2007 Milestones.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
RAID Shuli Han COSC 573 Presentation.
5.3 Machine-Independent Compiler Features
IBM S/390 Parallel Enterprise Server G5 fault tolerance: A historical perspective by L. Spainhower & T.A. Gregg Presented by Mahmut Yilmaz.
Learning, Monitoring, and Repair in Application Communities Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
IVEC: Off-Chip Memory Integrity Protection for Both Security and Reliability Ruirui Huang, G. Edward Suh Cornell University.
CML CML Compiler-Managed Protection of Register Files for Energy-Efficient Soft Error Reduction Jongeun Lee, Aviral Shrivastava* Compiler Microarchitecture.
The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.
LOGO Soft-Error Detection Through Software Fault-Tolerance Techniques by Gökhan Tufan İsmail Yıldız.
European Test Symposium, May 28, 2008 Nuno Alves, Jennifer Dworak, and R. Iris Bahar Division of Engineering Brown University Providence, RI Kundan.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Today’s Agenda  Reminder: HW #1 Due next class  Quick Review  Input Space Partitioning Software Testing and Maintenance 1.
Data Structure Introduction.
Fundamentals of Algorithms MCS - 2 Lecture # 15. Bubble Sort.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Relyzer: Exploiting Application-level Fault Equivalence to Analyze Application Resiliency to Transient Faults Siva Hari 1, Sarita Adve 1, Helia Naeimi.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
Fault-Tolerant Systems Design Part 1.
Transactional Coherence and Consistency Presenters: Muhammad Mohsin Butt. (g ) Coe-502 paper presentation 2.
Software solutions for challenges in embedded systems Sri Hari Krishna Narayanan, The Pennsylvania State University, USA, Theme While.
Implicit-Storing and Redundant- Encoding-of-Attribute Information in Error-Correction-Codes Yiannakis Sazeides 1, Emre Ozer 2, Danny Kershaw 3, Panagiota.
Detecting Errors Using Multi-Cycle Invariance Information Nuno Alves, Jennifer Dworak, and R. Iris Bahar Division of Engineering Brown University Providence,
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /3/2013 Lecture 9: Memory Unit Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
New Random Test Strategies for Automated Discovery of Faults & Fault Domains Mian Asbat Ahmad
Harnessing Soft Computation for Low-Budget Fault Tolerance Daya S Khudia Scott Mahlke Advanced Computer Architecture Laboratory University of Michigan,
Automated Debugging with Error Invariants TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A AA A A Chanseok Oh.
Low-cost Program-level Detectors for Reducing Silent Data Corruptions Siva Hari †, Sarita Adve †, and Helia Naeimi ‡ † University of Illinois at Urbana-Champaign,
Ramya Prabhakar, Seung Woo Son, Christina Patrick, Sri Hari Krishna Narayanan, Mahmut Kandemir Pennsylvania State University 4th International IEEE Security.
Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica.
Random Test Generation of Unit Tests: Randoop Experience
Bubble sort. Quite slow, but simple Principles: Compare 2 numbers next to each other (lets call it current and the one next to it) If the current number.
Loops ( while and for ) CSE 1310 – Introduction to Computers and Programming Alexandra Stefan 1.
University of Michigan Electrical Engineering and Computer Science 1 Low Cost Control Flow Protection Using Abstract Control Signatures Daya S Khudia and.
CS Introduction to Operating Systems
A New Approach to Software-Implemented Fault Tolerance
Soft-Error Detection through Software Fault-Tolerance Techniques
nZDC: A compiler technique for near-Zero silent Data Corruption
CDA 3101 Spring 2016 Introduction to Computer Organization
Daya S Khudia, Griffin Wright and Scott Mahlke
Hwisoo So. , Moslem Didehban#, Yohan Ko
Analysis of Bubble Sort and Loop Invariant
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

Using Loop Invariants to Detect Transient Faults in the Data Caches Seung Woo Son, Sri Hari Krishna Narayanan and Mahmut T. Kandemir Microsystems Design Lab. The Pennsylvania State University

LCTES04-Student Forum Related Work – H/W Hardware fault detection usually involves a combination of Information redundancy (e.g., ECC, parity, etc.) Temporal redundancy (e.g., executing the same instruction twice in the same functional unit, but at different times) Spatial redundancy (e.g., executing same instruction in different functional units)

LCTES04-Student Forum Related Work -S/W Replication of the program execution and the check of the results Recovery Blocks N-Version Programming Introduction of some control code into the program ABFT Assertions Code Flow Checking

LCTES04-Student Forum Our Approach Use loop invariants to detect transient-faults What is a loop invariant? It is data or a property that does not change during execution of a loop. Why target loops? Loops make up 90% of the execution time. Our usage of invariants If invariant value has changed, then soft error has occurred, modifying the data in cache. In that case, re-perform iterations.

LCTES04-Student Forum Bubble Sort Algorithm Bubblesort(sequence): Input: sequence of integers sequence Post-condition: sequence is sorted & contains the same integers as the original sequence length = length of sequence for i = 0 to length - 1 do for j = 0 to length - i -2 do if jth element of sequence >(j+1)th element of sequence then swap jth and (j+1)th element of sequence

LCTES04-Student Forum Bubble Sort-Sample Invariants Loop Invariant – Outer Loop: Last i elements of sequence are sorted and are all greater or equal to the other elements of the sequence. Loop Invariant – Inner Loop: Same as outer loop and the jth element of sequence is greater or equal to the first j elements of sequence.

LCTES04-Student Forum The Big Picture: How does the Project Work? Use an Invariant Detector to detect invariants in code. Incorporate ‘checker code’ into the source code -> hardened version Run this code in a modified version of SimpleScalar that injects errors into memory and cache. Calculate how our method performs

LCTES04-Student Forum The Daikon Invariant Detector Daikon - developed at the Program Analysis Group at MIT. Dynamically detects invariants about a program ’ s data structures Limitations It only targets procedural invariants Does not target loops Does not consider local variables

LCTES04-Student Forum Daikon Usage Modify our Source Code such that Daikon can detect loop invariants Instrument the Source Code Run Test Suites on Instrumented code to create Trace Files Detect invariants in Trace Files Incorporate Invariant Checker code in the Source Code

LCTES04-Student Forum Fault Injection Mechanism Modified SimpleScalar v3.0d to inject faults Randomly generate faults on memory access operation Memory corruption routines from Angshuman ’ s In our study, we randomly flip one bit of memory data during memory and cache read/write operations Generate statistics for faults injected

LCTES04-Student Forum Detected loop invariant – bubble sort … a != null -> easy a[i] > i -> easy i >= 1 -> easy n == 100 -> easy a[i..] sorted by medium a[0..i] sorted by > -> medium i easy …

LCTES04-Student Forum Detected loop invariant – matrix multiplication … size(a[]) == size(b[]) -> easy i >= 0 -> easy b[] contains no duplicates -> difficult b[] elements != null -> medium b[] == a[] > useless …

LCTES04-Student Forum Experiment Setup Bubble Sort Array size = 100 # of iteration = 150 Total # of instruction simulated: 68,019,732 Matrix Multiplication Array size = 100 x 100 # of iteration = 1 Total # of instruction simulated: 65,547,520

LCTES04-Student Forum Detection Rate

LCTES04-Student Forum Detection overhead - code increase

LCTES04-Student Forum Conclusion Developed a soft error detection technique using loop-invariants Detection rate varies according to the characteristics of the applications Different programs have different loop invariants Performance degradation due to fault- hardened code

LCTES04-Student Forum Future Work Better invariant generation mechanism that generated loop invariants directly Automatic Classification of Invariants Automatic Assertion generator in Source Code More accurate fault injection