725/ASP-DAC 20051 Using Loop Invariants to Fight Soft Errors in Data Caches Sri Hari Krishna N., Seung Woo Son, Mahmut Kandemir, Feihui Li Department of.

Slides:



Advertisements
Similar presentations
An Overview of ABFT in cloud computing
Advertisements

Tuning of Loop Cache Architectures to Programs in Embedded System Design Susan Cotterell and Frank Vahid Department of Computer Science and Engineering.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
This project and the research leading to these results has received funding from the European Community's Seventh Framework Programme [FP7/ ] under.
Automated Fitness Guided Fault Localization Josh Wilkerson, Ph.D. candidate Natural Computation Laboratory.
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
Microarchitectural Approaches to Exceeding the Complexity Barrier © Eric Rotenberg 1 Microarchitectural Approaches to Exceeding the Complexity Barrier.
Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.
Object (Data and Algorithm) Analysis Cmput Lecture 5 Department of Computing Science University of Alberta ©Duane Szafron 1999 Some code in this.
Data Partitioning for Reconfigurable Architectures with Distributed Block RAM Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Design of Fault Tolerant Data Flow in Ptolemy II Mark McKelvin EE290 N, Fall 2004 Final Project.
Fehlererkennung in SW David Rigler. Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques.
ED 4 I: Error Detection by Diverse Data and Duplicated Instructions Greg Bronevetsky.
Restrictive Compression Techniques to Increase Level 1 Cache Capacity Prateek Pujara Aneesh Aggarwal Dept of Electrical and Computer Engineering Binghamton.
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Sorting CS 202 – Fundamental Structures of Computer Science II Bilkent.
Architectural and Compiler Techniques for Energy Reduction in High-Performance Microprocessors Nikolaos Bellas, Ibrahim N. Hajj, Fellow, IEEE, Constantine.
TASK ADAPTATION IN REAL-TIME & EMBEDDED SYSTEMS FOR ENERGY & RELIABILITY TRADEOFFS Sathish Gopalakrishnan Department of Electrical & Computer Engineering.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Software Faults and Fault Injection Models --Raviteja Varanasi.
Success status, page 1 Collaborative learning for security and repair in application communities MIT & Determina AC PI meeting July 10, 2007 Milestones.
Dynamic Test Set Selection Using Implication-Based On-Chip Diagnosis Nuno Alves, Yiwen Shi, Nicholas Imbriglia, and Iris Bahar Brown University Jennifer.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
Roza Ghamari Bogazici University.  Current trends in transistor size, voltage, and clock frequency, future microprocessors will become increasingly susceptible.
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
ParaScale : Exploiting Parametric Timing Analysis for Real-Time Schedulers and Dynamic Voltage Scaling Sibin Mohan 1 Frank Mueller 1,William Hawkins 2,
Michael Ernst, page 1 Collaborative Learning for Security and Repair in Application Communities Performers: MIT and Determina Michael Ernst MIT Computer.
Instituto de Informática and Dipartimento di Automatica e Informatica Universidade Federal do Rio Grande do Sul and Politecnico di Torino Porto Alegre,
Department of Computer Science A Static Program Analyzer to increase software reuse Ramakrishnan Venkitaraman and Gopal Gupta.
TELSIKS 2005 Concurrent error detection in FSMs using transition checking technique G. Lj. Djordjevic, T. R. Stankovic and M. K. Stojcev Department of.
An efficient active replication scheme that tolerate failures in distributed embedded real-time systems Alain Girault, Hamoudi Kalla and Yves Sorel Pop.
Presenter: Jyun-Yan Li Effective Software-Based Self-Test Strategies for On-Line Periodic Testing of Embedded Processors Antonis Paschalis Department of.
SiLab presentation on Reliable Computing Combinational Logic Soft Error Analysis and Protection Ali Ahmadi May 2008.
Array (continue).
2013/01/14 Yun-Chung Yang Energy-Efficient Trace Reuse Cache for Embedded Processors Yi-Ying Tsai and Chung-Ho Chen 2010 IEEE Transactions On Very Large.
LOGO Soft-Error Detection Through Software Fault-Tolerance Techniques by Gökhan Tufan İsmail Yıldız.
Self-* Systems CSE 598B Paper title: Dynamic ECC tuning for caches Presented by: Niranjan Soundararajan.
Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.
VTS 2012: Zhao-Agrawal1 Net Diagnosis using Stuck-at and Transition Fault Models Lixing Zhao* Vishwani D. Agrawal Department of Electrical and Computer.
Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.
Laboratoire d' Intégration des Systèmes et des Technologies System-Level Hardware-Based Protection of Memories against Soft-Errors Valentin Gherman Samuel.
Relyzer: Exploiting Application-level Fault Equivalence to Analyze Application Resiliency to Transient Faults Siva Hari 1, Sarita Adve 1, Helia Naeimi.
Transactional Coherence and Consistency Presenters: Muhammad Mohsin Butt. (g ) Coe-502 paper presentation 2.
Yun-Chung Yang TRB: Tag Replication Buffer for Enhancing the Reliability of the Cache Tag Array Shuai Wang; Jie Hu; Ziavras S.G; Dept. of Electr. & Comput.
Software solutions for challenges in embedded systems Sri Hari Krishna Narayanan, The Pennsylvania State University, USA, Theme While.
Using Loop Invariants to Detect Transient Faults in the Data Caches Seung Woo Son, Sri Hari Krishna Narayanan and Mahmut T. Kandemir Microsystems Design.
Harnessing Soft Computation for Low-Budget Fault Tolerance Daya S Khudia Scott Mahlke Advanced Computer Architecture Laboratory University of Michigan,
Methodology to Compute Architectural Vulnerability Factors Chris Weaver 1, 2 Shubhendu S. Mukherjee 1 Joel Emer 1 Steven K. Reinhardt 1, 2 Todd Austin.
Low-cost Program-level Detectors for Reducing Silent Data Corruptions Siva Hari †, Sarita Adve †, and Helia Naeimi ‡ † University of Illinois at Urbana-Champaign,
컴퓨터교육과 이상욱 Published in: COMPUTER ARCHITECTURE LETTERS (VOL. 10, NO. 1) Issue Date: JANUARY-JUNE 2011 Publisher: IEEE Authors: Omer Khan (Massachusetts.
Kandemir224/MAPLD Reliability-Aware OS Support for FPGA-Based Systems M. Kandemir, G. Chen, and F. Li Department of Computer Science & Engineering.
Ramya Prabhakar, Seung Woo Son, Christina Patrick, Sri Hari Krishna Narayanan, Mahmut Kandemir Pennsylvania State University 4th International IEEE Security.
Packet Classification Using Dynamically Generated Decision Trees
Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica.
Static Analysis to Mitigate Soft Errors in Register Files Jongeun Lee, Aviral Shrivastava Compiler Microarchitecture Lab Arizona State University, USA.
1 Computer Algorithms Tutorial 2 Mathematical Induction Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
CS717 1 Hardware Fault Tolerance Through Simultaneous Multithreading (part 2) Jonathan Winter.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science Efficient Soft Error.
University of Michigan Electrical Engineering and Computer Science 1 Low Cost Control Flow Protection Using Abstract Control Signatures Daya S Khudia and.
Kandemir224/MAPLD Reliability-Aware OS Support for FPGA-Based Systems M. Kandemir, G. Chen, and F. Li Department of Computer Science & Engineering.
An Approach for Enhancing Inter- processor Data Locality on Chip Multiprocessors Guilin Chen and Mahmut Kandemir The Pennsylvania State University, USA.
A New Approach to Software-Implemented Fault Tolerance
Evaluating Register File Size
Soft-Error Detection through Software Fault-Tolerance Techniques
nZDC: A compiler technique for near-Zero silent Data Corruption
Hwisoo So. , Moslem Didehban#, Yohan Ko
Automated Fitness Guided Fault Localization
Presentation transcript:

725/ASP-DAC Using Loop Invariants to Fight Soft Errors in Data Caches Sri Hari Krishna N., Seung Woo Son, Mahmut Kandemir, Feihui Li Department of Computer Science & Engineering The Pennsylvania State University, USA

725/ASP-DAC Introduction Ever scaling process technology makes embedded systems more vulnerable to soft errors The method of fully duplicating instructions is effective, but has heavy costs in terms of performance, power and memory space Our approach: Checking loop invariants to fight soft errors

725/ASP-DAC Related Work Traditional approaches to protect against soft errors use spatial redundancy or temporal redundancy [1,2,3]. Pure software-based techniques are based on the replication of code by the compiler, or the insertion of control code. Control code based methods include assertion [6], algorithm based fault tolerance [7], and code flow checking [8].

725/ASP-DAC Detecting loop invariants Definition: the loop invariant is a value or property that does not change during the execution of a loop. Example: a property of “for loop” is that the loop index is bounded by the lower and upper bounds of the loop, assuming that the loop itself does not explicitly change it. There exist a lot of publicly-available tools that can extract loop invariants. We exploit Daikon, a dynamic invariant detector that detects likely invariants over the program’s data structures [9].

725/ASP-DAC Approach Overview Use Daikon to detect and then filter the loop invariants Generate the “checker code” using selected invariants and embed them into the original loop code for(…) { } Loop Body Invariant Detector Invariant Invariant Filter Invariant Code Modifier for(…) { } Loop Body Checker Code

725/ASP-DAC A Simple Example Bubble-sort algorithm These loop invariants must be satisfied during each loop iteration: Outer loop: last i elements are sorted and are all greater than or equal to the other elements Inner loop: Same as outer loop and the j th element is greater than or equal to the first j-1 elements Algorithm bubble-sort(sequence): Input: sequence of integers stored in sequence Post-condition: sequence is sorted and contains the same integers as the original sequence length = length of sequence for i = 0 to length - 1 do for j = 0 to length - i -2 do if jth element of sequence >(j+1)th element of sequence then swap jth and (j+1)th element of sequence

725/ASP-DAC A Simple Example Invariants detected by Daikon for bubble-sort In entering the bubble-sort routine: sequence[] != null ⇒ easy length of sequence[] =10 ⇒ easy At the beginning of the kth iteration of the outer loop: ∀ i : 0 ≤ i ≤ k sequence[] sorted by > ⇒ medium k ≥1 ⇒ easy k ≤ (length of sequence[] –1) ⇒ easy At the end of the kth iteration of the outer loop: ∀ i: length-k+1 ≤ i ≤ length sequence[] sorted by < ⇒ medium Invariants classification: Easy: easy to compute but not touching much data Difficult: useful for checking soft errors but expensive to compute Medium: tradeoffs between error coverage and associated overheads. Only a subset of the medium invariants are used in the checker code

725/ASP-DAC Bubble-sort code with “checker-code” Check two medium invariants of bubble-sort using just an additional loop iteration Algorithm new-bubble-sort(sequence): Post-condition: sequence is sorted and contains the same integers as the original sequence length = length of sequence for i = 0 to length - 1 do for j = 0 to length - i -2 do if jth element of sequence >(j+1)th element of sequence then swap jth and (j+1)th element of sequence for k = length - i to length do if jth element of sequence >(j+1)th element of sequence then error()

725/ASP-DAC Experimental Setup SimpleScalar v3.0d is modified to get the faults injection statistics Five benchmarks: iter-merge (iterative merge program), adi (alternate direction integration application), heap-sort, bubble-sort, and mxm Our loop invariant based approach vs. a full duplication based scheme

725/ASP-DAC Result(1): Error Detection Rate For different benchmarks, our approach shows very different performance. It is because that our approach depends heavily on the availability of invariants.

725/ASP-DAC Result(2): Code size increase Average code size (executable) increase is 3% and 8%, respectively for our approach and full duplication.

725/ASP-DAC Result(3): Execution cycles increase Our approach is much better than the full duplication scheme in terms of execution time

725/ASP-DAC Result(4): Sensitivity to Error Rate Error detection rate remains generally stable with changes in the injection rates

725/ASP-DAC Conclusion Traditional soft error checking methods use either spatial redundancy or temporal redundancy, and are expensive from power, performance and memory space perspective. Our approach demonstrates that through experimental analysis, loop invariants checking can catch a large fraction of the errors in a less expensive way.

725/ASP-DAC References [ 1] M. Gomaa, C. Scarbrough, T. N. Vijaykumar, and I. Pomeranz, Transient-fault recovery for chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture, June [2] N. Oh, S. Mitra, and E. J. McCluskey, Error detection by duplicated instructions in Super- scalar processors. IEEE Transactions on Reliability, Vol. 51, Issue 1, Mar [3] B. Nicolescu, R. Velazco, Detecting soft errors by a purely software approach: method, tools and perimental results. In Proceedings of DATE, [4] A.V.Aho, J.D. Ullman, Principles of Compiler Design, Addison-Wesley Publishing Company Inc, USA [5] M. Rebaudengo, et al.Soft-error detection through software fault-tolerance techniques. In Proceedings of the International Symposium on Defect and Fault Tolerance in VLSI Systems, Nov. 1999, pp [6] M. Zenha Rela, H. Madeira, J. G. Silva, Experimental Evaluation of the Fail-Silent Behavior in Programs with Consistency Checks, Proc. FTCS-26, 1996, pp [7] K. H. Huang, J. A. Abraham, Algorithm- Based Fault Tolerance for Matrix Operations.In IEEE Trans. Computers, vol. 33, Dec 1984, pp [8] S. Yau, F. Chen, An Approach to Concurrent Control Flow Checking, IEEE Transactions on Software Engineering, Vol. SE-6, No. 2, March 1980, pp [9] MIT, The Daikon Invariant Detector User Manual, Apr [10] D. Burger, and T. M. Austin, The SimpleScalar Tool Set, Version 2.0. In University of Wisconsin-Madison CS Department Technical Report #1342, Jun