Fehlererkennung in SW David Rigler. Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques.

Slides:



Advertisements
Similar presentations
A Case for Redundant Arrays Of Inexpensive Disks Paper By David A Patterson Garth Gibson Randy H Katz University of California Berkeley.
Advertisements

NC STATE UNIVERSITY 1 Assertion-Based Microarchitecture Design for Improved Fault Tolerance Vimal K. Reddy Ahmed S. Al-Zawawi, Eric Rotenberg Center for.
ENGS 116 Lecture 111 ILP: Software Approaches 2 Vincent H. Berk October 14 th Reading for monday: 3.10 – 3.15, Reading for today: 4.2 – 4.6.
Quantitative Analysis of Control Flow Checking Mechanisms for Soft Errors Aviral Shrivastava, Abhishek Rhisheekesan, Reiley Jeyapaul, and Carole-Jean Wu.
Loop Unrolling & Predication CSE 820. Michigan State University Computer Science and Engineering Software Pipelining With software pipelining a reorganized.
Fault-Tolerant Systems Design Part 1.
1 CSC 714 Center for Embedded Systems Research (CESR) Department of Computer Science North Carolina State University Frank Mueller Missing in Action: Timing.
SW-Based Fault Detection Mechanisms in Microprocessor Control Flow Execution.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
725/ASP-DAC Using Loop Invariants to Fight Soft Errors in Data Caches Sri Hari Krishna N., Seung Woo Son, Mahmut Kandemir, Feihui Li Department of.
Experimental Evaluation of a SIFT Environment for Parallel Spaceborne Applications K. Whisnant, Z. Kalbarczyk, R.K. Iyer, P. Jones Center for Reliable.
NATW 2008 Using Implications for Online Error Detection Nuno Alves, Jennifer Dworak, R. Iris Bahar Division of Engineering Brown University Providence,
Making Services Fault Tolerant
Chia-Yen Hsieh Laboratory for Reliable Computing Microarchitecture-Level Power Management Iyer, A. Marculescu, D., Member, IEEE IEEE Transaction on VLSI.
Fault-Tolerance in VHDL Description: Transient-Fault Injection & Early Reliability Estimation TIMA-INPG Lab Fabian Vargas, Alexandre Amory Raoul Velazco.
DS -V - FDT - 1 HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Zuverlässige Systeme für Web und E-Business (Dependable Systems for Web and E-Business)
LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop Kim, Yuanyuan.
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
ED 4 I: Error Detection by Diverse Data and Duplicated Instructions Greg Bronevetsky.
1 Making Services Fault Tolerant Pat Chan, Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong Miroslaw Malek.
Design of SCS Architecture, Control and Fault Handling.
Educational Computer Architecture Experimentation Tool Dr. Abdelhafid Bouhraoua.
Software-Based Online Detection of Hardware Defects: Mechanisms, Architectural Support, and Evaluation Kypros Constantinides University of Michigan Onur.
1 RAKSHA: A FLEXIBLE ARCHITECTURE FOR SOFTWARE SECURITY Computer Systems Laboratory Stanford University Hari Kannan, Michael Dalton, Christos Kozyrakis.
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
Towards a Hardware-Software Co-Designed Resilient System Man-Lap (Alex) Li, Pradeep Ramachandran, Sarita Adve, Vikram Adve, Yuanyuan Zhou University of.
TASK ADAPTATION IN REAL-TIME & EMBEDDED SYSTEMS FOR ENERGY & RELIABILITY TRADEOFFS Sathish Gopalakrishnan Department of Electrical & Computer Engineering.
Software Faults and Fault Injection Models --Raviteja Varanasi.
Language Evaluation Criteria
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
Instituto de Informática and Dipartimento di Automatica e Informatica Universidade Federal do Rio Grande do Sul and Politecnico di Torino Porto Alegre,
Dynamic Verification of Cache Coherence Protocols Jason F. Cantin Mikko H. Lipasti James E. Smith.
SiLab presentation on Reliable Computing Combinational Logic Soft Error Analysis and Protection Ali Ahmadi May 2008.
R Enabling Trusted Software Integrity Darko Kirovski Microsoft Research Milenko Drinić Miodrag Potkonjak Computer Science Department University of California,
Presenter: Jyun-Yan Li A hybrid approach to the test of cache memory controllers embedded in SoCs’ W. J. Perez, J. Velasco Universidad del Valle Grupo.
LOGO Soft-Error Detection Through Software Fault-Tolerance Techniques by Gökhan Tufan İsmail Yıldız.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
Understanding the Propagation of Hard Errors to Software and Implications for Resilient System Design M. Li, P. Ramachandra, S.K. Sahoo, S.V. Adve, V.S.
Eliminating Silent Data Corruptions caused by Soft-Errors Siva Hari, Sarita Adve, Helia Naeimi, Pradeep Ramachandran, University of Illinois at Urbana-Champaign,
ECE 753: FAULT-TOLERANT COMPUTING Kewal K.Saluja Department of Electrical and Computer Engineering Low Level Fault-Tolerance: Watchdog and Re-execution.
Fault-Tolerant Systems Design Part 1.
Title of Selected Paper: IMPRES: Integrated Monitoring for Processor Reliability and Security Authors: Roshan G. Ragel and Sri Parameswaran Presented by:
CS 211: Computer Architecture Lecture 6 Module 2 Exploiting Instruction Level Parallelism with Software Approaches Instructor: Morris Lancaster.
Error Detection in Hardware VO Hardware-Software-Codesign Philipp Jahn.
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
Fault-Tolerant Systems Design Part 1.
CS717 Detection of Control Flow Errors Survey of Hardware and Software Techniques Greg Bronevetsky.
Using Loop Invariants to Detect Transient Faults in the Data Caches Seung Woo Son, Sri Hari Krishna Narayanan and Mahmut T. Kandemir Microsystems Design.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
A Binary Agent Technology for COTS Software Integrity Anant Agarwal Richard Schooler InCert Software.
Low-cost Program-level Detectors for Reducing Silent Data Corruptions Siva Hari †, Sarita Adve †, and Helia Naeimi ‡ † University of Illinois at Urbana-Champaign,
EnerJ: Approximate Data Types for Safe and General Low-Power Computation (PLDI’2011) Adrian Sampson, Werner Dietl, Emily Fortuna Danushen Gnanapragasam,
Prefetching Techniques. 2 Reading Data prefetch mechanisms, Steven P. Vanderwiel, David J. Lilja, ACM Computing Surveys, Vol. 32, Issue 2 (June 2000)
Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica.
Static Analysis to Mitigate Soft Errors in Register Files Jongeun Lee, Aviral Shrivastava Compiler Microarchitecture Lab Arizona State University, USA.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Safety-Critical Systems 3 T Designing Safety Software Ilkka Herttua.
GangES: Gang Error Simulation for Hardware Resiliency Evaluation Siva Hari 1, Radha Venkatagiri 2, Sarita Adve 2, Helia Naeimi 3 1 NVIDIA Research, 2 University.
University of Michigan Electrical Engineering and Computer Science 1 Low Cost Control Flow Protection Using Abstract Control Signatures Daya S Khudia and.
18/05/2006 Fault Tolerant Computing Based on Diversity by Seda Demirağ
A New Approach to Software-Implemented Fault Tolerance
Soft-Error Detection through Software Fault-Tolerance Techniques
nZDC: A compiler technique for near-Zero silent Data Corruption
InCheck – An Integrated Recovery Methodology for nZDC
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Hwisoo So. , Moslem Didehban#, Yohan Ko
CS 704 Advanced Computer Architecture
Co-designed Virtual Machines for Reliable Computer Systems
Fault Tolerant Systems in a Space Environment
rePLay: A Hardware Framework for Dynamic Optimization
Presentation transcript:

Fehlererkennung in SW David Rigler

Overview Types of errors detection Fault/Error classification Description of certain SW error detection techniques Evaluation (Coverage / Overhead) Conclusion

Failure Runtime Detection (in Software) Software Diversity / N-Version P. Defensive Programming  Assertions  Bound/Range checking Control Flow checking  Block Entry Exit Checking  Error Capturing Instructions  Advanced Techniques … Redundant Data/Code HW - Failures SW - Failures

Transient Hardware Error Classification Data Errors Code Errors  Type S1 Statements affecting data only  Type S2 Statements affecting the execution flow  Type E1 Errors changing operation (not control flow)  Type E2 Errors changing the Statement type (S1  S2)

Data Errors (Executable Assertions) Generic  Bound  Integrity For SW and HW Errors Non-Generic  Value Range  Approximate (False alarm)

Data Errors (systematic Data Redundancy) Rules  Duplicate every variable: x -> (x1 and x2)  Perform write operations on x1 and x2  Read operation on x -> check for consistency of x1 and x2

Data Errors (systematic Data Redundancy) Generic Approach  Use pre-processor on high level language Compiler optimisations may be a problem All (visible) single Bit Flip Errors in DATA Memory can be detected

Control Flow Errors Block Entry Exit Checking  Unique signatures for Basic Blocks  Assign at Entry  Compare at Exit Problems  Jumps within Block  Granularity  Jumps to unused Area

Control Flow Errors Duplicate Condition Checks

Control Flow Errors Error Capturing Instructions  Special or unused Instructions Trap, SWI, …  Spread over unused Memory Program Memory Data Memory  Call Error Handling Function

Control Flow Errors Watchdog Timer  Periodically reset timer  Take Action at specific timer value  Needs Support of Hardware Common in embedded Controllers  Detects infinite loop errors

Coverage Example 1 BEEC, Duplicate Condition Checks, Systematic Data Redundancy Simulated bit-flip errors in memory ~ 5x Performance slow down ~ 2x Size No Silent Violations (Data) High Coverage even for Errors in Code Area.

Coverage Example 2 Physical Fault Injection  Heavy-Ion Radiation  Power-Supply Disturbances Hardware WDT Effect of additional SW  60%  85%

Improving Coverage Separate BB for redundant variables Separated in Memory  No single bit-flip jumps Use cumulative Signatures  Detect jumps within Block Avoid Signature aliasing  Hamming distance

100% Coverage For simple failure model  Single bit-flip  Data- and Code-Memory/Registers  Hidden Registers not included (Branch Buffer, Cache tags, etc) High Overhead  ~4x Memory usage  >3x Time

Conclusion: Error Detection in SW Pure SW: high coverage only for simple failure models Addition to HW Error Detection Trade-off: Overhead  Coverage  Fine tuning possible  Use available Resources (Time, Memory)

Miremadi G., J. Karlsson, U. Gunneflo, and J. Torin, Two Software Techniques for On-Line Error Detection, Proc. of the 22th International Symposium on Fault-Tolerant Computing (FTCS-22), July 1992, pp Miremadi G. and J. Torin, Evaluation Processor-Behavior Three Error-Detection Mechanisms Using Physical Fault-Injection, IEEE Trans. On Reliability, Vol. 44, No. 3, Sept. 1995, pp Rabejac C., J.-P. Blanquart, J.-P. Queille, Lab. for Dependability Eng., CNRS, Toulouse, France, Executable assertions and timed traces for on-line software error detection, Proc. of the 26th International Symposium on Fault-Tolerant Computing (FTCS-26), Alkhalifa Z., V. S. S. Nair, N. Krishnamurthy and J. A. Abraham, Design and Evaluation of Systemlevel Checks for On-line Control Flow Error Detection, IEEE Trans. on Parallel and Distributed Systems, Vol. 10, No. 6, Jun. 1999, pp M. Fazeli, R. Farivar, S. G. Miremadi, "A Software-Based Concurrent Error Detection Technique for PowerPC Processor-based Embedded systems", Proc. Of 20th IEEE Symposium on Defect and Fault Tolerance in VLSI Systems (DFT), Monterey, California, Software Detection Mechanisms Providing Full Coverage Against Single Bit-Flip Faults B. Nicolescu, Y. Savaria, Senior Member, IEEE, and R. Velazco, Member, IEEE Soft-error Detection through Software Fault-Tolerance techniques Maurizio REBAUDENGO, Matteo SONZA REORDA, Marco TORCHIANO, Massimo VIOLANTE