Tolerating Timing faults TSW November 2009 Anders P. Ravn Aalborg University.

Slides:



Advertisements
Similar presentations
Principles of Engineering System Design Dr T Asokan
Advertisements

REAL TIME SYSTEM Scheduling.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Fault-Tolerant Scheduling Techniques.
Principles of Engineering System Design Dr T Asokan
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 2.
Exception Handling – illustrated by Java mMIC-SFT November 2003 Anders P. Ravn Aalborg University.
Making Services Fault Tolerant
Fault Tolerance -Example TSW November 2009 Anders P. Ravn Aalborg University.
Dependability TSW 10 Anders P. Ravn Aalborg University November 2009.
Software Fault Tolerance – The big Picture RTS April 2008 Anders P. Ravn Aalborg University.
© Burns and Welling, 2001 Characteristics of a RTS n Large and complex n Concurrent control of separate system components n Facilities to interact with.
Fault Tolerance: Basic Mechanisms mMIC-SFT September 2003 Anders P. Ravn Aalborg University.
1 Chapter Fault Tolerant Design of Digital Systems.
Mini Project ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University August.
Modified from Sommerville’s originals Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
Distributed RT Systems Introduction ITV Multiprogramming and Real-Time Systems Anders P. Ravn Aalborg University April 2009.
Modified from Sommerville’s originals Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
Dependability ITV Real-Time Systems Anders P. Ravn Aalborg University February 2006.
CS CS 5150 Software Engineering Lecture 21 Reliability 3.
7. Fault Tolerance Through Dynamic or Standby Redundancy 7.5 Forward Recovery Systems Upon the detection of a failure, the system discards the current.
1 Task Model for Process Composition Stuart Wheater Santosh Shrivastava.
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
Safety Analysis – A quick introduction RTS February 2006 Anders P. Ravn Aalborg University.
Page 1 Building Reliable Component-based Systems Chapter 14 - Testing Reusable Software Components in Safety- Critical Real-Time Systems Chapter 14 Testing.
Safety Assessment (Fault Trees) ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg.
Chapter 2: Reliability and Fault Tolerance
1 Making Services Fault Tolerant Pat Chan, Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong Miroslaw Malek.
Page 1 Copyright © Alexander Allister Shvartsman CSE 6510 (461) Fall 2010 Selected Notes on Fault-Tolerance (12) Alexander A. Shvartsman Computer.
Summary and Safety Assessment mMIC-SFT November 2003 Anders P. Ravn Aalborg University.
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
CS, AUHenrik Bærbak Christensen1 Fault Tolerant Architectures Lyu Chapter 14 Sommerville Chapter 20 Part II.
CS 261 – Data Structures Preconditions, Postconditions & Assert.
SENG521 (Fall SENG 521 Software Reliability & Testing Fault Tolerant Software Systems: Techniques (Part 4b) Department of Electrical.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Fault Tolerance Mechanisms ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg.
CprE 458/558: Real-Time Systems
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE Advanced Operating Systems Lecture notes Dr.
Mixed Criticality Systems: Beyond Transient Faults Abhilash Thekkilakattil, Alan Burns, Radu Dobrin and Sasikumar Punnekkat.
Facilitating testing and monitoring of number entry systems in medical devices Abigail Cauchi, Christian Colombo, Mark Micallef & Gordon Pace.
Resilience through Dynamic Reconfigurations in Agent Systems Ilya Lopatkin Newcastle University, School of Computing Science.
A Survey of Fault Tolerance in Distributed Systems By Szeying Tan Fall 2002 CS 633.
SENG521 (Fall SENG 521 Software Reliability & Testing Fault Tolerant Software Systems: Techniques (Part 4a) Department of Electrical.
Week#2 Software Quality Assurance Software Quality Engineering.
18/05/2006 Fault Tolerant Computing Based on Diversity by Seda Demirağ
TECHNICAL SEMINAR On. introduction  Cloud support for real time system is really important because, today we found a lot of real time systems around.
Week#3 Software Quality Engineering.
REAL-TIME OPERATING SYSTEMS
8.6. Recovery By Hemanth Kumar Reddy.
Prabhat Kumar Saraswat Paul Pop Jan Madsen
Fault-Tolerant Computing Systems #3 Fault-Tolerant Software
Chapter 2: Reliability and Fault Tolerance
Fault Tolerance & Reliability CDA 5140 Spring 2006
Fault Tolerance In Operating System
ECE 353 Lab 3 Pipeline Simulator
Affordable iPad Repair Services in Dubai
 iOS update errors  Screen Repair  Apple keyboard Repair  Liquid Damage Repair  Battery Issues  Data Recovery.
Critical systems development
Fault and Energy Aware Communication Mapping with Guaranteed Latency for Applications Implemented on NoC Sorin Manolache, Petru Eles, Zebo Peng {sorma,
Multi-version approach (with error detection and recovery)
Fault Tolerance Distributed Web-based Systems
مدل زنجیره ای در برنامه های سلامت
Critical Systems Development
Fault-Tolerant CORBA By, Srinivas Seshu.
Hurricane Wilma Response and Recovery Efforts
Fault Tolerance Distributed
Hardware Assisted Fault Tolerance Using Reconfigurable Logic
ECE 753: FAULT-TOLERANT COMPUTING
Fault-Tolerant CORBA By, Srinivas Seshu.
Anand Bhat*, Soheil Samii†, Raj Rajkumar* *Carnegie Mellon University
Presentation transcript:

Tolerating Timing faults TSW November 2009 Anders P. Ravn Aalborg University

FT basis: Redundancy Time Space TryRetry... Try... BW 2.5 p. 41

Dynamic Redundancy 1.Error detection 2.Damage confinement and assessment 3.Error recovery 4.Fault treatment and continued service BW p. 41

Error Detection f: State x Input  State x Output Environment (exception) Application BW Ch 13 Assertion: precondition (input,state) postcondition (input, state, state’, output) invariant(state, state’) Timing: WCET(f, input) Deadline (f,input) D

Fault Tree EC_i > C_i ET_i < T_i Missed D_i EI_i > I_i ET_k < T_kEC_k > C_k EB_i < B_i Platform fails

Error Detection Deadline D missed (Platform Error) Overrun of C Min. Interarrival time T too small Blocking time B too small

Damage Confinement Static structure one task lower priority tasks ? Dynamic structure BW p. 457

Error Recovery Forward Backward Repair the state – if you can ! define recovery points checkpoint state at r. p. roll back retry Domino effect