Software Fault Tolerance – The big Picture mMIC-SFT September 2003 Anders P. Ravn Aalborg University.

Slides:



Advertisements
Similar presentations
An Overview of ABFT in cloud computing
Advertisements

11. Practical fault-tolerant system design Reliable System Design 2005 by: Amir M. Rahmani.
Dependability ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University August.
Term Paper OLOMOLA,Afolabi( ). Dependability Modellling.
Markov Reward Models By H. Momeni Supervisor: Dr. Abdollahi Azgomi.
Fault Tolerance -Example TSW November 2009 Anders P. Ravn Aalborg University.
Real-Time Systems... And the Fine Print Real-Time Systems Anders P. Ravn Aalborg University September 2009.
Dependability TSW 10 Anders P. Ravn Aalborg University November 2009.
Software Fault Tolerance – The big Picture RTS April 2008 Anders P. Ravn Aalborg University.
Fault Tolerance: Basic Mechanisms mMIC-SFT September 2003 Anders P. Ravn Aalborg University.
CSE 322: Software Reliability Engineering Topics covered: Dependability concepts Dependability models.
Presented By: Vinay Kumar.  At the time of invention, Internet was just accessible to a small group of pioneers who wanted to make the network work.
Mini Project ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University August.
Dependability ITV Real-Time Systems Anders P. Ravn Aalborg University February 2006.
Documentation ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University August.
Safety Analysis – A quick introduction RTS February 2006 Anders P. Ravn Aalborg University.
OHT 3.1 Galin, SQA from theory to implementation © Pearson Education Limited 2004 The need for comprehensive software quality requirements Classification.
Introduction ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University August.
SENG521 (Fall SENG 521 Software Reliability & Testing Defining Necessary Reliability (Part 3b) Department of Electrical & Computer.
Safety Assessment (Fault Trees) ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg.
Page 1 Copyright © Alexander Allister Shvartsman CSE 6510 (461) Fall 2010 Selected Notes on Fault-Tolerance (12) Alexander A. Shvartsman Computer.
Design of SCS Architecture, Control and Fault Handling.
Summary and Safety Assessment mMIC-SFT November 2003 Anders P. Ravn Aalborg University.
Introduction to Dependability slides made with the collaboration of: Laprie, Kanoon, Romano.
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
Reliability and Fault Tolerance Setha Pan-ngum. Introduction From the survey by American Society for Quality Control [1]. Ten most important product attributes.
2. Fault Tolerance. 2 Fault - Error - Failure Fault = physical defect or flow occurring in some component (hardware or software) Error = incorrect behavior.
 The software systems must do what they are supposed to do. “do the right things”  They must perform these specific tasks correctly or satisfactorily.
Topic (1)Software Engineering (601321)1 Introduction Complex and large SW. SW crises Expensive HW. Custom SW. Batch execution.
Introduction to Dependability. Overview Dependability: "the trustworthiness of a computing system which allows reliance to be justifiably placed on the.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 3 Slide 1 Critical Systems 1.
Practical Reports on Dependability Manifestation of System Failure Site unavailability System exception /access violation Incorrect result Data loss/corruption.
A Systems Perspective on Building Security Into Applications Dr. William J. Hery Polytechnic University
Replicated State Machines ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg.
Building Dependable Distributed Systems Chapter 1 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Code Complete Steve McConnell. 20. The Software-Quality Landscape.
Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.
Fault Tolerance Mechanisms ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg.
CS551 - Lecture 5 1 CS551 Lecture 5: Quality Attributes Yugi Lee FH #555 (816)
Defect resolution  Defect logging  Defect tracking  Consistent defect interpretation and tracking  Timely defect reporting.
Quality Factors Chapter Three. Question To know that quality has improved, it would be helpful to be able to measure quality. How can we measure quality?
Fault Tolerance Benchmarking. 2 Owerview What is Benchmarking? What is Dependability? What is Dependability Benchmarking? What is the relation between.
Fault-tolerant Control Motivation Definitions A general overview on the research area. Active Fault Tolerant Control (FTC) FTC- Analysis and Development.
Ensure that the right functions are performed Ensure that the these functions are performed right and are reliable.
Hwajung Lee. One of the selling points of a distributed system is that the system will continue to perform even if some components / processes fail.
Formal Methods in SE Software Verification Using Formal Methods By: Qaisar Javaid, Assistant Professor Formal Methods1.
Basic Concepts of Dependability Jean-Claude Laprie DeSIRE and DeFINE Workshop — Pisa, November 2002.
MAFTIA’s Interpretation of the IFIP 10.4 Terminology Yves Deswarte LAAS-CNRS Toulouse, France David Powell.
Diversity for Dependability * Jean-Claude Laprie PRDC’99 — December 16-17, 1999 — Hong Kong * Elaboration on «Diversity against Accidental and Deliberate.
Mixed Criticality Systems: Beyond Transient Faults Abhilash Thekkilakattil, Alan Burns, Radu Dobrin and Sasikumar Punnekkat.
1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University
1 5/18/2007ã 2007, Spencer Rugaber Architectural Styles and Non- Functional Requirements Jan Bosch. Design and Use of Software Architectures. Addison-Wesley,
1 INTRUSION TOLERANT SYSTEMS WORKSHOP Phoenix, AZ 4 August 1999 Jaynarayan H. Lala ITS Program Manager.
Application of Fault Injection to Globus Grid Middleware Nik Looker & Jie Xu University of Leeds, Leeds. LS2 9JT, UK Tianyu Wo & Jinpeng Huai Beihang University,
Attributes Availability Reliability Safety Confidentiality Integrity Maintainability Dependability Means Fault Prevention Fault Tolerance Fault Removal.
DEFINE central topics: Critical infrastructures interdependencies Marcelo Masera Joint Research Centre DEFINE workshop November 2002, Pisa.
Faults and fault-tolerance One of the selling points of a distributed system is that the system will continue to perform even if some components / processes.
Safety Assessment: Safety Integrity Levels
Fault-tolerant Control Motivation Definitions A general overview on the research area. Active Fault Tolerant Control (FTC) FTC- Analysis and Development.
Odessa National Polytechnic University Alexander Drozd 1 Master Course. Co-Design and Testing of Safety-Critical Embedded Systems CO-DESIGN.
Software Dependability
Critical systems design
Faults and fault-tolerance
Fault Tolerance In Operating System
Reliability and Fault Tolerance
Faults and fault-tolerance
Software Verification and Validation
Software Verification and Validation
Fault Tolerance Distributed
Software Verification and Validation
Presentation transcript:

Software Fault Tolerance – The big Picture mMIC-SFT September 2003 Anders P. Ravn Aalborg University

Fault Tolerance Means to isolate component faults Prevents system failures May increase system dependability

Dependability - attributes Availability Reliability Safety Confidentiality Integrity Maintainability BW p. 139

Dependability - means Fault prevention Fault tolerance Error Removal Failure Forecasting BW p. 106,...

Dependability - impediments Faults Errors Failures BW p. 103,... FaultErrorFailure... Fault

System and Component

Fault classification Origin Kind Property physical (internal/external) logical (design/interaction) omission value timing byzantine duration (permanent, transient) consistency (determinate, nondeterminate) autonomy (spontaneous, event-dependent)

Error Classification (Fault  Error) Effect Extent latent effective local distributed

Failure Classification (Fault  Failure) Consequence benign malign (a mishap) BW (Failure modes) p. 105

Fault Avoidance Careful Design Conservative Design process (procedures) notations tools robust functionality testability tracability

Error Removal Verification (analysis of design) Test (analysis of implementation)

Failure Forecasting Calculation – analysis of design Simulation – measurement on design Test -- measurement on implementation