1 Software Fault Protection Allen Goldberg Kestrel Technology.

Slides:



Advertisements
Similar presentations
h Protection from cyber attacks is achieved by acting on several levels: first, at the physical and material, placing the server in a place as safe as.
Advertisements

Remus: High Availability via Asynchronous Virtual Machine Replication
Principles of Engineering System Design Dr T Asokan
A CASE FOR REDUNDANT ARRAYS OF INEXPENSIVE DISKS (RAID) D. A. Patterson, G. A. Gibson, R. H. Katz University of California, Berkeley.
Byzantine Generals. Outline r Byzantine generals problem.
Chapter 19: Network Management Business Data Communications, 5e.
EECE499 Computers and Nuclear Energy Electrical and Computer Eng Howard University Dr. Charles Kim Fall 2013 Webpage:
Firewall Query Engine and Firewall Comparison Engine Mohamed Gouda Alex X. Liu Computer Science Department The University of Texas at Austin.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
1 Basic Definitions: Testing What is software testing? Running a program In order to find faults a.k.a. defects a.k.a. errors a.k.a. flaws a.k.a. faults.
1 Software Testing and Quality Assurance Lecture 38 – Software Quality Assurance.
DS -V - FDT - 1 HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Zuverlässige Systeme für Web und E-Business (Dependable Systems for Web and E-Business)
1 Software Testing and Quality Assurance Lecture 34 – Software Quality Assurance.
SIGDIG – Signal Discrimination for Condition Monitoring A system for condition analysis and monitoring of industrial signals Collaborative research effort.
Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.
Developing Dependable Systems CIS 376 Bruce R. Maxim UM-Dearborn.
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
Software Reliability Categorising and specifying the reliability of software systems.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 24 Slide 1 Critical Systems Validation 1.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
Achieving Qualities 1 Võ Đình Hiếu. Contents Architecture tactics Availability tactics Security tactics Modifiability tactics 2.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
OHTO -99 SOFTWARE ENGINEERING “SOFTWARE PRODUCT QUALITY” Today: - Software quality - Quality Components - ”Good” software properties.
Chapter 1 In-lab Quiz Next week
1 Software Testing and Quality Assurance Lecture 33 – Software Quality Assurance.
High Performance Embedded Computing © 2007 Elsevier Lecture 5: Embedded Systems Issues Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Protecting the Public, Astronauts and Pilots, the NASA Workforce, and High-Value Equipment and Property Mission Success Starts With Safety Believe it or.
EEL Software development for real-time engineering systems.
Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.
OHTO -99 SOFTWARE ENGINEERING “SOFTWARE PRODUCT QUALITY” Today: - Software quality - Quality Components - ”Good” software properties.
Building Dependable Distributed Systems Chapter 1 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
3.1 Operating System Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual.
CprE 458/558: Real-Time Systems
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
Idaho RISE System Reliability and Designing to Reduce Failure ENGR Sept 2005.
The concept of RAID in Databases By Junaid Ali Siddiqui.
1 Fault-Tolerant Computing Systems #1 Introduction Pattara Leelaprute Computer Engineering Department Kasetsart University
1 Software Maintenance The process of changing the system after it has been delivered and in operation Software change is inevitable –New requirements.
Outsourcing, subcontracting and COTS Tor Stålhane.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Structuring Redundancy for Fault Tolerance Chapter 2 Designed by: Hadi Salimi Instructor: Dr. Mohsen Sharifi.
SENG521 (Fall SENG 521 Software Reliability & Testing Fault Tolerant Software Systems: Techniques (Part 4a) Department of Electrical.
Software Managed Resiliency Siva Hari Lei Chen, Xin Fu, Pradeep Ramachandran, Swarup Sahoo, Rob Smolenski, Sarita Adve Department of Computer Science University.
1 Software Testing and Quality Assurance Lecture 38 – Software Quality Assurance.
Troubleshooting Windows Vista Lesson 11. Skills Matrix Technology SkillObjective DomainObjective # Troubleshooting Installation and Startup Issues Troubleshoot.
Reliability of Disk Systems. Reliability So far, we looked at ways to improve the performance of disk systems. Next, we will look at ways to improve the.
1 Chapter 5 Software Reliability Software Engineering: Design, Reliability, and Management, International Std. Edition by Martin L. Shooman.
UC Marco Vieira University of Coimbra
Chapter 4. CONCEPT OF THE OPERATING SYSTEM MANAGING ESSENTIAL FILE OPERATIONS.
Learn To Fix Errors On Dell PC. We are a third-party service provider for Dell users in Nederland. Call us on Website:
Run Standard Diagnostic Tests
Operating System Reliability
Operating System Reliability
Fault Tolerance In Operating System
Chapter 3: Operating-System Structures
Operating System Reliability
Operating System Reliability
INFORMATION SYSTEMS SECURITY and CONTROL
Operating System Reliability
Chapter 2 Operating System Overview
Operating System Reliability
Seminar on Enterprise Software
Operating System Reliability
Presentation transcript:

1 Software Fault Protection Allen Goldberg Kestrel Technology

Workshop on Aviation Software, Oct System Engineering System engineers build reliable systems from less reliable components. Redundancy is a primary means of achieving reliability. Systems are monitored for anomalies. Fault containment mechanisms (e.g. firewalls) limit damage

Workshop on Aviation Software, Oct Assume perfection, little accommodation for failure even though perfection is rarely achievable Can we make reliable software systems from less reliable software components? What About software?

Workshop on Aviation Software, Oct IVHM Fault Protection Systems System under control Fault Protection System model monitoring fault response

Workshop on Aviation Software, Oct Software Fault Protection (SFP) SUT is software Software Fault Protection System Model of software monitoring fault response

Workshop on Aviation Software, Oct Software Redundancy redundancy: different representations of software behavior code test case model … Redundancy is expensive How should you invest your “redundancy” dollars?

Workshop on Aviation Software, Oct Effective Redundancy at Runtime software “model” “1.2” version programming 1 full-featured, efficient, complex version 0.2 backup version performs essential functions software Software Fault Protection System Model of software monitoring fault response

Workshop on Aviation Software, Oct Software Model When software fails it is usually “obviously” wrong Simple models can detect errors interface behavior data reasonableness resource usage Our model extends ARINC 653 configuration file software Software Fault Protection System Model of software monitoring fault response

Workshop on Aviation Software, Oct Failure responses safe modes: terminate non-essential activities component reset (supported by 653) transient errors lead to bad state component replacement (supported by 653) “1.2” version programming

Workshop on Aviation Software, Oct Fault Containment Eliminate “non-logical” software dependencies error propagation (crash) resource contention ARINC 653 Fault containment is essential to fault isolation

Workshop on Aviation Software, Oct Future Work relate SFP with multi-string flight computers, and system fault protection relate SFP to treatment of radiation induced SEU’s generate SFP models from software design artifacts generate SFP implementations from SFP models