1 Software Fault Protection Allen Goldberg Kestrel Technology
Workshop on Aviation Software, Oct System Engineering System engineers build reliable systems from less reliable components. Redundancy is a primary means of achieving reliability. Systems are monitored for anomalies. Fault containment mechanisms (e.g. firewalls) limit damage
Workshop on Aviation Software, Oct Assume perfection, little accommodation for failure even though perfection is rarely achievable Can we make reliable software systems from less reliable software components? What About software?
Workshop on Aviation Software, Oct IVHM Fault Protection Systems System under control Fault Protection System model monitoring fault response
Workshop on Aviation Software, Oct Software Fault Protection (SFP) SUT is software Software Fault Protection System Model of software monitoring fault response
Workshop on Aviation Software, Oct Software Redundancy redundancy: different representations of software behavior code test case model … Redundancy is expensive How should you invest your “redundancy” dollars?
Workshop on Aviation Software, Oct Effective Redundancy at Runtime software “model” “1.2” version programming 1 full-featured, efficient, complex version 0.2 backup version performs essential functions software Software Fault Protection System Model of software monitoring fault response
Workshop on Aviation Software, Oct Software Model When software fails it is usually “obviously” wrong Simple models can detect errors interface behavior data reasonableness resource usage Our model extends ARINC 653 configuration file software Software Fault Protection System Model of software monitoring fault response
Workshop on Aviation Software, Oct Failure responses safe modes: terminate non-essential activities component reset (supported by 653) transient errors lead to bad state component replacement (supported by 653) “1.2” version programming
Workshop on Aviation Software, Oct Fault Containment Eliminate “non-logical” software dependencies error propagation (crash) resource contention ARINC 653 Fault containment is essential to fault isolation
Workshop on Aviation Software, Oct Future Work relate SFP with multi-string flight computers, and system fault protection relate SFP to treatment of radiation induced SEU’s generate SFP models from software design artifacts generate SFP implementations from SFP models