Software Fault Tolerance – The big Picture mMIC-SFT September 2003 Anders P. Ravn Aalborg University
Fault Tolerance Means to isolate component faults Prevents system failures May increase system dependability
Dependability - attributes Availability Reliability Safety Confidentiality Integrity Maintainability BW p. 139
Dependability - means Fault prevention Fault tolerance Error Removal Failure Forecasting BW p. 106,...
Dependability - impediments Faults Errors Failures BW p. 103,... FaultErrorFailure... Fault
System and Component
Fault classification Origin Kind Property physical (internal/external) logical (design/interaction) omission value timing byzantine duration (permanent, transient) consistency (determinate, nondeterminate) autonomy (spontaneous, event-dependent)
Error Classification (Fault Error) Effect Extent latent effective local distributed
Failure Classification (Fault Failure) Consequence benign malign (a mishap) BW (Failure modes) p. 105
Fault Avoidance Careful Design Conservative Design process (procedures) notations tools robust functionality testability tracability
Error Removal Verification (analysis of design) Test (analysis of implementation)
Failure Forecasting Calculation – analysis of design Simulation – measurement on design Test -- measurement on implementation