Download presentation
Presentation is loading. Please wait.
Published byErika Anthony Modified over 9 years ago
1
Fault Tolerance Mechanisms ITV Model-based Analysis and Design of Embedded Software Techniques and methods for Critical Software Anders P. Ravn Aalborg University August 2011
2
Fault Tolerance Means to isolate component faults Prevents system failures May increase system dependability... And mask them
3
Fault Tolerance
4
FT - levels Full tolerance Graceful Degradation Fail safe BW p. 107
5
FT basis: Redundancy Time Space TryRetry... Try...
6
Fault Tolerance
7
Basic Strategies
8
Dynamic Redundancy 1.Error detection 2.Damage confinement and assessment 3.Error recovery 4.Fault treatment and continued service BW p. 114
9
Error Detection f: State x Input State x Output Environment (exception) Application Assertion: precondition (input) postcondition (input, output) invariant(state, state’) Timing: WCET(f, input) Deadline (f,input) D
10
Damage Confinement Static structure Dynamic structure (transaction) object I I
11
Error Recovery Forward Backward Repair the state – if you can ! define recovery points checkpoint state at r. p. roll back retry Domino effect
12
Recovery blocks ENSURE acceptance_test BY { module_1 } ELSE BY { module_2 }... ELSE BY { module_m } ELSE ERROR BW p. 120
13
Implementation of Recovery Blocks
14
Abstract class RecoveryBlock public abstract class RecoveryBlock { abstract boolean acceptanceTest(); /** method to produce the result, it must be implemented by the application. * @param module 0,..., MaxModule-1 */ abstract void block(int module); /* MaxModules must be set by the application to the number of blocks */ protected int MaxModules; ENSURE acceptance_test BY { module_1 } ELSE BY { module_2 }... ELSE BY { module_m } ELSE ERROR
15
RecoveryBlock execution /** method to execute recovery module 0, 1,... MaxModules-1 until one succeds * @throws NoAccept if no module passes acceptanceTest. */ public final void do_it() throws NoAccept, CloneNotSupportedException{ save(); int i = 0; do { try { block(i++); if ( acceptanceTest() ) return; } catch (Exception e) {/* if the block fails, we continue - not acceptance */} restore(copy); } while (i < MaxBlocks); throw new NoAccept(); } ENSURE acceptance_test BY { module_1 } ELSE BY { module_2 }... ELSE BY { module_m } ELSE ERROR
16
RecoveryBlock cache public abstract class RecoveryBlock { /** The recovery Cache is implemented by a clone of the original object */ RecoveryBlock copy; /** save object to recovery cache, uses Java clone which must be a deep clone. */ private final void save() throws CloneNotSupportedException { copy = (RecoveryBlock) this.clone(); } /** method to restore data from recovery cache, it must be implemented by the application * @param value of the object to be restored */ abstract void restore(RecoveryBlock copy);
17
Application /** Extends the basic abstract RecoveryBlock with faulty sorting * algorithms and log calls, returns etc. to a TextArea. */ public class RecoveringSort extends RecoveryBlock { /** checksum for acceptance test */ private int checksum; /** data to be saved in recovery cache */ private int [] argument; public RecoveringSort(TextArea t) { MaxBlocks = 3; log = t; }
18
Acceptance criteria /* Acceptance test for sorting; it shall verify: * 1) the return value is an ordered list, * 2) the return value is a permutation of the initial values */ boolean acceptanceTest() { boolean result = true; // check ordering int i = argument.length-1; while (i > 0) if (argument[i] < argument[--i]) {result = false; break; } // check permutation, this is a partial check through a checksum // A full check is as expensive computationally as sorting, // thus, we use a partial check. i = argument.length; int sum = 0; while (i > 0) sum+=argument[--i]; return result && (sum == checksum); }
19
Application - modules /** Starts sorting using the recovery block mechanisms.. * @param data integer array containing elements to be sorted. */ public int [] sort(int [] data) { argument = (int [])data.clone(); // copy needed for recovery to work checksum = 0; int i = argument.length; while (i > 0) checksum+=argument[--i]; try { do_it(); } catch (NoAccept e) { log.append("All blocks falied\n"); } return argument; } void block(int i) { switch (i) { case 0: BucketSort(argument); break; case 1: BadSort(argument); break; case 2: AlmostGoodSort(argument); break; default: }
20
Fault classes (scope of R-B) Origin Kind Property physical (internal/external) logical (design/interaction) omission value timing byzantine duration (permanent, transient) consistency (determinate, nondeterminate) autonomy (spontaneous, event-dependent) + (+) ++ (-) + / (+) + / +
21
The ideal FT-component Exception HandlerNormal mode Request/response Interface exception Interface exception Failure exception Failure exception
22
N-version programming V1 V2 V3 Driver (comparator) Comparison vectors (votes) Comparison status indicators Comparison points
23
Fault classes (scope of N-VP) Origin Kind Property physical (internal/external) logical (design/interaction) omission value timing byzantine duration (permanent, transient) consistency (determinate, nondeterminate) autonomy (spontaneous, event-dependent) + (+) ++ + + / (+) + / +
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.