Presentation is loading. Please wait.

Presentation is loading. Please wait.

Response to Undesired Events Response to Undesired Events In In Software Systems Presented by Presented by Joe Piccioni Joe Piccioni Kim Ushe Mupfumira.

Similar presentations


Presentation on theme: "Response to Undesired Events Response to Undesired Events In In Software Systems Presented by Presented by Joe Piccioni Joe Piccioni Kim Ushe Mupfumira."— Presentation transcript:

1 Response to Undesired Events Response to Undesired Events In In Software Systems Presented by Presented by Joe Piccioni Joe Piccioni Kim Ushe Mupfumira Kim Ushe Mupfumira Senthil Ramanathan Senthil Ramanathan Smitha Chunduri Smitha Chunduri

2 Overview Definition of Undesired Events (UEs) Definition of Undesired Events (UEs) How to handle UEs How to handle UEs Effects of UEs on code complexity Effects of UEs on code complexity Impossible Abstractions Impossible Abstractions Direction of Propagation of UEs Direction of Propagation of UEs Common Error Indications Common Error Indications Suggestions for a proper UE handling mechanism Suggestions for a proper UE handling mechanism Degrees of UE Degrees of UE Factors which determine the degree of UE Factors which determine the degree of UE Conclusions Conclusions

3 What are Undesired Events? Deviation from normal behavior Deviation from normal behavior Errors should not be handled but corrected Errors should not be handled but corrected Even with programs proven to be correct UEs at run-time will continue to be a problem Even with programs proven to be correct UEs at run-time will continue to be a problem Routines to respond to UEs must be provided in reliable systems Routines to respond to UEs must be provided in reliable systems

4 Why should we expect UEs? Programs written to demonstrate structural programming are written with the assumption that they will always perform correctly Programs written to demonstrate structural programming are written with the assumption that they will always perform correctly Incorrect data or inconsistent data may be supplied to the system Incorrect data or inconsistent data may be supplied to the system Programs are changed from time to time, new errors may appear Programs are changed from time to time, new errors may appear

5 What should we do about UEs? Programs can be defined to take corrective action when UEs occur Programs can be defined to take corrective action when UEs occur Often such programs can only be added after a period of use Often such programs can only be added after a period of use Structure of the system should allow for such a likely change or addition to the program to enhance overall system reliability Structure of the system should allow for such a likely change or addition to the program to enhance overall system reliability

6 A program’s response to UEs Attempts self diagnosis Attempts self diagnosis Print diagnosis information Print diagnosis information Save partial results Save partial results Retry Retry Use of alternative resources Use of alternative resources Send a message to the user Send a message to the user

7 Leveled structure An UE will be detected by a lower level An UE will be detected by a lower level Information available elsewhere (usually at higher levels) determines the appropriate action Information available elsewhere (usually at higher levels) determines the appropriate action The UE should be communicated to higher levels where diagnosis and recovery is attempted The UE should be communicated to higher levels where diagnosis and recovery is attempted

8 UE on code complexity Probability of UEs in I/O modules is higher Probability of UEs in I/O modules is higher Straight forward machine language to write on a tape is usually simple Straight forward machine language to write on a tape is usually simple Code needed for error detection and correction makes the program quite complex Code needed for error detection and correction makes the program quite complex As a result change in the normal case is difficult As a result change in the normal case is difficult

9 Solution Parnas proposes the use of a software analog of a trap used in hardware systems Parnas proposes the use of a software analog of a trap used in hardware systems Traps simplify code and decrease probability of UEs going undetected Traps simplify code and decrease probability of UEs going undetected The code concerned with recovery from UE is called by means of a trap The code concerned with recovery from UE is called by means of a trap This organization achieves a lexical separation of normal use, detection, and correction procedures, thereby easing changes This organization achieves a lexical separation of normal use, detection, and correction procedures, thereby easing changes

10 Separating Error handling code from “Regular” code In traditional programming, error detection, reporting, and handling often lead to confusing spaghetti code. In traditional programming, error detection, reporting, and handling often lead to confusing spaghetti code. For example pseudo code for a function that reads an entire file into memory might look like this: For example pseudo code for a function that reads an entire file into memory might look like this: read file { read file { open the file; determine its size; allocate that much memory; read the file into memory; close the file; }

11 This function looks simple enough but it ignores all of the following errors: This function looks simple enough but it ignores all of the following errors: –What happens if the file can’t be opened? –What happens if the length of the file can’t be determined? –What happens if enough memory can’t be allocated? –What happens if the read fails? –What happens if the file can’t be closed?

12 To answer these questions within your read_function your code would end up looking like this: error codeType readFile { error codeType readFile { initialize errorCode = 0; open the file; if (theFileIsOpen) { determine the length of the file; if (gotTheFileLength) { allocate that much memory; allocate that much memory; if (gotEnoughMemory) { read the file into memory; if (readFailed) errorCode = -1; } else errorCode = -2; } else errorCode = -3; close this file; if (theFileDidntclose && errorCode == 0) errorCode = -4; else errorCode = errorCode and -4; } else errorCode = -5; return errorCode; }

13 With error detection built in your original 7 lines in red have been inflated to 17 lines of code With error detection built in your original 7 lines in red have been inflated to 17 lines of code Worse there is so much error detection, reporting, and returning that the original 7 lines of code are lost in the clutter Worse there is so much error detection, reporting, and returning that the original 7 lines of code are lost in the clutter Java provides an easy solution to the problem of error management Java provides an easy solution to the problem of error management Exceptions enable you to write the main flow of your code and deal with the well exceptional cases elsewhere Exceptions enable you to write the main flow of your code and deal with the well exceptional cases elsewhere

14 If the read_file function used exceptions instead of traditional error management techniques, it would look like this: readFile { try { open the file; determine its size; allocate that much memory; read the file into memory; close the file; } catch (fileOpenFailed) doSomething; catch (sizeDeterminationFailed) doSomething; catch (memoryAllocationFailed) doSomething; catch (readFailed) doSomething; catch (fileClosedFailed) doSomething; }

15 Note that exceptions do not spare you the effort of doing the work of detecting, reporting, and handling errors Note that exceptions do not spare you the effort of doing the work of detecting, reporting, and handling errors What the exceptions do is to separate all the details of what to do when an UE happens from the normal case What the exceptions do is to separate all the details of what to do when an UE happens from the normal case Also the code size and structure is reduced and simplified Also the code size and structure is reduced and simplified

16 Impossible Abstractions The need to make an appropriate response often severely limits the Abstractions we set up. The need to make an appropriate response often severely limits the Abstractions we set up. Programs become less clear when the user can’t write all of their code in terms of the abstract model. Programs become less clear when the user can’t write all of their code in terms of the abstract model. –For practicality reasons, one must compromise the abstraction and include a set of degraded designs. Parnas’ 2 nd suggestion is to not specify a module to have properties which UEs frequently violate. Parnas’ 2 nd suggestion is to not specify a module to have properties which UEs frequently violate. Interfaces must include the necessary operations to communicate the occurrence of an UE. Interfaces must include the necessary operations to communicate the occurrence of an UE.

17 The Direction of propagation of Undesired Events Downward – violates the specified restrictions on the virtual machine. Represents an “Error of Usage”. Downward – violates the specified restrictions on the virtual machine. Represents an “Error of Usage”. Upward – failure of a properly used mechanism or reflection of an Undesired Event which was previously sent downward. Represents an “Error of Mechanism”. Upward – failure of a properly used mechanism or reflection of an Undesired Event which was previously sent downward. Represents an “Error of Mechanism”. –Job abortion occurs as a last resort. A program should: A program should: –Recover or, –Adjust it’s external state and report the UE upwards.

18 Continuation After UE “Handling” The Meta-structure previously described has four advantages: Doesn’t violate the principles of information hiding. Doesn’t violate the principles of information hiding. The Uses definition remains valid. The Uses definition remains valid. Allow evolution in a direction of increased reliability. Allow evolution in a direction of increased reliability. Trivial trap routines generally simplify debugging as the system is integrated. Trivial trap routines generally simplify debugging as the system is integrated. –These routines may only print their own name, but they can also indicate which module is at fault. –This information in turn will designate who should study the problem.

19 Football Example Resonsibility Hierarchy Upper Level: Head Coach Responsible over all. Responsible over all. Middle Level: Other Coaches and Staff Responsible for certain groups, preparing certain teams. Responsible for certain groups, preparing certain teams. Lower Level: Players Responsible for own performance. Responsible for own performance.

20 Modular Design OffenseO-LineBacksRecieversDefenseD-LineLinebackers Corners, Safeties Special Teams PuntersKickersRest Head Coach He or she acts as an interface between each Module.

21 Systematic Approach = Game Plan Basic Operations Basic Operations –Offense: »Running, Throwing, Catching, Blocking –Defense: »Tackling, Batting, Cover, Pursuit Types of UEs Types of UEs –Injury, Performance, Penalties, Equipment, Time Management, Drastic Game Situations, etc.

22 Responsibility Levels Event Types and Handlers Player Level (lower): Player Level (lower): –Minor injury = Tough it out –Poor performance = Try harder –Few penalties = Play smarter Staff Level (middle): Staff Level (middle): »These errors may be detected at player level but staff is responsible for taking the corrective action. –Severe injury = Substitute player –Continued poor performance = Switch formation –Equipment = Replacement

23 Head Coach Level Problems which Staff cannot solve Problems which Staff cannot solve Has information from each module and game situation information Has information from each module and game situation information Time Management Time Management –Run out the clock –Stop the clock »Call time out »Spike the football »Run out of bounds Drastic Game Situations Drastic Game Situations –On sides Kick Attempt »Special Teams coach can rely the costs of such an attempt.

24 Points applied to the Example Impossible Abstractions Impossible Abstractions –Quarterbacks can run (injury and complexity risk) –Trick plays Interfaces contain operations to communicate UEs Interfaces contain operations to communicate UEs –Head Coach can call plays through a microphone directly to the quarterback. –Staff has complete field snapshots where they can detect important information and bring it to the head coach. Information Hiding Principles still Valid Information Hiding Principles still Valid –Coaches concerned with only the team they manage. –Players concentrate on their own position.

25 Common error indication A list of general conditions where an UE could occur. A list of general conditions where an UE could occur. Aids in constructing a list which specifies the limitations of the program and the list of UEs which are bound to occur in case of a violation. Aids in constructing a list which specifies the limitations of the program and the list of UEs which are bound to occur in case of a violation. Aimed at improving ones anticipation of the types of UEs Aimed at improving ones anticipation of the types of UEs It is not the comprehensive list of UEs It is not the comprehensive list of UEs

26 Common error indications(contd) Limitations on the values of parameters Limitations on the values of parameters Example: Example: 1.Entering the value of speed in a stationary bike. 2.Entering you address in a web form. Capacity Limitations Capacity Limitations Example: Example: 1. When maximum weight an elevator can carry is exceeded. 2. Uploading attachments to your e-mail.

27 Common error indications(contd) Requests for undefined information Requests for undefined informationExample: Trying to open a file which doesn’t exist. Trying to open a file which doesn’t exist. Restrictions on the order of operations Restrictions on the order of operations Examples: Examples: 1. A Banking Module which provides functionalities such as Inserting, Deleting and Displaying a customer account. 1. A Banking Module which provides functionalities such as Inserting, Deleting and Displaying a customer account. 2. Trying to access a file before opening the file. 2. Trying to access a file before opening the file.

28 Common error indications(contd) Detection of actions which are likely to be unintentioned Detection of actions which are likely to be unintentioned Examples: Examples: 1. A door in a car is not locked properly 2. The door of an elevator is not locked properly. 2. The door of an elevator is not locked properly. (here I mean the elevators where the doors are manual) 3. Trying to open a file which is already open. 3. Trying to open a file which is already open.

29 Suggestions on building a proper UE handling mechanism Sufficiency Sufficiency Priority of traps Priority of traps A single erroneous call may violate several of the applicability conditions. A single erroneous call may violate several of the applicability conditions. Trapping to several UE routines not efficient. Trapping to several UE routines not efficient. Traps should be prioritized. Traps should be prioritized.Example: Entering a credit card number when making a purchase on web.

30 Suggestions……… Size of the “trap vector” Size of the “trap vector” Influence of the state of a function on occurrence of a trap Influence of the state of a function on occurrence of a trap Example: Example: A doctor who diagnoses a patient before providing any treatment. A doctor who diagnoses a patient before providing any treatment.

31 Suggestions….. Providing Accurate Information about the UE to the user Providing Accurate Information about the UE to the user It is difficult because design methods hidden from the user provides the accurate information about the UE It is difficult because design methods hidden from the user provides the accurate information about the UE Two extreme approaches Two extreme approaches 1. Use of single trap to report failure. 1. Use of single trap to report failure. Disadvantages: Disadvantages: It is very hard for the user to diagnose the failure It is very hard for the user to diagnose the failure

32 Suggestions…… 2. Fully detailed where a predicate is associated with each function. Predicate is set true if the associated function is affected by the failure. Predicate is set true if the associated function is affected by the failure. A master predicate which is set to true in case of a catastrophic failure Disadvantages: Disadvantages: Would return true or false for each function call. Would return true or false for each function call. Highly redundant. Highly redundant.

33 Suggestions….. An optimized approach An optimized approach Failure trap routines pass a parameter which classifies the type of error. Failure trap routines pass a parameter which classifies the type of error. Example: Example: errno and strerror(errno) in C language. errno and strerror(errno) in C language. Redundancy and efficiency Redundancy and efficiency The fully detailed extreme provides a highly insulated module. The fully detailed extreme provides a highly insulated module.

34 Suggestions….. Redundancy of checks has to be eliminated when UEs are rare. Redundancy of checks has to be eliminated when UEs are rare. Retaining the upper level checks Retaining the upper level checks Can detect UEs before any irreversible change Can detect UEs before any irreversible change Retaining the lower level checks Retaining the lower level checks Usually Preferred except when it is not difficult to back up Usually Preferred except when it is not difficult to back up

35 Incidents Vs Crashes Incidents are events although undesired were expected and recovery attempts were successful. Incidents are events although undesired were expected and recovery attempts were successful. All other errors are CRASHES!!!!! All other errors are CRASHES!!!!! This distinction is required to allow several degrees of undesired events. This distinction is required to allow several degrees of undesired events. Recovery is considered to be successful if each degree satisfies a set of predicates. Recovery is considered to be successful if each degree satisfies a set of predicates. If requirements of degree “i” can’t be met system attempts to satisfy degree ‘i+1’. If requirements of degree “i” can’t be met system attempts to satisfy degree ‘i+1’.

36 Example An error playing a CD: An error playing a CD: 1. Check if the case is properly closed. 2. If the power cord is properly fixed. 3. Any internal problem which can be repaired. 4. Any internal part which needs replacement. 5. Serious damage which can’t be repaired or replaced.

37 Degrees of UE Allows a programmer to: Define what he expects his program to do. Define what he expects his program to do. What he wants to treat as an incident and how he is prepared to handle it. What he wants to treat as an incident and how he is prepared to handle it. What he means by correct UE handling. What he means by correct UE handling.

38 Factors which determine the degree of an UE Basic Cause Basic Cause Find the cause by trying recovery actions. Start with the simplest or cheapest and when it fails try the next one. Find the cause by trying recovery actions. Start with the simplest or cheapest and when it fails try the next one. Situation Situation The degree of an undesired event depends on the situation at the time the UE occurred.The degree varies depending on the situation when the UE had occurred. The degree of an undesired event depends on the situation at the time the UE occurred.The degree varies depending on the situation when the UE had occurred.

39 Order of Degrees Criteria for determining the ordering of degrees can be considered by Order of Aims Order of Aims Order of Actions Order of Actions

40 Order of Aims Situation achieved by degree ‘i’ is less desirable than aims of lower degrees. Situation achieved by degree ‘i’ is less desirable than aims of lower degrees. “Less desirable “ depends on the goal and purpose of the user. They might be different for different users. “Less desirable “ depends on the goal and purpose of the user. They might be different for different users.

41 Order of Actions Order of degrees may be different even if all degrees may lead to same situation using different methods and costs. Order of degrees may be different even if all degrees may lead to same situation using different methods and costs. Decision as to which degree should be tried must be left to the user. Decision as to which degree should be tried must be left to the user. Recovery from an UE requires cooperation of both levels. Recovery from an UE requires cooperation of both levels.

42 Solutions Provide different versions of the system (difference lies in their preparation for and recovery from UE’s.) Provide different versions of the system (difference lies in their preparation for and recovery from UE’s.) Provide recovery actions as operations of the abstract machine. Provide recovery actions as operations of the abstract machine.

43 Dependable, Feel-Good Software Systematic approach throughout the system. Systematic approach throughout the system. Abstract interfaces not excessively restrictive. Abstract interfaces not excessively restrictive. Pass failures upward, reflect downward traveling UEs. Pass failures upward, reflect downward traveling UEs. UE consideration requires half (or more) of the programmers effort. UE consideration requires half (or more) of the programmers effort. The TRAP function should be a separate module containing the details of inter-level communication. The TRAP function should be a separate module containing the details of inter-level communication. –This communication is hidden from each level. Information about UEs are defined in the level’s abstract terms. Information about UEs are defined in the level’s abstract terms. The Uses hierarchy is maintained. The Uses hierarchy is maintained. Costs are low as long as no UE occurs. Costs are low as long as no UE occurs.


Download ppt "Response to Undesired Events Response to Undesired Events In In Software Systems Presented by Presented by Joe Piccioni Joe Piccioni Kim Ushe Mupfumira."

Similar presentations


Ads by Google