Object-Oriented and Classical Software Engineering Eighth Edition, WCB/McGraw-Hill, 2011 Stephen R. Schach
CHAPTER 6 TESTING
Overview Quality issues Non-execution-based testing What should be tested? Testing versus correctness proofs Who should perform execution-based testing? When testing stops
Revise Terminology When humans make a MISTAKE, a FAULT is injected into the system FAILURE is the observed incorrect behavior of a product as a consequence of a FAULT. ERROR is the amount by which a result is incorrect DEFECT is a generic term for fault. QUALITY the extent to which a product satisfies its specifications.
ERROR IEEE Glossary of terms 610.12, 1990 (1) The difference between a computed, observed, or measured value or condition and the true, specified, or theoretically correct value or condition. For example, a difference of 30 meters between a computed result and the correct result. (2) An incorrect step, process, or data defini- tion. For example, an incorrect instruction in a computer program.
FAULT MISTAKE FAILURE IEEE Glossary of terms 610.12, 1990 A defect in a hardware device or component; for example, a short circuit or broken MISTAKE A human action that produces an incorrect result. FAILURE The inability of a system or component to perform its required functions within specified performance requirements.
IEEE Glossary of terms 610.12, 1990 The fault tolerance discipline distinguishes between a human action (a mistake), its manifestation (a hardware or software fault), the result of the fault (a failure), and the amount by which the result is incorrect (the error).
Remember phases of Software Engineerinh
Phases of Classical Software Engineering Requirements phase Explore the concept Elicit the client’s requirements Analysis (specification) phase Analyze the client’s requirements Draw up the specification document Draw up the software project management plan “What the product is supposed to do” Design phase Architectural design, followed by Detailed design “How the product does it” Implementation phase Coding Unit testing Integration Acceptance testing Postdelivery maintenance Corrective maintenance Perfective maintenance Adaptive maintenance Retirement
Testing There are two basic types of testing Execution-based testing Non-execution-based testing
V & V Revise Terminology Verification Validation Process of determining whether a workflow has been correctly carried out Takes place at the end of each workflow Validation Intensive evaluation that determines if the product as a whole satisfies its requirements Takes place just before the product is to be delivered to the client
Testing (contd) Warning The term “verify” is also used for all non-execution-based testing
6.1 Software Quality Not “excellence” The extent to which software satisfies its specifications Every software professional is responsible for ensuring that his or her work is correct Quality must be built in from the beginning
6.1.1 Software Quality Assurance The members of the SQA group must ensure that the developers are doing high-quality work At the end of each workflow When the product is complete In addition, quality assurance must be applied to The process itself Example: Standards
Text Clear disk with floating text (Advanced) To reproduce the shape effects on this slide, do the following: On the Home tab, in the Slides group, click Layout, and then click Blank. Select the circle. Under Drawing Tools, on the Format tab, in the Size group, do the following: On the Home tab, in the Drawing group, click Shapes, and then under Basic Shapes click Oval (first row, second option from the left). Press and hold SHIFT to constrain the shape to a circle, and then on the slide, drag to draw a circle. In the Shape Height box, enter 4.07”. In the Shape Width box, enter 4.54”. Under Drawing Tools, on the Format tab, in the Shape Styles group, click Shape Outline, and then click No Outline. Under Drawing Tools, on the Format tab, in the Shape Styles group, click Shape Fill, click More Fill Colors, and then in the Colors dialog box, on the Custom tab, enter values for Red: 204, Green: 255, Blue: 153. Under Drawing Tools, on the Format tab, in the Shape Styles group, click Shape Effects, and then do the following: Point to 3-D Rotation, and then under Perspective click Perspective Relaxed (second row, third option from the left). Point to Bevel, and then under Bevel click Convex (second row, third option from the left). On the Home tab, in the bottom right corner of the Drawing group, click the Format Shape dialog box launcher. In the Format Shape dialog box, click 3-D Rotation in the left pane, and then do the following in the right pane under Rotation: Also in the Format Shape dialog box, click 3-D Format in the left pane, and then do the following in the right pane: In the Perspective box, enter 30°. In the Y box, enter 289.6°. Under Depth, in the Depth box, enter 25 pt. Under Bevel, click the button next to Bottom, and then under Bevel click Circle (first row, first option from the left). Under Surface, click the button next to Material, and then under Translucent click Clear (third option from the left). Click the button next to Lighting, and then under Neutral click Balance (first row, second option from the left). Also in the Format Shape dialog box, click Shadow in the left pane, and then do the following in the right pane: In the Transparency box, enter 85%. In the Size box, enter 100%. In the Blur box, enter 21 pt. In the Angle box, enter 90%. In the Distance box, enter 27 pt. To reproduce the text effects on this slide, do the following: Enter text in the text box, select the text, and then on the Home tab, in the Font group, select Gill Sans MT Condensed from the Font list, select 80 from the Font Size list, click the arrow next to Font Color, and then under Theme Colors click White, Background 1, Darker 50% (sixth row, first option from the left). On the Insert tab, in the Text group, click Text Box, and then on the slide, drag to draw the text box. Under Drawing Tools, on the Format tab, in the WordArt Styles group, click Text Effects, and then do the following: On the Home tab, in the Paragraph group, click Center to center the text in the text box. Point to 3-D Rotation, and then under Parallel click Off Axis 2 Left (second row, fourth option from the left). Point to Reflection, and then under Reflection Variations click Tight Reflection, 4 pt offset (second row, first option from the left). Under Drawing Tools, on the Format tab, in the bottom right corner of the WordArt Styles group, click the Format Text Effects dialog box launcher. In the Format Text Effects dialog box, click 3-D Format in the left pane, and then do the following in the right pane: Under Surface, click the button next to Material and then under Standard click Warm Matte (second option from the left). Click the button next to Lighting, and then under Neutral click Soft (first row, third option from the left). Under Depth, click the button next to Color and under Theme Colors then click White, Background 1 (first row, first option from the left). In the Depth box, enter 6.5 pt. To reproduce the background effects on this slide, do the following: Right-click the slide background area, and then click Format Background. In the Format Background dialog box, click Fill in the left pane, select Gradient fill in the right pane, and then do the following: In the Type list, select Linear. Click the button next to Direction, and then click Linear Down (first row, second option from the left). Under Gradient stops, click Add or Remove until two stops appear in the drop-down list. Also under Gradient stops, customize the gradient stops that you added as follows: Select Stop 1 from the list, and then do the following: In the Stop position box, enter 46%. Click the button next to Color, and then under Theme Colors click White, Background 1 (first row, first option from the left). Select Stop 2 from the list, and then do the following: In the Stop position box, enter 100%. Click the button next to Color, click More Colors, and then in the Colors dialog box, on the Custom tab, enter values for Red: 228, Green: 245, Blue: 193.
Responsibilities of SQA Group Development of standards to which software must conform Establishment of monitoring procedures for ensuring compliance with standards Ensure the quality of the software process Ensure the quality of the product
6.1.2 Managerial Independence There must be managerial independence between The development team The SQA group Neither group should have power over the other 20 1 5 12 9 14 8 16 7 19 3 17 2 15 10 6 13 4 18 Problem: SQA team finds serious defects as the deadline approaches! What to do? Who decides?
Managerial Independence (contd) More senior management must decide whether to Deliver the product on time but with faults, or Test further and deliver the product late The decision must take into account the interests of the client and the development organization
6.2 Non-Execution-Based Testing Testing software without running test cases Reviewing software, carefully reading through it Analyzing software mathematically Underlying principles We should not review our own work Other people should do the review. We cannot see our own mistakes. Group synergy Review using a team of software professionals with a brad range of skills
Experienced Senior technical Staff members 6.2.1 Walkthroughs A walkthrough team consists of from four to six members It includes representatives of The team responsible for the current workflow The team responsible for the next workflow The SQA group The client representative The walkthrough is preceded by preparation Lists of items Items not understood Items that appear to be incorrect Experienced Senior technical Staff members
6.2.2 Managing Walkthroughs The walkthrough team is chaired by the SQA representative In a walkthrough we detect faults, not correct them A correction produced by a committee is likely to be of low quality The cost of a committee correction is too high Not all items flagged are actually incorrect A walkthrough should not last longer than 2 hours There is no time to correct faults as well
Managing Walkthroughs (contd) Walkthrough can be Participant driven : reviewers present their lists Document Driven : person/team responsible from the document walks through it A walkthrough must be document-driven, rather than participant-driven Walkthrough leader elicits questions and facilitates discussion Verbalization leads to fault finding A walkthrough should never be used for performance appraisal
6.2.3 Inspections An inspection has five formal steps Overview : Given by a person responsible from producing the document Document distributed at the end of the session Preparation Understand the document in detail Use statistics of fault types Inspection One participant walks through the document Every item must be covered Every branch must be taken at least once Fault finding begins Within 1 day moderator (team leader) produces a written report Rework Person responsible resolves all faults and problems in the written document Follow-up Moderator ensures that all issues mentioned in the report are resolved List faults that were fixed. Clarify incorrectly flagged items
6.2.3 Inspections An inspection has five formal steps Moderator ensures that all issues mentioned in the report are resolved List faults that were fixed. Clarify incorrectly flagged items Follow-up Person responsible resolves all faults and problems in the written document Rework One participant walks through the document Every item must be covered Every branch must be taken at least once Fault finding begins Within 1 day moderator (team leader) produces a written report Inspection Understand the document in detail Use statistics of fault types Preparation Given by a person responsible from producing the document Document distributed at the end of the session Overview :
Inspections (contd) An inspection team has four members Moderator A member of the team performing the current workflow A member of the team performing the next workflow A member of the SQA group Special roles are played by the Reader Recorder
A fault that causes termination of the program Fault Statistics A fault that causes termination of the program Faults are recorded by severity Example: Major or minor Faults are recorded by fault type Examples of design faults: Not all specification items have been addressed Actual and formal arguments do not correspond In general interface and logic errors
Fault Statistics (contd) For a given workflow, we compare current fault rates with those of previous products Early warning!!!! If disproportionate number of a certain fault type is discovered in 203 code artifacts Check other artifacts for the same fault type We take action if there are a disproportionate number of faults in an artifact Redesigning from scratch is a good alternative We carry forward fault statistics to the next workflow We may not detect all faults of a particular type in the current inspection
Statistics on Inspections IBM inspections showed up 82% of all detected faults (1976) 70% of all detected faults (1978) 93% of all detected faults (1986) Switching system 90% decrease in the cost of detecting faults (1986) JPL Four major faults, 14 minor faults per 2 hours (1990) Savings of $25,000 per inspection The number of faults decreased exponentially by phase (1992)
Statistics on Inspections (contd) Warning Fault statistics should never be used for performance appraisal “Killing the goose that lays the golden eggs” Another problem:
6.2.4 Comparison of Inspections and Walkthroughs Two-step, informal process Preparation Analysis Inspection Five-step, formal process Overview Rework Follow-up
6.2.5 Strengths and Weaknesses of Reviews Reviews can be effective Faults are detected early in the process Reviews are less effective if the process is inadequate Large-scale software should consist of smaller, largely independent pieces The documentation of the previous workflows has to be complete and available online
6.2.6 Metrics for Inspections Inspection rate (e.g., design pages inspected per hour) Fault density (e.g., faults per KLOC inspected) Fault detection rate (e.g., faults detected per hour) Fault detection efficiency (e.g., number of major, minor faults detected per hour)
Metrics for Inspections (contd) Does a 50% increase in the fault detection rate mean that Quality has decreased? Or The inspection process is more efficient?
6.3 Execution-Based Testing Organizations spend up to 50% of their software budget on testing But delivered software is frequently unreliable Dijkstra (1972) “Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence”
6.4 What Should Be Tested? Definition of execution-based testing “The process of inferring certain behavioral properties of the product based, in part, on the results of executing the product in a known environment with selected inputs” This definition has troubling implications
6.4 What Should Be Tested? (contd) “Inference” We have a fault report, the source code, and — often — nothing else Input data => Desirable output “Known environment” We never can really know our environment Is the problem due to memory? OS? “Selected inputs” Sometimes we cannot provide the inputs we want Simulation is needed Ex:” avionics software: how well can you simulate the phyical behavior outside the aircraft?
6.4 What Should Be Tested? (contd) We need to test correctness (of course), and also Utility Reliability Robustness, and Performance
6.4.1 Utility The extent to which the product meets the user’s needs Examples: Ease of use Useful functions Cost effectiveness Compare with competitors
6.4.2 Reliability A measure of the frequency and criticality of failure Mean time between failures Mean time to repair Time (and cost) to repair the results of a failure Recover gracefully and quickly
6.4.3 Robustness A function of The range of operating conditions The possibility of unacceptable results with valid input The effect of invalid input A robust product should NOT crash when it is NOT used under valid conditions. Ex: enter ?@#$?@#$ as student id!
6.4.4 Performance The extent to which space and time constraints are met Response time, main mem requirements Real-time software is characterized by hard real-time constraints Nuclear reactor control system. If data are lost because the system is too slow There is no way to recover those data
6.4.5 Correctness A product is correct if it satisfies its output specifications, independent of it use of computer resources and when operated under permitted conditions
Correctness of specifications Incorrect specification for a sort: Function trickSort which satisfies this specification: Figure 6.1 Figure 6.2
Correctness of specifications (contd) Incorrect specification for a sort: Corrected specification for the sort: Figure 6.1 (again) Figure 6.3
Correctness (contd) Technically, correctness is Not necessary Example: C++ compiler Not sufficient Example: trickSort
6.5 Testing versus Correctness Proofs A correctness proof is an alternative to execution-based testing A correctness proof is a mathematical technique for showing that a product is correct Correct=satisfies it specification
6.5.1 Example of a Correctness Proof The code segment to be proven correct Figure 6.4 Correct: After the code is executed the variable s will contain the sum of the n elements of the array y.
Example of a Correctness Proof (contd) A flowchart equivalent of the code segment Figure 6.5
Example of a Correctness Proof (contd) Input specification Output specification Loop invariant Assertions Add an assertion before an after each statement.
Example of a Correctness Proof (contd) Figure 6.6
Example of a Correctness Proof (contd) An informal proof (using induction) appears in Section 6.5.1
6.5.2 Correctness Proof Mini Case Study Dijkstra (1972): “The programmer should let the program proof and program grow hand in hand” “Naur text-processing problem” (1969)
Naur Text-Processing Problem Given a text consisting of words separated by a blank or by newline characters, convert it to line-by-line form in accordance with the following rules: Line breaks must be made only where the given text contains a blank or newline Each line is filled as far as possible, as long as No line will contain more than maxpos characters
Episode 1 Naur constructed a 25-line procedure He informally proved its correctness
Episode 2 1970 — Reviewer in Computing Reviews The first word of the first line is preceded by a blank unless the first word is exactly maxpos characters long
Episode 3 1971 — London finds 3 more faults Including: The procedure does not terminate unless a word longer than maxpos characters is encountered
Episode 4 1975 — Goodenough and Gerhart find three further faults Including: The last word will not be output unless it is followed by a blank or newline
Correctness Proof Mini Case Study (contd) Lesson: Even if a product has been proven correct, it must still be tested
6.5.3 Correctness Proofs and Software Engineering Three myths of correctness proving (see over)
Three Myths of Correctness Proving Software engineers do not have enough mathematics for proofs Most computer science majors either know or can learn the mathematics needed for proofs Proving is too expensive to be practical Economic viability is determined from cost–benefit analysis Proving is too hard Many nontrivial products have been successfully proven Tools like theorem provers can assist us
Difficulties with Correctness Proving Can we trust a theorem prover ? Figure 6.7
Difficulties with Correctness Proving (contd) How do we find input–output specifications, loop invariants? What if the specifications are wrong? We can never be sure that specifications or a verification system are correct
Correctness Proofs and Software Engineering (contd) Correctness proofs are a vital software engineering tool, where appropriate: When human lives are at stake When indicated by cost–benefit analysis When the risk of not proving is too great Also, informal proofs can improve software quality Use the assert statement Model checking is a new technology that may eventually take the place of correctness proving (Section 18.11)
6.6 Who Should Perform Execution-Based Testing? Programming is constructive Testing is destructive A successful test finds a fault So, programmers should not test their own code artifacts
Who Should Perform Execution-Based Testing? (contd) Solution: The programmer does informal testing The SQA group then does systematic testing The programmer debugs the module All test cases must be Planned beforehand, including the expected output, and Retained afterwards
6.7 When Testing Stops Only when the product has been irrevocably discarded
Software Testing - objective Execute a program to find errors A good test case has a high probability of finding errors A successful test finds a new error Software specs. Test Reports Results Testing Check Test specs.
There are two main types of Software Testing Black Box White Box
Black Box Black box testing . . . You know the functionality Given that you know what it is supposed to do, you design tests that make it do what you think that it should do From the outside, you are testing its functionality against the specs/requirements For software this is testing the interface What is input to the system? What you can do from the outside to change the system? What is output from the system? Tests the functionality of the system by observing its external behavior No knowledge of how it goes about meeting the goals
White Box White box testing . . . You know the code Given knowledge of the internal workings, you thoroughly test what is happening on the inside Close examination of procedural level of detail Logical paths through code are tested Conditionals Loops Branches Status is examined in terms of expected values Impossible to thoroughly exercise all paths Exhaustive testing grows without bound Can be practical if a limited number of “important” paths are evaluated Can be practical to examine and test important data structures
When & What to Test? Low Level of Detail High Requirements Acceptance Specifications Acceptance Testing Low System Testing Analysis Level of Detail The V-model is a variation of the waterfall model that makes explicit the dependency between development activities and verification activities. The difference between the waterfall model and the V model is that the latter makes explicit the notion of level of abstraction. Allactivities from requirements to implementation focus on building more and more detailed representation of the system, whereas all activities from implementation to operation focus on validating the system. Design Integration Testing Object Design Unit Testing High Project Time
Types of Testing Unit Testing Integration Testing Done by programmer(s) Generally all white box Integration Testing Done by programmer as they integrate their code into code base Generally white box, maybe some black box Functional/System Testing It is recommended that this be done by an external test group Mostly black box so that testing is not ‘corrupted’ by too much knowledge Acceptance Testing Generally done by customer/customer representative in their environment through the GUI . . . Definitely black box
The Testing Process
Planning a Black Box Test Case Look at requirements/problem statement to generate. Said another way: test cases should be traceable to requirements. The “Test Case Grid” Contains: ID of test case Describe test input conditions Expected/Predicted results Actual Results
Test Case Grid Id Input Expected Result Actual Result For your analysis reports, please use the following format: Id Input Expected Result Actual Result
Black Box Test Planning The inputs must be very specific The expected results must be very specific You must write the test case so anyone on the team can run the exact test case and get the exact same result/sequence of events Example: “Passing grade?” Input field: Correct input: Grade = 90; Grade =20 Incorrect input: “a passing grade” “a failing grade”
Example Test Case Grid Id Input Expected Result Actual Result Your test case grid (last section of your analysis document) should identify at least 15 test cases. Example: “Passing grade?” Id Input Expected Result Actual Result 1 Grade < 70% Fail the class with less than a C (Leave blank until tested) 2 Grade > 70% Pass the class with at least a C
Bad Test Case Example Id Input Expected Result Actual Result A failing grade 1 Fail the class with less than a C (Leave blank until tested) A passing grade 2 Pass the class with at least a C What is a failing and passing grade? Problem: The “input” value is too vague.
Failure Test Cases What if the input type is wrong (You’re expecting an integer, they input a float. You’re expecting a character, you get an integer.)? What if the customer takes an illogical path through your functionality? What if mandatory fields are not entered? What if the program is aborted abruptly or input or output devices are unplugged?
Using a Flow Chart if x >= 0 Key if x <=100 check= false Decision if x <=100 statements check= false Mapping functionality in a flow chart makes the test case generation process much easier. check= true
One input leads to One output A piece of code with inputs a, b, and c. It produces the outputs x, y, and z.
One-to-One Testing Each input only has one valid expected result. To check for a valid ATM Card the following is NOT correct. Id Input Expected Result Actual Result 1 Read ATM Card If card is valid, accept card and ask for pin. If card is invalid, “No ATM Card” exception is thrown and card is returned to the user.
The correct way… Id Input Expected Result Actual Result Test for ATM card Input: Read ATM Card Expected result: Accept card and ask for pin # Input: Read invalid card Expected result: “No ATM Card” exception is thrown and card is returned to the user. Id Input Expected Result Actual Result 1 Read ATM Card Accept card and ask for pin # 2 Read Invalid Card “No ATM Card” exception is thrown and card is returned to the user.
Another test… Id Input Expected Result Actual Result Test for “get PIN” Input: 4 digit entry of a stolen card Expected result: “Stolen Card” exception is thrown Id Input Expected Result Actual Result 1 Read ATM Card Accept card and ask for pin # Read Invalid Card “No ATM Card” exception is thrown and card is returned to the user. 2 Invalid PIN Entered “Stolen Card” exception is thrown and card is destroyed. 3
What’s Next? After tests are performed, the results are recorded. Id Input Expected Result Actual Result 1 Read ATM Card Accept card and ask for pin # Accepted the card and asked for a pin. Read Invalid Card “No ATM Card” exception is thrown and card is returned to the user. 2 Accepted the card and asked for a pin. Invalid PIN Entered Accepted the card and asked for a pin. “Stolen Card” exception is thrown and card is destroyed. 3
What’s Next? After results are recorded, the testing report is created. Id Input Expected Result Actual Result Status 1 Read ATM Card Accept card and ask for pin # Accepted the card and asked for a pin. Passed Read Invalid Card “No ATM Card” exception is thrown and card is returned to the user. 2 Accepted the card and asked for a pin. Failed Invalid PIN Entered “Stolen Card” exception is thrown and card is destroyed. Accepted the card and asked for a pin. 3 Failed
Verification & Validation Testing is performed during the system implementation stage and the results are delivered in the Final Report. The Test Report provides validation and verification for the software program. Verification: "Are we building the product right?" The software should conform to its specification Validation: "Are we building the right product?" The software should do what the user really requires
Project Work Your Analysis Document is due soon! Next: Begin discussion of the Design Specification Requirements Review and prepare for the midterm exam Midterm Exam, coming soon!