Download presentation
Presentation is loading. Please wait.
1
Software Quality: Testing and Verification I
2
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software2 1.A failure is an unacceptable behaviour exhibited by a system — The frequency of failures measures software reliability Low failure rate = high reliability — Failures result from violation of a requirement 2.A defect is a flaw that contributes to a failure — It might take several defects to cause one failure 3.An error is a software decision that leads to a defect Software Flaws are identified at three levels
3
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software3 Eliminating Failures: Testing vs Verification Testing = running the program with a set of inputs to gain confidence that the software has few defects Goal: reduce the frequency of failures When done: after the programming is complete Methodology: develop test cases; run the program with each test case Verification = formally proving that the software has no defects Goal: eliminate failures When done: before and after the programming is complete Methodology: write separate specifications for the code; prove that the code and the specifications are mathematically equivalent
4
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software4 Effective and Efficient Testing Effective testing uncovers as many defects as possible Efficient testing finds defects using the fewest possible tests Good testing is like detective work: —The tester must try to understand how programmers and designers think, so as to better find defects. —The tester must cover all the use case scenarios and options. —The tester must be suspicious of everything. —The tester must not take a lot of time. The tester is not the programmer
5
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software5 Testing Methods 1.Black box: Testers run the software with a collection of inputs and observe the outputs —none of the source code or design documentation is available 2.Glass box (aka ‘white-box’ or ‘structural’): Testers watch all the steps taken by the software during a run — Testers have access to the source code and documentation — Individual programmers often use glass-box testing to verify their own code
6
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software6 Equivalence classes It is impossible to test a software product by brute force, using every possible input value. So a tester divides all the inputs into groups that will be treated similarly by the software. —These groups are called equivalence classes. —A representative from each group is called a test case. —The assumption is that if the software has no defects for the test case, then it will have no defects for the entire equivalence class. This approach is practical, but This approach is also flawed (it will not find all defects)
7
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software7 Examples of equivalence classes 1.Valid input is a month number (1-12). Equivalence classes could be: [-∞..0], [1..12], [13.. ∞]. —E.g., the three test cases could be -1, 5, and 45. 2.Valid input is a course id, with a department name (e.g., CSCI), a 3-digit number (e.g., 260) in the range 001-499, and an optional section (e.g., A, B, C, D, or E). Equivalence classes (test cases) could be: —A valid course id from each one of the 25 departments, each having a 3-digit number in the range 001-499. —A valid course id with a section —A course id with an invalid department name —A course id with an invalid number —A course id with an invalid section
8
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software8 Fighting combinatorial explosion Combinatorial explosion means that you cannot realistically use a test case from every combination of equivalence classes across the system. —E.g., With just 10 inputs and 5 possible values each, the system has 10 5 = 100,000 equivalence classes. Sooo… —Make sure that at least one test case represents an equivalence class of every different input. —Include test cases just inside the boundaries of the input values. —Include test cases just outside the boundaries. —Include a few random test cases.
9
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software9 Common Programming Errors 1.Incorrect logical conditions on loops and conditionals The landing gear must be deployed whenever the plane is within 2 minutes from landing or takeoff, or within 2000 feet from the ground. If visibility is less than 1000 feet, then the landing gear must be deployed whenever the plane is within 3 minutes from landing or lower than 2500 feet. if(!landingGearDeployed && (min(now-takeoffTime,estLandTime-now))< (visibility < 1000 ? 180 :120) || relativeAltitude < (visibility < 1000 ? 2500 :2000) ) { throw new LandingGearException(); }
10
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software10 2.Performing a calculation in the wrong part of a control construct E.g., while(j<maximum) { k=someOperation(j); j++; } if(k==-1) signalAnError(); 3.Not terminating a loop or recursive method properly E.g., while (i < courses.size()) if (id.equals(courses.getElement(i))) … ; 4.Not enforcing the preconditions (correctly) in a use case E.g., Failure to check that a courseOffering is not full before adding a student to its class list.
11
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software11 5. Not handling null conditions (null references) properly E.g., a Student with no schedule. 6. Not handling singleton conditions (one or zero of something that is normally more than one). E.g., a schedule with 0 courses in it. 7. Off-by-one errors E.g., for (i=1; i<arrayname.length; i++) { /* do something */ } 8.Operator precedence errors E.g., x*y+z instead of x*(y+z) 9. Use of inappropriate standard algorithms E.g., a non-unstable sort
12
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software12 Defects in Numerical Algorithms 1.Not enough bits or digits (magnitude/overflow) 2.Not enough decimal places (precision) 3.Ordering operations poorly, allowing errors to propagate 4.Assuming exact equality between two floating point values E.g., use abs(v1-v2) < epsilon instead of v1 == v2
13
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software13 Defects in Timing and Co-ordination Critical race —One thread fails because another thread interferes with the ‘normal’ sequence of events. —Critical races can be prevented by locking data so they cannot be accessed by another thread simultaneously. In Java, synchronized can be used to lock an object until the method terminates. E.g., consider two students wanting to add the same courseOffering to their schedules at the same time. These two threads must be synchronized in order to prevent a critical race. Deadlock and livelock —Deadlock is a situation where two or more threads are stopped, each waiting for the other to do something. The system hangs and the threads cannot do anything. —Livelock is similar, except that the threads can do some computations even though the system is hanging. E.g., consider a student wanting to access a course that another student is adding to her schedule, and the other student suspends this action and goes to lunch. How can this kind of deadlock be prevented in StressFree?
14
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software14 Defects in Handling Other Unusual Situations 1.Insufficient throughput or response time 2.Incompatibility with specific hardware/software configurations 3.Inability to handle peak loads or missing resources 4.Inappropriate management of resources 5.Inability to recover from a crash 6.Ineffective documentation (user manual, reference manual or on-line help)
15
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software15 Strategies for Testing Large Systems Big bang vs integration testing In big bang testing, you test the entire system as a unit A better strategy is incremental testing (sometimes called unit testing): —First test each individual subsystem alone —Then add more and more subsystems and test them one at a time —Can do this horizontally or vertically, depending on the architecture (e.g., a client-server architecture allows horizontal testing; server side first and client side second)
16
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software16 Top-down vs Bottom-up testing Top-down 1.Start by testing the user interface (GUI). —Simjulate he underlying functionality using stubs (code with the same interface but no functionality). 2.Then work downward, integrating lower and lower layers one at a time. Bottom-up 1.Start by testing the very lowest levels of the software. —Use drivers to test these modules (Drivers are simple programs that call the modules at the lower layers). 2.Now work upward, replacing the drivers with the actual modules that call the lower level modules.
17
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software17 Strategies for incremental testing
18
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software18 The test-fix-test cycle When testing exposes a failure: 1.A failure report is entered into a failure tracking system. 2.The failure is screened and assigned a priority. 3.Low-priority failures might be put on a known bugs list included with the software’s release notes. 4.Some failure reports might be merged if they seem to expose the same defects. 5.The failure is investigated. 6.The defect causing the failure is tracked down and fixed. 7.A new version of the software is created, and this cycle is repeated.
19
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software19 The ripple effect Efforts to remove one defect will likely add new ones The maintainer tries to fix problems without fully understanding the ramifications The maintainer makes ordinary human errors The system can regress into a more and more failure-prone state Regression testing reruns only a subset of the previously- successful test cases at each iteration (i.e., focus on the trouble spots). It’s expensive to re-run every test case every time the software is updated. Regression test cases are carefully selected to cover as much of the system as possible.
20
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software20 So when do we stop testing? Stop testing when: 1.all the level 1 test cases are successfully executed. 2.a certain predefined percentage of level 2 and level 3 test cases have been executed successfully. 3.the targets have been achieved and are maintained for at least two build cycles, where —A build involves compiling and integrating all the system’s components. Failure rates fluctuate between builds because: —Different sets of regression tests are used, and —New defects are introduced as old ones are fixed
21
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software21 Who is involved in testing? 1.Original developers conduct the first pass of unit and integration testing. 2.A separate group of developers conducts independent testing. —They have no vested interest, and —They have specific expertise in test case design and test tool utilization. 3.Users and clients —Alpha testing: performed under the supervision of the software development team. —Beta testing: Performed in a normal work environment. (An open beta release is the release of low-quality software to the general population.) —Acceptance testing: customers do it on their own initiative.
22
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software22 Inspections An activity in which one or more people critically examine source code or documentation, looking for defects. Normally team activities, with roles: —The author —The moderator —The secretary —The paraphrasers try to explain the code A peer review process Inspect only completed documents Complementary to testing: better at finding maintainability or efficiency defects Inspect before before testing.
23
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software23 Quality Assurance: When things go wrong… Perform root cause analysis Determine whether problems are caused by: —Lack of training —Schedules too tight —Poor designs or choices of reusable components Measure —the number of failures encountered by users —the number of failures found when testing —the number of failures found when inspecting —the percentage of code that is reused —The number of questions asked by users at the help desk (as a measure of usability and the quality of documentation) Strive for continual improvement
24
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software24 Software Process standards The personal software process (PSP): A disciplined approach that a developer can use to improve the quality and efficiency of his or her personal work. One of the key tenets is personally inspecting your own work. The team software process (TSP): Describes how teams of software engineers can work together effectively. The software capability maturity model (CMM): Contains five levels, Organizations start in level 1, and as their processes become better they can move up towards level 5. ISO 9000-2: An international standard that lists how an organization can improve its overall software process.
25
© Lethbridge/Laganière 2001 Chapter 9: Architecting and designing software25 Difficulties and Risks in Quality Assurance It’s easy to forget to test some aspects of a software system: —‘running the code a few times’ is not enough. —Forgetting certain types of tests impacts quality. There’s a natural conflict between quality and meeting deadlines. So… —Create a separate department to oversee QA. —Publish statistics about quality. —Plan adequate time for all activities. People have different skills, knowledge, and preferences when it comes to quality. So… —Assign tasks that fit their strengths. —Train people in testing and inspecting techniques. —Provide feedback about performance vis-a-vis quality in software. —Require developers and maintainers to work alternately on a testing team.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.