CS240: Advanced Programming Concepts

CS240: Advanced Programming Concepts
Week 15 Tuesday

Software Testing Testing is the process of detecting errors by running the actual software and verifying that it works as it should Test cases, Expected results, Actual results Testing is by far the most popular QA activity (but not the most effective) Technical reviews (design reviews, code reviews, etc.) are cheaper and more effective than testing, but are often not done Research has shown that all forms of testing combined usually find less than 60% of the errors present

Software Testing Testing is unlike other software development activities because the goal is to break the software rather than to create it Effective testing requires an assumption that defects actually exist, and a desire to find them If you think you won't find defects, or you don't want to, you won’t be effective in your testing Testing by both developers and an independent testing group are essential They have different perspectives and motivations They do different kinds of tests (developers do white box, test team does black box), which tend to discover different types of defects

Software Testing There are many different types of testing. Three of the most important are: Unit Testing: testing individual modules (e.g., classes) to make sure they work in isolation before combining them with the rest of the system Integration Testing: testing the combination of multiple modules after they have been integrated together If the individual modules work in isolation, can there possibly be defects in their combination? YES! The interactions between the modules can contain defects System Testing: testing done on the entire program, after it is completely integrated

Software Testing: The Bad News
Exhaustively testing software is not feasible The number of possible input combinations is effectively infinite The number of unique paths through the code is effectively infinite You might not live long enough to exhaustively test a non-trivial software system We must do partial testing because we only have enough resources (time and money) to run relatively few test cases Partial testing can never prove the absence of defects: If the system passes all your test cases, there could still be defects, you just need more or better test cases to find them

Software Testing: The Good News
Defects are not evenly distributed (i.e., they tend to cluster) Research has shown that: 80% of a system's defects are found in 20% of its code 50% of a system's defects are found in 5% of its code There is a high correlation between bugs and complex code. Use tools to measure code complexity, and focus testing on those modules with the most complex code One goal of testing is to identify the most problematic modules Redesign may be needed if there is an inherent design flaw Or, replace buggy module with a third-party library/product

Software Testing Effective testing lies in intelligently choosing the relatively few test cases that will actually be executed Test all requirements and features defined in the requirements spec. and functional spec. Test cases should not be redundant (i.e., each one should follow a different path through the code) Focus on scenarios that users are likely to encounter in practice Analyze the program’s design and code to find potential weak areas Analyze all points at which data enters the system and look for ways to attack it

Software Testing Approaches to test case design are generally divided into two broad categories: Black Box Testing and White Box Testing Black Box Testing The tester has limited knowledge of the inner workings of the item being tested Test cases are based on the specification of the item's external behavior Can be done at the Unit, Integration, and System levels White Box Testing The tester has knowledge of the inner workings of the item being tested Test cases are based on the specification of the item's external behavior AND knowledge of its internal implementation Most commonly done at the Unit level

White Box Testing Sources: Code Complete, 2nd Ed., Steve McConnell
Software Engineering, 5th Ed., Roger Pressman

White Box Testing From a testing perspective, looking at the class's internal implementation, in addition to its inputs and expected outputs, enables you to test it more thoroughly Testing that is based both on expected external behavior and knowledge of internal implementation is called "white box testing"

White Box Testing White box testing is primarily used during unit testing Unit testing is usually performed by the engineer who wrote the code In some cases an independent tester might do unit testing

Complete Path Coverage
Test ALL possible paths through a subroutine Some paths may be impossible to achieve. Skip those paths  Often there are too many paths to test them all, especially if there are loops in the code. In this case, we use less complete approaches: Line coverage Branch coverage Condition testing Loop testing

Line coverage At a minimum, every line of code should be executed by at least one test case Developers tend to significantly overestimate the level of line coverage achieved by their tests Coverage tools (like Cobertura) are important for getting a realistic sense of how completely your tests cover the code Complete line coverage is necessary, but not sufficient

Branch coverage Similar to line coverage, but stronger
Test every branch in all possible directions If statements test both positive and negative directions Switch statements test every branch If no default case, test a value that doesn't match any case Loop statements test for both 0 and > 0 iterations

Branch coverage Why isn't branch coverage the same thing as line coverage?

Branch coverage Why isn't branch coverage the same thing as code coverage? Consider an if with no else, or a switch with no default case Line coverage can be achieved without achieving branch coverage

Complete Condition testing
For each compound condition, C Find the simple sub-expressions that make up C Simple pieces with no ANDs or ORs Suppose there are n of them Create a test case for all 2n T/F combinations of the simple sub- expressions If (!done && (value < 100 || c == 'X')) … Simple sub-expressions !done, value < 100, c == 'X' n = 3 Need 8 test cases to test all possibilities

Complete Condition testing
Use a “truth table” to make sure that all possible combinations are covered by your test cases Doing this kind of exhaustive condition testing everywhere is usually not feasible Some combinations might be impossible to achieve (omit these cases, since they are impossible) !done value < 100 c == ‘X’ Case 1: False Case 2: True Case 3: Case 4: Case 5: Case 6: Case 7: Case 8:

Partial Condition Testing
A partial, more feasible approach For each condition, C, test the True and False branches of C and every sub- expression (simple or not) within C, but not all possible combinations If (!done && (value < 100 || c == 'X')) … !done, both T and F value < 100, both T and F c == 'X', both T and F (value < 100 || c == 'X'), both T and F (!done && (value < 100 || c == 'X')), both T and F One test case may cover several of these, thus reducing the number of required test cases

Partial Condition testing
This is similar to what Cobertura calls branch coverage, except that they only consider the True and False cases of simple sub-expressions The test cases for a particular sub-expression must actually execute that sub-expression If (!done && (value < 100 || c == 'X')) … Think about short-circuiting Above, if done is T, the rest of the expression doesn't matter anyway The test cases for value < 100 would need to set done to F The test cases for c == 'X' would need to set done to F and value >= 100

What test cases do we need to achieve
Line coverage? Branch coverage? Complete condition testing? Partial condition testing? // Compute Net Pay totalWithholdings = 0; for ( id = 0; id < numEmployees; ++id) { // compute social security withholding, if below the maximum if ( m_employee[ id ].governmentRetirementWithheld < MAX_GOVT_RETIREMENT) { governmentRetirement = ComputeGovernmentRetirement( m_employee[ id ] ); } // set default to no retirement contribution companyRetirement = 0; // determine discretionary employee retirement contribution if ( m_employee[ id ].WantsRetirement && EligibleForRetirement( m_employee[ id ] ) ) { companyRetirement = GetRetirement( m_employee[ id ] ); grossPay = ComputeGrossPay( m_employee[ id ] ); // determine IRA contribution personalRetirement = 0; if (EligibleForPersonalRetirement( m_employee[ id ] ) { personalRetirement = PersonalRetirementContribution( m_employee[ id ], companyRetirement, grossPay ); // make weekly paycheck withholding = ComputeWithholding( m_employee[ id ] ); netPay = grossPay - withholding - companyRetirement - governmentRetirement - personalRetirement; PayEmployee( m_employee[ id ], netPay ); // add this employee's paycheck to total for accounting totalWithholdings += withholding; totalGovernmentRetirement += governmentRetirement; totalRetirement += companyRetirement; SavePayRecords( totalWithholdings, totalGovernmentRetirement, totalRetirement );

Loop Testing Design test cases based on looping structure of the routine Testing loops Skip loop entirely One pass Two passes N-1, N, and N+1 passes [N is the maximum number of passes] M passes, where 2 < M < N-1

Loop Testing What test cases do we need? Skip loop entirely:
int ReadLine(istream & is, char buf[], int bufLen) { int count = 0; while (count < bufLen) { int c = is.get(); if (c != -1 && c != '\n') buf[count++] = (char)c; else break; } return count; What test cases do we need? Skip loop entirely: bufLen == 0 Exactly one pass: line of length 1 (including the '\n') OR bufLen == 1 Exactly two passes: line of length 2 OR bufLen == 2 N-1, N, and N+1 passes: lines of length bufLen-1, bufLen, and bufLen+1 M passes, where 2 < M < N-1 line of length bufLen / 2

Data Flow Testing The techniques discussed so far have all been based on "control flow" You can also design test cases based on "data flow“ (i.e., how data flows through the code) Some statements "define" a variable’s value (i.e., a “variable definition”) Variable declarations with initial values Assignments Incoming parameter values Some statements "use" variable’s value (i.e., a “variable use”) Expressions on right side of assignment Boolean condition expressions Parameter expressions

Data Flow Testing For every "use" of a variable
Determine all possible places in the program where the variable could have been defined (i.e., given its most recent value) Create a test case for each possible (Definition, Use) pair

Data Flow Testing What test cases do we need?
If ( Condition 1 ) { x = a; } Else { x = b; If ( Condition 2 ) { y = x + 1; y = x – 1; What test cases do we need? Definitions: 1) x = a; 2) x = b; Uses: 1) y = x + 1; 2) y = x – 1; 1. (x = a, y = x + 1) 2. (x = b, y = x + 1) 3. (x = a, y = x – 1) 4. (x = b, y = x – 1)

Data Flow Testing Example Use data flow testing to design a set of test cases for this subroutine.

Relational condition testing
Testing relational sub-expressions (E1 op E2) ==, !=, <, <=, >, >= Three test cases to try: Test E1 == E2 Test E1 slightly bigger than E2 Test E1 slightly smaller than E2

Internal Boundary Testing
void sort(int[] data) { if (data.length < 30) insertionSort(data); else quickSort(data); } Look for boundary conditions in the code, and create test cases for boundary – 1, boundary, boundary + 1

Internal Boundary Testing
const int CHUNK_SIZE = 100; char * ReadLine(istream & is) { int c = is.get(); if (c == -1) { return 0; } char * buf = new char[CHUNK_SIZE]; int bufSize = CHUNK_SIZE; int strSize = 0; while (c != '\n' && c != -1) { if (strSize == bufSize - 1) { buf = Grow(buf, bufSize); bufSize += CHUNK_SIZE; buf[strSize++] = (char)c; c = is.get(); buf[strSize] = '\0'; return buf; What test cases do we need? Lines of length 99, 100, 101

Data Type Errors Scan the code for data type-related errors such as:
Arithmetic overflow If two numbers are multiplied together, what happens if they're both large positive values? Large negative values? Is divide-by-zero possible? Other kinds of overflow If two strings are concatenated together, what happens if they're both unusually long Casting a larger numeric data type to a smaller one short s = (short)x; // x is an int Combined signed/unsigned arithmetic

Built-in Assumptions Scan the code for built-in assumptions that may be incorrect Year begins with 19 Age is less than 100 String is non-empty Protocol in URL is all lower-case What about " or FTP://...?

Limitations of white box testing
Whatever blind spots you had when writing the code will carry over into your white box testing Testing by independent test group is also necessary Developers often test with the intent to prove that the code works rather than proving that it doesn't work Developers tend to skip the more sophisticated types of white box tests (e.g., condition testing, data flow testing, loop testing, etc.), relying mostly on line coverage White box testing focuses on testing the code that's there. If something is missing (e.g., you forgot to handle a particular case), white box testing might not help you. There are many kinds of errors that white box testing won't find Timing and concurrency bugs Performance problems Usability problems Etc.

Black Box Testing Sources: Code Complete, 2nd Ed., Steve McConnell
Software Engineering, 5th Ed., Roger Pressman Testing Computer Software, 2nd Ed., Cem Kaner, et. Al.

Black Box Testing Testing software against a specification of its external behavior without knowledge of internal implementation details Can be applied to software “units” (e.g., classes) or to entire programs External behavior is defined in API docs, Functional specs, Requirements specs, etc. Because black box testing purposely disregards the program's control structure, attention is focused primarily on the information domain (i.e., data that goes in, data that comes out) The Goal: Derive sets of input conditions (test cases) that fully exercise the external functionality

Black Box Testing Black box testing tends to find different kinds of errors than white box testing Missing functions Usability problems Performance problems Concurrency and timing errors Initialization and termination errors Etc. Unlike white box testing, black box testing tends to be applied later in the development process Black Box Testing Example

The Information Domain: inputs and outputs
Individual input values Try many different values for each individual input Combinations of inputs Individual inputs are not independent from each other Programs process multiple input values together, not just one at a time Try many different combinations of inputs in order to achieve good coverage of the input domain Ordering and Timing of inputs In addition to the particular combination of input values chosen, the ordering and timing of the inputs can also make a difference

The Information Domain: inputs and outputs
Defining the input domain Boolean value T or F Numeric value in a particular range 99 <= N <= 99 Integer, Floating point One of a fixed set of enumerated values {Jan, Feb, Mar, …} {Visa, MasterCard, Discover, …} Formatted strings Phone numbers File names URLs Credit card numbers Regular expressions

Equivalence Partitioning
Typically the universe of all possible test cases is so large that you cannot try them all You have to select a relatively small number of test cases to actually run Which test cases should you choose? Equivalence partitioning helps answer this question

Partition the test cases into "equivalence classes" Each equivalence class contains a set of "equivalent" test cases Two test cases are considered to be equivalent if we expect the program to process them both in the same way (i.e., follow the same path through the code) If you expect the program to process two test cases in the same way, only test one of them, thus reducing the number of test cases you have to run

First-level partitioning: Valid vs. Invalid test cases Valid Invalid

Partition valid and invalid test cases into equivalence classes

Create a test case for at least one value from each equivalence class

When designing test cases, you may use different definitions of “equivalence”, each of which will partition the test case space differently Example: int Add(n1, n2, n3, …) Equivalence Definition 1: partition test cases by the number of inputs (1, 2, 3, etc.) Equivalence Definition 2: partition test cases by the number signs they contain (positive, negative, both) Equivalence Definition 3: partition test cases by the magnitude of operands (large numbers, small numbers, both) Etc.

When designing test cases, you may use different definitions of “equivalence”, each of which will partition the test case space differently Example: string Fetch(URL) Equivalence Definition 1: partition test cases by URL protocol (“http”, “https”, “ftp”, “file”, etc.) Equivalence Definition 2: partition test cases by type of file being retrieved (HTML, GIF, JPEG, Plain Text, etc.) Equivalence Definition 3: partition test cases by length of URL (very short, short, medium, long, very long, etc.) Etc.

If an oracle is available, the test values in each equivalence class can be randomly generated. This is more useful than always testing the same static values. Oracle: something that can tell you whether a test passed or failed Test multiple values in each equivalence class. Often you’re not sure if you have defined the equivalence classes correctly or completely, and testing multiple values in each class is more thorough than relying on a single value.

Equivalence Partitioning - examples
Input Valid Equivalence Classes Invalid Equivalence Classes A integer N such that: -99 <= N <= 99 ? Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits

Input Valid Equivalence Classes Invalid Equivalence Classes A integer N such that: -99 <= N <= 99 [-99, -10] [-9, -1] [1, 9] [10, 99] ? Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits

Input Valid Equivalence Classes Invalid Equivalence Classes A integer N such that: -99 <= N <= 99 [-99, -10] [-9, -1] [1, 9] [10, 99] < -99 > 99 Malformed numbers {12-, 1-2-3, …} Non-numeric strings {junk, 1E2, $13} Empty value Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits ?

Input Valid Equivalence Classes Invalid Equivalence Classes A integer N such that: -99 <= N <= 99 [-99, -10] [-9, -1] [1, 9] [10, 99] < -99 > 99 Malformed numbers {12-, 1-2-3, …} Non-numeric strings {junk, 1E2, $13} Empty value Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits (555) 200 <= Area code <= 999 200 < Prefix <= 999 ?

Input Valid Equivalence Classes Invalid Equivalence Classes A integer N such that: -99 <= N <= 99 [-99, -10] [-9, -1] [1, 9] [10, 99] < -99 > 99 Malformed numbers {12-, 1-2-3, …} Non-numeric strings {junk, 1E2, $13} Empty value Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits (555) 200 <= Area code <= 999 200 < Prefix <= 999 Invalid format , (555)(555)5555, etc. Area code < 200 or > 999 Area code with non-numeric characters Similar for Prefix and Suffix

Boundary Value Analysis
When choosing values from an equivalence class to test, use the values that are most likely to cause the program to fail Errors tend to occur at the boundaries of equivalence classes rather than at the "center" If (200 < areaCode && areaCode < 999) { // valid area code } Wrong! If (200 <= areaCode && areaCode <= 999) { // valid area code } Testing area codes 200 and 999 would catch this error, but a center value like 770 would not In addition to testing center values, we should also test boundary values Right on a boundary Very close to a boundary on either side

Boundary Value Analysis
Create test cases to test boundaries of equivalence classes

Boundary Value Analysis - examples
Input Boundary Cases A number N such that: -99 <= N <= 99 ? Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits

Input Boundary Cases A number N such that: -99 <= N <= 99 -100, -99, -98 -10, -9 -1, 0, 1 9, 10 98, 99, 100 Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits ?

Input Boundary Cases A number N such that: -99 <= N <= 99 -100, -99, -98 -10, -9 -1, 0, 1 9, 10 98, 99, 100 Phone Number Area code: [200, 999] Prefix: (200, 999] Suffix: Any 4 digits Area code: 199, 200, 201 Area code: 998, 999, 1000 Prefix: 200, 199, 198 Prefix: 998, 999, 1000 Suffix: 3 digits, 5 digits

Numeric values are often entered as strings which are then converted to numbers internally [int x = atoi(str);] This conversion requires the program to distinguish between digits and non-digits A boundary case to consider: Will the program accept / and : as digits? Char / 1 2 3 4 5 6 7 8 9 : 47 48 49 50 51 52 53 54 55 56 57 58 ASCII

Mainstream usage testing
Don't get so wrapped up in testing boundary cases that you neglect to test "normal" input values Values that users would typically enter during mainstream usage

Black Box Testing Examples
Triangle Classification Next Date Substring Search

Ad Hoc Exploratory Testing (Error Guessing)
Based on intuition, guess what kinds of inputs might cause the program to fail Create some test cases based on your guesses Intuition will often lead you toward boundary cases, but not always Some special cases aren't boundary values, but are mishandled by many programs Try exiting the program while it's still starting up Try loading a corrupted file Try strange but legal URLs:

Comparison Testing Also called Back-to-Back testing
If you have multiple implementations of the same functionality, you can run test inputs through both implementations, and compare the results for equality Why would you have access to multiple implementations? Safety-critical systems sometimes use multiple, independent implementations of critical modules to ensure the accuracy of results You might use a competitor's product, or an earlier version of your own, as the second implementation You might write a software simulation of a new chip that serves as the specification to the hardware designers. After building the chip, you could compare the results computed by the chip hardware with the results computed by the software simulator Inputs may be randomly generated or designed manually EX: Nvidia graphics chips specified with software simulation. Real chip compared back-to-back with simulation to verify implementation. EX: Database drivers EX: Web Crawler in 240

Testing for race conditions and other timing dependencies
Many systems perform multiple concurrent activities Operating systems manage concurrent programs, interrupts, etc. Servers service many clients simultaneously Applications let users perform multiple concurrent actions Test a variety of different concurrency scenarios, focusing on activities that are likely to share resources (and therefore conflict) "Race conditions" are bugs that occur only when concurrent activities interleave in particular ways, thus making them difficult to reproduce Test on hardware of various speeds to ensure that your system works well on both slower and faster machines

Performance Testing Measure the system's performance
Running times of various tasks Memory usage, including memory leaks Network usage (Does it consume too much bandwidth? Does it open too many connections?) Disk usage (Is the disk footprint reasonable? Does it clean up temporary files properly?) Process/thread priorities (Does it play well with other applications, or does it hog the whole machine?)

Limit Testing Test the system at the limits of normal use
Test every limit on the program's behavior defined in the requirements Maximum number of concurrent users or connections Maximum number of open files Maximum request size Maximum file size Etc. What happens when you go slightly beyond the specified limits? Does the system's performance degrade dramatically, or gracefully?

Stress Testing Test the system under extreme conditions (i.e., beyond the limits of normal use) Create test cases that demand resources in abnormal quantity, frequency, or volume Low memory conditions Disk faults (read/write failures, full disk, file corruption, etc.) Network faults Unusually high number of requests Unusually large requests or files Unusually high data rates (what happens if the network suddenly becomes ten times faster?) Even if the system doesn't need to work in such extreme conditions, stress testing is an excellent way to find bugs

Random Testing Randomly generate test inputs
Could be based on some statistical model How do you tell if the test case succeeded? Where do the expected results come from? Some type of “oracle” is needed Expected results could be calculated manually Possible, but lots of work Automated oracles can often be created to measure characteristics of the system Performance (memory usage, bandwidth, running times, etc.) Did the system crash? Maximum and average user response time under simulated user load

Security Testing Any system that manages sensitive information or performs sensitive functions may become a target for intrusion (i.e., hackers) How feasible is it to break into the system? Learn the techniques used by hackers Try whatever attacks you can think of Hire a security expert to break into the system If somebody broke in, what damage could they do? If an authorized user became disgruntled, what damage could they do?

Usability Testing Is the user interface intuitive, easy to use, organized, logical? Does it frustrate users? Are common tasks simple to do? Does it conform to platform-specific conventions? Get real users to sit down and use the software to perform some tasks Watch them performing the tasks, noting things that seem to give them trouble Get their feedback on the user interface and any suggested improvements Report bugs for any problems encountered

Recovery Testing Try turning the power off or otherwise crashing the program at arbitrary points during its execution Does the program come back up correctly when you restart it? Was the program’s persistent data corrupted (files, databases, etc.)? Was the extent of user data loss within acceptable limits? Can the program recover if its configuration files have been corrupted or deleted? What about hardware failures? Does the system need to keep working when its hardware fails? If so, verify that it does so.

Configuration Testing
Test on all required hardware configurations CPU, memory, disk, graphics card, network card, etc. Test on all required operating systems and versions thereof Virtualization technologies such as VMWare and Virtual PC are very helpful for this Test as many Hardware/OS combinations as you can Test installation programs and procedures on all relevant configurations

Compatibility Testing
Test to make sure the program is compatible with other programs it is supposed to work with Ex: Can Word 12.0 load files created with Word 11.0? Ex: "Save As… Word, Word Perfect, PDF, HTML, Plain Text" Ex: "This program is compatible with Internet Explorer and Firefox" Test all compatibility requirements

Documentation Testing
Test all instructions given in the documentation to ensure their completeness and accuracy For example, “How To ...” instructions are sometimes not updated to reflect changes in the user interface Test user documentation on real users to ensure it is clear and complete

CS240: Advanced Programming Concepts

Similar presentations

Presentation on theme: "CS240: Advanced Programming Concepts"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS240: Advanced Programming Concepts

Similar presentations

Presentation on theme: "CS240: Advanced Programming Concepts"— Presentation transcript:

Similar presentations

About project

Feedback