FOR0383 Software Quality Assurance

FOR0383 Software Quality Assurance
Lectures 22 and 23 What is the expected result? of hratt -> 9/20/2018 Dr Andy Brooks

The problem: oracle/véfrétt Source:
A Taxonomy for Test Oracles, Douglas Hoffman, Software Quality Week (QW’98), 1998, 8pp. The problem: Test case identifier Test case values Expected results Actual results 1 [2,7,0] 2 [0,7,2] How do we calculate the expected results? (and know that they are correct...) 20/09/2018 Dr Andy Brooks

The oracle The oracle is the mechanism used to generate expected results. High machine speeds, cheap memory, and test automation has meant that it is very easy to generate very large amounts of test data. Correspondingly, the oracle being used has to be capable of generating all the expected results. How do we know the oracle is correct? 20/09/2018 Dr Andy Brooks

Oracle methods used by Hoffman
A human calculates the expected results. A separate program implementing the same algorithm is used to calculate the expected results. A simulation is used to calculate the expected results. The simulation is typically at a higher level of abstraction. A queuing system written in a high-level simulation language can be used to test a queuing system written in Java or C++. There are many high-level simulation languages. Java C++ 20/09/2018 Dr Andy Brooks

A debugged hardware simulator is used to calculate the expected results. e.g. wave tank simulator e.g. earthquake simulator 20/09/2018 Dr Andy Brooks

An earlier version of the software is used to calculate the expected results. v1.1.1, v1.1.2 assuming v1.1.1 is OK? “back-to-back testing” The same version of the software on a different hardware platform is used to calculate the expected results. t.d. intel, ibm microprocessors 20/09/2018 Dr Andy Brooks

A check is made on the consistency of generated values and end points. e.g. the output value should be monotonically increasing as we monotonically increase the input value e.g. we can easily calculate the output if the input value is 0o, 90o, 180o, 270o, or 360o... A sample of values can be checked against independently generated expected results from existing commercial or open-source software. failure 20/09/2018 Dr Andy Brooks

Background Many organizations rely on a human oracle but the volume of data from automated tests is overwhelming. It is not enough to regard program termination as a successfully executed test case. Output results must be verified. “very few errors cause noticeable abnormal termination” 20/09/2018 Dr Andy Brooks

Background Verifying mathematical subroutines is relatively straightforward by using a different algorithm, language, compiler, etc. Verifying the interrupt handling of an operating system kernel is far more difficult. Generating complete sets of expected results is often not practical. “It is particularly difficult to generate expected information for file directories, machine registers, system tables, memory, etc.” 20/09/2018 Dr Andy Brooks

Input-Process-Output testing model
SUT System Under Test Test Inputs Test Results SUT´s very rarely fit this simple model. There are often multiple, complex inputs and outputs. Outputs can include values left in memory, the program state for the SUT (and other software), database values, ... 20/09/2018 Dr Andy Brooks

Expanded Testing Model
precondition/forskilyrði Expanded Testing Model Test Inputs SUT Test Results Precondition Data Postcondition Data System Under Test Precondition Program State Postcondition Program State Environmental Inputs Environmental Results Test designers select what they regard as the most relevant test inputs and results and choose a subset to use in verifying program behaviour. Environmental inputs are rarely specified, and often only some preconditions are specified. Amman and Offut use the terms Prefix Values and Postfix Values. 20/09/2018 Dr Andy Brooks

vital signs/lífsmörk intubation/ pípusetning extra Patient monitoring Suppose the SUT is software used to monitor a patient´s vital signs and provide intelligent advice. Suppose the SUT has access to the electronic patient record. 20/09/2018 Dr Andy Brooks

extra Patient monitoring What data would you use in the electronic patient record for the purposes of testing ? age, weight, sex, allergies, smoker/non-smoker,... Should the electronic patient record be updated after executing a test ? What data would you use to represent the current state of the patient ? drug dosage, heart rate, level of consciousness, intubated or not, blood alcohol level, ... Should the current state of the patient be updated after executing a test ? if the advice is to intubate, change drug dosage, ... 20/09/2018 Dr Andy Brooks

extra Patient monitoring A likely scenario is that the electronic patient record is accessed over a local or a wide-area network. Does the SUT download a complete copy or send SQL queries to a third party ? Should network load be an input to the testing ? Network load is an environment variable. Temperature is an environment variable, but it is usually assumed that computer equipment is not exposed to too much cold or too much heat. Should the size of electronic patient record be an input to the testing ? A patient with lifelong health problems will have a very large patient record. 20/09/2018 Dr Andy Brooks

extra Patient monitoring Just prior to connecting the patient to the monitoring system, what should the patient data be? A blank patient record? An average patient record? Just prior to connecting the patient to the monitoring system, what should the state of intelligent advice be? No advice? Some monitoring devices may provide data at intervals rather than in real-time... So initial values at t=0 may be important. 20/09/2018 Dr Andy Brooks

“Observations” Several oracles may be needed for one program:
to verify correct functional behaviour to verify correct screen layout and navigation to verify correct memory use “... Only the SUT running in the target environment will process all the inputs and provide all the results. No matter how meticulous we are in creating an oracle, we will not achieve both independence and completeness.” 20/09/2018 Dr Andy Brooks

Characteristics of oracles
Completeness of information A complete oracle that duplicates all the results can be regarded as a second implementation of the SUT. Accuracy of information A complete oracle that is as accurate as the SUT is at least as complex as the SUT. Differences detected by a complex oracle may be as a result of faults in the oracle rather than the SUT. Faults will be missed if both the SUT and a complex oracle contain the same fault. 20/09/2018 Dr Andy Brooks

Independence of the oracle from the SUT can be achieved by using: different algorithms different libraries different system platform (hardware) different operating environment (operating system) Speed of predictions Is the oracle slow compared to the SUT? Time of execution Is the oracle run parallel with the SUT? 20/09/2018 Dr Andy Brooks

Usability of results An expected result calculated using pencil and paper will have to be manually copied to the test environment. (adding to the expense of testing) Scripts may have to be written to transfer expected results from a simulation to the test environment. (adding to the expense of testing) If there are changes to the SUT, can the oracle be easily changed? (poor maintainability of the oracle will add to the expense of testing) 20/09/2018 Dr Andy Brooks

Manual testing To calculate expected results, the human oracle, in addition to paper and pencil, might use: books tables desk calculators mobile phone calculators ... 20/09/2018 Dr Andy Brooks

Automated testing “Automated testing does not mean mechanical reproduction of manual tests.” Some kind of oracle is used. 20/09/2018 Dr Andy Brooks

Types of oracles Categories are based on oracle outputs, not the methods used. True oracle A true oracle “faithfully reproduces all relevant results for a SUT using independent platform, algorithms, processes, compilers, code, etc”. Building a true oracle for a common mathematical function can be relatively straightforward. The sin() function can be implemented using different hardware and software. CORDIC algorithm,... A true oracle can accept any inputs that the SUT can and there is a good chance of finding faults. 20/09/2018 Dr Andy Brooks

Types of oracles Categories are based on oracle outputs, not the methods used. True oracle The less an oracle has in common with the SUT the more confident we are that the results of testing are correct. A true oracle is as complicated as the SUT and may have its own faults... Common hardware and software (operating system, compiler,...) may inject errors that effect the oracle and the SUT in the same way. Remember the Pentium bug... 20/09/2018 Dr Andy Brooks

The Pentium Division Bug 1994
extra There was a fault in a lookup table used to perform division. Example 1 Inputs x = , y = Calculation z = x - (x/y)*y Expected result 0 Pentium answer 256 Example 2 Inputs x = , y = Calculation x/y Expected result Pentium answer human oracle, try calculating... 20/09/2018 Dr Andy Brooks

Types of oracles Categories are based on oracle outputs, not the methods used. Stochastic oracle stochastic/slembi- When resources are limited, a relatively small sample of inputs are used in testing. A pseudo-random number generator may be used to select the input values which are fed to the oracle and the SUT. A statistically random sample suffers no bias. The oracle must be capable of accepting any randomly generated value or it must be specially built for the values selected. 20/09/2018 Dr Andy Brooks

Heuristic oracle Selected results are reproduced.
Types of oracles Categories are based on oracle outputs, not the methods used. Heuristic oracle heuristic/brjóstvitsaðferð Selected results are reproduced. Remaining values are checked for consistency using a heuristic. The sin() function can be checked for selected values such as 90o (1), 180o (0), 270o (-1), and 360o (0). Using small increments of the inputs, the outputs from the SUT can be checked for consistency: progressively greater/progressively less A heuristic oracle is very easy to implement and runs much faster than a true oracle. 20/09/2018 Dr Andy Brooks

Types of oracles Categories are based on oracle outputs, not the methods used. Heuristic oracle extra 1 -1 0o 90o 180o 270o 360o The sin() function increases between 0 and 90 degrees, decreases from 90 to 270 degrees, and increases again to 360 degrees. The heuristic oracle for sin() can detect various kinds of faults. 20/09/2018 Dr Andy Brooks

Types of oracles Categories are based on oracle outputs, not the methods used. Heuristic oracle extra 1 -1 0o 90o 180o 270o 360o The values are increasing and decreasing as appropriate. The heuristic oracle for sin() would not detect that the function is Triangle. 20/09/2018 Dr Andy Brooks

Types of oracles Categories are based on oracle outputs, not the methods used. Sampling oracle A sampling oracle involves using a selected (not random) set of values. Boundary values, midpoints, minima, and maxima are typically chosen. A sampling oracle is created to provide the expected results for the selected values. 20/09/2018 Dr Andy Brooks

Types of oracles Categories are based on oracle outputs, not the methods used. Consistent oracle A consistent oracle uses the results from a previous test run as the oracle for the next test run. A useful way for evaluating changes. Historic faults may remain undetected. 20/09/2018 Dr Andy Brooks

Five types of oracles Table 2
True oracle Stochastic Heuristic Sampling Consistent Definition Independent generation of expected results Verify a randomly selected sample Verify selected points, use a heuristic for remainder Verify a specially selected sample Compare run n results with n-1 Example of use Algorithm validation Operational verification Algorithm verification Boundary testing Regression test Advantages Possibility for exhaustive testing Can automate tests with a simple oracle Easier than true oracle Very fast verification possible with simple oracle Fastest: can generate and verify large amounts of data Disadvantages Expensive implementation. Possibly long execution times May miss systematic and specific errors. Can be time consuming to verify Can miss systematic errors and incorrect algorithms May miss systematic or specific errors Original run may include unknown errors 20/09/2018 Dr Andy Brooks

Other remarks on oracles
Oracle data can be generated before, parallel to, or after a test case is run. If generated before, inputs to the test case must be known beforehand. If test case execution performs comparisons with the expected results, the oracle must run before or in parallel with the test case. Parallel running of the oracle assumes the oracle is as fast as the software under test. 20/09/2018 Dr Andy Brooks

Manually comparing outputs with expected results is limited by human processing capabilities. Plan the method of results’ comparisons to be used. I´ve checked 500 test cases, I just can´t do any more. 20/09/2018 Dr Andy Brooks

Just about anybody is capable of detecting certain kinds of incorrect result without knowing the correct result. When you logon to home banking and find you have kr, you know a mistake has occurred. But if your account is showing a balance of kr, you might not be able to tell if that is the correct amount. maybe kr is the correct amount. Experts can be good at setting limits to plausible amounts. 20/09/2018 Dr Andy Brooks

Example: Fisher´s Exact Test
Depression No-depression Icelandic a b a+b Foreigner c d c+d a+c b+d n=a+b+c+d For fixed row and column totals, R. A. Fisher worked out the exact probability of obtaining the set of values {a,b,c,d} as: 9/20/2018 Dr Andy Brooks

Example: Fisher´s Exact Test
b c d Table Probability = 0,004525 = 0,061086 = 0,244344 = 0,380090 = 0,004524 Are the values for probability correct? human oracle, try calculating... 9/20/2018 Dr Andy Brooks

FOR0383 Software Quality Assurance

Similar presentations

Presentation on theme: "FOR0383 Software Quality Assurance"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

FOR0383 Software Quality Assurance

Similar presentations

Presentation on theme: "FOR0383 Software Quality Assurance"— Presentation transcript:

Similar presentations

About project

Feedback