Random Testing: Theoretical Results and Practical Implications IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2012 Andrea Arcuri, Member, IEEE, Muhammad Zohaib Iqbal, Member, IEEE, and Lionel Briand, Fellow, IEEE Ou,Bao-Lin 2013/2/25
Outline Introduction Background Random testing Comparison with partition testing Novel theoretical results on random testing Conclusion
Introduction review the empirical and theoretical results on the comparisons between random and partition testing prove nontrivial lower bounds for the expected number of test cases to cover all targets study the scalability of random testing derive a nontrivial upper bound valid for the lack of predictability for any SUT(system under test)
Background - Asymptotic Notations Upper Bound : Lower Bound : Tight Bound : Dominated : if only and if
Background - Expectation of a Random Variable A random variable X can assume different values from a domain D, each one with a specific probability .
Background - Expectation of a Random Variable We can run k experiments and collect the resulting outputs . We can estimate E[X] with the following formula:
Background - Geometric Distribution Given p the probability of an event e in a trial, we might want to know how many trials x we need on average before obtaining e .
Random testing To consider their binary representation and then choose each bit with uniform probability, where k is the length in bits of a test case Problem: Elements in the binary space could not be mappable to the test case domain. The binary length k of the test cases can be variable. Even when bounds on the length are used, using a pure uniform distribution is unwise.
Random testing a set of testing targets The probability that a random test case covers is We will simply assume When all the probabilities are equal, we have
Random testing - Definition 1 (RT) Given a vector of probabilities for each target , we use We use to represent the special case when all the are equal.
Comparison with partition testing- partition testing The input domain D is divided in k ≥ 2 subdomains . In partition testing, the input space is divided into equivalence classes. In the case of a partition , is the ratio of failing test cases in that partition ( ) over its cardinality ( ) .
Comparison with partition testing- partition testing Given test cases, the sampling rate for each partition is . Once a partition strategy is chosen, sampling test cases for each partition is, in general, a difficult task.
Comparison with partition testing- Metrics for Comparisons E-measure: expected number of triggered failures. P-measure: probability of finding at least one failing test case. F-measure: expected number of test cases required to trigger at least one failure.
Comparison with partition testing- First Empirical Comparisons The probability that random testing finds at least one failure with test cases is In the case of partition testing, the probability that partition testing triggers at least one failure is
Comparison with partition testing- First Empirical Comparisons Considered the probabilities that a random test belongs to the partition , therefore
Comparison with partition testing- First Empirical Comparisons Experiment And Different values of were considered. The values for were chosen with uniform distribution. They found that in 14 out of 50 trials .
Comparison with partition testing- Analytical Analyses The expected number of failures detected for random testing is whereas for partition testing it is
Comparison with partition testing- Analytical Analyses When the subdomains are not actual partitions, we use for the P-measure and for the E- measure. when when for all partition
Comparison with partition testing- Analytical Analyses P-measure & E-measure For any given program, For any given program and partition testing strategy, If , then Equality holds only if for all partitions we have If , then However, the converse is not necessarily true.
Effectiveness of Random Testing Theorem 1
Theorem 1 - Proof being the random variable representing the number of trials in the epoch in which we have covered i different targets so far. The probability of drawing one of the remaining n - i targets is
Theorem 1 - Proof Because is geometrically distributed, therefore we have
Effectiveness of Random Testing Each target is sought in a specific order (RT0). Definition 2 (RT0) .
Effectiveness of Random Testing Theorem 2 Proof Covering a particular target follows a geometric distribution with parameter .
Theorem 2 - Proof Expected number of test cases: When all the probabilities are equal
Theorem 2 - Proof define any generic represents the positive values is the negative ones represents
Theorem 2 - Proof In a similar way, we can write
Theorem 2 - Proof To prove
Scalability Assume a testing technique called V For each target , we are assuming that V is faster than random testing by a constant factor , where is a constant Assuming n feasible targets, we consider the case of z ═ kn feasible targets, where k ≥ 1 is the scalability factor.
Scalability Given ,then Given , the probability of the most difficult target for k =1, then, for all values of k, the most difficult target for the case k > 1 should have value for some function .
Scalability Theorem 3
Theorem 3 - Proof Proof An upper bound for can be constructed by
Theorem 3 - Proof construct a lower bound for
Theorem 3 - Proof Finally, if ,then
Predictability of Two Runs Let us first define the difference D among two runs of random testing. The binary value represents whether the target has been covered or not
Predictability of Two Runs Theorem 4 The expected value of the random variable D is equal to
Theorem 4 - Proof Consider the expected value of
Predictability of Two Runs - Experiment Three different vectors, and . In , The remaining probabilities are chosen such that their total sum for is equal to 1. In , and Sample sizes will cover the following range: where n=32
Experiment
Predictability of Two Runs Theorem 5 Proof Given the function Proving is maximized for x=1/2
Theorem 5 - Proof ,where Therefore, is maximized when we have for all the the equality . In this case,
Conclusion Provided novel formal results regarding the effectiveness, scalability, and predictability of random testing Proven nontrivial, tight lower bounds for the expected number of test cases sampled by random testing to cover predefined targets We formally proved the mathematical formula that describes the predictability of random testing and we proved a general upper bound for it.