Methods for Estimating Defects Catherine V. Stringfellow Mathematics and Computer Science Department New Mexico Highlands University October 20, 2000
Problem and Questions To improve the efficiency of system testing by helping testers determine when to stop testing and release software. When should testers stop testing without sacrificing effectiveness? Can defect estimations be used to make release decisions?
Data available in defect database release name phase defect reported component name date defect reported and much more
Approach Estimate number of remaining components with defects using: Capture-Recapture Methods Curve-Fitting Methods Experience-Based Methods Estimate number of remaining failures using: SRGM Selection Method Compare estimates to acceptability thresholds to make release decisions
Estimating Components with Defects in Post- Release, but not in Test Capture/recapture Methods Curve Fitting Models Experience-based Estimation Method (Uses historical data in the estimation)
Capture/Recapture Methods Derived from Wildlife Biological Models Software Defect Estimation Models Use multiple independent reviewers to count faults (instead of animals) A reviewer captures a certain fault and other reviewers who identify same fault are said to “recapture it”
Capture/Recapture x x x x x x x x x x x x x x x Model requires an overlap if most overlap, few remaining (unless MANY found) if few overlap, many remaining
Five CRC Estimation Methods M0ML MTML MTChpm (for two reviewers) MHJK MthChao
Models’ Assumptions Detection probabilities Defect number Detection probabilities Defect number Detection probabilities Defect number
CRC with MtML Example Estimated Remaining is = 5 Correct remaining = 8 MLE method tends to underestimate
Curve Fitting Models Detection Profile Method (fit with decreasing exp curve) Cumulative Method (fit with increasing exponential curve) Estimated # total defects is smallest x-value with y-value <= 0.5 Estimate is x-value at which exp function is asymptotic
Sample Data
DPM Curve Fitting Example DPM Estimate: approximately 8 remaining Correct Answer: 8
Results (3 sites) Ranks:
SRGMs Some Models’ Assumptions Test according to operational profile. Failure rate proportional to number of failures remaining. Defect repair is immediate. Defect repair is perfect. No new code is introduced during the test period.
SRGMs in practice Assumptions violated. Previous studies show SRGMs are robust: SRGMs perform well in predicting failure rates.
SRGMs Exponential Model (Musa’s) Delayed S-shaped Gompertz curve Yamada Exponential Yamada Raleigh
Release 1
Release 2
Release 3
SRGM Selection Method
Integrating Estimation Methods Logical AND: all methods must say stop Sequential If mhjk says stop, stop. Else if SRGM selection method and one other method says stop, stop. Else if at least one of the following methods, m0ml, mtml, dpm(linear), AND the experienced-based method say stop, stop. Else continue.
Integrating Estimation Methods Majority Group m0ml, mtml, dpm(linear) together. If one says stop, stop. In case of tie: a) continue test for another week; or b) compare number of defects in last week to acceptability threshold (5)
Contributions CRC/Curve-fitting methods applied in new way test sites vs reviewers test vs inspection defective components vs defects case study vs simulation or experiment Simple experienced-based method
Contributions SRGM selection method Using defect estimations to make release decisions. Combining methods to make decisions CRC/Curve-fitting/Experience-based/SRGM Selection methods -> release decisions based on defect estimations
Future Work Evaluate methods on future releases of same project Validate integrated method on other projects and in other environments to improve external validity of case study Evaluate other methods to make release decisions Stopping Rules Determination of parameters based on defect data