EEL5881 Software Engineering

EEL5881 Software Engineering
Kenneth McCoig

The Distribution of Faults in a Large Industrial Software System
Thomas J. Ostrand AT&T Labs – Research 180 Park Avenue Florham Park, NJ 07932 Elaine J. Weyuker AT&T Labs – Research 180 Park Avenue Florham Park, NJ 07932

Abstract Four questions addressed in study: Ultimate Goal
How are faults distributed over different files? How does the size of a module affect fault density? How does faultiness persist from release to release? Are newly written files more fault-prone than ones written for earlier releases? Ultimate Goal The ultimate goal of the authors work is to help determine a way to identify particularly fault-prone files.

Introduction Few studies into dependability of large industrial software systems because: Difficult to locate and gain access to large systems Time consuming/expensive to collect/analyze data Difficult to find personnel with skills to perform empirical studies More case studies needed in this paradigm There have been relatively few studies that investigate dependability of software in large industrial systems. 1. It is difficult to locate and gain access to large systems 2. It is very time consuming and expensive to collect and analyze data 3. It is difficult to find personnel with skills to perform empirical studies. Authors are quick to point out that many more case studys will be needed to prove their ultimate goal (as stated in abstract)

Related Work Fenton and Ohlsson Adams
Basili and Perricone, Hatton, Moller and Paulish Fenton and Ohlsson studied two releases of a large commercial system. The General belief is that modules that were found to have a higher concentration of faults during pre-release would have a higher concentration in post-release. They found evidence that this was not the case. Adams also did a study at IBM and determined only 10% of faults that were found in post release are even worth fixing because of field downtime being to costly. The last set of guys point out that some argue the smaller the module size the easier it is to comprehend therefore the less errors can be entered into the system. They actually all found that generally as the size of a module increases the number of faults per unit size decreases.

System Description Inventory tracking system Total of 13 Releases
Current version (R13): 500,000 LOC / 1600 files Mostly Java, a few other file types Life cycles phases: Its important to point out this particular systems life cycle phases, because faults when detected were recorded in each phases detected. Requirements Design Development Unit Testing Integration Testing System Testing Beta Release Controlled Release General Release

System Description (cont.)
Beta and controlled release phases combined.

System Description (cont.)
Inventory tracking system Fault recording/tracking system 4,743 faults over all 13 Releases Fault severity levels 97% of faults detected prior to beta-release Authors include information about a fault recording system already in place at AT&T. Something to note is the 97% fault detection rate before beta-release

Question 1: Fault Distribution
Pareto-like distribution of faults? Findings prove similar results of Fenton and Ohlsson This question aims to determine if there is a distribution of faults

Overall Pareto Distribution By Release
This table shows a very uneven distribution of faults among files from release 1 to 13.

Fault Distribution for Releases 1, 6, 8, 10, 12

Question 2: Effects of Module Size on Fault-Proneness
Results prove the opposite of common belief However all other studies prove inconclusive data This question aims to answer whether file size has anything to do with fault proneness It has been argued that large files are more fault prone than small ones …. Blah blah They believe that this question can only be answered if other variable are known, such as experience of programmer, programming languages, development environments, and applications developed.

Fault Density vs. File Size

Question 3: Persistence of Faults
Test results: not enough data However small amount of data shows support for question to be true Data also agrees with Fenton and Ohlsson study This question aims to determine if files with high concentration of faults detected during pre-release also tend to have high concentrations of faults detected during post-release. Also whether faultiness persists between releases.

Distribution of Post-Release Faults

Question 4: Old Files vs. New
Results only confirm intuition Nothing new can be derived from this This question aims to determine if newly written files are more fault-prone than ones written for earlier releases.

Conclusions Question 1 hypothesis proven true, opposite of intuition
Question 2 hypothesis proven true, however other studies data inconclusive Question 3 hypothesis nothing proven, data insufficient, however little data seems to support true Question 4 hypothesis proven true to intuition Personal thoughts How are faults distributed over different files? How does the size of a module affect fault density? How does faultiness persist from release to release? Are newly written files more fault-prone than ones written for earlier releases? Personal thoughts – this study only provides a small basis of ultimate goal.. There needs to be several more case studies done to determine any form of an alogorithm to determine fault prone files

Questions?

EEL5881 Software Engineering

Similar presentations

Presentation on theme: "EEL5881 Software Engineering"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

EEL5881 Software Engineering

Similar presentations

Presentation on theme: "EEL5881 Software Engineering"— Presentation transcript:

Similar presentations

About project

Feedback