Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University.

Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University ‡Nara Institute of Science and Technology

1 Background Complexity metrics are used to estimate fault-proneness of software component. According to the value of the metrics, we can allocate the effort of review/testing to the fault-prone component. Chidamber and Kemerer’s metrics are the representative complexity metrics for object-oriented software.

2 Chidamber and Kemerer’s metrics[1] C&K metrics evaluate complexity of classes from the following three viewpoints: Inheritance complexity ・ DIT (Depth of inheritance tree) ・ NOC (Number of children) Coupling complexity ・ RFC （ Response for a class ・ CBO(Coupling between object-class) Class internal complexity ・ WMC(Weighted methods par class ・ LCOM(Lack of cohesion in method) [1] S.R. Chidamber and C.F. Kemerer, A Metrics Suite for Object Oriented Design, IEEE Trans. on software eng., vol., 20, No. 6 (1994) 476-492.

3 Evaluation of C&K metrics Several research studies evaluate the usefulness of C&K metrics. ・ Chidamber and Kemerer confirm that C&K metrics satisfy Weyuker’s properties [1]. ・ Basili et. al. empirically evaluated that C&K metric suit is better predictor of fault-proneness of class than traditional code metrics [2]. ・ Briand et. al. discussed several design metrics that include C&K Metrics [3]. [2] Basili, V. R., Briand, L. C., and Mélo, W. L., A validation of object-oriented design metrics as quality indicators, IEEE Trans. on Software Eng. Vol. 20, No. 22, (1996) 751-761. [3] Briand, L. C., Daly, J.W., and Wüst, J.K., A Unified Framework for Coupling Measurement in Object-Oriented Systems, IEEE Trans. on software eng., vol.25, No.1, (1999) 91-121.

4 Difficulty in applying C&K metrics to design In previous researches, C&K metrics were applied to source code. Because some of C&K metrics need information such as algorithm or call-relationship, which are determined later at design phase. In order to allocate the review and testing effort efficiently, early estimation of the fault-prone classes (components) is preferable.

5 Proposed method We propose a method to predict fault-proneness at early phase in object-oriented development. 1. Introduce four checkpoints into design / implementation phase. 2. Determine the available metric set at each checkpoint. 3. By multivariate logistic regression analysis, estimate fault-proneness of the classes (components) at each checkpoint. We empirically evaluate how the metric sets predict fault- prone classes at each checkpoint.

6 Introduced checkpoints CP1: Association and attributes of classes are determined. CP2: Derivation, interface(method), and reused classes are determined. CP3: Algorithm of each method is developed. CP4: Source code is written. Implemen- tation t Object Design System Design Analysis

7 Metrics We use following metrics in this study. C&K metrics ・ DIT, NOC, RFC, CBO, WMC, and LCOM ・ CBON (Coupling to newly developed classes) ・ CBOR (Coupling to reused classes) CBO = CBON + CBOR Other metrics ・ NIV (Number of instance variables) ・ SLOC (Source lines of code)

8 Checkpoints and metric sets CP1: Association and attributes of classes are determined. { CBON, NIV } CP2: Derivation, interface(method), and reused classes are determined. { CBON, NIV, CBOR, CBO, WMC, DIT, NOC } CP3: Algorithm of each method is developed. { CBON, NIV, CBOR, CBO, WMC, DIT, NOC, RFC, LCOM } CP4: Source code is written. {CBON, NIV, CBOR, CBO, WMC, DIT, NOC, RFC, LCOM, SLOC } Implemen- tation t Object Design System Design Analysis

9 Estimation of fault-proneness of classes “Multivariate logistic regression is a standard technique based on maximum likelihood estimation, to analyze the relationships between measures and fault-proneness of classes.” P 1 : fault-proneness (probability of fault detected) CBO, NIV: metric values C 0, C 1, C 2 : coefficients If P 1 of the target class > 0.5, then the class is predicted as faulty.

10 Outline of the experiment We empirically evaluate the proposed method using the data collected from an experimental project. ・ The experimental project was performed at a computer company for five days in August 1997. ・ Developers were new employees who finished on-the- job training of object-oriented design and C++ programming. ・ Developer teams developed an identical e-mail delivery system using C++.

11 Experimental data Fault tracking data ・ Location ・ Type ・ Effort to fix Metric data ・ Metric values of developed classes As the result, 80 faults of 141 classes were collected.

12 Statistics of empirical data

13 Prediction by metrics (1/2) With the collected data, we estimate fault-prone classes at each checkpoint. Prediction at CP1

14 Prediction by metrics (2/2)

15 Indicators for evaluation To illustrate the precision of the estimation, two indicators are used [2]. Completeness: percentage of classes correctly predicted faulty in actual faulty. Correctness: percentage of classes actual faulty in predicted faulty. CorrectnessCompleteness

16 Precision of estimation On the whole, the precision of estimation improves as the process progress. –Correctness is relatively high at all checkpoints, so that the estimation used to ‘seed’ the faulty classes. –Completeness becomes better at later checkpoint. The estimation at CP2 does well.

17 Conclusion We have proposed a method to predict fault-proneness at early phase in object-oriented development, and evaluated the method empirically. As further work, we are going to: ・ Use other metrics in the proposed method. ・ Develop the tool which support the proposed method.

18 Weyuker’s properties [4] Let  (c) denote a measurement of metric  for class c, and p + q denote the combined class of class p and q, W1  p  q,  (p)   (q). W2  p  q,  (p) =  (q), and p differs from q. W3  p  q,  (p)   (q), and p's functionality is equal to q's (but p's design differs from q's). W4  p  q,  (p)   (p + q), and  (q)   (p + q). W5  p  q  r,  (p) =  (q), and  (p + r)   (q + r).  W6  p  q,  (p) +  (q)   (p + q). Chidamber and Kemerer proved that each metric WMC, DIT, NOC, CBO, RFC, and LCOM satisfies W1,...,  W6, except for NOC and LCOM which do not satisfy W4. [4]Weyuker, E. J., Evaluating software complexity measures, IEEE Trans. on Software Eng. Vol. 14, No. 9, (1998), 1357-1365.

19 Coefficients at each checkpoint

Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University.

Similar presentations

Presentation on theme: "Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University.

Similar presentations

Presentation on theme: "Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University."— Presentation transcript:

Similar presentations

About project

Feedback