CLUSTERING SUPPORT FOR FAULT PREDICTION IN SOFTWARE Maria La Becca Dipartimento di Matematica e Informatica, University of Basilicata, Potenza, Italy

Slides:



Advertisements
Similar presentations
SPEC Workshop 2008 Laboratory for Computer Architecture1/27/2008 On the Object Orientedness of C++ programs in SPEC CPU 2006 Ciji Isen & Lizy K. John University.
Advertisements

Metrics for OO Design Distinct & measurable characteristics of OO design:- Size:-it is defined as – population,volume,length & functionality Population.
Chapter 5: Introduction to Information Retrieval
Guidelines for the application of Data Envelopment Analysis to assess evolving software Alexander Chatzigeorgiou University of Macedonia Thessaloniki,
Presentation of the Quantitative Software Engineering (QuaSE) Lab, University of Alberta Giancarlo Succi Department of Electrical and Computer Engineering.
1 Empirical Validation of Three Software Metrics Suites to Predict Fault-Proneness of Object-Oriented Classes Developed Using Highly Iterative or Agile.
1 Predicting Bugs From History Software Evolution Chapter 4: Predicting Bugs from History T. Zimmermann, N. Nagappan, A Zeller.
Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University.
Figures – Chapter 24.
Applying and Interpreting Object Oriented Metrics
March 25, R. McFadyen1 Metrics Fan-in/fan-out Lines of code Cyclomatic complexity Comment percentage Length of identifiers Depth of conditional.
Nov R. McFadyen1 Metrics Fan-in/fan-out Lines of code Cyclomatic complexity* Comment percentage Length of identifiers Depth of conditional.
Analysis of CK Metrics “Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults” Yuming Zhou and Hareton Leung,
Design Metrics Software Engineering Fall 2003 Aditya P. Mathur Last update: October 28, 2003.
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.1 7. Problem Detection Metrics  Software quality  Analyzing trends Duplicated Code  Detection techniques.
Object-Oriented Metrics
Empirical Validation of OO Metrics in Two Different Iterative Software Processes Mohammad Alshayeb Information and Computer Science Department King Fahd.
March R. McFadyen1 Software Metrics Software metrics help evaluate development and testing efforts needed, understandability, maintainability.
1 Complexity metrics  measure certain aspects of the software (lines of code, # of if-statements, depth of nesting, …)  use these numbers as a criterion.
Predicting Class Testability using Object-Oriented Metrics M. Bruntink and A. van Deursen Presented by Tom Chappell.
Object Oriented Metrics XP project group – Saskia Schmitz.
Lecture 17 Software Metrics
Chidamber & Kemerer Suite of Metrics
CS4723 Software Validation and Quality Assurance
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Refactoring.
Object-Oriented Metrics Alex Evans Jonathan Jakse Cole Fleming Matt Keran Michael Ababio.
Japan Advanced Institute of Science and Technology
Introduction to Defect Prediction Cmpe 589 Spring 2008.
UNIVERSITAS SCIENTIARUM SZEGEDIENSIS UNIVERSITY OF SZEGED D epartment of Software Engineering New Conceptual Coupling and Cohesion Metrics for Object-Oriented.
Software Measurement & Metrics
Enron Corpus: A New Dataset for Classification By Bryan Klimt and Yiming Yang CEAS 2004 Presented by Will Lee.
Quality Assessment for CBSD: Techniques and A Generic Environment Presented by: Cai Xia Supervisor: Prof. Michael Lyu Markers: Prof. Ada Fu Prof. K.F.
The CK Metrics Suite. Weighted Methods Per Class b To use this metric, the software engineer must repeat this process n times, where n is the number of.
1 OO Metrics-Sept2001 Principal Components of Orthogonal Object-Oriented Metrics Victor Laing SRS Information Services Software Assurance Technology Center.
The CK Metrics Suite. Weighted Methods Per Class b To use this metric, the software engineer must repeat this process n times, where n is the number of.
Concepts of Software Quality Yonglei Tao 1. Software Quality Attributes  Reliability  correctness, completeness, consistency, robustness  Testability.
1 These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 5/e and are provided with permission by.
Supporting Release Management & Quality Assurance for Object-Oriented Legacy Systems - Lionel C. Briand Visiting Professor Simula Research Labs.
1 Metrics and lessons learned for OO projects Kan Ch 12 Steve Chenoweth, RHIT Above – New chapter, same Halstead. He also predicted various other project.
An Automatic Software Quality Measurement System.
THE IRISH SOFTWARE ENGINEERING RESEARCH CENTRELERO© What we currently know about software fault prediction: A systematic review of the fault prediction.
CSc 461/561 Information Systems Engineering Lecture 5 – Software Metrics.
Measurement and quality assessment Framework for product metrics – Measure, measurement, and metrics – Formulation, collection, analysis, interpretation,
Carolyn Penstein Rosé Language Technologies Institute Human-Computer Interaction Institute School of Computer Science With funding from the National Science.
Multi-Abstraction Concern Localization Tien-Duy B. Le, Shaowei Wang, and David Lo School of Information Systems Singapore Management University 1.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Daniel Liu & Yigal Darsa - Presentation Early Estimation of Software Quality Using In-Process Testing Metrics: A Controlled Case Study Presenters: Yigal.
INFORMATION RETRIEVAL PROJECT Creation of clusters of concepts that represent a domain corpus.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Source Code Elements for Comprehending Object- Oriented.
Object-Oriented (OO) estimation Martin Vigo Gabriel H. Lozano M.
Ontology Support for Abstraction Layer Modularization Hyun Cho, Jeff Gray Department of Computer Science University of Alabama
Object Oriented Metrics
1 OO Technical Metrics CIS 375 Bruce R. Maxim UM-Dearborn.
Software Engineering Object Oriented Metrics. Objectives 1.To describe the distinguishing characteristics of Object-Oriented Metrics. 2.To introduce metrics.
Design Metrics CS 406 Software Engineering I Fall 2001 Aditya P. Mathur Last update: October 23, 2001.
Experience Report: System Log Analysis for Anomaly Detection
Object Oriented Metrics
A Hierarchical Model for Object-Oriented Design Quality Assessment
Course Notes Set 12: Object-Oriented Metrics
Object-Oriented Metrics
Design Metrics Software Engineering Fall 2003
Design Metrics Software Engineering Fall 2003
Lecture 17 Software Metrics
Information flow-Test coverage measure
Mei-Huei Tang October 25, 2000 Computer Science Department SUNY Albany
Predict Failures with Developer Networks and Social Network Analysis
Software Metrics SAD ::: Fall 2015 Sabbir Muhammad Saleh.
Software Metrics using EiffelStudio
Latent semantic space: Iterative scaling improves precision of inter-document similarity measurement Rie Kubota Ando. Latent semantic space: Iterative.
Presentation transcript:

CLUSTERING SUPPORT FOR FAULT PREDICTION IN SOFTWARE Maria La Becca Dipartimento di Matematica e Informatica, University of Basilicata, Potenza, Italy

INTRODUCTION Fault Prediction Approaches Testing Refactoring SW Quality Fault Predictors Process Metrics Product Metrics Component || Package Level LEXICAL & STRUCTURAL INFORMATION LEXICAL & STRUCTURAL INFORMATION NEW SW CLUSTERING APPROACH GOAL FAULT PREDICTION MODELS Cluster Level Predictor

Software Clustering Approach – Steps : SOFTWARE CLUSTERING 1 - CORPUS CREATION OO SW System 4 - COMPUTING SIMILARITIES Identifiers & Comment Terms D 2 - CORPUS NORMALIZATION.. Corpus D1 D2 Dn Terms Splitting Identifiers Special Token Elimination Stop Word Removal Stemming 3 - CORPUS INDEXING Vector Space Model (VSM) Di Terms Term by Document Matrix 5 - EXTRACTING DEPENDENCIES JRipples 6 - CLUSTERING Structural Lexical G’ = (V, E, ω) BorderFlow Algorithm Lexically Similar Structurally Dependent

Fault Prediction Models FAULT PREDICTION MODELS Classes Lexically Similar Structurally Dependent Product Metrics Multivariate Linear Regression Logistic Regression

Definition and Context CASE STUDY SW Clustering Approache Fault Prediction Models Cluster Granularity Level RQ – Does the cluster level approach improve fault prediction as compared with the baseline (i.e., class and package level) ? Baseline Approache (Class & Package) Fault Prediction Models Class & Package Granularity Level Metrics SWLR - LGR SW SystemReleases – 1.5 – – – 2.5 – Release SW Metrics & Fault SW Metrics & Fault Source Code &

CASE STUDY Planning Fault Prediction Previous Knowledge OO SW System X.0X.1 INTER X.0 X.1 INTRA Empiric Evaluation Training Set Test Set Dependent Variables (Name) Definition ClassFaultThe number of faults in the classes BinaryClassFault Indicates whether or not faults are present in a class. ClusterFaultThe number of faults in the clusters. BinaryClusterFault Indicates whether or not faults are present in a cluster. PackageFault The number of faults in the packages. BinaryPackageFault Indicates whether or not faults are present in a package. Selected Variables Indipendent Variables (Name) Definition WMC (Weighted Methods for Class) It indicates the number of methods (assuming unity weights for all methods). DIT (Depth of Inheritance Tree) It provides a measure of the inheritance levels from the object hierarchy top. NOC (Number Of Children) It measures the number of immediate descendants of the class. CBO (Coupling Between Object classes) It represents the number of classes coupled to a given class. RFC (Response For Class) It measures the number of methods that can be executed when an object of that class receives a message. LCOM (Lack of Cohesion in Methods) It counts the methods in a class that are not related through the sharing of some of the class fields. NPM (Number of Public Methods) It counts all the methods in a class that are declared as public. LOC (Lines Of Code) It is the number of instructions in each method of the class

CASE STUDY Intra-Release K-Rounds To assess and compare predictors SWLR e LGR Results Averaged over the rounds Version X.0 Inter-Release Training Set Test Set Training Set Test Set Validation and Evaluation – Intra- & Inter-Release Analysis K-Fold Cross Validation SWLR Models AIC & RD (Lower Values > Goodness of Fit) LGR Models SAR Kendall τ & Spearman ρ [-1;+1] SWLR Predictors LGR Predictors Precision Recall F - measure

RESULTS Results Cluster Level Models Baseline Class Level Models Baseline Package Level Models No Prevalence SWLR Sum of Absolute Residual (SAR) <>>- Correlation FP-FO (K. & S.) INTRA 8 of 15 0 of 152 of 15 5 of 15 Correlation FP-FO (K. & S.) INTER 7 of 110 of 111 of 113 of 11 LGR Goodness of the fit (AIC – RD) <>>- Correlation FP-FO (Precision – Recall - FMeasure) INTRA 9 of 150 of 151 of 155 of 15 Correlation FP-FO (Precision – Recall - FMeasure) INTER 7 of 110 of 113 of 111 of 11 Legend: Best Values Worst Values No OO Software System 6 PREDICTORS 3 SWLR CLUSTER + BASELINE 3 LGR CLUSTER + BASELINE INTRA- INTER-RELEASE Prevalence

CONCLUSION Thanks Acknowledgements Carmine Gravino Andrian Marcus Tim Menzies Giuseppe Scanniello