Fei Xing1, Ping Guo1,2 and Michael R. Lyu2

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Support Vector classifiers for Land Cover Classification Mahesh Pal Paul M. Mather National Institute of tecnology School of geography Kurukshetra University.
ECG Signal processing (2)
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
An Introduction of Support Vector Machine
Classification / Regression Support Vector Machines
An Introduction of Support Vector Machine
Support Vector Machines
SVM—Support Vector Machines
Support vector machine
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Groundwater 3D Geological Modeling: Solving as Classification Problem with Support Vector Machine A. Smirnoff, E. Boisvert, S. J.Paradis Earth Sciences.
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Support Vector Machines
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
The Nature of Statistical Learning Theory by V. Vapnik
2806 Neural Computation Support Vector Machines Lecture Ari Visa.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
SVM Support Vectors Machines
A Study of the Relationship between SVM and Gabriel Graph ZHANG Wan and Irwin King, Multimedia Information Processing Laboratory, Department of Computer.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Statistical Learning Theory: Classification Using Support Vector Machines John DiMona Some slides based on Prof Andrew Moore at CMU:
Optimization Theory Primal Optimization Problem subject to: Primal Optimal Value:
An Introduction to Support Vector Machines Martin Law.
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
Based on: The Nature of Statistical Learning Theory by V. Vapnick 2009 Presentation by John DiMona and some slides based on lectures given by Professor.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Support Vector Machine & Image Classification Applications
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
10/18/ Support Vector MachinesM.W. Mak Support Vector Machines 1. Introduction to SVMs 2. Linear SVMs 3. Non-linear SVMs References: 1. S.Y. Kung,
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.
An Introduction to Support Vector Machines (M. Law)
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
CISC667, F05, Lec22, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Support Vector Machines I.
Support Vector Machines Project מגישים : גיל טל ואורן אגם מנחה : מיקי אלעד נובמבר 1999 הטכניון מכון טכנולוגי לישראל הפקולטה להנדסת חשמל המעבדה לעיבוד וניתוח.
1 Support Vector Machine (SVM)  MUMT611  Beinan Li  Music McGill 
An Introduction to Support Vector Machine (SVM)
Support Vector Machines in Marketing Georgi Nalbantov MICC, Maastricht University.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Support Vector Machines Tao Department of computer science University of Illinois.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Support Vector Machines (SVM): A Tool for Machine Learning Yixin Chen Ph.D Candidate, CSE 1/10/2002.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Lecture 14. Outline Support Vector Machine 1. Overview of SVM 2. Problem setting of linear separators 3. Soft Margin Method 4. Lagrange Multiplier Method.
1 An introduction to support vector machine (SVM) Advisor : Dr.Hsu Graduate : Ching –Wen Hong.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
The Chinese University of Hong Kong Learning Larger Margin Machine Locally and Globally Dept. of Computer Science and Engineering The Chinese University.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Experience Report: System Log Analysis for Anomaly Detection
CS 9633 Machine Learning Support Vector Machines
PREDICT 422: Practical Machine Learning
LECTURE 16: SUPPORT VECTOR MACHINES
Support Vector Machines
An Introduction to Support Vector Machines
An Introduction to Support Vector Machines
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Support Vector Machines
LECTURE 17: SUPPORT VECTOR MACHINES
COSC 4368 Machine Learning Organization
Linear Discrimination
SVMs for Document Ranking
Presentation transcript:

A Novel Method for Early Software Quality Prediction Based on Support Vector Machine Fei Xing1, Ping Guo1,2 and Michael R. Lyu2 1Department of Computer Science Beijing Normal University 2Department of Computer Science & Engineering The Chinese University of Hong Kong

Outline Background Support vector machine Experiments Conclusions Basic theory SVM with Risk Feature Transductive SVM Experiments Conclusions Further work

Background Modern society is fast becoming dependent on software products and systems. Achieving high reliability is one of the most important challenges facing the software industry. Software quality models are in desperate need.

Background Software quality model A software quality model is a tool for focusing software enhancement efforts. Such a model yield timely predictions on a module-by-module basis, enabling one to target high-risk modules.

Background Software complexity metrics A quantitative description of program attributes. Closely related to the distribution of faults in program modules. Playing a critical role in predicting the quality of the resulting software.

Background Software quality prediction Software quality prediction aims to evaluate software quality level periodically and to indicate software quality problems early. Investigating the relationship between the number of faults in a program and its software complexity metrics

Background Related work EM algorithm, Discriminant analysis, Several different techniques have been proposed to develop predictive software metrics for the classification of software program modules into fault-prone and non fault-prone categories. EM algorithm, Feedforward neural networks, Random forests Discriminant analysis, Factor analysis, Classification trees, Pattern recognition, Reference in order: [7] Discriminant analysis,[8] Factor analysis,[9] Classification trees,[10] Pattern recognition,[11] EM algorithm,[12] Feedforward neural networks,[13] Random forests

Background Classification Problem Two types of errors A Type I error is the case where we conclude that a program module is fault-prone when in fact it is not. A Type II error is the case where we believe that a program module is non fault-prone when in fact it is fault-prone. Two classes, one is fault-prone, the other is non fault-prone class.

Background Which error type is more serious in practice? Type II error has more serious implications, since a product would be seem better than it actually is, and testing effort would not be directed where it is needed the most.

Research Objectives In search of a well accepted mathematical model for software quality prediction. Lay out the application procedure for the selected software quality prediction model. Perform experimental comparison for the assessment of the proposed model. Select proven model for investigation: Support Vector Machine.

Support Vector Machine Introduced by Vapnik in the late 1960s on the foundation of statistical learning theory Traced back to the classical structural risk minimization (SRM) approach, which determines the classification decision function by minimizing the empirical risk.

Support Vector Machine (SVM) It is a new technique for data classification, which has been used successfully in many object recognition applications SVM is known to generalize well even in high dimensional spaces under small training sample conditions SVM excels in linear classifiers

Linear Binary Classifier Given two classes of data sampled from x and y, we are trying to find a linear decision plane wT z + b=0, which can correctly discriminate x from y. wT z + b< 0, z is classified as y; wT z + b >0, z is classified as x. wT z + b=0 : decision hyperplane y Linear Binary Classifier is one of the most important classifiers in machine learning. It is not only very simple, but more importantly, many of them can be extended into non-linear classifiers based on a standard technique (kerneliztion). This slide demonstrates a typical example. … As you can see, there are many classifiers that can correctly classify the data. But which one is the best? This therefore motivate many types of methods. x

Support Vector Machine The current state-of-the-art classifier Local Learning Decision Plane Margin Support Vectors In this figure, the decision hyperplane given by the SVM attempts to maximize the margin between two types of data. Moreover, the decision line is only determined by several critical points called support vectors, whereas all other points are totally irrelevant.

Support Vector Machine Dual problem Using standard Lagrangian duality techniques, one arrives at the following dual Quadratic Programming (QP) problem: Mathematical description of SVM s.t.

Support Vector Machine The Optimal Separating Hyperplane Place a linear boundary between the two different classes, and orient the boundary in such a way that the margin is maximized: The optimal hyperplane is required to satisfy the following constrained minimization as: Mathematical description of SVM

Support Vector Machine The Generalized Optimal Separating Hyperplane For the linearly non-separable case, positive slack variables are introduced: C is used to weight the penalizing variables , and a larger C corresponds to assigning a higher penalty to errors.

Support Vector Machine SVM with Risk Feature Take into account the cost of different types of errors by adjusting the error penalty parameter C to control the risk. C1 is the error penalty parameter of class 1 and C2 is the error penalty parameter of class 2.

Optimal Separating Hyperplane C1=20000 C2=20000 C1=10000 C2=20000 C1=20000 C2=10000 C1 is the error penalty parameter of class 1 and C2 is the error penalty parameter of class 2. C1 :C2 ratio different will generate different separating hyperplane, consequently generate different classification results.

Support Vector Machine Transductive SVM A kind of semi-supervised learning Taking into account a particular test set as well as training set, and trying to minimize misclassifications of only those particular examples. TSVM is a new method derived from SVM. When applying it to build the classifier, we find that the best CCR of 90.03% is achieved. But it is more time consuming for TSVM training than other methods.

Experiments Data Description Medical Imaging System (MIS) data set. 11 software complexity metrics were measured for each of the modules Change Reports (CRs) represent faults detected. Treat those modules with 0 or 1 CRs to be non fault-prone (total 114), and those with CRs from 10 to 98 to be fault-prone (total 89). The total 203 samples are divided into two parts: half for training and remaining half for testing

Experiments Metrics of MIS data Total lines of code including comments (LOC) Total code lines (CL) Total character count (TChar) Total comments (TComm) Number of comment characters (MChar) Number of code characters (DChar) Halstead’s program length (N) Halstead’s estimated program length ( ) Jensen’s estimator of program length (NF ) McCabe’s cyclomatic complexity (v(G)) Belady’s bandwidth metric (BW), ……

Distribution in first three principal components space MIS data set: there are 11 metrics, (11 dimension), using PCA to reduce 11 dimension into 3. This gives a visual effect about how difficult to separate fault and non fault modules in lower PCA space.

Methods for Comparison Applied models QDA: Quadratic Discriminant Analysis PCA: Principal Component Analysis CART: Classification and Regression Tree SVM: Support Vector Machine TSVM: Transductive SVM Evaluation criteria CCR: Correct Classification Rate T1ERR: Type I error T2ERR: Type II error

The Comparison Results Methods CCR Std T1ERR T2ERR QDA 85.49% 0.0288 7.37% 7.14% PCA+QDA 86.53% 0.0275 4.90% 7.52% PCA+CART 83.02% 0.0454 9.59% 6.41% SVM 89.00% 0.0189 2.33% 8.67% PCA+SVM 89.07% 0.0209 2.06% 8.87% TSVM 90.03% 0.0326 2.11% 7.86% CCR+T1err+T2err=100% QDA--quadratic discriminant analysis PCA--Principal components analysis CART--the classification and regression trees TSVM--Transductive SVM

The Comparison of the Three Kernels of SVM Kernel function CCR Std T1ERR T2ERR Polynomial 70.74% 0.0208 0.45% 28.81% Radial basis 88.68% 0.0220 2.65% 8.67% Sigmoid 88.75% 0.0223 2.29% 8.96% CCR+T1err+T2err=100% Sigmoid kernel has the highest correct classification rate (CCR), radial basis function kernel has the minimal t2err. Radial Basis kernel function and Sigmoid kernel function all achieve excellent performance, which are significantly better than Polynomial kernel function.

Experiments with the Minimum Risk SVM with the risk feature The Bayesian decision with the minimum risk SVM: Take into account the cost of different types of errors by adjusting the error penalty parameter C to control the risk. The Bayesian decision with the minimum risk is typical with posterior probability density function to classify.

SVM with the Risk Feature C1 C2 CCR Std T1ERR T2ERR 5000 20000 86.53% 0.0393 11.43% 5.10% 8000 0.0275 4.90% 7.52% 10000 89.00% 0.0189 2.33% 8.67% 15000 89.07% 0.0209 2.06% 8.87% we can control Type II error by adjusting the error penalty parameter C of SVM, which can achieve better classification performance than using the minimum-risk-based Bayesian decision when considering both CCR and Type II error.

The Bayesian Decision with the Minimum Risk Risk ratio CCR Std T1ERR T2ERR 1:1 85.94% 0.0387 7.59% 6.47% 1:1.1 83.80% 0.0326 11.06% 5.14% 1:1.2 78.73% 0.0321 17.90% 3.37% 1:1.3 71.31% 0.0436 27.12% 1.57% 1:1.4 59.98% 0.0516 39.75% 0.27% where the first column is the risk ratio of Type I error versus Type II error. We can observe that SVM with risk is superior to the Bayesian decision based on the minimum risk in the classification performance. Nevertheless, the Bayesian decision can adjust type II error to a very low value at the cost of lower CCR and higher Type I error.

Discussions Features of this work Modeling nonlinear functional relationships Good generalization ability even in high dimensional spaces under small training sample conditions SVM-based software quality prediction model achieves a relatively good performance Easily controlling Type II error by adjusting the error penalty parameter C of SVM

Conclusions SVM provides a new approach which has not been fully explored in software reliability engineering. SVM offers a promising technique in software quality prediction. SVM is suitable for real-world applications in software quality prediction and other software engineering fields.