Download presentation
Presentation is loading. Please wait.
Published byDaniel Clyde Mitchell Modified over 9 years ago
1
An Automated Approach to Predict Effectiveness of Fault Localization Tools Tien-Duy B. Le, and David Lo School of Information Systems Singapore Management University Will Fault Localization Work For These Failures ? 29th IEEE International Conference on Software Maintenance
2
Fault Localization Tool: A Primer Give me a failing program I have calculated the most suspicious location of bugs Running … 1) 2) 3) 4) OK! I will check your suggestion Debugging My program failed
3
Will Fault Localization Tools Really Work? In ideal case: –Faulty statements are within a few suspicious statements, e.g. 10, 20, 30 … 3 … 1) 2) 3) 4) I found the bug Debugging Effective
4
Will Fault Localization Tools Really Work? In the worst case: –Faulty statements cannot be found early in the ranked list of statements –Time consuming 4 … 1) 2) 3) 4) Debugging Forever Debugging Effective
5
Will Fault Localization Tools Really Work? We build an oracle to predict if an output of a fault localization tool (i.e., instance) can be trusted or not. If not trusted –Developers do not have to spend time using the output –Developers can revert to manual debugging 5 Trusted or not ? Oracle Ball
6
Overall Framework 6 Suspiciousness Scores Spectra Fault Localization Feature Extraction Model Learning 1 2 Effectiveness Labels Training Stage Model
7
Overall Framework Major Components: –Feature Extraction 50 features, 6 categories –Model Learning We extend Support Vector Machine (SVM) to handle imbalanced training data. 7
8
Feature Extraction 8 Traces (5 Features) T1# traces T2# failing traces T3# successful traces … Program Elements (10 Features) PE1# program elements in failing traces PE2# program elements in correct traces PE3PE2 – PE1 …
9
Feature Extraction 9 Raw Scores (10 Features) R1Highest suspiciousness scores R2Second highest suspiciousness scores Rii th highest suspiciousness scores … Simple Statistics (6 Features) SS1Number of distinct scores in the top-10 scores SS2Mean of top-10 suspiciousness scores SS3Median of top-10 suspiciousness scores …
10
Feature Extraction 10 Gaps (11 Features) G1R1 – R2 G2R2 – R3 GiRi – R(i+1), where … Relative Difference (8 Features) RD1 RDi
11
Model Learning Extend off-the-shell Support Vector Machine Imbalanced training data –#ineffective instances > #effective instances Extended Support Vector Machine (SVM EXT ) 11 Maximum Marginal Hyperplane Effective instances Ineffective instances
12
SVM EXT For each effective instance, –We calculate its similarities to ineffective instances –Each instance is represented by a feature vector –Using cosine similarity: 12
13
SVM EXT Sort effective instances based on their highest similarities with ineffective instances (descending) Duplicate effective instances at the top of the list until training data is balanced. 13 selected effective instances Effective instances Ineffective instances
14
Overall Framework 14 Model Suspiciousness Scores Spectra Fault Localization Feature Extraction Effectiveness Prediction Prediction Deployment Stage 1 3
15
Experiments 15. We use 10 fold cross validation. We compute precision, recall and F-measure.
16
Effectiveness Labeling A fault localization instance is deemed effective if: –Root cause is among the top-10 most suspicious program elements –If a root cause spans more than 1 program elements One of them is in the top-10 –Otherwise, it is ineffective 16
17
Dataset 10 different programs: –NanoXML, XML-Security, and Space –7 programs from the Siemens test suites Totally, 200 faulty versions For Tarantula, among the 200 instances: –85 are effective –115 are ineffective 17
18
Research Question 1 How effective is our approach in predicting the effectiveness of a state-of-the-art spectrum-based fault localization tool ? Experimental setting: –Tarantula –Using Extended SVM (SVM EXT ) 18
19
Research Question 1 Precision of 54.36% –Correctly identify 47 out of 115 ineffective fault localization instances Recall of 95.29% –Correctly identify 81 out of 85 effective fault localization instances 19 PrecisionRecallF-Measure 54.36%95.29%69.23%
20
Research Question 2 How effective is our extended Support Vector Machine(SVM Ext ) compared with off-the-shelf Support Vector Machine (SVM) ? Experimental Setting –Tarantula –Using extended SVM (SVM EXT ) and off-the- shelf SVM 20
21
Research Question 2 Result SVM EXT outperforms off-the-shelf SVM 21 SVM EXT SVMImprovement Precision54.36%51.04%6.50% Recall95.29%57.65%65.29% F-Measure69.23%54.14%27.87%
22
Research Question 3 What are the most important features ? –Fisher score is used to measure how dominant and discriminative a feature is. 22
23
Top-10 Most Discriminative Features 23 1.RD7 2.RD8 3.RD6 4.PE1 5.PE2 6.SS1 7.RD5 8.RD1 9.PE4 10.R1
24
Most Important Features Relative Differences Features –C 7 (1), C 8 (2), C 6 (3), C 5 (7), and C 1 (8) 24
25
Most Important Features Program Elements –PE 1 (4), PE 2 (5), and PE 4 (9) 25 Failing TracesCorrect Traces #Program Elements PE1PE2
26
Most Important Features Simple Statistics –SS1 (6) : Number of distinct suspiciousness scores in {R 1,…,R 10 } Raw Scores –R 1 (10) : Highest suspiciousness scores 26
27
Research Question 4 Could our approach be used to predict the effectiveness of different types of spectrum-based fault localization tool ? Experimental setting: –Tarantula, Ochiai, and Information Gain –Using Extended SVM (SVM EXT ) 27
28
Research Question 4 ToolPrecisionRecallF-Measure Tarantula54.36%95.29%69.23% Ochiai63.23%97.03%76.56% Information Gain64.47%93.33%76.26% 28 F-Measure for Ochiai and Information Gain –Greater than 75% –Our approach can better predict the effectiveness of Ochiai and Information Gain
29
Research Question 5 How sensitive is our approach to the amount of training data ? Experimental setting: –Vary amount of training data from 10% to 90% –Random sampling 29
30
Research Question 5 30
31
Conclusion We build an oracle to predict the effectiveness of fault localization tools. –Propose 50 features capturing interesting dimensions from traces and susp. scores –Propose Extend. Support Vector Machine (SVM EXT ) Experiments –Achieve good F-Measure: 69.23% (Tarantula) –SVM EXT outperforms off-the-shelf SVM –Relative difference features are the best features 31
32
Future work Improve F-Measure further Extend approach to work for other fault localization techniques Extract more features from source code and textual descriptions, e.g., bug reports. 32
33
33 Thank you! Questions? Comments? Advice? {btdle.2012, davidlo}@smu.edu.sg
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.