An Automated Approach to Predict Effectiveness of Fault Localization Tools Tien-Duy B. Le, and David Lo School of Information Systems Singapore Management.

Slides:

Advertisements

Similar presentations

Active Learning with Feedback on Both Features and Instances H. Raghavan, O. Madani and R. Jones Journal of Machine Learning Research 7 (2006) Presented.

Advertisements

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

1 Semi-supervised learning for protein classification Brian R. King Chittibabu Guda, Ph.D. Department of Computer Science University at Albany, SUNY Gen*NY*sis.

SVM—Support Vector Machines

A Metric for Software Readability by Raymond P.L. Buse and Westley R. Weimer Presenters: John and Suman.

Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel.

Automated Fitness Guided Fault Localization Josh Wilkerson, Ph.D. candidate Natural Computation Laboratory.

Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.

Comprehensive Evaluation of Association Measures for Software Fault Localization LUCIA, David LO, Lingxiao JIANG, Aditya BUDI Singapore Management University.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.

Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.

Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.

5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.

+ Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie.

Automated Diagnosis of Software Configuration Errors

To Trust of Not To Trust? Predicting Online Trusts using Trust Antecedent Framework Viet-An Nguyen 1, Ee-Peng Lim 1, Aixin Sun 2, Jing Jiang 1, Hwee-Hoon.

Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.

Person-Specific Domain Adaptation with Applications to Heterogeneous Face Recognition (HFR) Presenter: Yao-Hung Tsai Dept. of Electrical Engineering, NTU.

Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)

AMOST Experimental Comparison of Code-Based and Model-Based Test Prioritization Bogdan Korel Computer Science Department Illinois Institute of Technology.

Efficient Model Selection for Support Vector Machines

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

Theory and Practice, Do They Match ? A Case with Spectrum-Based Fault Localization Tien-Duy B. Le, Ferdian Thung, and David Lo School of Information Systems.

Bug Localization with Machine Learning Techniques Wujie Zheng

Dr. Tom WayCSC Testing and Test-Driven Development CSC 4700 Software Engineering Based on Sommerville slides.

Automatically Repairing Broken Workflows for Evolving GUI Applications Sai Zhang University of Washington Joint work with: Hao Lü, Michael D. Ernst.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.

1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

THE IRISH SOFTWARE ENGINEERING RESEARCH CENTRELERO© What we currently know about software fault prediction: A systematic review of the fault prediction.

Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.

Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.

Bug Localization with Association Rule Mining Wujie Zheng

Multi-Abstraction Concern Localization Tien-Duy B. Le, Shaowei Wang, and David Lo School of Information Systems Singapore Management University 1.

Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.

Automating Readers’ Advisory to Make Book Recommendations for K-12 Readers by Alicia Wood.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

Final Project Mei-Chen Yeh May 15, General In-class presentation – June 12 and June 19, 2012 – 15 minutes, in English 30% of the overall grade In-class.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

A Novel Relational Learning-to- Rank Approach for Topic-focused Multi-Document Summarization Yadong Zhu, Yanyan Lan, Jiafeng Guo, Pan Du, Xueqi Cheng Institute.

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

 Effective Multi-Label Active Learning for Text Classification Bishan yang, Juan-Tao Sun, Tengjiao Wang, Zheng Chen KDD’ 09 Supervisor: Koh Jia-Ling Presenter:

Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006.

Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Test Case Purification for Improving Fault Localization presented by Taehoon Kwak SoftWare Testing & Verification Group Jifeng Xuan, Martin Monperrus [FSE’14]

An Automatic Method for Selecting the Parameter of the RBF Kernel Function to Support Vector Machines Cheng-Hsuan Li 1,2 Chin-Teng.

A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.

High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.

SOFTWARE TESTING TRAINING TOOLS SUPPORT FOR SOFTWARE TESTING Chapter 6 immaculateres 1.

10/23/ /23/2017 Presented at KDD’09 Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach David Lo1,

Tung Dao* Lingming Zhang+ Na Meng* Virginia Tech*

Experience Report: System Log Analysis for Anomaly Detection

Ask the Mutants: Mutating Faulty Programs for Fault Localization

Using Transductive SVMs for Object Classification in Images

Extra Tree Classifier-WS3 Bagging Classifier-WS3

Test Case Purification for Improving Fault Localization

Automated Fitness Guided Fault Localization

Expandable Group Identification in Spreadsheets

Using Automated Program Repair for Evaluating the Effectiveness of

DESIGN OF EXPERIMENTS by R. C. Baker

Kostas Kolomvatsos, Christos Anagnostopoulos

Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.

Presentation transcript:

An Automated Approach to Predict Effectiveness of Fault Localization Tools Tien-Duy B. Le, and David Lo School of Information Systems Singapore Management University Will Fault Localization Work For These Failures ? 29th IEEE International Conference on Software Maintenance

Fault Localization Tool: A Primer Give me a failing program I have calculated the most suspicious location of bugs Running … 1) 2) 3) 4) OK! I will check your suggestion Debugging My program failed

Will Fault Localization Tools Really Work? In ideal case: –Faulty statements are within a few suspicious statements, e.g. 10, 20, 30 … 3 … 1) 2) 3) 4) I found the bug Debugging Effective

Will Fault Localization Tools Really Work? In the worst case: –Faulty statements cannot be found early in the ranked list of statements –Time consuming 4 … 1) 2) 3) 4) Debugging Forever Debugging Effective

Will Fault Localization Tools Really Work? We build an oracle to predict if an output of a fault localization tool (i.e., instance) can be trusted or not. If not trusted –Developers do not have to spend time using the output –Developers can revert to manual debugging 5 Trusted or not ? Oracle Ball

Overall Framework 6 Suspiciousness Scores Spectra Fault Localization Feature Extraction Model Learning 1 2 Effectiveness Labels Training Stage Model

Overall Framework Major Components: –Feature Extraction 50 features, 6 categories –Model Learning We extend Support Vector Machine (SVM) to handle imbalanced training data. 7

Feature Extraction 8 Traces (5 Features) T1# traces T2# failing traces T3# successful traces … Program Elements (10 Features) PE1# program elements in failing traces PE2# program elements in correct traces PE3PE2 – PE1 …

Feature Extraction 9 Raw Scores (10 Features) R1Highest suspiciousness scores R2Second highest suspiciousness scores Rii th highest suspiciousness scores … Simple Statistics (6 Features) SS1Number of distinct scores in the top-10 scores SS2Mean of top-10 suspiciousness scores SS3Median of top-10 suspiciousness scores …

Feature Extraction 10 Gaps (11 Features) G1R1 – R2 G2R2 – R3 GiRi – R(i+1), where … Relative Difference (8 Features) RD1 RDi

Model Learning Extend off-the-shell Support Vector Machine Imbalanced training data –#ineffective instances > #effective instances  Extended Support Vector Machine (SVM EXT ) 11 Maximum Marginal Hyperplane Effective instances Ineffective instances

SVM EXT For each effective instance, –We calculate its similarities to ineffective instances –Each instance is represented by a feature vector –Using cosine similarity: 12

SVM EXT Sort effective instances based on their highest similarities with ineffective instances (descending) Duplicate effective instances at the top of the list until training data is balanced. 13 selected effective instances Effective instances Ineffective instances

Overall Framework 14 Model Suspiciousness Scores Spectra Fault Localization Feature Extraction Effectiveness Prediction Prediction Deployment Stage 1 3

Experiments 15. We use 10 fold cross validation. We compute precision, recall and F-measure.

Effectiveness Labeling A fault localization instance is deemed effective if: –Root cause is among the top-10 most suspicious program elements –If a root cause spans more than 1 program elements One of them is in the top-10 –Otherwise, it is ineffective 16

Dataset 10 different programs: –NanoXML, XML-Security, and Space –7 programs from the Siemens test suites Totally, 200 faulty versions For Tarantula, among the 200 instances: –85 are effective –115 are ineffective 17

Research Question 1 How effective is our approach in predicting the effectiveness of a state-of-the-art spectrum-based fault localization tool ? Experimental setting: –Tarantula –Using Extended SVM (SVM EXT ) 18

Research Question 1 Precision of 54.36% –Correctly identify 47 out of 115 ineffective fault localization instances Recall of 95.29% –Correctly identify 81 out of 85 effective fault localization instances 19 PrecisionRecallF-Measure 54.36%95.29%69.23%

Research Question 2 How effective is our extended Support Vector Machine(SVM Ext ) compared with off-the-shelf Support Vector Machine (SVM) ? Experimental Setting –Tarantula –Using extended SVM (SVM EXT ) and off-the- shelf SVM 20

Research Question 2 Result  SVM EXT outperforms off-the-shelf SVM 21 SVM EXT SVMImprovement Precision54.36%51.04%6.50% Recall95.29%57.65%65.29% F-Measure69.23%54.14%27.87%

Research Question 3 What are the most important features ? –Fisher score is used to measure how dominant and discriminative a feature is. 22

Top-10 Most Discriminative Features 23 1.RD7 2.RD8 3.RD6 4.PE1 5.PE2 6.SS1 7.RD5 8.RD1 9.PE4 10.R1

Most Important Features Relative Differences Features –C 7 (1), C 8 (2), C 6 (3), C 5 (7), and C 1 (8) 24

Most Important Features Program Elements –PE 1 (4), PE 2 (5), and PE 4 (9) 25 Failing TracesCorrect Traces #Program Elements PE1PE2

Most Important Features Simple Statistics –SS1 (6) : Number of distinct suspiciousness scores in {R 1,…,R 10 } Raw Scores –R 1 (10) : Highest suspiciousness scores 26

Research Question 4 Could our approach be used to predict the effectiveness of different types of spectrum-based fault localization tool ? Experimental setting: –Tarantula, Ochiai, and Information Gain –Using Extended SVM (SVM EXT ) 27

Research Question 4 ToolPrecisionRecallF-Measure Tarantula54.36%95.29%69.23% Ochiai63.23%97.03%76.56% Information Gain64.47%93.33%76.26% 28 F-Measure for Ochiai and Information Gain –Greater than 75% –Our approach can better predict the effectiveness of Ochiai and Information Gain

Research Question 5 How sensitive is our approach to the amount of training data ? Experimental setting: –Vary amount of training data from 10% to 90% –Random sampling 29

Research Question 5 30

Conclusion We build an oracle to predict the effectiveness of fault localization tools. –Propose 50 features capturing interesting dimensions from traces and susp. scores –Propose Extend. Support Vector Machine (SVM EXT ) Experiments –Achieve good F-Measure: 69.23% (Tarantula) –SVM EXT outperforms off-the-shelf SVM –Relative difference features are the best features 31

Future work Improve F-Measure further Extend approach to work for other fault localization techniques Extract more features from source code and textual descriptions, e.g., bug reports. 32

33 Thank you! Questions? Comments? Advice? {btdle.2012,