02/13/20071 Indexing Noncrashing Failures: A Dynamic Program Slicing-Based Approach Chao Liu, Xiangyu Zhang, Jiawei Han, Yu Zhang, Bharat K. Bhargava University.

Slides:



Advertisements
Similar presentations
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
Advertisements

Multimedia Database Systems
1 VLDB 2006, Seoul Mapping a Moving Landscape by Mining Mountains of Logs Automated Generation of a Dependency Model for HUG’s Clinical System Mirko Steinle,
INTROPERF: TRANSPARENT CONTEXT- SENSITIVE MULTI-LAYER PERFORMANCE INFERENCE USING SYSTEM STACK TRACES Chung Hwan Kim*, Junghwan Rhee, Hui Zhang, Nipun.
Overview Introduces a new cohesion metric called Conceptual Cohesion of Classes (C3) and uses this metric for fault prediction Compares a new cohesion.
A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.
Pruning Dynamic Slices With Confidence Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona.
July 11 th, 2005 Software Engineering with Reusable Components RiSE’s Seminars Sametinger’s book :: Chapters 16, 17 and 18 Fred Durão.
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
CS590Z Statistical Debugging Xiangyu Zhang (part of the slides are from Chao Liu)
Statistical Debugging: A Tutorial Steven C.H. Hoi Acknowledgement: Some slides in this tutorial were borrowed from Chao Liu at UIUC.
SIM5102 Software Evaluation
An Experimental Evaluation on Reliability Features of N-Version Programming Xia Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005.
The Basics of Software Testing
Winter Retreat Connecting the Dots: Using Runtime Paths for Macro Analysis Mike Chen, Emre Kıcıman, Anthony Accardi, Armando Fox, Eric Brewer
Visualization. CS351 - Software Engineering (AY2004)2 Program visualization Debugging programs without the aid of support tools can be extremely difficult.
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared.
Workflow API and workflow services A case study of biodiversity analysis using Windows Workflow Foundation Boris Milašinović Faculty of Electrical Engineering.
Automated Diagnosis of Software Configuration Errors
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
CS527: (Advanced) Topics in Software Engineering Overview of Software Quality Assurance Tao Xie ©D. Marinov, T. Xie.
An automated image prescreening tool for a printer qualification process by † Du-Yong Ng and ‡ Jan P. Allebach † Lexmark International Inc. ‡ School of.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
CPIS 357 Software Quality & Testing
An Automated Approach to Predict Effectiveness of Fault Localization Tools Tien-Duy B. Le, and David Lo School of Information Systems Singapore Management.
Tracking The Problem  By Aaron Jackson. What’s a Problem?  A suspicious or unwanted behavior in a program  Not all problems are errors as some perceived.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Bug Localization with Machine Learning Techniques Wujie Zheng
Lecture 20: Cluster Validation
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
September Bound Computation for Adaptive Systems V&V Giampiero Campa September 2008 West Virginia University.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Xusheng Xiao North Carolina State University CSC 720 Project Presentation 1.
References: “Pruning Dynamic Slices With Confidence’’, by X. Zhang, N. Gupta and R. Gupta (PLDI 2006). “Locating Faults Through Automated Predicate Switching’’,
VizDB A tool to support Exploration of large databases By using Human Visual System To analyze mid-size to large data.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.
LogTree: A Framework for Generating System Events from Raw Textual Logs Liang Tang and Tao Li School of Computing and Information Sciences Florida International.
Pruning Dynamic Slices With Confidence Original by: Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona Presented by: David Carrillo.
8/23/00ISSTA Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Author: Chao Liu, Xiangyu Zhang Presented by Wenbin Li.
Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.
Experience Report: System Log Analysis for Anomaly Detection
Regression Testing with its types
Course Outcomes of Object Oriented Modeling Design (17630,C604)
Chapter 8 – Software Testing
Towards Trustworthy Program Repair
Outlier Processing via L1-Principal Subspaces
Using Execution Feedback in Test Case Generation
Learning Software Behavior for Automated Diagnosis
Effective Data-Race Detection for the Kernel
CSE 4705 Artificial Intelligence
Software testing strategies 2
Test Case Purification for Improving Fault Localization
Expandable Group Identification in Spreadsheets
Cluster Validity For supervised classification we have a variety of measures to evaluate how good our model is Accuracy, precision, recall For cluster.
Ying Dai Faculty of software and information science,
Code search & recommendation engines
Overview of Query Evaluation
Precise Condition Synthesis for Program Repair
Test Cases, Test Suites and Test Case management systems
By Hyunsook Do, Sebastian Elbaum, Gregg Rothermel
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Mitigating the Effects of Flaky Tests on Mutation Testing
Presentation transcript:

02/13/20071 Indexing Noncrashing Failures: A Dynamic Program Slicing-Based Approach Chao Liu, Xiangyu Zhang, Jiawei Han, Yu Zhang, Bharat K. Bhargava University of Illinois at Urbana-Champaign Purdue University Supported by NSF ,

2 Overview Problem:  Automatically cluster program failures that are due to the same bug. Solution:  By looking at the similarity between the dynamic slices of program failures.

3 Outline Motivation Failure Indexing in Formulation Dynamic Slicing-Based Failure Indexing Experiments Conclusion

4 Automated Failure Reporting End-users as Beta testers  Valuable information about failure occurrences in reality  24.5 million/day in Redmond (if all users send) – John Dvorak, PC Magazine Widely adopted because of its usefulness  Microsoft Windows, Linux Gentoo, Mozilla applications …  Any applications can implement this functionality

5 Failure Report Automatic reports (windows/mozilla)  Application name, version (e.g., winword.exe).  Module name, version (e.g., mso.dll)  Offset into module (for example, 00003cbb).  Calling context. Manual reports (bugzilla)  Textual description of the symptoms  Failure inducing input

6 After Failures Collected … Failure triage  Failure prioritization: What are the most severe bugs? Worst 1% bugs = 50% failures  Duplicate failure removal Same failures can be reported multiple times  Patch suggestion Automatically locating the patch by querying the patch database with the reported failure

7 Cluster failure reports that may correspond to the same fault. A Solution: Failure Indexing Most Severe Less Severe Least Severe Failure Report s X Y 0

8 Current Status of Failure Indexing Great success in indexing crashing failures  Same crashing venues likely imply the same failure  E.g., Microsoft Dr. Watson System, Mozilla Quality Feedback Agent … Elusive: How to index noncrashing failures  Noncrashing failures are mainly due to semantic bugs  Hard to index because crashing contexts not available anymore

9 Noncrashing Failures Examples.  Unwanted dialogs.  Undesired visual outputs, e.g. colors, layouts.  Periodical loss of focus.  Periodical loss of connection.  Abnormal memory consumption.  Abnormal performance. Caused by semantic bugs.

10 Semantic Bugs Dominate Semantic Bugs: Application specific Only few are detectable Mostly require annotations or specifications Memory-related Bugs: Many are detectable Others Concurrency bugs Bug Distribution [Li et al., ICSE’07] 264 bugs in Mozilla and 98 bugs in Apache manually checked 29,000 bugs in Bugzilla automatically checked Courtesy of Zhenmin Li

11 Existing Approaches to Indexing Noncrashing Failures T-Proximity [Podgurski et al., ICSE 2003]  Failures exhibiting similar behaviors (e.g., similar branchings) are indexed together  Entire execution is considered R-Proximity [Liu and Han, FSE 2006]  Failures likely due to the same bug are indexed together  Bug location for each failure is automatically found through statistical debugging tool SOBER [Liu et al., FSE 2005]

12 Comments on Existing Approaches Ideal Solution (possible through manual effort)  Index by root causes (i.e., the exact fault location)  Finding root causes for every failure is exactly what failure indexing wants to circumvent T-Proximity  Indexing based on the entire execution  But usually only a small part of an execution is failure-relevant R-Proximity  Indexing by likely fault location – failure-relevant  Better quality than T-Proximity, but requires a set of passing executions to find the likely fault location Theme of this paper  Can we index noncrashing failures as effectively as R- Proximity without any successful executions?

13 Outline Motivation Failure Indexing in Formulation Dynamic Slicing-Based Failure Indexing Experiments Conclusion

14 Failure Indexing in Formulation A failure indexing technique is a function pair  : Signature function that represents a failing execution in certain ways  : Distance function that calculates the dissimilarity between two failure signatures Indexing result  A proximity matrix where the (i, j) cell is the dissimilarity between failure and, i.e.,  Failures and are indexed together if is small

15 Metrics for Indexing Effectiveness No quantitative metric for indexing effectiveness exists Indexing effectiveness  Cohesion: To what extent failures due to the same bug are close to each other  Separation: To what extent failures due to different bug are separated from each other Silhouette coefficient  A measure adapted from data mining  A value ranges from -1 to 1, the higher the better  More details in paper (Section 2.2)

16 Outline Motivation Failure Indexing in Formulation Dynamic Slicing-Based Failure Indexing Experiments Conclusion

17 Dynamic Slicing-Based Failure Indexing Dynamic slicing as the failure signature function

18 Dynamic Slicing Full dynamic slice (FS) is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and Laski, 1988 ] …… 10. A = … B = …… 30. P = 31. If (P<0) { A = A } 37. B=B+1 …… 40. Error(A) FS = {10, 30, 35, 40}

19 Data Slicing Full dynamic slice (FS) is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and Laski, 1988 ] Data slice (DS): only data dependence is considered. …… 10. A = … B = …… 30. P = 31. If (P<0) { A = A } 37. B=B+1 …… 40. Error(A) DS = {10, 35, 40}

20 Distance between Dynamic Slices For any two non-empty dynamic slices and of the same program, the distance between them is

21 Outline Motivation Failure Indexing in Formulation Dynamic Slicing-Based Failure Indexing Experiments Conclusion

22 Experiment Result Experiment setup  Benchmark (gzip 1.2.3) obtained from the Software-artifact Infrastructure Repository (SIR from Nebraska Lincoln), together with a test suite  6,184 lines of C code  Ground-truth determination group 1 group 2 group 1 &2 -

23 Two Semantic Bugs in Gzip Ground Truth:  217 input test cases (executions) in total  82 cases fail due to both faults, no crashes  65 fail due to Fault 1, 17 fail due to Fault 2 deflate.c /*Fault 1*/ /*Fault 2*/

24 Indexing Result R-Proximity is the most effective  Expected because it uses information from both passing and failing executions T-Proximity is the worst  Expected because it essentially indexes the entire execution, rather than the failure relevant part FS-Proximity and DS- proximity  More effective than T- Proximity because indexing on failure- relevant information  Less effective than R- Proximity because of no access to passing executions Red crosses are for failures due to Fault 1 Blue circles are for failures due to Fault 2 Proximity Graph(PG): the axes are meaningless, if two objects are distant in the PG, they are distant in their original space

25 Indexing Result- A Closer Look (1) Data slices can precisely capture the error propagation mechanism of Fault two. Red crosses are for failures due to Fault 1 Blue circles are for failures due to Fault 2

26 Indexing Result- A Closer Look (2) Data slices can precisely capture the two different error propagation mechanisms of Fault 1 Red crosses are for failures due to Fault 1 Blue circles are for failures due to Fault 2

27 Observations Dynamic slicing based failure proximity is more effective than T-Proximity DS-Proximity is more accurate than FS- Proximity DS-Proximity is able to produce more cohesive individual clusters.  However, clusters belong to the same bug may be distant due to the different error propagations. Not as good as R-Proximity But does not require passing reports.

28 Outline Motivation Failure Indexing in Formulation Dynamic Slicing-Based Failure Indexing Experiments Conclusion

29 Conclusions Indexing noncrashing failures  An increasingly important question as crashing failures are tackled more and more nicely  Not intensively studied yet Dynamic slicing-based failure indexing  Effective and does not rely on passing executions A framework to develop and evaluate more indexing techniques  Decomposition of an indexing technique into signature function and distance function – Many instantiations  Quantitative evaluation metrics for scientific study

30 Further discussion,contact