Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan, 2005 University of Wisconsin, Stanford University,

Slides:



Advertisements
Similar presentations
Active Learning with Feedback on Both Features and Instances H. Raghavan, O. Madani and R. Jones Journal of Machine Learning Research 7 (2006) Presented.
Advertisements

Stanford University CS243 Winter 2006 Wei Li 1 Register Allocation.
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken.
1 Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael Jordan Presented By : Arpita Gandhi.
Part II Sigma Freud & Descriptive Statistics
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
1 Software Testing and Quality Assurance Lecture 13 - Planning for Testing (Chapter 3, A Practical Guide to Testing Object- Oriented Software)
Chapter 1. The Phases of Software Development. Data Structure 2 Chapter outline  Objectives  Use Javadoc to write a method’s complete specification.
Statistical Debugging Ben Liblit, University of Wisconsin–Madison.
Bug Isolation via Remote Program Sampling Ben Liblit, Alex Aiken, Alice X.Zheng, Michael I.Jordan Presented by: Xia Cheng.
Bug Isolation in the Presence of Multiple Errors Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan UC Berkeley and Stanford University.
The Future of Correct Software George Necula. 2 Software Correctness is Important ► Where there is software, there are bugs ► It is estimated that software.
Chapter 10 Quality Control McGraw-Hill/Irwin
Software Testing and Quality Assurance
(c) 2007 Mauro Pezzè & Michal Young Ch 16, slide 1 Fault-Based Testing.
CS590Z Statistical Debugging Xiangyu Zhang (part of the slides are from Chao Liu)
Statistical Debugging: A Tutorial Steven C.H. Hoi Acknowledgement: Some slides in this tutorial were borrowed from Chao Liu at UIUC.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
16/27/2015 3:38 AM6/27/2015 3:38 AM6/27/2015 3:38 AMTesting and Debugging Testing The process of verifying the software performs to the specifications.
Reduce Instrumentation Predictors Using Random Forests Presented By Bin Zhao Department of Computer Science University of Maryland May
AFID: An Automated Fault Identification Tool Alex Edwards Sean Tucker Sébastien Worms Rahul Vaidya Brian Demsky.
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice ZhengMike Jordan.
DCT 1123 PROBLEM SOLVING & ALGORITHMS INTRODUCTION TO PROGRAMMING.
AMOST Experimental Comparison of Code-Based and Model-Based Test Prioritization Bogdan Korel Computer Science Department Illinois Institute of Technology.
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan University of Wisconsin, Stanford University, and.
Path Testing + Coverage Chapter 9 Assigned reading from Binder.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Bug Localization with Machine Learning Techniques Wujie Zheng
Automated Whitebox Fuzz Testing (NDSS 2008) Presented by: Edmund Warner University of Central Florida April 7, 2011 David Molnar UC Berkeley
Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.
The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
DEBUGGING. BUG A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected.
Unit Testing 101 Black Box v. White Box. Definition of V&V Verification - is the product correct Validation - is it the correct product.
Software Engineering Chapter 3 CPSC Pascal Brent M. Dingle Texas A&M University.
Controlling Execution Programming Right from the Start with Visual Basic.NET 1/e 8.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Software Testing and Quality Assurance Software Quality Assurance 1.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Cooperative Concurrency Bug Isolation Guoliang Jin, Aditya Thakur, Ben Liblit, Shan Lu University of Wisconsin–Madison Instrumentation and Sampling Strategies.
What is Testing? Testing is the process of finding errors in the system implementation. –The intent of testing is to find problems with the system.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
Bug Localization with Association Rule Mining Wujie Zheng
1 RealProct: Reliable Protocol Conformance Testing with Real Nodes for Wireless Sensor Networks Junjie Xiong, Edith C.-Ngai, Yangfan Zhou, Michael R. Lyu.
Scientific Debugging. Errors in Software Errors are unexpected behaviors or outputs in programs As long as software is developed by humans, it will contain.
Chapter 4 Grouping Objects. Flexible Sized Collections  When writing a program, we often need to be able to group objects into collections  It is typical.
CIS 4910 Information Systems Development Project Project Documentation.
(1) Test Driven Development Philip Johnson Collaborative Software Development Laboratory Information and Computer Sciences University of Hawaii Honolulu.
Aditya Thakur Rathijit Sen Ben Liblit Shan Lu University of Wisconsin–Madison Workshop on Dynamic Analysis 2009 Cooperative Crug Isolation.
Statistical Debugging CS Motivation Bugs will escape in-house testing and analysis tools –Dynamic analysis (i.e. testing) is unsound –Static analysis.
Cooperative Bug Isolation CS Outline Something different today... Look at monitoring deployed code –Collecting information from actual user runs.
Quality Control Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
1 CS510 S o f t w a r e E n g i n e e r i n g Delta Debugging Simplifying and Isolating Failure-Inducing Input Andreas Zeller and Ralf Hildebrandt IEEE.
Automated Adaptive Bug Isolation using Dyninst Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Bug Isolation via Remote Program Sampling Ben LiblitAlex Aiken Alice X. ZhengMichael I. Jordan UC Berkeley.
14 Compilers, Interpreters and Debuggers
Testing Tutorial 7.
Introduction To Repetition The for loop
Understand the Programming Process
Statistical Debugging
Sampling User Executions for Bug Isolation
Public Deployment of Cooperative Bug Isolation
Statistical Debugging
Understand the Programming Process
ICT Programming Lesson 5:
Presentation transcript:

Scalable Statistical Bug Isolation Ben Liblit, Mayur Naik, Alice Zheng, Alex Aiken, and Michael Jordan, 2005 University of Wisconsin, Stanford University, and UC Berkeley Mustafa Dajani 27 Nov 2006 CMSC 838P

Overview of the Paper  explained a statistical debugging algorithm that is able to isolate bugs in programs containing multiple undiagnosed bugs  showed a practical, scalable algorithm for isolating multiple bugs in many software systems 1  outline: Introduction Background Cause Isolation Algorithm Experiments

Objective of the Study: To develop a statistical algorithm to hunt for causes of failures Crash reporting systems are useful in collecting data Actual executions are a vast resource Using feedback data for causes of failures

Introduction Statistical debugging - a dynamic analysis for detecting the causes of run failures. - an instrumentation program basically monitor program behavior by sampling information - this involves testing of predicates in particular events during the run Predicates, P - bug predictors; large programs may consist of thousands of predicates Feedback Report, R - contains information whether a run has succeeded or failed.

Introduction the study’s model of behavior: “ If P is observed to be true at least once during run R then R(P) = 1, otherwise R(P) = 0.” - In other words, it counts how often “P observed true” and “P observed” using random sampling previous works involved the use of regularized logistic regression (it tries to select predicates to determine outcome of every run) - but this algorithm creates redundancy in finding predicates as well as difficulty in predicting multiple bugs

Introduction Study design: - determine all possible predicates - eliminate predicates that have no predictive power - loop { - rank the surviving predicates by importance - remove all top-ranked predicates - discard all runs where the run passed, R(P)=1 - go to top of loop until set of runs or set of predicates are empty

Bug Isolation Architecture Program Source Compiler Sampler Predicates Shipping Application Counts & /       ‚  Statistical Debugging Top bugs with likely causes

Depicting failures through P F(P) F(P) + S(P) Failure(P) = F(P) = # of failures where P observed true S(P) = # of successes where P observed true

Consider this code fragment: if (f == NULL) { x = 0; *f; } When does a program fail?

Valid pointer assignment If (…) f = …some valid pointer…; *f;

Predicting P’s truth or falsehood F(P observed) F(P observed) + S(P observed) Context(P) = F(P observed) = # of failures observing P S(P observed) = # of successes observing P

Notes Two predicates are redundant if they predict the same or nearly the same set of failing ones Because of elimination is iterative, it is only necessary that Importance selects a good predictor at each step and not necessarily the best one.

Guide to Visualization Increase(P) S(P) error bound log(F(P) + S(P)) Context(P) ldi-2005/

Rank by Increase(P) High Increase() but very few failing runs! These are all sub-bug predictors –Each covers one special case of a larger bug Redundancy is clearly a problem

Rank by F(P) Many failing runs but low Increase()! Tend to be super-bug predictors –Each covers several bugs, plus lots of junk

Notes In the language of information retrieval –Increase(P) has high precision, low recall –F(P) has high recall, low precision Standard solution: –Take the harmonic mean of both –Rewards high scores in both dimensions

Rank by Harmonic Mean It works! –Large increase, many failures, few or no successes But redundancy is still a problem

Lessons Learned Can learn a lot from actual executions –Users are running buggy code anyway –We should capture some of that information Crash reporting is a good start, but… –Pre-crash behavior can be important –Successful runs reveal correct behavior –Stack alone is not enough for 50% of bugs