Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.

Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li

Outline Background –Definitions –Motivations Contributions –Algorithm –Visualization the algorithm –Experiments Personal Comments

Definitions A bug predicate is denoted by P –A P is associated with a particular program point. For instance, If (ptr == NULL), or int a = 10; –, if P 1, P 2 are associated with the same program point. –P is observed to be true at least once when running R, denoted by R(P) = 1, otherwise R(P) = 0 A bug is denoted by B (or b) A bug profile is denoted by

Definitions (cont.) A bug profile includes a set of failure runs, which share b as the cause. – more than one bug occur in failure runs – R(P) = 1 indicates that P is a bug predictor, Likely,

Motivations A traditional technique involved in statistical bug prediction –Regularized logistic regression, which select predicates that best predicate outcome of every run Scalability problems lie in large-scale programs

Motivations cont. Scalability problems of Regularized logistic regression –The set P is logically redundant –It’s difficult to achieve the actual important predicates associated with specific bugs causing different failure.

Contributions To Highlight Contributions –To propose a statistical debugging algorithm To isolate bugs that includes multiple undiagnosed bugs –To perform better than earlier corresponding algorithms –To validate the algorithm by experiments –To reveal circumstances for bugs to happen as well as frequencies of failure runs

Statistical Debugging algorithm To automatically isolating multiple bugs To select S To rank the predicators in S from the most to least important. To let predicators in S and the associated metrics be available to help fix the most serious bugs

Statistical Debugging algorithm cont. Steps: –Identify the most important bug B Not bug B but a predicate P closely correlated with its bug profile –Fix B, and repeat Simulating the program’s behavior without bug b

Statistical Debugging algorithm cont. To identify the bug To select predicates that are the most likely to correspond to its bug profile –P 1,P 2,P 3, …, P, ranked in the order of importance –R(P) = 1 –Bug profiles, unknown size and membership

Statistical Debugging algorithm cont. To repeat to fix bug B –To discard any run such that R(P) = 1 –To recursively apply the algorithm to the remaining runs –To prune P = {P 1,P 2,P 3, …, P B } by: Reducing the importance of predictors of B Re-ranking predictors P, for instance, allowing other predicators to rise to the top in subsequent iterations.

Statistical Debugging algorithm cont. To analyze simple codes to introduce equations in the algorithm. Consider the following C code: –f = …; Line (a) –if (f == NULL) {Line (b) X = 0;Line (c) *f; }Line (d) The bugs in this example is deterministic, because…

Statistical Debugging algorithm cont. Non-deterministic bugs, considering the following codes, –f = …; Line (a) –if (f == NULL) {Line (b) X = 0;Line (c) if (….) f =.. // some valid pointer… *f; }Line (d) The bugs in this example is non- deterministic With respect to (b)

Statistical Debugging algorithm cont. The probability that P being true implies failure. F(P): the number of failing runs in which P is observed to be true. S(P): the number of successful runs in which P is observed to be true.

Statistical Debugging algorithm cont. Failure(P) = 1.0, a bug is deterministic for P, equivalently, P is never observed to be true in a successful run, S(P)=0 Failure(P) < 1.0, non-deterministic

Statistical Debugging algorithm cont. Failure(P) is not enough, considering… –f = …; Line (a) –if (f == NULL) {Line (b) X = 0;Line (c) *f; }Line (d) Failure(f == NULL) =1.0, good As well, Failure(x == 0) =1.0, why? x==0 always true, only failures reach it

Statistical Debugging algorithm cont. Thus, just because Failure(P) is high does not mean P is the cause of a bug, only means this predicate is checked on a path of failures. In the case of (x==0), the condition causing failure is made earlier, e.g. (f == NULL)

Statistical Debugging algorithm cont. It is introduced to address the issue. Not only by the chance that it implies failure, but also how much difference of the P is observed to be true vs. simply reaching it where the P is checked. To eliminate the predicates irrelevant to the bug, like (x==0) in the above example

Statistical Debugging algorithm cont. In the above example Failure(x==0)=Context(x==0)=1.0 and so Increase(x==0)=0; Conclusion: a predicate P with no useful for predication and be discarded.

Visualization of Algorithm Thermometer is used for visualization of experiments The length of the thermometer: # of runs where a predicate is observed. Black band on the left: Context(P); red band: Increase (P); white band: # of successful runs; S(P)

Visualization of Algorithm cont. It shows F(P) after discarded the negative increase(P) The large white band reveals these predicates are non-deterministic. The very narrow red band indicate that Increase scores are small. With high increase scores Super-bug predicate, combining Multiple bugs

Visualization of Algorithm cont.

The following suggestions of metric of predicates are made from the above observation

Experiments To validate the statistic debugging algorithm in five case studies. To determine how many runs needed, let importance N (P) be the importance of P using N runs. So, Importance 32,000 (P) – Importance N (P)<0.2

Experiments

Personal Comments Likes –Well structure, problems addressed, then proposed solutions addressed, step by step –Using a real and simple example to explain problems and difficulties that lies in research –Giving statistical interpretation by visualization, using their observation to explain the abstract mathematic equations

Personal Comments Dislikes –They do not mention whether their research could be extended for isolation potential bugs, e.g. bugs with less importance, which probably cause failure in future –The dark (red) band and grey (pink) dark band in pictures are not very clear if this paper is only white/ black.

Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.

Similar presentations

Presentation on theme: "Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li.

Similar presentations

Presentation on theme: "Scalable Statistical Bug Isolation Authors: B. Liblit, M. Naik, A.X. Zheng, A. Aiken, M. I. Jordan Presented by S. Li."— Presentation transcript:

Similar presentations

About project

Feedback