Automated Debugging with Error Invariants TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A AA A A Chanseok Oh.

Slides:



Advertisements
Similar presentations
1 Verification by Model Checking. 2 Part 1 : Motivation.
Advertisements

DATAFLOW TESTING DONE BY A.PRIYA, 08CSEE17, II- M.s.c [C.S].
Delta Debugging and Model Checkers for fault localization
SPEED: Precise & Efficient Static Estimation of Symbolic Computational Complexity Sumit Gulwani MSR Redmond TexPoint fonts used in EMF. Read the TexPoint.
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
50.530: Software Engineering Sun Jun SUTD. Week 10: Invariant Generation.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Mahadevan Subramaniam and Bo Guo University of Nebraska at Omaha An Approach for Selecting Tests with Provable Guarantees.
Chair of Software Engineering From Program slicing to Abstract Interpretation Dr. Manuel Oriol.
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
Timed Automata.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
Discussion #33 Adjacency Matrices. Topics Adjacency matrix for a directed graph Reachability Algorithmic Complexity and Correctness –Big Oh –Proofs of.
Pushdown Systems Koushik Sen EECS, UC Berkeley Slide Source: Sanjit A. Seshia.
Programming Logic and Design Sixth Edition
FIT FIT1002 Computer Programming Unit 19 Testing and Debugging.
The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.
An Optimization Approach to Improving Collections of Shape Maps Andy Nguyen, Mirela Ben-Chen, Katarzyna Welnicka, Yinyu Ye, Leonidas Guibas Computer Science.
1 Formal Methods in SE Qaisar Javaid Assistant Professor Lecture # 11.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
What Went Wrong? Alex Groce Carnegie Mellon University Willem Visser NASA Ames Research Center.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Overview of program analysis Mooly Sagiv html://
DAST 2005 Tirgul 6 Heaps Induction. DAST 2005 Heaps A binary heap is a nearly complete binary tree stored in an array object In a max heap, the value.
Searching in a Graph CS 5010 Program Design Paradigms “Bootcamp” Lesson 8.4 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
Automated Diagnosis of Software Configuration Errors
Describing algorithms in pseudo code To describe algorithms we need a language which is: – less formal than programming languages (implementation details.
Regular Model Checking Ahmed Bouajjani,Benget Jonsson, Marcus Nillson and Tayssir Touili Moran Ben Tulila
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Locating Causes of Program Failures Texas State University CS 5393 Software Quality Project Yin Deng.
Do … while ( continue_cond ) Syntax: do { stuff you want to happen in the loop } while (continue_condition);
Bug Localization with Machine Learning Techniques Wujie Zheng
1 Inference Rules and Proofs (Z); Program Specification and Verification Inference Rules and Proofs (Z); Program Specification and Verification.
CEN 4072 Software Testing Chapter 1: How Failures Come To Be.
Extended Static Checking for Java  ESC/Java finds common errors in Java programs: null dereferences, array index bounds errors, type cast errors, race.
Mr. Dave Clausen1 La Cañada High School Chapter 6: Repetition Statements.
Reasoning about programs March CSE 403, Winter 2011, Brun.
References: “Pruning Dynamic Slices With Confidence’’, by X. Zhang, N. Gupta and R. Gupta (PLDI 2006). “Locating Faults Through Automated Predicate Switching’’,
Java Basics Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Learning Symbolic Interfaces of Software Components Zvonimir Rakamarić.
Using Loop Invariants to Detect Transient Faults in the Data Caches Seung Woo Son, Sri Hari Krishna Narayanan and Mahmut T. Kandemir Microsystems Design.
1 CEN 4072 Software Testing PPT12: Fixing the defect.
Searching in a Graph CS 5010 Program Design Paradigms “Bootcamp” Lesson TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
SAT-Based Model Checking Without Unrolling Aaron R. Bradley.
Use of SMT Solvers in Verification Thomas Wies New York University.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Error Explanation with Distance Metrics Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer Strichman International Journal on Software Tools for.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
1 Section 8.2 Program Correctness (for imperative programs) A theory of program correctness needs wffs, axioms, and inference rules. Wffs (called Hoare.
Programming Logic and Design Fifth Edition, Comprehensive Chapter 6 Arrays.
CS Class 04 Topics  Selection statement – IF  Expressions  More practice writing simple C++ programs Announcements  Read pages for next.
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
Introduction to Programming G50PRO University of Nottingham Unit 6 : Control Flow Statements 2 Paul Tennent
Abstraction and Abstract Interpretation. Abstraction (a simplified view) Abstraction is an effective tool in verification Given a transition system, we.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Debugging and Testing Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
The software model checker BLAST Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar Presented by Yunho Kim TexPoint fonts used in EMF. Read.
Testing and Debugging PPT By :Dr. R. Mall.
Localizing Errors in Counterexample Traces
Reasoning About Code.
Reasoning about code CSE 331 University of Washington.
Automatic Verification
Program Slicing Baishakhi Ray University of Virginia
White-Box Testing.
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
50.530: Software Engineering
Predicate Abstraction
Presentation transcript:

Automated Debugging with Error Invariants TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A AA A A Chanseok Oh (NYU), Martin Schäf (SRI), Daniel Schwartz-Narbonne (NYU), Thomas Wies (NYU)

Program takes a sequence of integers as input returns the sorted sequence. On the input sequence 14, 11 the program returns 0, 11 instead of 11,14. Faulty Shell Sort [Cleve, Zeller ICSE‘05]

Program takes a sequence of integers as input returns the sorted sequence. On the input sequence 11, 14 the program returns 0, 11 instead of 11,14. Faulty Shell Sort [Cleve, Zeller ICSE‘05] 32: shell_sort(a, argc); Fault Localization: given a faulty program execution, automatically identify root cause of the error.

Landscape of Fault Localization Techniques dynamic static delta debugging distance metrics nearest neighbors cause-effect chains... slicing max-sat Compare failing/successful executions  needs good test suites Analyze program/failing execution  no test suites needed

Program takes a sequence of integers as input returns the sorted sequence. On the input sequence 11, 14 the program returns 0, 11 instead of 11,14. Faulty Shell Sort [Cleve, Zeller ICSE‘05] State-of-the-art static fault localization tool identifies 18 statements as potential locations of the root cause.

New Static Approach: Fault Abstraction

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 … 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc;

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc; 11: do { 12: h /= 3; 13: for (i = h; i < size; i++) { 14: int v = a[i]; 15: for (j = i; j >= h && a[j–h] > v; j -= h) 16: a[j] = a[j-h]; 17: if (i != j) 18: a[j] = v; 19: } 20: } while (h != 1); Things happen in the last iteration of the do-while loop. …

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc; 11: do { 12: h /= 3; 13: for (i = h; i < size; i++) { 14: int v = a[i]; 15: for (j = i; j >= h && a[j–h] > v; j -= h) 16: a[j] = a[j-h]; 17: if (i != j) 18: a[j] = v; 19: } 20: } while (h != 1);... in the last iteration of the outer for loop. …

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc; 11: do { 12: h /= 3; 13: for (i = h; i < size; i++) { 14: int v = a[i]; 15: for (j = i; j >= h && a[j–h] > v; j -= h) 16: a[j] = a[j-h]; 17: if (i != j) 18: a[j] = v; 19: } 20: } while (h != 1); The swap in the insertion sort loop is where the array gets its wrong entry. …

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc; 11: do { 12: h /= 3; 13: for (i = h; i < size; i++) { 14: int v = a[i]; 15: for (j = i; j >= h && a[j–h] > v; j -= h) 16: a[j] = a[j-h]; 17: if (i != j) 18: a[j] = v; 19: } 20: } while (h != 1);... because the array read is outside the array bounds. …

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc;...because argc should have been decremented by 1. Ouch! 32: shell_sort(a, argc); …

Fault Abstraction 1 size = argc; 2 h = 1; 3 h = h*3+1; 4 assume !(h <= size); 5 h /= 3; 6 i = h; 7 assume (i < size); 8 v = a[i]; 9 j = i; 10 assume !(j >= h && a[j-h] > v); 11 i++; 12 assume (i < size); 13 v = a[i]; 14 j = i; 15 assume (j >= h && a[j-h] > v); 16 a[j] = a[j-h]; 17 j -= h; 18 assume (j >= h && a[j-h] > v); 19 a[j] = a[j-h]; 20 j -= h; 21 assume !(j >= h && a[j-h] > v); 22 assume (i != j); 23 a[j] = v; 24 i++; 25 assume !(i < size); 26 assume (h == 1); Input: error path = program path + input state + expected output state Output: abstract slice of error path

Basic Idea of Fault Abstraction Consider a finite automaton A word that is accepted by the automaton: a b c d a b b b a Eliminate the loops to obtain a simpler accepted word: a a a q0q0 q1q1 q4q4 qfqf a a b a q2q2 q3q3 b c d

Programs as Automata [Heizmann, Hoenicke, Podelski SAS’09] l1l1 l1l1 l2l2 l2l2 l3l3 l3l3 l err l exit y ≤ z x ≥ z x ≥ y x < y x := x + 1 x < z 1: assume y ≤ z; 2: while x < y do x := x + 1; 3: assert x ≤ z; Nodes are program states Edges are program transitions

l1l1 l1l1 l2l2 l2l2 l3l3 l3l3 l err l exit y ≤ z x ≥ z x ≥ y x < y x := x + 1 x < z Programs as Automata [Heizmann, Hoenicke, Podelski SAS’09] But there are no repeating states in executions of error paths? 1: assume y ≤ z; 2: while x < y do x := x + 1; 3: assert x ≤ z; error path = finite word of program statements program = automaton that accepts error paths fault local. = eliminate loops in the error path

Executions are runs of the automaton program states Reachable Program States Error States

Executions are runs of the automaton program states Reachable Program States Error States

Abstractions are OK as long as they do not change whether errors occur program states Abstraction of Reachable Program States Error States

Abstractions are OK as long as they do not change whether errors occur program states Abstraction of Reachable Program States Error States

Our idea: when possible, replace program states with invariants program states Abstraction of Reachable Program States Error States At every program point P, define an invariant I P that: Includes every state possible at P Can only reach an error if P could

An error invariant I for a position p in an error trace τ is a formula over program variables such that 1.all states reachable by executing the prefix of τ up to position p satisfy I 2.all executions of the suffix of τ that start from p in a state that satisfies I, still lead to the same error. Error Invariants s 0 ⊨ Pres err ⊭ Post sp ⊨ I sp ⊨ I

Fault Abstraction program states a b c d a b a error states

Fault Abstraction program states b,c,d a a a b a b c d a b a error states

Fault Abstraction program states b,c,d a a a b a b c d a b a error states Need a suitable notion of state equivalence: Two states are equivalent if, from both states, error states can be reached “for the same reason”.

Restoring Invariants is sufficient program states b,c,d a a a b a b c d a b a error states

Restoring Invariants is sufficient program states a a a a b c d a b a error states How can we compute error invariants, given an error path?

Each statement becomes an SMT assertion Add the violated assertion We now have an UNSAT formula Calculate an invariant following each statement Technically a Craig interpolant Note that they are not always minimal. This forms our automaton states are the invariants edges are the statements Step 1: Encode in SMT a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[2]==0 ∧ h==1 ∧ i==2 18: a[j] = v 15: j -= h 14: v = a[i] 13: i = i + 1 a[0] != 0 a[0] = 0 …

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 13: i = h; 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc;

Real world example: SED Full Program: LOC Error Trace: 985 LOC Full Program: LOC Error Trace: 985 LOC more LOC

Reduced SED trace //I00: true 3: last_regex = 0; //I01: last_regex == 0 && addr = 0x123 42: addr->addr_regex = last_regex; //I02: mem[0x123]==0 && addr=0x123 65: rxb = addr->addr_regex; //I03: rxb==0; num_regs = rxb->re_nsub + 1 //I04 : false

Results sedgzip LOC Full Program LOC Error Trace LOC address- sensitive slice 1151 LOC address- insensitive slice 449

enull e must not be null Human study: with abstract slices, programmers need 60% less time to understand inconsistencies.

In conclusion: 18: a[j] = v 15: j -= h 14: v = a[i] 13: i = i + 1 a[0] != 0 … a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[2]==0 ∧ h==1 ∧ i==2 a[0] = 0

In conclusion: l1l1 l1l1 l2l2 l2l2 l3l3 l3l3 l err l exit y ≤ z x ≥ z x ≥ y x < y x := x + 1 x < z

In conclusion: l1l1 l1l1 l2l2 l2l2 l3l3 l3l3 l err y ≤ z x ≥ y x < z

a[2]==0 ∧ size==3 a[2]==0 ∧ h==1 ∧ i==h v==0 ∧ h==1 ∧ j==1 v==0 ∧ h==1 ∧ j==0 a[0]==0 … 13: i++; 14: v = a[i]; 15: j -= h; 18: a[j] = v; a[2]==0 ∧ h==1 ∧ i==2 a[2]==0 32: size = argc; In conclusion: Thanks for your attention Thanks for your attention