Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts.

Slides:



Advertisements
Similar presentations
Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
Advertisements

Korat Automated Testing Based on Java Predicates Chandrasekhar Boyapati, Sarfraz Khurshid, Darko Marinov MIT ISSTA 2002 Rome, Italy.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
Programming Types of Testing.
1 of 24 Automatic Extraction of Object-Oriented Observer Abstractions from Unit-Test Executions Dept. of Computer Science & Engineering University of Washington,
Dept. of Computer Science A Runtime Assertion Checker for the Java Modeling Language (JML) Yoonsik Cheon and Gary T. Leavens SERP 2002, June 24-27, 2002.
1 Static Testing: defect prevention SIM objectives Able to list various type of structured group examinations (manual checking) Able to statically.
Static Specification Analysis for Termination of Specification-Based Data Structure Repair Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts.
Automatic Data Structure Repair for Self-Healing Systems Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Encapsulation by Subprograms and Type Definitions
Role-Based Exploration of Object-Oriented Programs Brian Demsky and Martin Rinard MIT Laboratory for Computer Science.
Automatic Detection and Repair of Errors in Data Structures Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Data Structure Repair Using Goal-Directed Reasoning Brian Demsky Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Automatic Detection and Repair of Errors in Data Structures Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Specification-Based Error Localization Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Data Structure Repair Brian Demsky Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology.
Automatic Extraction of Object-Oriented Component Interfaces John Whaley Michael C. Martin Monica S. Lam Computer Systems Laboratory Stanford University.
Role Analysis Victor Kunkac, Patric Lam, Martin Rinard Laboratory for Computer Science, MIT Presentation by George Caragea CMSC631,
Data Structure Repair Using Goal-Directed Reasoning Brian Demsky Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Data Structure Repair Using Goal-Directed Reasoning Brian Demsky Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
MCA –Software Engineering Kantipur City College. Topics include  Formal Methods Concept  Formal Specification Language Test plan creation Test-case.
Korat: Automated Testing Based on Java Predicates Chandrasekhar Boyapati 1, Sarfraz Khurshid 2, and Darko Marinov 3 1 University of Michigan Ann Arbor.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica
A Simple Method for Extracting Models from Protocol Code David Lie, Andy Chou, Dawson Engler and David Dill Computer Systems Laboratory Stanford University.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
Learning, Monitoring, and Repair in Application Communities Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Designing and Debugging Batch and Interactive COBOL Programs Chapter 5.
Validated Model Transformation Tihamér Levendovszky Budapest University of Technology and Economics Department of Automation and Applied Informatics Applied.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
Introduction CS 3358 Data Structures. What is Computer Science? Computer Science is the study of algorithms, including their  Formal and mathematical.
Data Structure Repair. Data structure repair problem F = 20 G = 5 F = 20 G = 10 I = 5 J = 2 Broken Data Structure Errors Missing elements Inappropriate.
Hai Wan School of Software Sun Yat-sen University KRW-2012 June 17, 2012 Boolean Program Repair Reverse Conversion Tool via SMT.
An Object-Oriented Approach to Programming Logic and Design Fourth Edition Chapter 4 Looping.
Jinlin Yang and David Evans [jinlin, Department of Computer Science University of Virginia PASTE 2004 June 7 th 2004
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
IDENTIFYING SEMANTIC DIFFERENCES IN ASPECTJ PROGRAMS Martin Görg and Jianjun Zhao Computer Science Department, Shanghai Jiao Tong University.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Automated Patch Generation Adapted from Tevfik Bultan’s Lecture.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Author: Alex Groce, Daniel Kroening, and Flavio Lerda Computer Science Department, Carnegie Mellon University Pittsburgh, PA Source: R. Alur and.
Gordana Rakić, Zoran Budimac
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
David Streader Computer Science Victoria University of Wellington Copyright: David Streader, Victoria University of Wellington Debugging COMP T1.
Design-Directed Programming Martin Rinard Daniel Jackson MIT Laboratory for Computer Science.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
( = “unknown yet”) Our novel symbolic execution framework: - extends model checking to programs that have complex inputs with unbounded (very large) data.
Finding bugs with a constraint solver daniel jackson. mandana vaziri mit laboratory for computer science issta 2000.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Testing and Debugging UCT Department of Computer Science Computer Science 1015F Hussein Suleman March 2009.
Testing Tutorial 7.
Localizing Errors in Counterexample Traces
Compositional Pointer and Escape Analysis for Java Programs
Model Checking for an Executable Subset of UML
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Programming Fundamentals (750113) Ch1. Problem Solving
PROGRAMMING FUNDAMENTALS Lecture # 03. Programming Language A Programming language used to write computer programs. Its mean of communication between.
Programming Fundamentals (750113) Ch1. Problem Solving
50.530: Software Engineering
CUTE: A Concolic Unit Testing Engine for C
Presentation transcript:

Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Problem Error Introduced Execution with Broken Data Structure Crash or Unexpected Result Have to trace symptom back to cause Error may be present but not visible in test suite

Problem Goal is to discover bugs when they corrupt data not when effect becomes visible Perform frequent consistency checks Bug localized between first unsuccessful check and last successful check Error Introduced Execution with Broken Data Structure Crash or Unexpected Result

Our Approach Specification of Data Structure Consistency Properties Archie Compiler Efficient Consistency Checker Program Instrumented Program with Early Data Structure Corruption Detection +

Architecture Concrete Data Structure Abstract Model Model Definition Rules Model Consistency Constraints

Architecture Rationale Why use the abstract model? Model construction separates objects into sets Reachability properties Field values Different constraints for objects in different sets Appropriate division of complexity Data structure representation complexity encapsulated in model definition rules Consistency property complexity encapsulated in (clean, uniform) model constraint language

List Example structure node { node *next; value *data; } structure value { int data; } node * head;

Sets and Relations in Model Sets of objects set NODE of node set VALUE of value Relations between objects – values of object fields, referencing relationships between objects relation NEXT : NODE -> NODE relation DATA : NODE -> VALUE

Model Translation Bits translated to sets and relations in abstract model using statements of the form: Quantifiers, Condition  Inclusion Constraint true  head in NODE for n in NODE, !n.next = NULL  n.next in NODE for n in NODE, !n.next = NULL   n,n.next  in NEXT for n in NODE, !n.data = NULL  n.data in VALUE for n in NODE, !n.data = NULL   n,n.data  in DATA

Generated Model NODE VALUE DATA NEXT

Consistency Properties Quantifiers, Body Body is first-order property of basic propositions Inequality constraints on numeric fields Cardinality constraints on sizes of sets Referencing relationships for each object Set and relation inclusion constraints Example: for n in NODE, size(NEXT.n)<=1 for v in VALUE, size(DATA.v)=1

Consistency Violations Evaluate consistency properties for v in VALUE, size(DATA.v)=1 NODE VALUE DATA NEXT

Consistency Violations Evaluate consistency properties for v in VALUE, size(DATA.v)=1 NODE VALUE DATA NEXT Inconsistency Found!!!

Default Instrumentation void copynode(struct node *n) { struct node * newnode= malloc(sizeof(struct node)); newnode.data=n.data; newnode.next=n.next; n.next=newnode; } Insert check here

Instrumentation void copynode(struct node *n) { struct node * newnode= malloc(sizeof(struct node)); newnode.data=n.data; newnode.next=n.next; n.next=newnode; } Insert check here Failed Pass

Instrumentation void copynode(struct node *n) { struct node * newnode= malloc(sizeof(struct node)); newnode.data=n.data; newnode.next=n.next; n.next=newnode; } Insert check here Failed Pass

Performance is a Key Issue Would like to perform checks as often as possible Performance of consistency checking limits how frequently program can check Have developed compiler optimizations Fixed point elimination Relation elimination Set elimination Key idea: Perform checks directly on data structures (eliminating model when possible)

Fixed Point Elimination Evaluation of model definition rules requires fixed point computation Replace fixed point computation with more efficient traversal when possible Compute dependence graph for model definition rules Compute strongly connected components (SCCs) Topologically sort SCCs Eliminate fixed point computation for SCCs with no cyclic dependences

Relation Elimination

_

Model Definition Rules: for i in 0..C, true  f[i] in S for s in S, !s.q=NULL   s,s.q  in Q for s in S, !s.q=NULL  s.q in T Model Definition Rule Inlining

Model Definition Rules: for i in 0..C, true  f[i] in S !f[i].q=NULL   f[i],f[i].q  in Q !f[i].q=NULL  f[i].q in T Model Definition Rule Inlining

Model Definition Rules: for i in 0..C, true  f[i] in S !f[i].q=NULL   f[i],f[i].q  in Q !f[i].q=NULL  f[i].q in T Model Constraints: for s in S, MIN<=s.r and s.r<=MAX for t in T, (Q.t).r!=K Constraint Inlining

Model Definition Rules: for i in 0..C, true  f[i] in S !f[i].q=NULL   f[i],f[i].q  in Q !f[i].q=NULL  f[i].q in T MIN<=f[i].r and f[i].r<=MAX Model Constraints: for t in T, (Q.t).r!=K Constraint Inlining

Model Definition Rules: for i in 0..C, true  f[i] in S !f[i].q=NULL   f[i],f[i].q  in Q !f[i].q=NULL  f[i].q in T MIN<=f[i].r and f[i].r<=MAX Model Constraints: for t in T, (Q.t).r!=K Set Elimination

Model Definition Rules: for i in 0..C, true  f[i] in S !f[i].q=NULL   f[i],f[i].q  in Q !f[i].q=NULL  f[i].q in T MIN<=f[i].r and f[i].r<=MAX Model Constraints: for t in T, (Q.t).r!=K Set Elimination

Freeciv Benchmark Multiplayer Client/Server based online game Available at Looked at the server Server contains 73,000 lines of code Added 750 instrumentation sites 20,000 consistency checks performed in our sample execution

Performance Evaluation Fixed point elimination (47x speedup) Relation construction elimination (110x speedup) Set construction elimination (820x speedup) Bottom line Baseline compiled version 5,100 times slower than uninstrumented Optimized version 6 times slower than uninstrumented Optimized version can be used interactively

User Study Designed to answer following question: Does inconsistency detection help developers to more quickly localize and correct detected data structure corruption errors?

User Study Created three buggy version of Freeciv Two groups of three developers One used conventional tools One used specification-based consistency checking Each participant was asked to spend at least one hour on each version Both populations given an instrumented version of Freeciv

Results

Extension: Data Structure Repair Do not stop program with inconsistent data Instead, use consistency specification to repair data structure and keep executing! Input: inconsistent data structure Output: consistent data structure “Automatic detection and repair of errors in data structures” (Demsky, Rinard OOPSLA 2003) Repair enables continued execution All programs execute successfully after repair

Related Work Specification languages such as UML or Alloy Specification-based testing Korat (Boyapati et. al. ISSTA 2002) Testera (Marinov and Khurshid ASE 2001) Eiffel (Meyer 1992) Invariant inference and checking Daikon (Ernst et. al. ICSE 1999) DIDUCE (Hangal and Lam ICSE 2002) Carrot (Pytlik et. al. 2003) Debugging tools AskIgor (Zeller FSE 2002) Debugging Backwards in Time (Lewis AADEBUG 2003)

Conclusion Consistency checking to localize data structure corruption bugs Optimizations for good performance Experimental confirmation that consistency checking may be useful Data structure repair

Results Case study shows benefit from approach With tool All developers found and fixed all bugs Mean of 9 minutes required Without tool Three developers found total of one bug (out of nine developer/bug combinations) Others spent hour debugging (unsuccessfully)

Bugs Introduced Actual errors in buggy versions First error creates invalid terrain values (violates valid terrain property) Second causes two tiles to refer to the same city (violates single reference property) Third causes a city to be placed on ocean (violates cities not in ocean property)

Consistency Properties Map exists size(MAP)=1 Grid of tiles exists size(GRID)=1 Tiles have valid terrain values for t in TILE, MIN <= t.TERRAIN and t.TERRAIN<=MAX Each city has exactly one reference from the grid for c in CITY, size(CITYMAP.c)=1 Cities are not in the ocean for c in CITY, !(CITYMAP.c).TERRAIN=OCEAN