Data Structure Repair Brian Demsky Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology.

Slides:



Advertisements
Similar presentations
.NET Technology. Introduction Overview of.NET What.NET means for Developers, Users and Businesses Two.NET Research Projects:.NET Generics AsmL.
Advertisements

Chapter 16: Recovery System
SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
Virtual Memory Introduction to Operating Systems: Module 9.
Programming Types of Testing.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
CHAPTER 1: AN OVERVIEW OF COMPUTERS AND LOGIC. Objectives 2  Understand computer components and operations  Describe the steps involved in the programming.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Static Testing: defect prevention SIM objectives Able to list various type of structured group examinations (manual checking) Able to statically.
Static Specification Analysis for Termination of Specification-Based Data Structure Repair Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts.
Automatic Data Structure Repair for Self-Healing Systems Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Automatic Detection and Repair of Errors in Data Structures Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Informationsteknologi Friday, November 16, 2007Computer Architecture I - Class 121 Today’s class Operating System Machine Level.
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
Programming Languages Structure
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
Data Structure Repair Using Goal-Directed Reasoning Brian Demsky Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
Automatic Detection and Repair of Errors in Data Structures Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Specification-Based Error Localization Brian Demsky Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Testing an individual module
Applied Software Project Management Andrew Stellman & Jennifer Greene Applied Software Project Management Applied Software.
Describing Syntax and Semantics
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Data Structure Repair Using Goal-Directed Reasoning Brian Demsky Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Specification-Based Error Localization Brian Demsky Cristian Cadar Daniel Roy Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts.
Data Structure Repair Using Goal-Directed Reasoning Brian Demsky Martin Rinard Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Query Processing Presented by Aung S. Win.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
System/Software Testing
CC0002NI – Computer Programming Computer Programming Er. Saroj Sharan Regmi Week 7.
Reverse Engineering State Machines by Interactive Grammar Inference Neil Walkinshaw, Kirill Bogdanov, Mike Holcombe, Sarah Salahuddin.
CMSC 345 Fall 2000 Unit Testing. The testing process.
CHAPTER 5: CONTROL STRUCTURES II INSTRUCTOR: MOHAMMAD MOJADDAM.
Exam2 Review Bernard Chen Spring Deadlock Example semaphores A and B, initialized to 1 P0 P1 wait (A); wait(B) wait (B); wait(A)
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Problem Solving Techniques. Compiler n Is a computer program whose purpose is to take a description of a desired program coded in a programming language.
Data Structure Repair. Data structure repair problem F = 20 G = 5 F = 20 G = 10 I = 5 J = 2 Broken Data Structure Errors Missing elements Inappropriate.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
Chapter 5 Control Structure (Repetition). Objectives In this chapter, you will: Learn about repetition (looping) control structures Explore how to construct.
Chapter 5: Control Structures II (Repetition). Objectives In this chapter, you will: – Learn about repetition (looping) control structures – Learn how.
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
Paging (continued) & Caching CS-3013 A-term Paging (continued) & Caching CS-3013 Operating Systems A-term 2008 (Slides include materials from Modern.
Testing OO software. State Based Testing State machine: implementation-independent specification (model) of the dynamic behaviour of the system State:
Dynamic Testing.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
Chapter 7 Memory Management Eighth Edition William Stallings Operating Systems: Internals and Design Principles.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.
Programming Logic and Design Seventh Edition Chapter 1 An Overview of Computers and Programming.
Graph Coverage for Specifications CS 4501 / 6501 Software Testing
Journaling File Systems
Java Programming: Guided Learning with Early Objects
Design and Programming
Algorithm Correctness
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Assertions References: internet notes; Bertrand Meyer, Object-Oriented Software Construction; 4/25/2019.
Foundations and Definitions
Programming Logic and Design Eighth Edition
Presentation transcript:

Data Structure Repair Brian Demsky Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

Motivation F = 20 G = 5 F = 20 G = 10 I = 5 J = 2 Broken Data Structure Errors Missing elements Inappropriate sharing Dangling references Out of bounds array indices Inconsistent values

F = 10 G = 5 F = 2 G = 1 I = 3 J = 2 F = 20 G = 10 F = 20 G = 5 F = 20 G = 10 I = 5 J = 2 Broken Data StructureConsistent Data Structure Repair Algorithm Goal

F = 10 G = 5 F = 2 G = 1 I = 3 J = 2 F = 20 G = 10 I = 5 J = 2 Broken Data StructureConsistent Data Structure Repair Algorithm Consistency Properties From Developer F = 20 G = 5 F = 20 G = 10 Goal

What Does Repair Algorithm Produce? Data structure that Satisfies consistency properties, and Heuristically close to broken data structure Not necessarily the same data structure as (hypothetical) correct program would produce But enough to keep program operating successfully

Where Is This Likely To Be Useful? Less useful when acceptable to reboot Must be OK to lose volatile state Must be OK to wait for reboot Cause of error must go away after reboot Persistent data structures (file systems, application files) Autonomous and/or safety critical systems Monitor/control unstable physical phenomena Largely independent subcomputations Moving time window

Basic Approach Broken Bits Repaired Bits Broken Abstract Model Repaired Abstract Model Abstract Repair Automatically Generated Concrete Repair Model Translation

Developer and System Responsibilities Broken Bits Repaired Bits Broken Abstract Model Repaired Abstract Model Abstract Repair Automatically Generated Concrete Repair Model Translation 1 2 Consistency Constraints 4 3 Model Definition Rules 5 Consistency Check 6

Architecture Rationale Why use the abstract model? Model construction separates objects into sets Reachability properties Field values Different constraints for objects in different sets Appropriate division of complexity Data structure representation complexity encapsulated in model definition rules Consistency property complexity encapsulated in (clean, uniform) model constraints

Talk Outline File System Example Model Definition Rules Consistency Constraints Abstract Repairs Concrete Repairs Benchmarks Specification Inference Related Work Future Directions Conclusion

File System Example struct disk { int blockbitmap; entry dir[numentries]; block block[numblocks]; } struct entry { byte name[Length]; int firstblock; } struct block { int nextblock; byte data[blocksize]; } struct blockbitmap subtype block { int nextblock; bit bitmap[numblocks]; } intro -52 Directory EntriesDisk Blocks 3

File System Model Sets of objects set Block of block : Used | Free set Used of block : Bitmap Relations between objects relation Next : Used, Used relation BlockStatus : Block, boolean Block UsedFree Next Bitmap boolean BlockStatus

Model Translation Bits translated to sets and relations in abstract model using statements of the form: Quantifiers, Condition => Inclusion Constraint  i  [0..numentries-1], 0  d.dir[i].firstblock  d.block[d.dir[i].firstblock]  Used  b  Used, 0  b.nextblock   b,d.block[b.nextblock]   Next  b  Used, 0  b.nextblock  d.block[b.nextblock]  Used  b in [0..numblocks-1], d.block[b]  Used  d.block[b]  Free true  d.block[d.blockbitmap]  Bitmap  j  [0..numblocks-1],  b  Bitmap, true =>  BlockStatus

Model for File System Example intro -52 Directory EntriesDisk Blocks Used Free 0 Blocks Bitmap 3 Next

Developer and System Responsibilities Broken Bits Repaired Bits Broken Abstract Model Repaired Abstract Model Abstract Repair Automatically Generated Concrete Repair Model Translation 1 2 Consistency Constraints 4 3 Model Definition Rules 5 Consistency Check 6

Consistency Constraints in Example |Bitmap|=1  u  Used, u.BlockStatus=true  f  Free, f.BlockStatus=false  b  Used, |Next.b|  Used Free 0 Blocks Bitmap 3 Next

Detecting Inconsistencies Evaluate consistency properties, find violations |Bitmap|=1 is violated - Bitmap set is empty 1 2 Used Free 0 Blocks Bitmap 3 Next

Developer and System Responsibilities Broken Bits Repaired Bits Broken Abstract Model Repaired Abstract Model Abstract Repair Automatically Generated Concrete Repair Model Translation 1 2 Consistency Constraints 4 3 Model Definition Rules 5 Consistency Check 6

Repairing Violations of Model Consistency Properties Violation provides binding for quantified variables Convert Body to disjunctive normal form (p 1  …  p n )  …  (q 1  …  q m ) p 1 … p n, q 1 … q m are basic propositions Choose a conjunction to satisfy Repair violated basic propositions in conjunction

Repairing Violations of Basic Propositions Inequality constraints on values of numeric fields V.R = E, V.R E Compute value of expression, assign relation Presence of required number of objects |S| = C, |S|  C, |S|  C Remove or insert objects from/to set Topology of region surrounding each object |V.R| = C, |V.R|  C, |V.R|  C |R.V| = C, |R.V|  C, |R.V|  C Remove or insert tuples from/to relation Inclusion constraints: V in S, V 1 in V 2.R,  V 1,V 2  in R Remove or add the object or tuple from/to set or relation

Repairing Inconsistencies Repair the violation of |Bitmap|=1 by adding a block to the Bitmap set 1 2 Used Free 0 Blocks Bitmap 3 Next

Developer and System Responsibilities Broken Bits Repaired Bits Broken Abstract Model Repaired Abstract Model Abstract Repair Automatically Generated Concrete Repair Model Translation 1 2 Consistency Constraints 4 3 Model Definition Rules 5 Consistency Check 6

Goal-Directed Reasoning Translates Abstract Repairs Into Concrete Repairs Abstract repairs add or remove objects (or tuples) to sets (or relations) Goal: find concrete data structure updates with same effect 1)Find model definition rules that construct the relevant set or relation 2)Basic strategy: For removals, appropriately falsify the guards of all these model definition rules. For additions, appropriately satisfy the guard of one of these model definition rules.

Goal-Directed Reasoning in Example Abstract Repair: add block 0 to the Bitmap set

Goal-Directed Reasoning in Example Abstract Repair: add block 0 to the Bitmap set Model Definition Rules:  i  [0..numentries-1], 0  d.dir[i].firstblock  d.block[d.dir[i].firstblock]  Used  b  Used, 0  b.nextblock   b,d.block[b.nextblock]   Next  b  Used, 0  b.nextblock  d.block[b.nextblock]  Used  b in [0..numblocks-1], d.block[b]  Used  d.block[b]  Free true  d.block[d.blockbitmap]  Bitmap  j  [0..numblocks-1],  b  Bitmap, true =>  BlockStatus

Goal-Directed Reasoning in Example Abstract Repair: add block 0 to the Bitmap set Model Definition Rules:  i  [0..numentries-1], 0  d.dir[i].firstblock  d.block[d.dir[i].firstblock]  Used  b  Used, 0  b.nextblock   b,d.block[b.nextblock]   Next  b  Used, 0  b.nextblock  d.block[b.nextblock]  Used  b in [0..numblocks-1], d.block[b]  Used  d.block[b]  Free true  d.block[d.blockbitmap]  Bitmap  j  [0..numblocks-1],  b  Bitmap, true =>  BlockStatus

Goal-Directed Reasoning in Example Abstract Repair: add block 0 to the Bitmap set Relevant Model Definition Rule: true  d.block[d.blockbitmap]  Bitmap d.block[d.blockbitmap]=block 0

Goal-Directed Reasoning in Example Abstract Repair: add block 0 to the Bitmap set Relevant Model Definition Rule: true  d.block[d.blockbitmap]  Bitmap d.block[d.blockbitmap]=block 0 Data Structure Update: d.blockbitmap = index of block 0 in d.block array

Repair in Example Original File System Updated File System intro -52 Directory EntriesDisk Blocks 3 intro 02 Directory EntriesDisk Blocks 3 block bitma p

Reasoning at Compile Time Compile specifications into repair algorithms Goal-directed reasoning takes place at compile time Consider possibility that |Bitmap| = 0 Abstract repair Choose a block in Free set Add block to Bitmap set Concrete repair Find relevant model definition rule: true  d.block[d.blockbitmap]  Bitmap Goal-directed reasoning finds following update: d.blockbitmap = index of block in d.block array Check that block is an element of d.block array:  b in [0..numblocks-1], d.block[b]  Used  d.block[b]  Free

Multiple Repairs Some broken data structures may require multiple repairs Reconstruct model Reevaluate consistency constraints Perform any required additional repairs

Architecture Broken Bits Repaired Bits Broken Abstract Model Repaired Abstract Model Abstract Repair Automatically Generated Concrete Repair.. Model Translation

Model Recomputation BlockStatus 1 Used Free Blocks Bitmap Next 0 true 2 3 false

Model Recomputation Re-evaluate constraints, find violations of  u  Used, u.BlockStatus=true and  f  Free, f.BlockStatus=false BlockStatus 1 Used Free Blocks Bitmap Next 0 true 2 3 false

Model Recomputation Repair violations of  u  Used, u.BlockStatus=true and  f  Free, f.BlockStatus=false by modifying the BlockStatus relation BlockStatus 1 Used Free Blocks Bitmap Next 0 true 2 3 false

Repaired File System block bitma p Repaired File System intro Directory EntriesDisk Blocks 3

Acyclic Repair Dependences Questions Isn’t it possible for the repair of one constraint to invalidate another constraint? What about infinite repair loops? What about unsatisfiable specifications? Answer We require specifications to have no cyclic repair dependences between constraints So all repair sequences terminate Repair can fail only because of resource limitations

Repair Dependence Graph 2. Add block to Bitmap 4. Satisfy Rule 6 (BlockStatus) 6. Replace with in BlockStatus 1. |Bitmap|=1 5. f.BlockStatus=false 3. d.blockbitmap=indexof(b free ) 7. b.bitmap[j]=false for j=indexof(f) 8. Remove from BlockStatus by removing Bitmap

Repair Dependence Graph 2. Add block to Bitmap 4. Satisfy Rule 6 (BlockStatus) 6. Replace with in BlockStatus 1. |Bitmap|=1 5. f.BlockStatus=false 3. d.blockbitmap=indexof(b free ) 7. b.bitmap[j]=false for j=indexof(f) 8. Remove from BlockStatus by removing Bitmap

Repair Dependence Graph 2. Add block to Bitmap 4. Satisfy Rule 6 (BlockStatus) 6. Replace with in BlockStatus 1. |Bitmap|=1 5. f.BlockStatus=false 3. d.blockbitmap=indexof(b free ) 7. b.bitmap[j]=false for j=indexof(f)

When to Test for Consistency and Repair Persistent data structures Repair can be independent activity, or Repair when data written out or read in Volatile data structures in running program Under programmer control Transaction-based approach Identify transaction start and end Repair at start, end, or both Failure-based approach Wait until program fails Repair and restart from latest safe point

Experience We acquired five benchmarks (written in C/C++) AbiWord x86 emulator CTAS (air-traffic control tool) Simplified Linux file system Freeciv interactive game We developed specifications for all five Little development time (days, not weeks) Most of time spent figuring out Freeciv and CTAS Each benchmark has Workload Bug or fault insertion methodology Ran benchmarks with and without repair

AbiWord Open-source word processing program Approximately 360,000 lines of C++ code Abiword represents documents using a Piece table Consistency properties: Piece table has a section fragment Piece table has a paragraph fragment Doubly-linked list of fragments is well formed

AbiWord Screen Shot

Results Workload – import (valid) Microsoft Word document that crashes AbiWord Bug that creates inconsistent documents with text before the section fragment Without repair AbiWord crashes when loading the document With repair AbiWord is able to open and successfully process the document

Parallel x86 emulator Parallel x86 emulator for the RAW machine Multi-tile architecture Emulator runs x86 binaries on RAW Contains L2 cache of translated x86 assembly instructions Maintains a constant L2 cache size Consistency property: Computed size of the L2 cache is consistent with its actual size

Results Workload – gzip benchmark on x86 emulator Bug that (sometimes) adds the size of a cache item twice when it is inserted Without repair Actual cache size goes to zero x86 emulator crashes With repair Actual cache size is the same as computed size Program runs correctly

CTAS Set of air-traffic control tools Traffic management Arrival planning Flow visualization Shortcut planning Deployed in centers around country (Dallas/Ft. Worth, Los Angeles, Denver, Miami, Minneapolis/St. Paul, Atlanta, Oakland) Approximately 1 million lines of C/C++ code

CTAS Screen Shot

Results Workload – recorded radar feed from DFW Fault insertion Simulate error in flight plan processing Bad airport index in flight plan data structure Without repair System crashes – segmentation fault With repair Aircraft has different origin or destination System continues to execute Anomaly eventually flushed from system

Aspects of CTAS Lots of independent subcomputations System processes hundreds of aircraft – problem with one should not affect others Multipurpose system (visualization, arrival planning, shortcuts, …) – problem in one purpose should not affect others Sliding time window: anomalies eventually flushed Rebooting ineffective – system will crash again as soon as it sees the problematic flight plan

intro directory block inode bitmap block bitmap block inode … inode block disk blocks Simplified Linux File System Some Consistency Properties inode bitmap consistent with inode usage block bitmap consistent with block usage directory entries refer to valid inodes files contain valid blocks only files do not share blocks super block group block

Results Workload – write and verify several files Simulated power failure Inode and block bitmap errors Partially initialized directory and inode entries Without repair Incorrect file contents because of inode and disk block sharing With repair Bitmaps repaired preventing illegal sharing, correct file contents

POMM OOMP POMM PPMP Terrain Grid City Structures Freeciv Consistency Properties Tiles have valid terrain values Cities are not in the ocean Each city has exactly one reference from the grid O = Ocean P = Plain M = Mountain

Freeciv Screen Shot

Results Workload – Freeciv software plays against itself Fault insertion – randomly corrupt terrain values Without repair – program crashes (seg fault) With repair Game runs just fine But game plays out differently because of the different terrain values

Experience Developing Specifications Specifications small compared to system size Specifications straightforward to develop once you understand consistency properties Potential to omit properties Overhead of understanding data structures

Specification Inference Automatically infer specifications using the dynamic invariant detection tool, Daikon Developer simply reviews generated specification Successfully inferred specifications for two of our benchmarks CTAS Freeciv

CTAS Specification Inferred specification contained All constraints in the hand-coded specification Additional constraints on the arrival and departure runways Different abstractions in the manually developed and inferred specifications

Freeciv Specification Inferred specification is missing properties about city placement (Daikon limitation) Inferred specification contains previously overlooked properties about The continents field of a tile The initial position of the players Similar abstractions in manually-developed and inferred specifications

Related Work Hand-coded repair Lucent 5ESS switch IBM MVS operating system Integrity Maintenance in Databases Deriving Production Rules for Constraint Maintenance (Ceri, Widom) Automatic Generation of Production Rules for Integrity Maintenance (Ceri et al) Constraint analysis: A design process for specifying operations on objects (Urban et al) Consistency management with repair actions (Nentwich et al)

Related Work Constraint mechanisms in programing languages Kaleidoscope (Lopez) Alphonse (Hoover) Self-stabilizing algorithms (Dijkstra) Log-based recovery for database systems Recovery-oriented computing Microrecovery & Microreboot (Candea,Fox) Undo framework (Brown,Patterson) Specification Languages Alloy (Jackson) UML

Future Directions Explore other mechanisms to decouple software systems Data dependences Control dependences More frequent consistency checking Use page protection mechanisms in hardware to incrementally check specifications Static analysis to eliminate unnecessary checks

Conclusion Data structure repair exciting way to (potentially) improve reliability Specification-based approach promises to make technique more widely applicable Automatic inference of specifications promises to make developing data structure consistency specifications even easier Moving towards more robust, probabilistic, continuous concept of system behavior

Implementation Size of system: 26,200 lines Compiler 20,400 lines of Java code 2,500 lines of parser definitions Runtime - 3,200 lines of C code

Time to Check Consistency & Perform Repairs ApplicationTime to Check Consistency(ms) Time to Check and Repair (ms) AbiWord CTAS FreeCiv File system

Lines of Code ApplicationLines of Code AbiWord360,000 x86 emulator65,000 CTAS>1 million FreeCiv73,000 File system700

Formalizing Repair Dependences: Repair Dependence Graph Absence of certain classes of cycles implies valid repair schedule Node removal for cycle elimination – may remove conjunct and update nodes Must leave at least one conjunction per constraint Must leave at least one update per abstract repair