Serdar Tasiran, Tayfun Elmas Koç University, Istanbul, Turkey

Slides:



Advertisements
Similar presentations
Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
Advertisements

Modular and Verified Automatic Program Repair Francesco Logozzo, Thomas Ball RiSE - Microsoft Research Redmond.
Partial Order Reduction: Main Idea
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Chapter 4: Trees Part II - AVL Tree
Background for “KISS: Keep It Simple and Sequential” cs264 Ras Bodik spring 2005.
CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
/ PSWLAB Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs By Cormac Flanagan, Stephen N. Freund 24 th April, 2008 Hong,Shin.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Nested Transactional Memory: Model and Preliminary Architecture Sketches J. Eliot B. Moss Antony L. Hosking.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Nachos Phase 1 Code -Hints and Comments
Runtime Refinement Checking of Concurrent Data Structures (the VYRD project) Serdar Tasiran Koç University, Istanbul, Turkey Shaz Qadeer Microsoft Research,
Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.
Design Principles and Common Security Related Programming Problems
/ PSWLAB Thread Modular Model Checking by Cormac Flanagan and Shaz Qadeer (published in Spin’03) Hong,Shin Thread Modular Model.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Agenda  Quick Review  Finish Introduction  Java Threads.
Simplifying Linearizability Proofs Using Reduction and Abstraction Serdar Tasiran Koc University, Istanbul, Turkey Tayfun Elmas, Ali Sezgin, Omer Subasi.
Threads prepared and instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University 1July 2016Processes.
File System Consistency
Eighth Lecture Exception Handling in Java
Memory Management.
Translation Lookaside Buffer
User-Written Functions
Module 11: File Structure
Transactions and Reliability
CHP - 9 File Structures.
CS522 Advanced database Systems
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Input Space Partition Testing CS 4501 / 6501 Software Testing
Memory Consistency Models
Faster Data Structures in Transactional Memory using Three Paths
Memory Consistency Models
Lecture 25 More Synchronized Data and Producer/Consumer Relationship
Computer Engg, IIT(BHU)
Ch. 4 – Semantic Analysis Errors can arise in syntax, static semantics, dynamic semantics Some PL features are impossible or infeasible to specify in grammar.
Specifying Multithreaded Java semantics for Program Verification
Stack Data Structure, Reverse Polish Notation, Homework 7
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
Design by Contract Fall 2016 Version.
Threads and Memory Models Hal Perkins Autumn 2011
Optimizing Malloc and Free
Fault Injection: A Method for Validating Fault-tolerant System
Design and Programming
Chapter 6 Intermediate-Code Generation
Over-Approximating Boolean Programs with Unbounded Thread Creation
Indexing and Hashing Basic Concepts Ordered Indices
A Robust Data Structure
VyrdMC: Driving Runtime Refinement Checking Using Model Checkers
VyrdMC: Driving Runtime Refinement Checking Using Model Checkers
Dr. Mustafa Cem Kasapbaşı
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Exception Handling Imran Rashid CTO at ManiWeber Technologies.
Runtime Checking of Refinement for Concurrent Software Components
Serdar Tasiran, Tayfun Elmas Koç University, Istanbul, Turkey
Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M
Database Recovery 1 Purpose of Database Recovery
CSE 153 Design of Operating Systems Winter 19
Tayfun Elmas, Serdar Tasiran Koç University, Istanbul, Turkey
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Programming with Shared Memory Specifying parallelism
Tayfun Elmas, Serdar Tasiran Koç University, Istanbul, Turkey
Advance Database System
Internal Representation of Files
Dynamic Binary Translators and Instrumenters
Presentation transcript:

VYRD: VerifYing Concurrent Programs by Runtime Refinement-violation Detection Serdar Tasiran, Tayfun Elmas Koç University, Istanbul, Turkey Shaz Qadeer Microsoft Research, Redmond, WA Thanks: Chandu Thekkath, Lidong Zhou (Boxwood Team, MSR SVC) G. Bolukbasi, M. Erkan Keremoglu (Koç University) Hi all. I’m Tayfun Elmas from Koc University. In this talk I’ll present you a technique for detecting concurrency errors. In this technique we watch for refinement violations at runtime. This is joint work with my advisor Serdar Tasiran and Shaz Qadeer, from Microsoft Research. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Overview Vyrd: Low-overhead runtime tool, handles industrial-scale programs Caught previously undetected (but easily triggered) errors Novel correctness criteria: I/O and view refinement Improved observability, more extensive checking than testing VyrdMC: Use execution-based model checker to drive Vyrd Improve coverage, reduce instrumentation burden More extensive validation from small test programs The LP metric “What else should we try to test if we are after concurrency errors?” Inspired by atomicity, refinement violations in real examples 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Outline Example Refinement The VYRD tool Experience Testing I/O-refinement View-refinement The VYRD tool Experience VYRD + model checking The LP coverage metric Here is the outline of my talk. First I’ll give a motivating data structure example and and explain how our technique applies to the this example. Then I’ll talk about two different notions of refinement called ... I’ll introduce our runtime verification tool Vyrd and the experience we had by applying Vyrd on industrial scale software. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Multiset Implementation: LookUp LookUp (x) for i = 1 to n Multiset data structure M = { 2, 3, 3, 3, 9, 8, 8, 5 } Why this example? Mimics patterns we encountered in real systems Linearizability, atomicity (by reduction) declare this example incorrect But it is not buggy This was the case with Boxwood, Scan file system LookUp (x) for i = 1 to n acquire(A[i]) if (A[i].content==x && A[i].valid) release(A[i]) return true else release(A[i]) return false A 9  8 6  5 3  2 content valid Our motivating data structure is a multiset. Here is an example of a multiset. Notice that several copies of the same integer can be in the multiset like 3 and 8 in this example. The implementation represents the multiset by an array A with two fields. The content field stores the integer element and the Boolean valid field tells us whether the element is to be included in the multiset or not. For example one representation of the multiset above could be like as the bottom one. On the right you see the implementation for the lookup method. Lookup queries whether a given integer x is in the multiset. It traverses the array A linearly by locking elements one by one and checking if the content is x and the valid field is set. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Multiset FindSlot: Helper routine for InsertPair For space allocation Implementation: FindSlot Multiset FindSlot: Helper routine for InsertPair For space allocation Does not set valid field x not in multiset yet FindSlot (x) for i = 1 to n acquire(A[i]) if (A[i].content==null) A[i].content = x release(A[i]) return i else return 0 FindSlot is a helper method for an insertion method I will tell you about in the next slide. Given an integer x, it looks for an empty slot to put x in. If it finds one, it allocates the slot for x by setting its content field to x and returns the index, otherwise it returns 0. Notice that it doesn’t set the valid field, so x is not in the multiset yet. Thus it will not be treated as in the set by a Lookup metod that will check this slot. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Multiset Implementation: InsertPair Refinement violation if only one of x, y inserted Not atomic by reduction Allows exceptional termination Example: Multiset array of size 2 2 concurrent InsertPair’s both find slots for x’s both fail to find slots for y’s Not possible in atomic execution First InsertPair must succeed Linearizability: Each execution must be equivalent to an atomic, linear one that satisfies the sequential specification Execution not buggy, but violates linearizability InsertPair (x, y) i = FindSlot (x) if (i == 0) return failure j = FindSlot (y) if (j == 0) A[i].content = null acquire(A[i]) acquire(A[j]) A[i].valid = true A[j].valid = true release(A[i]) release(A[j]) return success Insertpair has an interesting except. term that is not possible in seq case. Using a sep spec we do not rule out this excep execution. Multiset has an InsertPair method to insert a pair of integers x, y into the contents. The implementation of InsertPair is given on the right. InsertPair makes the multiset example interesting because InsertPair demonstrates the methods in real concurrent systems that first hold up several resources and then completes its operation on all the resources atomically. It is considered an error if one of x or y is inserted and but not the other. To prevent this error, it makes two calls to FindSlot to first allocate slots for x and y. If both FindSlot’s succeed, in a protected block, it includes x and y into the multiset atomically by setting their corresponding valid bits to true. Then it returns success. InsertPair returns failure if either of the FindSlot calls fail. This can happen because of resource contention with other concurrent InsertPair routines. For example, imagine we have an empty multiset of size n. n concurrent InsertPair’s running on this multiset can all find free slots for their x’s but then they may be unable to find slots for their y’s if there is no more empty slots. This causes all the InsertPairs to return failure even though at the beginning there is space for some of them to succeed. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Multiset Specification INSERTPAIR (x, y, retval) Spec state M: set of integers Each method Atomic deterministic state update/observation Given current state, arguments and method return value (if one exists) specifies new Spec state INSERTPAIR (x, y, retval) if (retval == success) M = M U {x, y} return retval LOOKUP (x) if (x  M) return true else return false DELETE (x) M = M \ {x} NOTE: Coordinate the bullt “Given...” with the InsertPair method. Here we give the specification for multiset. The state of the spec is represented by a set M of integers. Each method of the specification specifies an atomic deterministic update or observation of the specification state. A mutator method, given the current state and the arguments, specifies what will the next state be. Notice that some methods also take a return value that affects the behavior of the method. For example InsertPair takes two integers and a return value. If the return value is success it specifies a new state with x and y included. Other return values causes InsertPair to keep the existing state. If the return value is not success it leaves the current state unchanged. The reason for us to let the return value affect the state transition is to model the InsertPairs in the impl that fails due to concurrency. Also there are the Delete method that removes an integer from the multiset and the lookup method that queries the set for a given integer. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Outline Example Refinement The VYRD tool Experience Testing I/O-refinement View-refinement The VYRD tool Experience VYRD + model checking The LP coverage metric Here is the outline of my talk. First I’ll give a motivating data structure example and and explain how our technique applies to the this example. Then I’ll talk about two different notions of refinement called ... I’ll introduce our runtime verification tool Vyrd and the experience we had by applying Vyrd on industrial scale software. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Multiset Testing Don’t know which happened first Call Insert(3) Call LookUp(3) Return“success” Call Insert(4) Return “true” Call Delete(3) Unlock A[0] A[0].elt=3 Unlock A[1] A[1].elt=4 read A[0] A[0].elt=null Don’t know which happened first Insert(3) or Delete(3) ? Should 3 be in the multiset at the end? Must accept both possibilities as correct Common practice: Run long multi-threaded test Perform sanity checks on final state In this slide we’ll explain how we check IO refinement. Again we use the insert operation instead of insertpair to simplify the picture. On the right you see 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Multiset I/O Refinement Witness ordering Spec trace M=Ø Call Insert(3) Call LookUp(3) Return“success” Call Insert(4) Return “true” Call Delete(3) Unlock A[0] A[0].elt=3 Unlock A[1] A[1].elt=4 read A[0] A[0].elt=null M=Ø {3} {3, 4} {4} Spec trace Call Insert(3) Return “success” Call LookUp(3) Call Insert(4) Call Delete(3) M = M U {3} Check 3  M Return “true” M = M U {4} M = M \ {3} Commit Insert(3) Commit LookUp(3) Commit Insert(4) Commit Delete(3) Witness ordering Unlock A[0] A[0].elt=3 Unlock A[1] A[1].elt=4 read A[0] A[0].elt=null M = M U {3} Check 3  M M = M U {4} M = M \ {3} In this slide we’ll explain how we check IO refinement. Again we use the insert operation instead of insertpair to simplify the picture. On the right you see 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

The Vyrd Approach Refinement as Correctness Criterion Refinement For each execution of the implementation (Impl) there exists an “equivalent”, atomic execution of Spec Linearizability, atomicity/reducibility For each execution of Impl there exists an “equivalent” atomic execution of Impl Refinement was used as correctness criterion for verifying Boxwood, the Scan file system Atomicity/reducibility, linearizability declare these systems incorrect Reducibility, linearizability sometimes too restrictive Refinement as correctness criterion More extensive checking than assertions More observability than pure testing Keywords: Linerizability and atomicy are more restrictive. The flexibility in spec gives us a more powerful method to prove correctness of some tricky implmentations. In our approach to verifying concurrent data structures we use refinement as the correctness criterion. The benefits of this choice are that refinement is a more thorough condition than method local assertions and that it provides more observability than pure testing. Correctness conditions like Linearizability and atomicity require that for each execution of impl in a concurrent environment there exists an equivalent atomic execution of the same Impl. However Refinement uses a separate specification and for each execution of the impl refinement requires existence of an equivalent atomic execution of this spec. The specification we use is more permissive than the impl. For example the spec allows methods to terminate exceptionally to model failure due to resource contention in a concurrent environment. However the impl would not allow some of the method executions to fail. We check refinement at runtime using execution traces of the implementation. We do this in order to be able to handle industrial-scale programs. Our approach can be regarded as intermediate between testing and exhaustive verification with respect to the coverage of the whole execution space explored. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

I/O-refinement Selecting Commit Actions Commit points: Determine witness ordering, drive Spec Hints to refinement checking tool No formal procedure Intuitively, where “new data structure state” becomes visible to other threads Straightforward for programmer to select For each method Designate lines in source code Multiple lines annotated as commit May be conditional on program state, return value For each method execution, only one action should be marked as the commit action InsertPair (x, y) i = FindSlot (x) if (i == 0) return failure j = FindSlot (y) if (j == 0) A[i].content = null acquire(A[i]) acquire(A[j]) A[i].valid = true A[j].valid = true release(A[i]) release(A[j]) return success Put IO refinement slide with commit points beforehand. Commit points are really hits to refinement checking tools by the user that helps in determining the witness ordering in which the spec trace is constructed. For each public method of the Impl, we designate lines in the source code so that their execution correspond to commit actions. There may be multiple lines annotated as commit. However, for each execution of a method there must be a single action executed as the commit action and its execution brings the method execution to its commit point There is no formal procedure for deciding on the commit points. Intuitively, where the modified state of the data structure becomes visible to other threads should be the primary candidate for a commit point. For example the commit point for the insertpair is where the lock of A[i] is released after inserting both x and y to the set. Even though insertpair changes some shared state by calling findslot beforehand, the elements in allocated slots are not observed as in the set by other threads so it is the commit point where methods by other threads can see x in the set. release(A[i]) // commit 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Outline Example Refinement The VYRD tool Experience Testing I/O-refinement View-refinement The VYRD tool Experience VYRD + model checking The LP coverage metric Here is the outline of my talk. First I’ll give a motivating data structure example and and explain how our technique applies to the this example. Then I’ll talk about two different notions of refinement called ... I’ll introduce our runtime verification tool Vyrd and the experience we had by applying Vyrd on industrial scale software. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

LookUp(5)=true, LookUp(7)=true LookUp(6)=true, LookUp(8)=true Need for more observability View-refinement T1: InsertPair(5,7) T2: InsertPair(6,8) Read A[0].elt = null FINDSLOT (x) // Buggy for i  1 to n if (A[i].content == null) acquire(A[i]) A[i].content = x release(A[i]) return i return 0 Read A[0].elt = null Read A[1].elt = null 1 2 3 elt     valid F F F F elt 5 7   valid F F F F elt 5 7   Overwrites 5! valid T T F F LookUp(5)=true, LookUp(7)=true 1 2 3 elt 6 7   It would be caught is lookup5 would get interleaved here. IO refinement is still not sufficient to catch some errors that does not appear in method calls and return values. Consider the buggy impl of FindSlot on the left. It does not lock the elements before reading from their content fields. It acquires the lock just before starting the modification. As a result as you see on the right, two concurrently executed insertpairs can get interleaved in a way so that they read the first slot in the array as empty and think that it is available for an insertion but only one of them succeeds in inserting its x into this slot. In this case, the integer 5 inserted by the first thread is overwritten by the second thread. Each thread checks the insertpair it runs by calling lookup methods with the same arguments as the insertpair had after the insertpair finishes. If the lookups are scheduled in this way, they all return true although the last state is inconsistent with the termination status of the methods. The error is there and is observable from the state but IO refinement is unable to detect it in this senario due to the lookup not being scheduled in the right places. valid T T F F Read A[2].elt = null elt 6 7 8  valid T F elt 6 7 8  valid T F LookUp(5)=false LookUp(6)=true, LookUp(8)=true 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

LookUp(5)=true, LookUp(7)=true LookUp(6)=true, LookUp(8)=true I/O-refinement may miss errors View-refinement T1: InsertPair(5,7) T2: InsertPair(6,8) Read A[0].elt = null If observer methods don’t get interleaved in the right place, I/O refinement may miss errors Source of bug too far in the past when I/O refinement violation happens Read A[0].elt = null Read A[1].elt = null 1 2 3 elt     valid F F F F elt 5 7   valid F F F F elt 5 7   Overwrites 5! valid T T F F LookUp(5)=true, LookUp(7)=true 1 2 3 elt 6 7   Do not say about the first bullet. IO refinement is still not sufficient to catch some errors that does not appear in method calls and return values. Consider the buggy impl of FindSlot on the left. It does not lock the elements before reading from their content fields. It acquires the lock just before starting the modification. As a result as you see on the right, two insertpairs can get interleaved in a way so that they read the first slot in the array as empty but only one of them succeeds in inserting its x into this slot. In this case, the integer 5 inserted by the first thread is overwritten by the second thread. Each thread checks the insertpair it runs by calling lookup methods with the same arguments as the insertpair had. If the lookups are scheduled in this way, they all return true although the last state is inconsistent with the termination status of the methods. The error is there and is observable from the state but IO refinement is unable to detect it in this senario due to the lookup not being scheduled in the right places. valid T T F F Read A[2].elt = null elt 6 7 8  valid T F elt 6 7 8  valid T F LookUp(6)=true, LookUp(8)=true 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

View-refinement More Observability I/O-refinement may miss errors I/O-refinement + “correspondence” between states of Impl and Spec at commit points Catches state discrepancy at the next commit point Early warnings for possible I/O refinement violations As we saw in the previous example with 2 insertpairs IO refinement is not that good at finding refinement errors.The problem is IO refinement relies on observer methods and if the observer methods do not get interleaved in the right place along the trace, IO refinement may miss errors. In the extreme case, if there are no observer methods, IO refinement trivially passes any executions. In another case, when a refinement violation is detected, the source of the bug may be too far in the past so there may need to be an analysis of the trace to the far back. Our solution is view-refinement. View refinement augments IO refinement with a new condition that seeks correspondence between states of the impl and the spec along at commit points. To accomplish this we add commit actions to the set lambda and label them with state information when they are executed. View-refinement catches state discrepancies right when it happens. In fact these state discrepancies are early warnings for future IO refinement violations. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

View-refinement   View Variables State correspondence Hypothetical “view” variables must match at commit points “view” variable: Value of variable is abstract data structure state Updated atomically once by each method viewImpl : state information for Impl For A[1..n] Extract content if valid=true viewSpec: state information for Spec Elements of the multiset viewSpec  M (nothing to abstract) Other Spec’s may have state to be abstracted viewImpl={3, 3, 5, 5, 8, 8, 9} 3  5   content valid A 9 8 6 The state correspondence is obtained by matching view variables from the impl and the spec at commit points. A view variable is a hypothetical variable that extracts an abstract state of the data structure. This abstract state is updated or observed atomically by each method. The view variables that carry state information of the impl and the spec are denoted by viewimpl and viewspec respectively. The view for multiset data structure is the set of integers stored in the multiset. The view variable for the multiset impl extracts the elements in content fields whose corresponding valid fields are true. Thus the view variable for the multiset in the figure does not contain the first 5 and 6 in the view variable. The view var for the spec gets elements from the set M. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

View-refinement     View Variables for Multiset viewImpl: Computed using abstraction function View is a canonical representation Canonizes state for view: Exact match not required AbstractionFunction (A) view = Ø for i = 1 to n if (A[i].content != null && A[i].valid == true) view = view U {A[i].content} return view content valid A 1  3 7  6 5 Abstraction function (for checking view-refinement) An extra method of the data structure For the current data structure state, computes the current state of the view variable There may be state variables to be abstracted away in the spec. Later we will see example in which the spec is also a program. The abst func is given by the user. View is a canonical representation of the abstract state. View computation canonize the state so So even though the internal representation of the data structure state are different for two multiset instances the view variable may be identical for both of them and exact match between the data structure states are not required. For example the view for the representations in the figure are the same although the order of elements are different with extra allocated slots. Since the spec has already an abstact representation, its states has nothing to abstract so the view variable for a spec instance is canonized version of the spec state. However this does not mean spec cannot have details to abstract. Our method accepts specs in different levels of detail so as we will see later, the spec can be a program that requires an abstraction function. As for multiset example, the view variable carry information about what integers are stored in the multiset. The canonical representation of view variables for multiset discards the order of elements. For multiset spec, you do not need to do any abstraction, the view is the entire spec state. For the multiset impl the view variable must be extracted from the data structure state. The abstraction function for the multiset impl is given on the right. The abstraction function traverses the array A and abstracts away the valid fields of the elements. It only includes into the view the content field for which the valid field is true into the view. For example if abstraction function traversed the multiset whose state is given below on the right it would not include this element with content 5 and the element with content 6 because their corresponding valid fields are not set. viewImpl={1, 3, 5, 6} content valid A 6  1   5 3 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

View-refinement Checking Refinement Witness ordering Spec trace M=Ø Call Insert(3) Unlock A[0] A[0].elt=3 Call LookUp(3) Return“success” Unlock A[1] A[1].elt=4 Call Insert(4) read A[0] Return “true” Call Delete(3) A[0].elt=null M=Ø {3} {3, 4} {4} Spec trace Call Insert(3) Return “success” Call LookUp(3) Return “true” Call Insert(4) Call Delete(3) M = M U {3} Check 3  M M = M U {4} M = M \ {3} Commit Insert(3) Commit LookUp(3) Commit Insert(4) Commit Delete(3) Witness ordering viewImpl = {3} viewImpl = {3,4} viewImpl = {4} A[0].elt=3 A[1].elt=4 A[0].elt=null viewSpec = {3} viewSpec = {3,4} viewSpec = {4} Say the view are computed by running abst func at this point. The checking procedure is similar to checking IO refinement. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

LookUp(5)=true, LookUp(7)=true LookUp(6)=true, LookUp(8)=true Catching FindSlot Bug View-refinement T1: InsertPair(5,7) T2: InsertPair(6,8) Read A[0].elt = null FINDSLOT (x) // Buggy for i  1 to n if (A[i].content == null) acquire(A[i]) A[i].content = x release(A[i]) return i return 0 Read A[0].elt = null Read A[1].elt = null 1 2 3 elt     valid F F F F elt 5 7   valid F F F F elt 5 7   Overwrites 5! valid T T F F LookUp(5)=true, LookUp(7)=true 1 2 3 elt 6 7   IO refinement is still not sufficient to catch some errors that does not appear in method calls and return values. Consider the buggy impl of FindSlot on the left. It does not lock the elements before reading from their content fields. It acquires the lock just before starting the modification. As a result as you see on the right, two insertpairs can get interleaved in a way so that they read the first slot in the array as empty but only one of them succeeds in inserting its x into this slot. In this case, the integer 5 inserted by the first thread is overwritten by the second thread. Each thread checks the insertpair it runs by calling lookup methods with the same arguments as the insertpair had. If the lookups are scheduled in this way, they all return true although the last state is inconsistent with the termination status of the methods. The error is there and is observable from the state but IO refinement is unable to detect it in this senario due to the lookup not being scheduled in the right places. valid T T F F Read A[2].elt = null elt 6 7 8  valid T F elt 6 7 8  valid T F LookUp(6)=true, LookUp(8)=true 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

InsertP(5,7) Returns “success” InsertP(6,8) Returns “success” Catching FindSlot Bug View-refinement Specification {5, 7} {5, 6, 7, 8} M = Ø Call InsertPair(5,7) Return “success” Call InsertPair(6,8) viewSpec = Ø {5, 7} {5, 6, 7, 8} viewImpl = Ø {5, 7} {6, 7, 8} Suppose we are checking the execution trace with buggy FindSlot implementation. First we drive the spec according to the witness ordering of the commit points. Then we track the valuations of the view variables for the impl and the spec at commit points and compare them for equivalence. Here at the commit point of the second insertpair, 5 disappears from the view variable of the impl but it is there in the view var of the spec. Then a refinement error is signalled that says an element is overwritten betweeen last two commit actions. InsertP(5,7) Returns “success” Call InsertP(5,7) Read A[0].elt InsertP(6,8) Returns “success” Call InsertP(6,8) Read A[0].elt Read A[0].elt A[0].content=5 A[1].content=7 A[0].valid=true A[0].valid=true A[1].valid=true A[0].content=6 A[0].valid=true Read A[2].elt A[2].content=8 A[2].valid=true Implementation 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Outline Example Refinement The VYRD tool Experience Testing I/O-refinement View-refinement The VYRD tool Experience VYRD + model checking The LP coverage metric Here is the outline of my talk. First I’ll give a motivating data structure example and and explain how our technique applies to the this example. Then I’ll talk about two different notions of refinement called ... I’ll introduce our runtime verification tool Vyrd and the experience we had by applying Vyrd on industrial scale software. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

The VYRD Tool ... Impl Implreplay Spec Instrument Impl Test harness in order to log actions in the order they happen Commit actions annotated by user Write abstraction function Test harness Impl Write to log Enables online/offline checking ... Call LookUp(3) Call Insert(3) A[0].elt=3 Unlock A[0] Call Delete(3) Call Insert(4) read A[0] A[1].elt=4 Return “true” Unlock A[1] Return“success” Return“success” A[0].elt=null Unlock A[0] Return“success” Read from log Abstraction function (for checking view-refinement) An extra method of the data structure For the current data structure state, computes the current state of the view variable Vyrd analyzes execution traces of the impl generated by test programs. Vyrd uses two separate threads for the process. The testing thread runs a test harness that generates test programs. A test program makes concurrent method calls to the impl. During the run of the test program, the corresponding execution trace is recorded in a shared sequential log. The verification thread reads the execution trace from the log. Since the verification thread follows the testing thread from behind, it can not access the instantaneous state of the impl. Thus the replaying module re-executes actions from the log on a separate instance of the impl called impl-replay and executes atomic methods on the spec at commit points. During replaying, the replaying mechanism also computes the view variables when it reaches a commit point and annotates the commit actions along the traces with view variables. The refinement checker module checks the resulting lambda traces of impl and the spec for IO and view refinement. These threads can run in online or offline setting. In online checking both threads simultaneously while in offline checking the verification thread runs after the whole test program finishes its work. Execute logged actions Replay Mechanism Run methods in witness ordering Implreplay Spec Refinement Checker traceImpl traceSpec 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

The VYRD Tool Atomized Impl as Spec Spec : atomized version of Impl INSERTPAIR (x, y, retval) acquire(global_lock) if (retval == failure) release(global_lock) return failure i = FindSlot (x) .......... j = FindSlot (y) acquire(A[i]) acquire(A[j]) A[i].valid = true A[j].valid = true release(A[i]) release(A[j]) return success Spec : atomized version of Impl Fully synchronized methods Use single global lock Separates checking concurrency errors from sequential verification Slight modification: Return value from Impl method additional argument to Spec methods More permissive than Impl Can handle failure return values Exact state match at commit points not required Match view variables only Different from “commit atomicity” Global lock serializes the methods, this separates from sequential verification checking concurrency errors. Easily usable, no need to write a separate spec. Our approach can employ specifications in different forms and in different levels of detail. However, it is straightforward to obtain an executable specification from the atomized version of the impl. To accomplish this, we use a single global lock to fully synchronize the method bodies. You can see the modified version of the ınsertpair method for the spec. Slight modification is needed to make the methods to model nondeterministic failures of the impl due to concurrency. The spec methods accept as an input parameter, return values from the impl trace. When a method takes the return value “failure”, it does nothing even though without this check it can complete its job with a “success” as the return value. Although they use the same method impl, the impl and spec can have diff states at a commit point. But we use the canonicized forms of view and exact match between the impl and the spec states is not required. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Outline Example Refinement The VYRD tool Experience Testing I/O-refinement View-refinement The VYRD tool Experience VYRD + model checking The LP coverage metric Here is the outline of my talk. First I’ll give a motivating data structure example and and explain how our technique applies to the this example. Then I’ll talk about two different notions of refinement called ... I’ll introduce our runtime verification tool Vyrd and the experience we had by applying Vyrd on industrial scale software. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Replicated Disk Manager The Boxwood Project Experience BLINKTREE MODULE Root Pointer Node Internal Pointer Node Level n+1 ................ Level n .................. Root Level Leaf Pointer Node ... Level 0 ... ...... ..... ..... ..... ..... ............... .... .... ......... ........ ...... ...... ..... ..... ..... ..... ............... .... .... ......... ........ ...... Data Node ... Data Nodes ... GlobalDiskAllocator CHUNK MANAGER MODULE Replicated Disk Manager Write Read Read Write CACHE MODULE Dirty Cache Entries ... Clean Cache Entries Cache We verified all modules of this system. We caught an interesting difficult tricky error that has gone undetected. We run Vyrd on the Boxwood Project from Microsoft. The goal of the Boxwood project is building a distributed abtract storage infrastructure for applications with high data storage and retrieval requirements. Here, you see a high level picture of Boxwood. Boxwood has a concurrent blinktree implementation in Blinktree module. The blinktree module uses a cache module to store and retrieve its data at tree nodes quickly. The cache module makes its data persistent using a chunk manager module. The chunk manager implements distributed storage system that abstracts the storage system from the upper layers. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Experience Experimental Results Scalable method: Caught bugs in industrial-scale designs Boxwood (30K LOC) Scan Filesystem (Windows NT) Java Libraries with known bugs Moderate annotation effort Several lines for each method I/O-refinement Low logging and verification overhead: BLinkTree: Logging 17% over testing, refinement check 27% View-refinement BLinkTree: Logging 20% over testing, refinement check 137% More effective in catching errors Boxwood Cache: # of random method calls until error caught View-refinement: 26 I/O-refinement: 539 Remove tables. Explain why cache bug is tricky. Here are the experimental results from application of Vyrd on the Blinktree and cache modules. The overall results show that Vyrd can handle industrial scale designs with modest logging and verification costs. IO refinement requires only method call commit and return actions to be logged so the logging overhead is much less than view refinement requires. Note that the logging overhead includes the logging for IO refinement. But view refinement is more effective in catching bugs the first table shows the big difference in time passes in terms of the number of method calls made before detecting the error. The overhead of logging the actions for view-refinement may take much time as the granularity of actions gets finer. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Experience Concurrency Bug in Cache Very similar to bug found in Scan file system Had not been caught by developers Current version does not contain bug Bug manifestation: Cache entry is correct Permanent storage has corrupted data Cause of bug: Concurrent execution of Write and Flush on the same entry Write to a dirty entry not locked properly Flush writes corrupted data to Chunk Manager Marks entry clean Hard to catch through testing As long as Read’s hit in Cache, return value correct Caught through testing only if Cache fills, clean entry in Cache is evicted No “Write”s to entry in the meantime Entry read after eviction Very unlikely 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Outline Example Refinement The VYRD tool Experience Testing I/O-refinement View-refinement The VYRD tool Experience VYRD + model checking The LP coverage metric Here is the outline of my talk. First I’ll give a motivating data structure example and and explain how our technique applies to the this example. Then I’ll talk about two different notions of refinement called ... I’ll introduce our runtime verification tool Vyrd and the experience we had by applying Vyrd on industrial scale software. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

LookUp(5)=true, LookUp(7)=true LookUp(6)=true, LookUp(8)=true View-refinement T1: InsertPair(5,7) T2: InsertPair(6,8) Read A[0].elt = null FINDSLOT (x) // Buggy for i  1 to n if (A[i].content == null) acquire(A[i]) A[i].content = x release(A[i]) return i return 0 Read A[0].elt = null Read A[1].elt = null 1 2 3 elt     valid F F F F elt 5 7   valid F F F F elt 5 7   Overwrites 5! valid T T F F LookUp(5)=true, LookUp(7)=true 1 2 3 Refinement no better than testing if interleaving required to trigger bug does not happen Long tests with many threads may trigger bug elt 6 7   IO refinement is still not sufficient to catch some errors that does not appear in method calls and return values. Consider the buggy impl of FindSlot on the left. It does not lock the elements before reading from their content fields. It acquires the lock just before starting the modification. As a result as you see on the right, two insertpairs can get interleaved in a way so that they read the first slot in the array as empty but only one of them succeeds in inserting its x into this slot. In this case, the integer 5 inserted by the first thread is overwritten by the second thread. Each thread checks the insertpair it runs by calling lookup methods with the same arguments as the insertpair had. If the lookups are scheduled in this way, they all return true although the last state is inconsistent with the termination status of the methods. The error is there and is observable from the state but IO refinement is unable to detect it in this senario due to the lookup not being scheduled in the right places. valid T T F F Read A[2].elt = null elt 6 7 8  valid T F elt 6 7 8  valid T F LookUp(6)=true, LookUp(8)=true 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

VyrdMC: The Idea Thread 1 Thread 2 Thread 3 LookUp(3) Insert(3) Delete(3) For fixed test program, model checker explores all distinct thread interleavings Rationale: Explore “concurrency component” of state space Make fixed program interesting Lead program to non-trivial “anchor” state Issue a long sequence of random method calls from initial state Start multi-threaded test from anchor state: Each thread issues a small number of method calls Pick method arguments so threads contend for access to same portion of program state The goal of the test harness is to generate concurrent execution traces of impl. To accomplish this, it forks a number of threads. Each thread issues a number of method calls consecutively. The arguments to these methods are picked so that the thread contend over the same region of the data structure to reveal concurrency errors. The runtime information regarding to each action that is run atomically is recorded to shared log so that at the end we have a linear sequence of actions in the log. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Guiding Vyrd with a Model Checker Call InsertP(5,7) Call InsertP(5,7) Call InsertP(6,8) Call InsertP(6,8) Read A[0].elt Read A[0].elt Read A[0].elt A[0].content=5 A[0].content=5 Read A[1].elt A[1].content=7 A[1].content=7 ATOMIC ATOMIC A[0].valid=true A[0].valid=true A[1].valid=true A[1].valid=true A[0].content=6 InsertP(5,7) Returns “success” A[2].content=8 Read A[2].elt ATOMIC Read A[2].elt A[2].content=6 A[0].valid=true A[2].valid=true Read A[3].elt A[3].content=8 ATOMIC InsertP(5,7) Returns “success” A[2].valid=true A[3].valid=true InsertP(6,8) Returns “success” InsertP(6,8) Returns “success”

Vyrd + ExplicitMC (like Java PathFinder) Important side benefit: Automates, streamlines logging and replaying of implementation Makes it program independent Logged actions: Each bytecode instruction Model checker’s VM has hooks for calling methods of refinement checker, logger after each bytecode instruction Semantics of replay for VM’s actions defined, already implemented in model checker’s VM Refinement checker simply uses it Automates most labor-intensive part of runtime refinement checking Programmer only provides commit point annotations, writes abstraction function (for view refinement only) 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Handling Backtracking Special log entries indicate parent-child relationships in tree, jumps to ancestors Vyrd resets to initial state, rewinds to beginning of log and replays actions from root to back-tracked ancestor Handling Backtracking Call InsertP(5,7) Call InsertP(6,8) Read A[0].elt Read A[0].elt Read A[1].elt A[0].content=5 A[1].content=7 ATOMIC A[1].content=7 A[0].valid=true ATOMIC A[1].valid=true A[0].valid=true A[1].valid=true InsertP(5,7) Returns “success” A[0].content=6 Read A[2].elt A[2].content=8 A[2].content=6 ATOMIC Read A[2].elt Read A[3].elt A[0].valid=true A[3].content=8 ATOMIC A[2].valid=true A[2].valid=true InsertP(5,7) Returns “success” A[3].valid=true InsertP(6,8) Returns “success” InsertP(6,8) Returns “success”

Vyrd + Stateless MC (like Verisoft) No state caching, just store path from root to node Terminates because state-transition diagram is a DAG/tree. Benefits: Possibly more efficient Vyrd is re-started anyway At re-starts, Vyrd resets state, No need to rewind to beginning of log Drawbacks: Must manually implement logging, replay Must instrument implementation source or byte code 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Partial order reductions Pure blocks, annotations If pure block terminates normally, no net modification of program state Example: Lock node Check if data sought is in the node Release lock if it isn’t Many industrial-scale programs make extensive use of pure blocks Enumerating all distinct interleavings of even two concurrent methods may be infeasible if purity not exploited Solution: Use non-deterministic abstraction of pure block Makes pure execution of pure block independent of any other action 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Outline Example Refinement The VYRD tool Experience Testing I/O-refinement View-refinement The VYRD tool Experience VYRD + model checking The LP coverage metric Here is the outline of my talk. First I’ll give a motivating data structure example and and explain how our technique applies to the this example. Then I’ll talk about two different notions of refinement called ... I’ll introduce our runtime verification tool Vyrd and the experience we had by applying Vyrd on industrial scale software. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

The Location Pairs Metric If I am after concurrency errors only, what unexamined scenario should I try to trigger? Coverage metrics: Link between testing and model checking Communicate partial results, testing goals between tools Direct tools toward unexplored, distinct new executions The “location pairs” (LP) metric Directed at concurrency errors ONLY Focus: “High-level” data races Atomicity violations Refinement violations All variables may be lock-protected, but operations not implemented atomically Well, Many widely-used software applications are built on concurrent data structures. Examples are file systems, databases, internet services and some standard Java and C# class libraries. These systems frequently use intricate synchronization mechanisms to get better performance in a concurrent environment. This makes them prone to concurrency errors. Concurrency errors can have serious consequences, such as data loss or corruption. Unfortunately, these errors are typically hard to detect and reproduce through pure testing-based techniques. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Idea Observation: Bug occurs whenever Method1 executes up to line X, context switch occurs Method2 starts execution from line Y Provided there is a data dependency between Method1’s code “right before” line X: BlockX Method2’s code “right after” line Y: BlockY Bug description follows pattern above Only requirement on program state, other threads, etc.: Make the interleaving above possible May require many other threads, complicated program state, ... A “one-bit” data abstraction captures error scenario Depdt: Is there a data dependency between BlockX and BlockY 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

public synchronized StringBuffer append(StringBuffer sb) { public synchronized void setLength(int newLength) { int len = sb.length(); int newCount = count + len; if (newCount > value.length) ensureCapacity(newCount); ... if (count < newLength) ... } else { count = newLength; ... } return this; sb.getChars(0, len, value, count); count = newCount; } return this; } 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

private static void CpToCache( byte[] buf, CacheEntry te, int lsn, Handle h sb) { public static void Flush(int lsn) { ... lock (clean) { for (int i=0; i<buf.length; i++) { BoxMain.alloc.Write(h, te.data, te.data.length, 0, 0, WRITE_TYPE_RAW); te.data[i] = buf[i]; } } ... te.lsn = lsn } } 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

public synchronized StringBuffer append(StringBufer sb) { 1 int len = sb.length(); 2 int newCount = count + len; 3 if (newCount > value.length) { 4 ensureCapacity(newCount); 5 sb.getChars(0, len, value, count); 6 count = newCount; 7 return this; 8 } ----------------------------------- acquire(this) ----------------------------------- invoke sb.length() --------------------------– L1 ---- int len = sb.length() --------------------------- L2 ---- int newCount = count + len ----------------------------------- if (newCount > value.length) ----------------------------------- expandCapacity(newCount); ----------------------------------- invoke sb.getChar() ----------------------------------- sb.getChars(0, len, value, count) --------------------------–-------- count = newCount ----------------------------------- return this 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Coverage FSM State Method 1 Method 2 (LX, pend1, LY, pend2, depdt) Location in the CFG of Method 2 Location in the CFG of Method 1 Do actions following LX and LY have a data dependency? Is an “interesting” action in Method 2 expected next? Is an “interesting” action in Method 1 is expected next? 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Coverage FSM (L1, !pend1, L3, !pend2, depdt) t1: L1  L2 t2: L3  L4 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Coverage Goal The “pend1” bit gets set when The depdt bit is TRUE Method2 takes an action Intuition: Method1’s dependent action must follow Must cover all (reachable) transitions of the form p = (LXp, TRUE, LY, pend2p, depdtp)  q = (LXq, pend1q, LY, pend2q, depdtq) p = (LX, pend1p, LYp, TRUE, depdtp)  q = (LX, pend1q, LYq, pend2q, depdtq) Separate coverage FSM for each method pair: FSM(Method1, Method2) Cover required transitions in each FSM If Method1 calls Method3: Considered when FSM(Method3, Method2) is covered 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Reducing the Coverage FSM Method-local actions: Basic block consisting of method-local actions considered a single atomic action Normal executions of pure blocks: Considered a “no-op” Modeled by “bypass transition” in coverage FSM. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Empirical evidence Errors captured by metric 100% metric  Bug guaranteed to be triggered Preliminary study Bugs in Java class libraries Bug found in Boxwood cache Bug found in Scan file system Bugs categories reported in E. Farchi, Y. Nir, S. Ur Concurrent Bug Patterns and How to Test Them 17th Intl. Parallel and Distributed Processing Symposium (IDPDS ’03) # of locations per method in Boxwood: ~10, after factoring out atomic and pure blocks How many are covered by random testing? How does coverage change over time? Don’t know yet. Implementing coverage measurement tool. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Future Work: Approximating Reachable LP Set LP reachability undecidable Metric only intended as aid to programmer What have I tested? What should I try to test? If an unreached LP can’t be ruled out, make sure it does not lead to error Future work: Better approximate reachable LP set Do conservative reachability analysis of coverage FSM using predicate abstraction. Programmer can add predicates for better FSM reduction In this part of the talk I’ll tell you about our experience using the Vyrd tool. 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.

Re-cap Vyrd: Novel correctness criteria: I/O and view refinement Improved observability, more extensive checking than testing Low-overhead, applicable to industrial-scale programs Caught previously undetected errors VyrdMC: More extensive validation from small test programs Use execution-based model checker to explore all distinct thread interleavings for simple test Improve coverage, reduce annotation burden The LP metric Inspired by atomicity, refinement violations in real examples Code-based, practical metric that captures concurrency errors well 30/11/18 PLDI 2005, June 12-15, Chicago, U.S.