Runtime Refinement Checking of Concurrent Data Structures (the VYRD project) Serdar Tasiran Koç University, Istanbul, Turkey Shaz Qadeer Microsoft Research,

Slides:

Advertisements

Similar presentations

Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.

Advertisements

Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.

Intermediate Code Generation

Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.

Reduction, abstraction, and atomicity: How much can we prove about concurrent programs using them? Serdar Tasiran Koç University Istanbul, Turkey Tayfun.

Goldilocks: Efficiently Computing the Happens-Before Relation Using Locksets Tayfun Elmas 1, Shaz Qadeer 2, Serdar Tasiran 1 1 Koç University, İstanbul,

1 1 Regression Verification for Multi-Threaded Programs Sagar Chaki, SEI-Pittsburgh Arie Gurfinkel, SEI-Pittsburgh Ofer Strichman, Technion-Haifa Originally.

6/14/991 Symbolic verification of systems with state machines David L. Dill Jeffrey Su Jens Skakkebaek Computer System Laboratory Stanford University.

Iterative Context Bounding for Systematic Testing of Multithreaded Programs Madan Musuvathi Shaz Qadeer Microsoft Research.

/ PSWLAB Concurrent Bug Patterns and How to Test Them by Eitan Farchi, Yarden Nir, Shmuel Ur published in the proceedings of IPDPS’03 (PADTAD2003)

Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.

/ PSWLAB Atomizer: A Dynamic Atomicity Checker For Multithreaded Programs By Cormac Flanagan, Stephen N. Freund 24 th April, 2008 Hong,Shin.

A Semantic Characterization of Unbounded-Nondeterministic Abstract State Machines Andreas Glausch and Wolfgang Reisig 1.

ADVERSARIAL MEMORY FOR DETECTING DESTRUCTIVE RACES Cormac Flanagan & Stephen Freund UC Santa Cruz Williams College PLDI 2010 Slides by Michelle Goodstein.

Nested Transactional Memory: Model and Preliminary Architecture Sketches J. Eliot B. Moss Antony L. Hosking.

“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.

Modular Verification of Multithreaded Software Shaz Qadeer Compaq Systems Research Center Shaz Qadeer Compaq Systems Research Center Joint work with Cormac.

1 Data Structures A program solves a problem. A program solves a problem. A solution consists of: A solution consists of:  a way to organize the data.

Quick Review of Apr 15 material Overflow –definition, why it happens –solutions: chaining, double hashing Hash file performance –loading factor –search.

1 ACID Properties of Transactions Chapter Transactions Many enterprises use databases to store information about their state –e.g., Balances of.

Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.

Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.

Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.

Chapter 7Louden, Programming Languages1 Chapter 7 - Control I: Expressions and Statements "Control" is the general study of the semantics of execution.

272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.

K. Rustan M. Leino RiSE, Microsoft Research, Redmond joint work with Peter Müller and Jan Smans Lecture 0 1 September 2009 FOSAD 2009, Bertinoro, Italy.

Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.

Procedure specifications CSE 331. Outline Satisfying a specification; substitutability Stronger and weaker specifications - Comparing by hand - Comparing.

15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.

Data Structures Week 6 Further Data Structures The story so far  We understand the notion of an abstract data type.  Saw some fundamental operations.

Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.

Searching: Binary Trees and Hash Tables CHAPTER 12 6/4/15 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education,

Dynamic Analysis of Multithreaded Java Programs Dr. Abhik Roychoudhury National University of Singapore.

ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.

CS 363 Comparative Programming Languages Semantics.

Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.

A Universe-Type-Based Verification Technique for Mutable Static Fields and Methods Alexander J Summers Sophia Drossopoulou Imperial College London Peter.

Data Structure Introduction.

Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.

CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.

Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.

ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.

Using Sequential Containers Lecture 8 Hartmut Kaiser

/ PSWLAB Thread Modular Model Checking by Cormac Flanagan and Shaz Qadeer (published in Spin’03) Hong,Shin Thread Modular Model.

Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)

Week 9, Class 3: Java’s Happens-Before Memory Model (Slides used and skipped in class) SE-2811 Slide design: Dr. Mark L. Hornick Content: Dr. Hornick Errors:

A Calculus of Atomic Actions Tayfun Elmas, Shaz Qadeer and Serdar Tasiran POPL ‘ – Seminar in Distributed Algorithms Cynthia Disenfeld 27/05/2013.

Simplifying Linearizability Proofs Using Reduction and Abstraction Serdar Tasiran Koc University, Istanbul, Turkey Tayfun Elmas, Ali Sezgin, Omer Subasi.

A Calculus of Atomic Actions Serdar Tasiran Koc University, Istanbul, Turkey Tayfun ElmasShaz Qadeer Koc University Microsoft Research.

SDN Network Updates Minimum updates within a single switch

Chapter 3 of Programming Languages by Ravi Sethi

Regression Testing with its types

Module 11: File Structure

Opeoluwa Matthews, Jesse Bingham, Daniel Sorin

Specifying Multithreaded Java semantics for Program Verification

Threads and Memory Models Hal Perkins Autumn 2011

Optimizing Malloc and Free

Serdar Tasiran, Tayfun Elmas Koç University, Istanbul, Turkey

Threads and Memory Models Hal Perkins Autumn 2009

VyrdMC: Driving Runtime Refinement Checking Using Model Checkers

VyrdMC: Driving Runtime Refinement Checking Using Model Checkers

Runtime Checking of Refinement for Concurrent Software Components

Serdar Tasiran, Tayfun Elmas Koç University, Istanbul, Turkey

Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M

Tayfun Elmas, Serdar Tasiran Koç University, Istanbul, Turkey

Tayfun Elmas, Serdar Tasiran Koç University, Istanbul, Turkey

Concurrent Cache-Oblivious B-trees Using Transactional Memory

Presentation transcript:

Runtime Refinement Checking of Concurrent Data Structures (the VYRD project) Serdar Tasiran Koç University, Istanbul, Turkey Shaz Qadeer Microsoft Research, Redmond, USA

The Problem Given a specification for a data structure  high-level, executable  sequential, with atomic operations an implementation  concurrent  not atomic  not necessarily linearizable  may have race conditions Verify that (All) executions of the implementation are consistent with the specification

Why runtime verification? Static techniques Must reason about entire state space of implementation Theorem proving techniques: Need representation invariant for implementation  Difficult for multi-threaded implementations Compositional methods Better-suited for correctness proofs  We want to catch concurrency bugs Proofs difficult in practice  Need abstract model for each component’s environment  Coordination of proof sub-tasks difficult for large programs Runtime verification No need for component environment abstractions Gives up exhaustiveness, but can handle practical programs

Outline Preliminaries Atomicity Refinement  I/O refinement The multiset example Checking I/O refinement Improving I/O refinement Comparison with other correctness criteria

Semantics: State Transition Systems A set of program variables The set of program states A transition function Given  current state  action performed specifies next state Actions Visible: e.g. method calls, returns Invisible: e.g. updates to method local variables

More preliminaries Call action: (thread, “Call”, methodName, args) Return action: (thread, “Return”, methodName, rets) A run Corresponding trace: The sequence of visible actions along the run  2  3 …  n-1 s0s0 s1s1...s2s2 S n-1 snsn 11 22 33  n-1 nn

Well-formed atomic traces No visible actions by any other thread or by another method between matching “call” and “return” actions Call0Return0a1a1 a0a0 Call1Return1b0b0 Call2Return2c1c1 c0c0 Matching call and return

Well-formed atomic traces Call0Return0a1a1 a0a0 Call1Return1b0b0 Call2Return2c1c1 c0c0 Atomic fragment Call0Return0 Fragment signature

Determinism Recall: Return action contains return value Atomic state transition system deterministic iff same initial state and same fragment signature imply same final state We require specs to be atomic and deterministic Call0Return0 Fragment signature

Outline Preliminaries Atomicity Refinement  I/O refinement The multiset example Checking I/O refinement Improving I/O refinement Comparison with other correctness criteria

Refinement “Projection of traces (visible actions) to application threads match” S: State transition system for specification I: State transition system for implementation I refines S iff for every trace  I of I, there is a trace  S of S such that, for every application thread t  I | t =  S | t

Refinement Call0b0b0 a0a0 Call1 a1a1 Return1 Return0 Call2c1c1 c0c0 Return2 Call0Return0a1a1 a0a0 Call1Return1b0b0 Call2Return2c1c1 c0c0 Specification trace  S Implementation trace  I

I/O Refinement Refinement:   I   S  I | t =  S | t Notion of refinement: Choice of what actions are visible I/O Refinement: Define only “call” and “return” actions as visible  sequence of calls and returns in the implementation  a spec run in with matching calls and return values Spec atomic and deterministic Spec run gives a linear order of method executions Called “witness interleaving” Practical issue: Too many possible interleavings Idea: Infer witness interleaving from runtime analysis

Commit actions and witness interleavings Call0b0b0 a0a0 Call1a1a1 Return1Return0 Call2c1c1 c0c0 Return2 Call0Return0a1a1 a0a0 Call1Return1b0b0 Call2Return2c1c1 c0c0 Specification trace  S Implementation trace  I Commit action

Intuition: The first line of code that makes visible to other threads the modified view of the data structure state Ordering of commit actions: Application’s view of ordering of operations Can be seen as simplified way to construct abstraction map User specifies line of code that corresponds to the commit action

Outline Preliminaries Atomicity Refinement  I/O refinement The multiset example Checking I/O refinement Improving I/O refinement Comparison with other correctness criteria

Example: Multiset Multiset data structure Two operations INSERTPAIR(x,y)  If it succeeds, inserts integers x and y into the multiset  If it fails, it leaves multiset unmodified LOOKUP(x)  Returns true iff x is in the multiset

Multiset Specification INSERTPAIR(x,y) status  success or failure if (status = success ) M  M U {x,y} return status LOOKUP(x) return (x  M) INSERTPAIR allows non-deterministic failure or success Makes choice visible via return value

Multiset Implementation Implementation uses array to represent multiset Only elements with valid bit set to “true” in multiset         content valid A 1234

Multiset Implementation LOOKUP(x) for i  1 to n ACQUIRE(L[i]) if (A[i].valid and A[i].content = x) RELEASE(L[i]) return true else RELEASE(L[i]) return false 5 7     content valid A  5 7     content valid A 

FINDSLOT(x) for i  1 to n ACQUIRE(L[i]) if (A[i].content = null) A[i].content = x RELEASE(L[i]) return i else RELEASE(L[i]) return 0; Implementation helper method: FindSlot(x) Finds free slot in array to insert x Returns index if slot found Returns 0 otherwise

Multiset Implementation INSERTPAIR(x,y) i  FINDSLOT(x) if (i=0) return failure j  FINDSLOT(y) if (j=0) A[i].content  null return failure ACQUIRE(L[i]) ACQUIRE(L[j]) A[i].valid  true A[j].valid  true RELEASE(L[i]) RELEASE(L[j]) return success ; 5        content valid A 5  7      content valid A 5 7      content valid A 5 7     content valid A

INSERTPAIR(x,y) i  FINDSLOT(x) if (i=0) return failure j  FINDSLOT(y) if (j=0) A[i].content  null return failure ACQUIRE(L[i]) ACQUIRE(L[j]) A[i].valid  true A[j].valid  true RELEASE(L[i]) RELEASE(L[j]) return success ; LOOKUP(x)  LOOKUP(y)  LOOKUP(x) LOOKUP(y) Commit

Outline Preliminaries Atomicity Refinement  I/O refinement The multiset example Checking I/O refinement Improving I/O refinement Comparison with other correctness criteria

Runtime Checking of I/O Refinement Spec atomic and deterministic Given a sequence of method calls and return values, there is at most one run Checking procedure: Execute implementation Record  order of commit points  method calls and return values Execute spec methods in the order they committed Check fails if spec is not able to execute method with given return value

Runtime Checking of I/O Refinement I/O refinement check may fail because Implementation is wrong The selection of commit points is wrong Can tell which is the case by comparing witness interleaving with implementation trace Improves testing: In multi-threaded tests, without witness interleaving, difficult to decide expected return value of method or final state at end of test Must consider all interleavings, or Forced to be too permissive

Off-line checking using a log Avoid overhead and concurrency impact Write actions of implementation into log Verification: Separate thread  Only reads from the log Log: Sequence of visible actions and commit actions Actions appear in log in the order they happen in implementation Logged actions (not operations) serialized by log  Initial part of log in memory   Low impact of contention for log Must perform action atomically with log entry insertion  Optimizations possible

Outline Preliminaries Atomicity Refinement  I/O refinement The multiset example Checking I/O refinement Improving I/O refinement Comparison with other correctness criteria

Improving I/O refinement Why: Test program may not perform “observations” frequently enough Inserting them may affect concurrency too much Observations may not get interleaved at the most interesting places Example: If a multiset test program does no LookUp’s, all I/O refinement tests will pass

Improving I/O refinement: The “view” Include a state-based condition into the definition of refinement Define auxiliary, hypothetical variable “ view ” Both spec and implementation have their copy of view User indicates how view gets updated in both view represents only the “abstract state” of the data structure Abstract away information not relevant to data structure state  From both spec and implementation Method return values determined uniquely by view Initialized to same value in spec and implementation In spec: Updated once, atomically between call and return In implementation: Updated once, atomically with commit action During runtime checking, check that view s match for each method

Definition of view for multiset Spec’s definition of view: Contents of multiset in spec Implementation’s definition of view: Computed atomically with commit action view   for i  1 to n lockOK = (L[i] not held by any thread) or (L[i] held by thread currently committing) or (L[i] held by thread executing LOOKUP) if (A[i].valid or lockOK) view  view U {A[i].content}

Off-line checking for view Implementation’s version of view constructed from log, off-line Must log all variable updates that affect view can look forward in log, past commit action, to compute view for implementation May have to log certain lock acquisitions and releases in log Compute, compare view incrementally for efficiency Overhead, impact on concurrency increases But programs we are working on keep similar logs for recovery purposes anyway Performance overhead tolerable for these

What does view buy? Examining program and spec state provides more observability Imagine multiset with REMOVE operation Suppose application executes INSERTPAIR( a, a ) But implementation erroneously inserts a once To expose this error through I/O refinement or testing, error must show up in return value of LOOKUP(a) Need execution that inserts a pair of a’s when there are no a’s in the multiset removes a looks “a” up before more a’s get inserted Checking view catches bug right after INSERTPAIR( a, a )

Non-trivial spec view Imagine multiset spec given as a binary search tree Executable spec contains detail not part of abstract data structure state Must abstract structure of search tree view just takes union of tree node contents Abstracts away parent-child, left-right relationships

Outline Preliminaries Atomicity Refinement  I/O refinement The multiset example Checking I/O refinement Improving I/O refinement Comparison with other correctness criteria

Multiset is not atomic INSERTPAIR(x,y) i  FINDSLOT(x) if (i=0) return failure j  FINDSLOT(y) if (j=0) A[i].content  null return failure ACQUIRE(L[i]) ACQUIRE(L[j]) A[i].valid  true A[j].valid  true RELEASE(L[i]) RELEASE(L[j]) return success ; Not possible to re-order different threads’ executions of FINDSLOT

Multiset has a race condition INSERTPAIR(x,y) i  FINDSLOT(x) if (i=0) return failure j  FINDSLOT(y) if (j=0) A[i].content  null return failure ACQUIRE(L[i]) ACQUIRE(L[j]) A[i].valid  true A[j].valid  true RELEASE(L[i]) RELEASE(L[j]) return success ; LOOKUP(x) for i  1 to n ACQUIRE(L[i]) if (A[i].valid and A[i].content = x) RELEASE(L[i]) return true else RELEASE(L[i]) return false

Multiset is not linearizable thread t1 executing INSERTPAIR(1,2) concurrently with thread t2 executing INSERTPAIR(5,6)     content valid 1    content valid FINDSLOT(1) by t1 succeeds 1  5  content valid FINDSLOT(5) by t2 succeeds 1  5  content valid FINDSLOT(2),INSERTPAIR(1,2) by t1 fail FINDSLOT(6),INSERTPAIR(5,6) by t2 fail No linearized execution of implementation fails first call to INSERTPAIR(5,6) Multiset implementation not linearizable

Conclusions, future work Run-time refinement checking a promising approach Caught artificial bugs in Multiset Caught real bugs in Scan file system for WindowsNT In the process of applying it to Boxwood A concurrent implementation of a B-link tree When no formal, executable spec exists, we use “atomic version” of implementation as spec Lowers barrier to application of refinement-based methods Concentrates on concurrency bugs