1 CheckFence: Checking Consistency of Concurrent Data Types on Relaxed Memory Models Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of Computer.

Slides:



Advertisements
Similar presentations
Verification of architectural memory models by model checking Shaz Qadeer Compaq Systems Research Center
Advertisements

Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Copyright 2000 Cadence Design Systems. Permission is granted to reproduce without modification. Introduction An overview of formal methods for hardware.
Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.
Hongjin Liang and Xinyu Feng
Architecture-aware Analysis of Concurrent Software Rajeev Alur University of Pennsylvania Amir Pnueli Memorial Symposium New York University, May 2010.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
“FENDER” AUTOMATIC MEMORY FENCE INFERENCE Presented by Michael Kuperstein, Technion Joint work with Martin Vechev and Eran Yahav, IBM Research 1.
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 12 CS252 Graduate Computer Architecture Spring 2014 Lecture 12: Synchronization and Memory Models Krste.
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
(C) 2001 Daniel Sorin Correctly Implementing Value Prediction in Microprocessors that Support Multithreading or Multiprocessing Milo M.K. Martin, Daniel.
1 Symbolic Execution for Model Checking and Testing Corina Păsăreanu (Kestrel) Joint work with Sarfraz Khurshid (MIT) and Willem Visser (RIACS)
Wait-Free Reference Counting and Memory Management Håkan Sundell, Ph.D.
PARTIAL-COHERENCE ABSTRACTIONS FOR RELAXED MEMORY MODELS Presented by Michael Kuperstein, Technion Joint work with Martin Vechev, IBM Research and Eran.
Toward Efficient Support for Multithreaded MPI Communication Pavan Balaji 1, Darius Buntinas 1, David Goodell 1, William Gropp 2, and Rajeev Thakur 1 1.
Concurrent Executions on Relaxed Memory Models Challenges & Opportunities for Software Model Checking Rajeev Alur University of Pennsylvania Joint work.
Concurrent Data Structures in Architectures with Limited Shared Memory Support Ivan Walulya Yiannis Nikolakopoulos Marina Papatriantafilou Philippas Tsigas.
Steven Pelley, Peter M. Chen, Thomas F. Wenisch University of Michigan
Progress Guarantee for Parallel Programs via Bounded Lock-Freedom Erez Petrank – Technion Madanlal Musuvathi- Microsoft Bjarne Steensgaard - Microsoft.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
1 Memory Model Sensitive Analysis of Concurrent Data Types Sebastian Burckhardt Dissertation Defense University of Pennsylvania July 30, 2007.
Formalisms and Verification for Transactional Memories Vasu Singh EPFL Switzerland.
CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy.
Simple, Fast, and Practical Non- Blocking and Blocking Concurrent Queue Algorithms Presenter: Jim Santmyer By: Maged M. Micheal Michael L. Scott Department.
Architecture-aware Analysis of Concurrent Software Rajeev Alur University of Pennsylvania Joint work with Sebastian Burckhardt and Milo Martin UCLA, November.
Memory Model Safety of Programs Sebastian Burckhardt Madanlal Musuvathi Microsoft Research EC^2, July 7, 2008.
Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.
Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
SUPPORTING LOCK-FREE COMPOSITION OF CONCURRENT DATA OBJECTS Daniel Cederman and Philippas Tsigas.
© 2007 GrammaTech, Inc. All rights reserved GrammaTech, Inc. 317 N Aurora St. Ithaca, NY Tel: Verifying.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Evaluation of Memory Consistency Models in Titanium.
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
Shared Memory Consistency Models: A Tutorial Sarita V. Adve Kouroush Ghrachorloo Western Research Laboratory September 1995.
CDP 2013 Based on “C++ Concurrency In Action” by Anthony Williams, The C++11 Memory Model and GCCThe C++11 Memory Model and GCC Wiki and Herb Sutter’s.
Challenges in Non-Blocking Synchronization Håkan Sundell, Ph.D. Guest seminar at Department of Computer Science, University of Tromsö, Norway, 8 Dec 2005.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Maged M.Michael Michael L.Scott Department of Computer Science Univeristy of Rochester Presented by: Jun Miao.
11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
JAVA MEMORY MODEL AND ITS IMPLICATIONS Srikanth Seshadri
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
Detecting and Eliminating Potential Violation of Sequential Consistency for concurrent C/C++ program Duan Yuelu, Feng Xiaobing, Pen-chung Yew.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
CS510 Concurrent Systems Jonathan Walpole. A Methodology for Implementing Highly Concurrent Data Objects.
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy Slides by Vincent Rayappa.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Model checking transactional memory with Spin John O’Leary, Bratin Saha, Mark Tuttle Intel Corporation We used the Spin model checker to prove that Intel’s.
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects MAGED M. MICHAEL PRESENTED BY NURIT MOSCOVICI ADVANCED TOPICS IN CONCURRENT PROGRAMMING,
Agenda  Quick Review  Finish Introduction  Java Threads.
Review A program is… a set of instructions that tell a computer what to do. Programs can also be called… software. Hardware refers to… the physical components.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Healing Data Races On-The-Fly
An Operational Approach to Relaxed Memory Models
Håkan Sundell Philippas Tsigas
Memory Consistency Models
Threads Cannot Be Implemented As a Library
Department of Computer Science, University of Rochester
Memory Consistency Models
Specifying Multithreaded Java semantics for Program Verification
Amir Kamil and Katherine Yelick
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Over-Approximating Boolean Programs with Unbounded Thread Creation
Yiannis Nikolakopoulos
Amir Kamil and Katherine Yelick
Why we have Counterintuitive Memory Models
Presentation transcript:

1 CheckFence: Checking Consistency of Concurrent Data Types on Relaxed Memory Models Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of Computer and Information Science University of Pennsylvania June 6, 2007

2 General Motivation Multi-threaded SoftwareShared-memory Multiprocessor Concurrent Executions Bugs

3 Specific Motivation Multi-threaded Software lock-free synchronization (intentional races) Shared-memory Multiprocessor with relaxed memory model Concurrent Executions not always sequentially consistent Bugs

4 concurrency libaries with lock-free synchronization... are simple, fast, and safe to use –concurrent versions of queues, sets, maps, etc. –more concurrency, less waiting –fewer deadlocks... are notoriously hard to design and verify –tricky interleavings routinely escape reasoning and testing –exposed to relaxed memory models code needs to contain memory fences for correct operation! Specific Motivation

5... are simple, fast, and safe to use –concurrent versions of queues, sets, vectors, etc. –more concurrency, less waiting –fewer deadlocks... are notoriously hard to design and verify –tricky interleavings –exposed to relaxed memory models code needs memory fences for correct operations Specific Motivation DATA TYPE IMPLEMENTATIONS ARE TINY tens to hundreds lines of code CLIENT PROGRAMS MAY BE LARGE thousands, millions lines of code concurrency libaries with lock-free synchronization

6 Bridging the Gap Architecture: + Multiprocessors + Relaxed memory models Concurrent Algorithms: + Lock-free queues, sets, deques Computer-Aided Verification: + Model check C code + Sound counterexamples CheckFence Tool methodology described in our papers in [CAV 2006], [PLDI 2007]

7 Example: Nonblocking Queue The implementation optimized: no locks. not race-free exposed to memory model The client program on multiple processors calls operations enqueue(1)... enqueue(2).... Processor a = dequeue() b = dequeue() Processor 2 void enqueue(int val) {... } int dequeue() {... }

8 Michael & Scott’s Nonblocking Queue [Principles of Distributed Computing (PODC) 1996] boolean_t dequeue(queue_t *queue, value_t *pvalue) { node_t *head; node_t *tail; node_t *next; while (true) { head = queue->head; tail = queue->tail; next = head->next; if (head == queue->head) { if (head == tail) { if (next == 0) return false; cas(&queue->tail, (uint32) tail, (uint32) next); } else { *pvalue = next->value; if (cas(&queue->head, (uint32) head, (uint32) next)) break; } delete_node(head); return true; } 1 23 headtail

9 Correctness Condition Data type implementations must appear sequentially consistent to the client program: the observed argument and return values must be consistent with some interleaved, atomic execution of the operations. enqueue(1) dequeue() -> 2 enqueue(2) dequeue() -> 1 Observation Witness Interleaving enqueue(1) enqueue(2) dequeue() -> 1 dequeue() -> 2  Observation  Witness Interleaving

10 Part II: Solution

11 Bounded Model Checker Pass: all executions of the test are observationally equivalent to a serial execution Fail: CheckFence Memory Model Axioms Inconclusive: runs out of time or memory

12 Demo: CheckFence Tool

13 Example: Memory Model Bug Processor 1 links new node into list Processor 2 reads value at head of list --> Processor 2 loads uninitialized value... 3 node->value = 2;... 1 head = node;... 2 value = head->value;... Processor 1 reorders the stores! memory accesses happen in order adding a fence between lines on left side prevents reordering head

14 Tool Architecture C code Symbolic Test Trace Symbolic test is nondeterministic, has exponentially many executions (due to symbolic inputs, dyn. memory allocation, interleaving/reordering of instructions). CheckFence solves for “bad” executions.

15 C code Symbolic Test Trace Symbolic Test automatic, lazy loop unrolling automatic specification mining (enumerate correct observations) construct CNF formula whose solutions correspond precisely to the concurrent executions

16 Which Memory Model? Memory models are platform dependent & ridden with details We use a conservative abstract approximation “Relaxed” to capture common effects Once code is correct for Relaxed, it is correct for stronger models Finding simple, general abstraction is hard (work in progress) TSO PSO IA-32 Alpha Relaxed RMO z6 SC

17 Part VI: Results

18 TypeDescriptionlocSource QueueTwo-lock queue80M. Michael and L. Scott (PODC 1996) QueueNon-blocking queue98 SetLazy list-based set141Heller et al. (OPODIS 2005) SetNonblocking list174T. Harris (DISC 2001) Deque“snark” algorithm159D. Detlefs et al. (DISC 2000) LL/VL/SCCAS-based74M. Moir (PODC 1997) LL/VL/SCBounded Tags198 Studied Algorithms

19 2 known 1 unknown regular bugs Bounded Tags CAS-based fixed “snark” original “snark” Nonblocking list Lazy list-based set Non-blocking queue Two-lock queue Description Deque LL/VL/SC Deque Set Queue Type Results –snark algorithm has 2 known bugs, we found them –lazy list-based set had a unknown bug (missing initialization; missed by formal correctness proof [CAV 2006] because of hand-translation of pseudocode)

20 # Fences inserted 2 known 1 unknown regular bugs Store 2 4 Load Dependent Loads Aliased Loads Bounded Tags CAS-based fixed “snark” original “snark” Nonblocking list Lazy list-based set Non-blocking queue Two-lock queue Description Deque LL/VL/SC Deque Set Queue Type Results –snark algorithm has 2 known bugs, we found them –lazy list-based set had a unknown bug (missing initialization; missed by formal correctness proof [CAV 2006] because of hand-translation of pseudocode) –Many failures on relaxed memory model inserted fences by hand to fix them small testcases sufficient for this purpose

21 Typical Tool Performance Very efficient on small testcases (< 100 memory accesses) Example (nonblocking queue): T0 = i (e | d) T1 = i (e | e | d | d ) - find counterexamples within a few seconds - verify within a few minutes - enough to cover all 9 fences in nonblocking queue Slows down with increasing number of memory accesses in test Example (snark deque): Dq = ( pop_l | pop_l | pop_r | pop_r | push_l | push_l | push_r | push_r ) - has 134 memory accesses (77 loads, 57 stores) - Dq finds second snark bug within ~1 hour Does not scale past ~300 memory accesses

22 Related Work Bounded Software Model Checking Clarke, Kroening, Lerda (TACAS'04) Rabinovitz, Grumberg (CAV'05) Correctness Conditions for Concurrent Data Types Herlihy, Wing (TOPLAS'90) Alur, McMillan, Peled (LICS'96) Operational Memory Models & Explicit Model Checking Park, Dill (SPAA'95) Huynh, Roychoudhury (FM'06) Axiomatic Memory Models & SAT solvers Yang, Gopalakrishnan, Lindstrom, Slind (IPDPS'04)

23 Contribution First model checker for C code on relaxed memory models. Handles ``reasonable’’ subset of C ( conditionals, loops, pointers, arrays, structures, function calls, dynamic memory allocation ) No formal specifications or annotations required Requires manually written test suite Soundly verifies & falsifies individual tests, produces counterexamples Relaxed Memory Models Lock-free implementations Software Verification

24 END

25 Bounded Model Checker Pass: all executions of the test are observationally equivalent to a serial execution Fail: CheckFence Memory Model Axioms Inconclusive: runs out of time or memory

26 Future Work Make CheckFence publicly available Experiment with more memory models –hardware (PPC, Itanium), language (Java, C++ volatiles) Improve solver component –enhance SAT solver support for total/partial orders Develop reasoning techniques for relaxed memory models Develop scalable methods for finding specific, common bugs Build concurrent library

27 Axioms for Relaxed A set of addresses V set of values X set of memory accesses S  X subset of stores L  X subset of loads a(x) memory address of x v(x) value loaded or stored by x < p is a partial order over X (program order) < m is a total order over X (memory order) For a load l  L, define the following set of stores that are “visible to l”: S(l) = { s  S | a(s) = a(l) and (s < m l or s < p l ) } Executions for the model Relaxed are defined by the following axioms: 1. If x < p y and a(x) = a(y) and y  S, then x < m y 2. For l  L and s  S(l), always either v(l) = v(s) or there exists another store s’  S(l) such that s < m s’

28 → 2 → 0 Relaxed Memory Model Example Example: output not sequentially consistent (that is, not consistent with any interleaved execution) ! processor 1 may perform stores out of order processor 2 may perform loads out of order relaxed ordering guarantees improve processor performance Q: Why doesn’t everything break? A: Relaxations are designed in a way to guarantee that uniprocessor programs are safe race-free programs are safe x = 1 y = 2 print y print x thread 1thread 2