Inherent limitations facilitate design and verification of concurrent programs Hagit Attiya Technion.

Slides:

Advertisements

Similar presentations

Operating Systems Semaphores II

Advertisements

Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study Sebastian Burckhardt Rajeev Alur Milo M. K. Martin Department of.

CM20145 Concurrency Control

Tintu David Joy. Agenda Motivation Better Verification Through Symmetry-basic idea Structural Symmetry and Multiprocessor Systems Mur ϕ verification system.

IBM T. J. Watson Research Center Conditions for Strong Synchronization Maged Michael IBM T J Watson Research Center Joint work with: Martin Vechev, Hagit.

1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )

Guy Golan-GuetaTel-Aviv University Nathan Bronson Stanford University Alex Aiken Stanford University G. Ramalingam Microsoft Research Mooly Sagiv Tel-Aviv.

Lecture plan Transaction processing Concurrency control

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.

TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A AAA A A A AA A Proving that non-blocking algorithms don't block.

Database Systems (資料庫系統)

Unit 9 Concurrency Control. 9-2 Wei-Pang Yang, Information Management, NDHU Content  9.1 Introduction  9.2 Locking Technique  9.3 Optimistic Concurrency.

Impossibilities for Disjoint-Access Parallel Transactional Memory : Alessia Milani [Guerraoui & Kapalka, SPAA 08] [Attiya, Hillel & Milani, SPAA 09]

© 2005 P. Kouznetsov Computing with Reads and Writes in the Absence of Step Contention Hagit Attiya Rachid Guerraoui Petr Kouznetsov School of Computer.

Memory Consistency Models Kevin Boos. Two Papers Shared Memory Consistency Models: A Tutorial – Sarita V. Adve & Kourosh Gharachorloo – September 1995.

Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.

Concurrency Control II

Architecture-aware Analysis of Concurrent Software Rajeev Alur University of Pennsylvania Amir Pnueli Memorial Symposium New York University, May 2010.

CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,

“FENDER” AUTOMATIC MEMORY FENCE INFERENCE Presented by Michael Kuperstein, Technion Joint work with Martin Vechev and Eran Yahav, IBM Research 1.

5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.

Multiprocessor Synchronization Algorithms ( ) Lecturer: Danny Hendler The Mutual Exclusion problem.

1 Supplemental Notes: Practical Aspects of Transactions THIS MATERIAL IS OPTIONAL.

Chapter 6 Process Synchronization: Part 2. Problems with Semaphores Correct use of semaphore operations may not be easy: –Suppose semaphore variable called.

Inherent limitations facilitate design & verification of concurrent programs Hagit Attiya Technion.

Sequential reductions for verifying serializability Hagit Attiya Technion & EPFL G. RamalingamMSR India Noam Rinetzky University of London.

Sequential reductions for verifying serializability Hagit Attiya Technion & EPFL G. RamalingamMSR India Noam Rinetzky University of London.

1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.

Formalisms and Verification for Transactional Memories Vasu Singh EPFL Switzerland.

Transaction Management and Concurrency Control

Transaction Management and Concurrency Control

Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.

1 Martin Vechev IBM T.J. Watson Research Center Joint work with: Hagit Attiya, Rachid Guerraoui, Danny Hendler, Petr Kuznetsov, Maged Michael.

Transaction Processing: Concurrency and Serializability 10/4/05.

Transaction Management

The Cost of Privatization Hagit Attiya Eshcar Hillel Technion & EPFLTechnion.

1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.

1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.

Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.

Cosc 4740 Chapter 6, Part 3 Process Synchronization.

BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.

L AWS OF ORDER : EXPENSIVE SYNCHRONIZATION IN CONCURRENT ALGORITHMS CANNOT BE ELIMINATED POPL '11 Hagit Attiya, Rachid Guerraoui, Danny Hendler, Petr Kuznetsov,

II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.

Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.

Fence Complexity in Concurrent Algorithms Petr Kuznetsov TU Berlin/DT-Labs.

Complexity Implications of Memory Models. Out-of-Order Execution Avoid with fences (and atomic operations) Shared memory processes reordering buffer Hagit.

Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.

Alpine Verification Meeting 2008 Model Checking Transactional Memories Vasu Singh (Joint work with Rachid Guerraoui, Tom Henzinger, Barbara Jobstmann)

Reduction Theorems for Proving Serializability with Application to RCU-Based Synchronization Hagit Attiya Technion Work with Ramalingam and Rinetzky (POPL.

Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.

Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.

Lecture 20: Consistency Models, TM

Transaction Management and Concurrency Control

Memory Consistency Models

Atomic Operations in Hardware

Memory Consistency Models

Advanced Operating Systems - Fall 2009 Lecture 8 – Wednesday February 4, 2009 Dan C. Marinescu Office: HEC 439 B. Office hours: M,

Challenges in Concurrent Computing

Threads and Memory Models Hal Perkins Autumn 2011

Chapter 10 Transaction Management and Concurrency Control

Threads and Memory Models Hal Perkins Autumn 2009

Lecture 22: Consistency Models, TM

Introduction of Week 13 Return assignment 11-1 and 3-1-5

Sitting on a Fence: Complexity Implications of Memory Reordering

Locking Protocols & Software Transactional Memory

Lecture: Coherence and Synchronization

Lecture: Consistency Models, TM

Problems with Locks Andrew Whitaker CSE451.

CSE 542: Operating Systems

CSE 542: Operating Systems

Presentation transcript:

Inherent limitations facilitate design and verification of concurrent programs Hagit Attiya Technion

Concurrent Programs Core challenge is synchronization Correct synchronization is hard to get right Efficient synchronization is even harder Ad-hoc VS Principled Manual VS Automatic Ad-hoc VS Principled Manual VS Automatic

Work with Ramalingam and Rinetzky (POPL 2010) EXAMPLE I: VERIFYING LOCKING PROTOCOLS

The Goal: Sequential Reductions Verify concurrent data structures Pre-execution static analysis E.g., linked list with hand-over-hand locking no memory leaks, shape (its a list), serializability Find sequential reductions Consider only sequential executions But conclude that properties hold in all executions

Back-of-envelop estimate of gain Static analysis of a linked-list algorithm [Amit, Rinetzky, Reps, Sagiv, Yahav, CAV 2007] –Verifies e.g., memory safety, sortedness, pointed-to by a variable, heap sharing One thread (sequential) 10s3.6MB Two threads (interleaved)~4h886MB Three threads (interleaved)> 8h----

Serializability operation interleaved execution complete non-interleaved execution ~ ~ ~ ~ ~ ~ ~ ~ ~ to the thread locally [Papadimitriou 79]

If M is serializable Then Π φ cni-Π φ If M is serializable Then Π φ cni-Π φ Serializability assists verification Concurrent code M Π= all executions of M φ = a property local to the threads cni-Π: complete non-interleaved executions of M (small subset of Π) Easily derived from [Papadimitriou 79]

How do we know that M is serializable, w/o considering all executions? E.g., from only complete non interleaved executions If M is serializable Then Π φ cni-Π φ If M is serializable Then Π φ cni-Π φ

Special (and common) case: Disciplined programming with locks Guard access to data with locks –Lock() acquire the lock –Unlock() release the lock Only one process holds the lock at each time Follow a locking protocol that guarantees conflict serializability E.g., two-phase locking (2PL) or tree locking (TL)

Two-phase locking [Papadimitriou `79] Locks acquire (grow) phase followed by locks release (shrink) phase No lock is acquired after some lock is released t1t1 H t1t1 t1t1 t2t2 t1t1

Tree (hand-over-hand) locking [Kedem & Sliberschatz 76] [Smadi 76] [Bayer & Scholnick 77] Except for the first lock, acquire a lock only when holding the lock on its parent No lock is acquired after being released t1t1 H t1t1 t1t1 t2t2

Tree (hand-over-hand) locking [Kedem & Sliberschatz 76] [Smadi 76] [Bayer & Scholnick 77] Except for the first lock, acquire a lock only when holding the lock on its parent No lock is acquired after being released t1t1 t2t2 t2t2 H t1t1

void p() { acquire(B) B = 0 release(B) int b = B if (b) acquire(A) } void q() { acquire(B) B = 1 release(B) } Yes! –for databases –concurrency control monitor ensures that M follows the locking policy at run-time M is serializable No! –for static analysis –no central monitor Not two-phase locked But only in interleaved executions

Our Goal Statically verify that M follows a locking policy Applies to local conflict-serializable locking protocols –Depending only on threads local variables & global variables locked by it E.g., two phase locking, tree locking, (dynamic) DAG locking… But not protocols that rely on a concurrency control monitor!

Thread-local properties A thread-owned view contains the values of threads local variables & global variables locked by it A property φ is thread-local if it –Can be expressed in terms of thread-owned views –Is prefix closed A thread-local property of an execution holds in every execution indistinguishable from it

Our contribution: Easy step ni-Π: complete non-interleaved executions of M For any local conf serializable locking policy LP Π LP ni-Π LP For any local conf serializable locking policy LP Π LP ni-Π LP non-interleaved execution For any thread-local property φ Π φ ni-Π φ For any thread-local property φ Π φ ni-Π φ Two phase locking Tree locking Dynamic tree locking Dynamic DAG locking

Reduction to non-interleaved executions: Proof idea σ is the shortest execution that does not follow LP σ follows LP, guarantees conflict-serializability non interleaved execution equivalent to σ σ (t,e) σ

Reduction to non-interleaved executions: Proof idea σ is the shortest execution that does not follow LP σ follows LP, guarantees conflict-serializability non interleaved execution equivalent to σ σ (t,e) σ σ ni

Reduction to non-interleaved executions: Proof idea σ is the shortest execution that does not follow LP σ follows LP, guarantees conflict-serializability non interleaved execution similar to σ non interleaved execution similar to σ where LP is violated σ (t,e) σ σ ni (t,e)

Ni-reduction: Proof sketch there is a ni-execution that is equivalent to σ there is a ni-execution that is equivalent to σ where LP is violated σ σ ni (t,e)

Ni-reduction: Proof sketch There is a ni-execution σ ni with the same conflicts as in σ t can execute e also after σ ni Write σ ni = σ 1 σ t σ 2, σ t is the sub-exeuction by thread t t can execute e also after σ 1 σ t σ 1 σ t (t,e) is a ni-execution and it follows the locking protocol Since σ 1 σ t (t,e) and σ (t,e) are conflict equivalent, σ (t,e) follows the locking protocol

Further reduction acni-Π: almost-complete non-interleaved executions of M For any LCS locking policy LP Π LP acni-Π LP For any LCS locking policy LP Π LP acni-Π LP almost complete non-interleaved execution

Reduction to non-interleaved executions: A complication Need to argue about termination int X=0, Y=0 void p() { acquire(Y) y = Y release(Y); if (y 0) acquire(X) X = 3 release(X) } void q() { if (random(5) == 3){ acquire(Y) Y = 1 release(Y) while (true) nop } Y is set to 1 & the method enters an infinite loop Observe Y == 1 & violates 2PL

Reduction to non-interleaved executions: Termination Can use sequential reduction to verify termination For any terminating local conflict serializable locking policy LP Π LP acni-Π LP For any terminating local conflict serializable locking policy LP Π LP acni-Π LP

Acni-reduction: Proof ideas Start from a ni-execution (rely on the previous, ni-reduction to get there) Create its equivalent completion, if possible Not always possible, e.g., Does not access variables accessed by later threads t 1 :lock(v),t 1 :lock(u),t 2 :lock(u) u v

Implications for statis analysis Pessimistic analysis (over approximate) –Analyze a module from every possible state Semi-optimistic analysis –Analyze a module only from states that occur after a sequence of modules ran one after the other (not to completion) Optimistic analysis (precise) –Analyze a module only from states that occur after a sequence of modules ran to completion (one after the other) Acni-reduction Ni-reduction

Initial analysis results Shape analysis of hand-over-hand lists *Does not verify sortedness of list and fails to verify linearizability in some cases Shape analysis of hand-over-hand trees (for the first time) Our method 3.5s4.0MB TVLA prior596.1s90.3MB Separation logic* 0.4s0.2MB Our method 124.6s90.6MB

Whats next? Extend to shared (read) locks Extend to software transactional memory –aborted transactions –non-locking non-conflict based serializability (e.g., using timestamps) Combine with other reductions [Guerraoui, Henzinger, Jobstmann, Singh]

EXAMPLE II: REQUIRED MEMORY ORDERINGS Work with Guerraoui, Hendler, Kuznetsov, Michael and Vechev (POPL 2011)

Relaxed memory models Out of order execution of memory accesses, to compensate for slow writes Optimize to issue reads before following writes, if they access different locations Reordering may lead to inconsistency

Read-after-write (RAW) Reordering Process P: Write(X,1) Read(Y) Process P: Write(X,1) Read(Y) Process Q: Write(Y,1) Read(X) Process Q: Write(Y,1) Read(X) P Q W(Y,1) R(Y) W(X,1) R(X) W(X,1)

Avoiding out-of-order: Read-after-write (RAW) Fence Process P: Write(X,1) FENCE Read(Y) Process P: Write(X,1) FENCE Read(Y) Process Q: Write(Y,1) FENCE Read(X) Process Q: Write(Y,1) FENCE Read(X) P Q W(Y,1) R(Y) W(X,1) R(X)

Avoiding out-of-order: Atomic Operations Atomic operations: atomic-write-after-read (AWAR) E.g., CAS, TAS, Fetch&Add,… RAW fences / AWAR are ~60 slower than (remote) memory accesses atomic{ read(Y) … write(X,1) } atomic{ read(Y) … write(X,1) }

Our result 34 Any concurrent program in a certain class must use RAW/AWARs

Which programs? Concurrent data types: –queues, counters, hash tables, trees,… –Non-commutative operations –Linearizable solo-terminating implementations Mutual exclusion

Non-commutative operations Operation A is non-commutative if there is operation B where: A influences B and B influences A

Example: Queue enq(v) add v to the end of the queue deq() dequeues item at the head of the queue Q.deq():1;Q.deq():2 Q.deq():2;Q.deq():1 deq() influence each other Q.enq(3):ok;Q.deq():1 Q.deq():1;Q.enq(3):ok enq() is not non-commutative Q Q Q 3 3

Proof Intuition: Writing If an operation does not write, it does not influence anyone It would be commutative 38 no shared write 1 deq do not influence each other 1 deq

Proof Intuition: Read If an operation does not read, it is not influenced by anyone It would be commutative 39 1 deq do not influence each other 1 deq no shared read

40 Proof Intuition: RAW deq 1 1 W no RAW deq11 Linearization

Mutual exclusion (Mutex) Two processes do not hold lock at the same time (Deadlock-freedom) If a process calls Lock() then some process acquires the lock Two Lock() operations influence each other! Every successful lock acquire incurs a RAW/AWAR fence

Who should care? Concurrent programmers: when is it futile to avoid expensive synchronization Hardware designers: motivation to lower cost of specific synchronization constructs API designers: choice of API affects synchronization Verification engineers: declare incorrect when synchronization is missing 42 …although I hope that these shortcomings will be addressed, I hasten to add that they are insignificant compared to the huge step forward that this paper represents…. -- Linux Weekly News, Jan 26, 2011 …although I hope that these shortcomings will be addressed, I hasten to add that they are insignificant compared to the huge step forward that this paper represents…. -- Linux Weekly News, Jan 26, 2011

What else? Weaker operations? E.g., idempotent Work Stealing Tight lower bounds? Other patterns –Read-after-read, write-after-write, barriers

And beyond… The cost of verifying adherence to a locking policy (Semi-) Automatic insertion of lock acquire / release commands or fences

Thank you!