Gal Milman Based on Chapter 10 (Concurrent Queues and the ABA Problem) in The Art of Multiprocessor Programming by Herlihy and Shavit Seminar 2 (236802)

Slides:



Advertisements
Similar presentations
Synchronization NOTE to instructors: it is helpful to walk through an example such as readers/writers locks for illustrating the use of condition variables.
Advertisements

Mutual Exclusion – SW & HW By Oded Regev. Outline: Short review on the Bakery algorithm Short review on the Bakery algorithm Black & White Algorithm Black.
1 Chapter 5 Concurrency: Mutual Exclusion and Synchronization Principals of Concurrency Mutual Exclusion: Hardware Support Semaphores Readers/Writers Problem.
1 Chapter 4 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Concurrent Programming
Maged M. Michael, “Hazard Pointers: Safe Memory Reclamation for Lock- Free Objects” Presentation Robert T. Bauer.
Monitors & Blocking Synchronization 1. Producers & Consumers Problem Two threads that communicate through a shared FIFO queue. These two threads can’t.
Concurrent Queues and Stacks The Art of Multiprocessor Programming Spring 2007.
Scalable Synchronous Queues By William N. Scherer III, Doug Lea, and Michael L. Scott Presented by Ran Isenberg.
Lecture 6-2 : Concurrent Queues and Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Concurrent Queues Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Art of Multiprocessor Programming1 Concurrent Queues Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified.
Concurrent Queues and Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Concurrent Queues and Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Concurrent Data Structures in Architectures with Limited Shared Memory Support Ivan Walulya Yiannis Nikolakopoulos Marina Papatriantafilou Philippas Tsigas.
Progress Guarantee for Parallel Programs via Bounded Lock-Freedom Erez Petrank – Technion Madanlal Musuvathi- Microsoft Bjarne Steensgaard - Microsoft.
5.6 Semaphores Semaphores –Software construct that can be used to enforce mutual exclusion –Contains a protected variable Can be accessed only via wait.
Introduction to Lock-free Data-structures and algorithms Micah J Best May 14/09.
Simple, Fast, and Practical Non- Blocking and Blocking Concurrent Queue Algorithms Presenter: Jim Santmyer By: Maged M. Micheal Michael L. Scott Department.
Language Support for Concurrency. 2 Common programming errors Process i P(S) CS P(S) Process j V(S) CS V(S) Process k P(S) CS.
CS510 Concurrent Systems Class 2 A Lock-Free Multiprocessor OS Kernel.
Concurrent Queues and Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
שירן חליבה Concurrent Queues. Outline: Some definitions 3 queue implementations : A Bounded Partial Queue An Unbounded Total Queue An Unbounded Lock-Free.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
CS510 Concurrent Systems Jonathan Walpole. A Lock-Free Multiprocessor OS Kernel.
1 Concurrent Languages – Part 1 COMP 640 Programming Languages.
November 15, 2007 A Java Implementation of a Lock- Free Concurrent Priority Queue Bart Verzijlenberg.
6.3 Peterson’s Solution The two processes share two variables: Int turn; Boolean flag[2] The variable turn indicates whose turn it is to enter the critical.
Semaphores, Locks and Monitors By Samah Ibrahim And Dena Missak.
1 Threads  Sequential Execution: Here statements are executed one after the other.They consider only a single thread of execution, where thread is an.
Data structures for concurrency 1. Ordinary collections are not thread safe Namespaces System.Collections System.Collections.Generics Classes List, LinkedList,
Maged M.Michael Michael L.Scott Department of Computer Science Univeristy of Rochester Presented by: Jun Miao.
CSC321 Concurrent Programming: §5 Monitors 1 Section 5 Monitors.
11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
מרצה: יהודה אפק מגיש: ערן שרגיאן. OutlineOutline Quick reminder of the Stack structure. The Unbounded Lock-Free Stack. The Elimination Backoff Stack.
1 Lock-Free concurrent algorithm for Linked lists: Verification CSE-COSC6490A : Concurrent Object-Oriented Languages York University - W09 Speaker: Alexandre.
JAVA MEMORY MODEL AND ITS IMPLICATIONS Srikanth Seshadri
Advanced Locking Techniques
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects Maged M. Michael Presented by Abdulai Sei.
Monitors and Blocking Synchronization Dalia Cohn Alperovich Based on “The Art of Multiprocessor Programming” by Herlihy & Shavit, chapter 8.
Concurrent Queues Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
1 Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois.
Concurrent Stacks Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Priority Queues Dan Dvorin Based on ‘The Art of Multiprocessor Programming’, by Herlihy & Shavit, chapter 15.
Distributed Algorithms (22903) Lecturer: Danny Hendler The wait-free hierarchy and the universality of consensus This presentation is based on the book.
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects MAGED M. MICHAEL PRESENTED BY NURIT MOSCOVICI ADVANCED TOPICS IN CONCURRENT PROGRAMMING,
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 6: Process Synchronization.
Semaphores Chapter 6. Semaphores are a simple, but successful and widely used, construct.
Chapter 6 Synchronization Dr. Yingwu Zhu. The Problem with Concurrent Execution Concurrent processes (& threads) often access shared data and resources.
6.852 Lecture 21 ● Techniques for highly concurrent objects – coarse-grained mutual exclusion – read/write locking – fine-grained locking (mutex and read/write)
Monitors and Blocking Synchronization
Background on the need for Synchronization
Concurrent Objects Companion slides for
Outline for Today Objective: Administrative details:
Distributed Algorithms (22903)
Distributed Algorithms (22903)
Distributed Algorithms (22903)
Concurrent Queues and Stacks
Concurrent Queues and Stacks
CS510 - Portland State University
Yiannis Nikolakopoulos
Semaphores Chapter 6.
CSE 153 Design of Operating Systems Winter 19
Process/Thread Synchronization (Part 2)
Presentation transcript:

Gal Milman Based on Chapter 10 (Concurrent Queues and the ABA Problem) in The Art of Multiprocessor Programming by Herlihy and Shavit Seminar 2 (236802) in Advanced Topics in Concurrent Programming Winter 15/16 Instructor: Erez Petrank Concurrent Queues

background: Pool Provides Put, Get methods. The same item can appear more than once. Often acts as producer-consumer buffer.

Provides Put, Get methods. The same item can appear more than once. Often acts as producer-consumer buffer. Pool Properties: Capacity: bounded / unbounded Methods: total / partial / synchronous Fairness: queue (FIFO) / stack (LIFO) /…

Queue DequeueEnqueue Tail Head (Get)(Put) next

Q u e u e We will review: Blocking Queue Lock-Free Queue o ABA Problem Synchronous Dual Queue

Blocking Queue Initialization: tail = head = Node(value=null) capacity = input_capacity size = AtomicInteger(0) enqLock, deqLock notFullCondition, notEmptyCondition tail head next tail head next value next value Dummy node

public void addNode(T x) { Node newNode = new Node(x); tail.next = newNode; tail = newNode; } tail head tail x newNode Blocking Queue - enq head 1 2

public T removeNodeAndGetValue() { result = head.next.value; head = head.next; return result; } Blocking Queue - deq head tail x head tail x

public void enq(T x) { addNode(x); size.getAndIncrement() } Blocking Queue - enq enqLock.lock(); while (size.get() == capacity) notFullCondition.await(); enqLock.unlock(); boolean mustWakeDequeuers = false; if ( == 0) mustWakeDequeuers = true; if (mustWakeDequeuers) { deqLock.lock(); notEmptyCondition.signalAll(); deqLock.unlock(); }

public void enq(T x) { addNode(x); size.getAndIncrement() } Blocking Queue - enq enqLock.lock(); while (size.get() == capacity) notFullCondition.await(); enqLock.unlock(); boolean mustWakeDequeuers = false; if ( == 0) mustWakeDequeuers = true; if (mustWakeDequeuers) { deqLock.lock(); notEmptyCondition.signalAll(); deqLock.unlock(); } deq is symmetrical. Why should we lock before signaling? deqenq deqlock.lock() while (queue is empty) notEmptyCondition.await() addNode() notEmptyCondition.signalAll() Why should we lock before signaling? To avoid the lost wakeup problem.

Unbounded Blocking Queue Unbounded => size and capacity are unnecessary Total => conditions are unnecessary Initialization: tail = head = Node(value=null) enqLock, deqLock tail head next

Unbounded Blocking Queue - enq, deq public void enq(T x) { addNode(x); } enqLock.lock(); enqLock.unlock(); public T deq() throws EmptyException { if (head.next == null) throw new EmptyException(); T result = removeNodeAndGetValue(x); return result; } deqLock.lock(); deqLock.unlock();

Abstract vs. Concrete The queue’s actual (abstract) head and tail are not necessarily the items referenced by head and tail. The actual head is the successor of the node referenced by head. The actual tail is the last item reachable from the head. Unbounded Blocking Queue

Q u e u e We will review: Blocking Queue Lock-Free Queue o ABA Problem Synchronous Dual Queue progress performance

Blocking Algorithms Cons Performance on a multi-core processor Using several locks is exposed to error conditions like deadlock Contention instead of progress: A blocked thread does not make any progress. If the thread that holds the lock gets stuck - the threads that try to acquire the lock get stuck. A data structure protected by a lock cannot safely be accessed in an interrupt handler, as the preempted thread may be the one holding the lock. Sources:

Non-Blocking Algorithms Non-blocking – failure or suspension of any thread cannot cause failure or suspension of another thread. Lock-free – There is guaranteed system-wide progress. / Some thread finishes in a finite number of steps. Wait-free – There is guaranteed per-thread progress. / Every thread finishes in a finite number of steps. Source:

Lock-Free Algorithms In the absence of locks: Use atomic read-modify-write hardware primitives, like CAS But what about an action that is composed of several atomic primitive calls? o It should mark the data structure, so other threads may finish the action. "lock" "unlock"

Initialization: Unbounded ⇒ size and capacity are unnecessary Lock-free ⇒ locks and conditions are unnecessary tail = head = Node(value=null) * head, tail and Node.next are AtomicReference tail head next Lock-Free Queue

while (true) { if (next == null) { if ( ) { } } else { tail.compareAndSet(last, next); } } Lock-Free Queue - enq public void enq(T x) { Node node = new Node(x); Node last = tail.get(); Node next = last.next.get(); last.next.compareAndSet(next, node) tail.compareAndSet(last, node); return; }

Lock-Free Queue - deq while (true) { if ( ) tail.compareAndSet(last, next); } public T deq() throws EmptyException { Node first = head.get(); Node last = tail.get(); Node next = first.next.get(); if (first != last) { T value = next.value; head.compareAndSet(first, next) return value; } else { if (next == null) throw new EmptyException(); } }

ABA Problem Consider our lock-free queue But stop relying on Java’s garbage collector Each thread will maintain a private free list Now we encounter the ABA problem

head tail x a b head tail y b head tail b a free list (2) a tail a head b free list (2) tail a head b enq(y) deq 1 2 free list (2)

ABA Solution: A Stamp AtomicStampedReference encapsulates: a reference to Node an integer stamp AtomicReference AtomicStampedReference stamp is incremented in each compareAndSet. replaced by

Q u e u e We will review: Blocking Queue Lock-Free Queue o ABA Problem Synchronous Dual Queue

reservationfulfilment

head tail y x head tail head NotEmpty Condition -deq1 -deq2 -… Blocking Queue States head tail y x Synchronous Dual Queue States head tail null tail head enq reservations deq reservations In the dual queue - enq and deq are symmetrical. Both check queue’s state to figure out if they should be a reserver or a fulfiller. fair vs.

Synchronous Dual Queue Reserver adds an item to the queue, Other threads might finish the addition then spins on the item Fulfiller changes the item rendezvous To end the rendezvous - the item is removed from the queue Other threads might execute the removal lock- free lock- free