Practical and Lock-Free Doubly Linked Lists Håkan Sundell Philippas Tsigas.

Slides:



Advertisements
Similar presentations
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas.
Advertisements

Hazard Pointers: Safe Memory Reclamation of Lock-Free Objects
Håkan Sundell, Chalmers University of Technology 1 Evaluating the performance of wait-free snapshots in real-time systems Björn Allvin.
Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community.
Architecture-aware Analysis of Concurrent Software Rajeev Alur University of Pennsylvania Amir Pnueli Memorial Symposium New York University, May 2010.
Maged M. Michael, “Hazard Pointers: Safe Memory Reclamation for Lock- Free Objects” Presentation Robert T. Bauer.
Scalable and Lock-Free Concurrent Dictionaries
(C) Ph. Tsigas © Ph. Tsigas Algorithm Engineering of Parallel Algorithms and Parallel Data Structures Philippas Tsigas.
Wait-Free Reference Counting and Memory Management Håkan Sundell, Ph.D.
Dynamic-Sized Nonblocking Hash Tables Yujie Liu Kunlong Zhang Michael Spear Lehigh Univ. Tianjin Univ. Lehigh Univ.
Håkan Sundell, Chalmers University of Technology 1 Space Efficient Wait-free Buffer Sharing in Multiprocessor Real-time Systems Based.
A Lock-Free Multiprocessor OS Kernel1 Henry Massalin and Calton Pu Columbia University June 1991 Presented by: Kenny Graunke.
Highly-Concurrent Data Structures Hagit Attiya and Eshcar Hillel Computer Science Department Technion.
E81 CSE 532S: Advanced Multi-Paradigm Software Development Chris Gill Department of Computer Science and Engineering Washington University in St. Louis.
Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.
Concurrent Data Structures in Architectures with Limited Shared Memory Support Ivan Walulya Yiannis Nikolakopoulos Marina Papatriantafilou Philippas Tsigas.
TOWARDS A SOFTWARE TRANSACTIONAL MEMORY FOR GRAPHICS PROCESSORS Daniel Cederman, Philippas Tsigas and Muhammad Tayyab Chaudhry.
Lock-free Cuckoo Hashing Nhan Nguyen & Philippas Tsigas ICDCS 2014 Distributed Computing and Systems Chalmers University of Technology Gothenburg, Sweden.
Computer Laboratory Practical non-blocking data structures Tim Harris Computer Laboratory.
CS510 Concurrent Systems Class 2 A Lock-Free Multiprocessor OS Kernel.
Locality in Concurrent Data Structures Hagit Attiya Technion.
1 Concurrency: Deadlock and Starvation Chapter 6.
CS533 - Concepts of Operating Systems 1 Class Discussion.
Simple, Fast and Practical Non- Blocking and Blocking Concurrent Queue Algorithms Maged M. Michael & Michael L. Scott Presented by Ahmed Badran.
Computer Laboratory Practical non-blocking linked lists Tim Harris Computer Laboratory.
SUPPORTING LOCK-FREE COMPOSITION OF CONCURRENT DATA OBJECTS Daniel Cederman and Philippas Tsigas.
Synchronization Methods for Multicore Programming Brendan Lynch.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology High-level Specification and Efficient Implementation.
Algorithms for Synchronization and Consistency in Concurrent System Services Anders Gidenstam Distributed Computing and Systems group, Department of Computer.
1 Lock-Free Linked Lists Using Compare-and-Swap by John Valois Speaker’s Name: Talk Title: Larry Bush.
Design and Implementation of the Recursive Virtual Address Space Model for Small Scale Multiprocessor Systems Diploma Thesis Marcus Völp Supervisor: Volkmar.
Parallel Programming Philippas Tsigas Chalmers University of Technology Computer Science and Engineering Department © Philippas Tsigas.
Simple Wait-Free Snapshots for Real-Time Systems with Sporadic Tasks Håkan Sundell Philippas Tsigas.
CS510 Concurrent Systems Jonathan Walpole. A Lock-Free Multiprocessor OS Kernel.
Håkan Sundell, Chalmers University of Technology 1 Using Timing Information on Wait-Free Algorithms in Real-Time Systems (2 papers)
Understanding Performance of Concurrent Data Structures on Graphics Processors Daniel Cederman, Bapi Chatterjee, Philippas Tsigas Distributed Computing.
Håkan Sundell, Chalmers University of Technology 1 NOBLE: A Non-Blocking Inter-Process Communication Library Håkan Sundell Philippas.
November 15, 2007 A Java Implementation of a Lock- Free Concurrent Priority Queue Bart Verzijlenberg.
Håkan Sundell, Chalmers University of Technology 1 Applications of Non-Blocking Data Structures to Real-Time Systems Seminar for the.
Håkan Sundell, Chalmers University of Technology 1 Simple and Fast Wait-Free Snapshots for Real-Time Systems Håkan Sundell Philippas.
A Consistency Framework for Iteration Operations in Concurrent Data Structures Yiannis Nikolakopoulos A. Gidenstam M. Papatriantafilou P. Tsigas Distributed.
1 Announcements The fixing the bug part of Lab 4’s assignment 2 is now considered extra credit. Comments for the code should be on the parts you wrote.
Challenges in Non-Blocking Synchronization Håkan Sundell, Ph.D. Guest seminar at Department of Computer Science, University of Tromsö, Norway, 8 Dec 2005.
Non-blocking Data Structures for High- Performance Computing Håkan Sundell, PhD.
Maged M.Michael Michael L.Scott Department of Computer Science Univeristy of Rochester Presented by: Jun Miao.
1 Contention Management and Obstruction-free Algorithms Niloufar Shafiei.
CSE 425: Concurrency II Semaphores and Mutexes Can avoid bad inter-leavings by acquiring locks –Guard access to a shared resource to take turns using it.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
Practical concurrent algorithms Mihai Letia Concurrent Algorithms 2012 Distributed Programming Laboratory Slides by Aleksandar Dragojevic.
CS510 Concurrent Systems Why the Grass May Not Be Greener on the Other Side: A Comparison of Locking and Transactional Memory.
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects Maged M. Michael Presented by Abdulai Sei.
CS510 Concurrent Systems Jonathan Walpole. A Methodology for Implementing Highly Concurrent Data Objects.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
November 27, 2007 Verification of a Concurrent Priority Queue Bart Verzijlenberg.
Lists (2). Circular Doubly-Linked Lists with Sentry Node Head.
A Calculus of Atomic Actions Tayfun Elmas, Shaz Qadeer and Serdar Tasiran POPL ‘ – Seminar in Distributed Algorithms Cynthia Disenfeld 27/05/2013.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
An algorithm of Lock-free extensible hash table Yi Feng.
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects MAGED M. MICHAEL PRESENTED BY NURIT MOSCOVICI ADVANCED TOPICS IN CONCURRENT PROGRAMMING,
CS510 Concurrent Systems Tyler Fetters. A Methodology for Implementing Highly Concurrent Data Objects.
NB-FEB: A Universal Scalable Easy- to-Use Synchronization Primitive for Manycore Architectures Phuong H. Ha (Univ. of Tromsø, Norway) Philippas Tsigas.
Combining HTM and RCU to Implement Highly Efficient Balanced Binary Search Trees Dimitrios Siakavaras, Konstantinos Nikas, Georgios Goumas and Nectarios.
Håkan Sundell Philippas Tsigas
Lock-Free Linked Lists Using Compare-and-Swap
A Lock-Free Algorithm for Concurrent Bags
Practical Non-blocking Unordered Lists
Anders Gidenstam Håkan Sundell Philippas Tsigas
NOBLE: A Non-Blocking Inter-Process Communication Library
A Concurrent Lock-Free Priority Queue for Multi-Thread Systems
Presentation transcript:

Practical and Lock-Free Doubly Linked Lists Håkan Sundell Philippas Tsigas

2 Outline  Synchronization Methods  Doubly Linked Lists  Concurrent Doubly Linked Lists –Previous results –New Lock-Free Algorithm  Experiments  Conclusions

3 Synchronization  Shared data structures needs synchronization  Synchronization using Locks –Mutually exclusive access to whole or parts of the data structure P1 P2 P3 P1 P2 P3

4 Blocking Synchronization  Drawbacks –Blocking –Priority Inversion –Risk of deadlock  Locks: Semaphores, spinning, disabling interrupts etc. –Reduced efficiency because of reduced parallelism

5 Non-blocking Synchronization  Lock-Free Synchronization –Optimistic approach (i.e. assumes no interference) 1.The operation is prepared to later take effect (unless interfered) using hardware atomic primitives 2.Possible interference is detected via the atomic primitives, and causes a retry –Can cause starvation  Wait-Free Synchronization –Always finishes in a finite number of its own steps.

6 Doubly Linked Lists  Fundamental data structure –Can be used to implement various abstract data types (e.g. deques)  Unordered List, i.e. the nodes are ordered only relatively to each other. –Supports Traversals –Supports Inserts/Deletes at arbitrary positions  Operations: InsertAfter, InsertBefore, Delete, Prev, Next, Read, First, Last HT

7 Previous Non-blocking Doubly Linked Lists  M. Greenwald, “Two-handed emulation: how to build non-blocking implementations of complex data structures using DCAS”, PODC 2002  O. Agesen et al., “DCAS-based concurrent deques”, SPAA 2000 –D. Detlefs et al., “Even better DCAS-based concurrent deques”, DISC 2000 –P. Martin et al. “DCAS-based concurrent deques supporting bulk allocation”, TR, 2002 –Errata: S. Doherty et al. “DCAS is not a silver bullet for nonblocking algorithm design”, SPAA 2004

8 Previous Non-blocking Doubly Linked Lists  H. Attiya and E. Hillel, “Built-in coloring for highly-concurrent doubly linked lists”, DISC 2006  J. Valois, Ph.D. Thesis, 1995  P. Martin (D. Lea) “A practical lock-free doubly-linked list”, 2004  Problems (unified, all prev. pub have subset): –DCAS not available in contempory sys. –No disjoint access parallelism –No traversals from deleted nodes. –No delete operation –Not consistent data structure when idle

9 New Lock-Free Concurrent Doubly Linked List ( Deque at OPODIS 2004 )  Treat the doubly linked list as a singly linked list with auxiliary information in each node about its predecessor!  Singly Linked Lists –T. Harris, “A pragmatic implementation of non-blocking linked lists”, DISC 2001  Marks pointers using spare bit  Needs only standard CAS HT

10 Lock-Free Doubly Linked Lists - INSERT

11 Lock-Free Doubly Linked Lists - DELETE

Lock-Free Doubly Linked List – Traversal and Positions  Informal: From a deleted node (i.e. earlier position) there should be a path that leads into the active list (i.e. a new position).  Definition 1 The position of a cursor that references a node that is present in the list is the referenced node. The position of a cursor that references a deleted node, is represented by the node that was directly to the next of the deleted node at the very moment of the deletion (i.e. the setting of the deletion mark). If that node is deleted as well, the position is equal to the position of a cursor referencing that node, and so on recursively. The actual position is then interpreted to be at an imaginary node directly previous of the representing node.

13 Lock-Free Doubly Linked List - Memory Management  The information about neighbor nodes should also be accessible in partially deleted nodes! –Enables helping operations to find –Enables continuous traversals  A. Gidenstam et. al, “Efficient and reliable lock-free memory reclamation based on reference counting”, ISPAN 2005 –Combines Hazard Pointers with Reference Count.  Allows fully dynamic memory management  Allows (also in deleted) in-node safe pointers  Breaks cyclic garbage  High performance

14 New Lock-Free Doubly Linked List - Techniques Summary  General Doubly Linked List Structure –Treated as singly linked lists with extra info  Uses CAS atomic primitive  Lock-Free memory management –Gidenstam et. al: (HP + Ref. Count).  Helping scheme  Back-Off strategy  All together proved to be linearizable

Experiments  Implemented in C on two high-performance computers –Sun Fire 15K, Solaris, Ultrasparc III 900 MHz, 48-way –IBM p690+ Regatta, AIX, 1.7 GHz, 32-way  random operations / thread –30% traversal, 10% insertbefore, 10% insertafter, 50% delete  1 upto 32 threads  Measured average execution time –Scales lineary with increasing number of threads

16 Conclusions  Implements a general doubly linked list, the first lock-free using CAS. –Allows disjoint-access parallelism. –Uses lock-free memory management. –Uses atomic primitives available in contemporary systems. –Allows fully dynamic list sizes. –Supports traversals of deleted nodes.

17 Questions?  Contact Information: –Address:  Håkan Sundell, School of Business and Informatics, University College of Borås, Sweden  Philippas Tsigas, Computing Science, Chalmers University of Technology, Sweden –   –Web: 