Practical Non-blocking Unordered Lists

Slides:



Advertisements
Similar presentations
Wait-Free Linked-Lists Shahar Timnat, Anastasia Braginsky, Alex Kogan, Erez Petrank Technion, Israel Presented by Shahar Timnat 469-+
Advertisements

CAFÉ: Scalable Task Pool with Adjustable Fairness and Contention Dmitry Basin, Rui Fan, Idit Keidar, Ofer Kiselov, Dmitri Perelman Technion, Israel Institute.
Wait-Free Queues with Multiple Enqueuers and Dequeuers
1 Chapter 4 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
Architecture-aware Analysis of Concurrent Software Rajeev Alur University of Pennsylvania Amir Pnueli Memorial Symposium New York University, May 2010.
CS252: Systems Programming Ninghui Li Program Interview Questions.
1/20 Generalized Symbolic Execution for Model Checking and Testing Charngki PSWLAB Generalized Symbolic Execution for Model Checking and Testing.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Scalable and Lock-Free Concurrent Dictionaries
Wait-Free Reference Counting and Memory Management Håkan Sundell, Ph.D.
Dynamic-Sized Nonblocking Hash Tables Yujie Liu Kunlong Zhang Michael Spear Lehigh Univ. Tianjin Univ. Lehigh Univ.
Scalable Synchronous Queues By William N. Scherer III, Doug Lea, and Michael L. Scott Presented by Ran Isenberg.
Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.
Lock-free Cuckoo Hashing Nhan Nguyen & Philippas Tsigas ICDCS 2014 Distributed Computing and Systems Chalmers University of Technology Gothenburg, Sweden.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Chapter 12 – Data Structures Outline 12.1Introduction.
Comparison Under Abstraction for Verifying Linearizability Daphna Amit Noam Rinetzky Mooly Sagiv Tom RepsEran Yahav Tel Aviv UniversityUniversity of Wisconsin.
SUPPORTING LOCK-FREE COMPOSITION OF CONCURRENT DATA OBJECTS Daniel Cederman and Philippas Tsigas.
C o n f i d e n t i a l Developed By Nitendra NextHome Subject Name: Data Structure Using C Title: Overview of Data Structure.
Comp 249 Programming Methodology Chapter 15 Linked Data Structure - Part B Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia.
A Consistency Framework for Iteration Operations in Concurrent Data Structures Yiannis Nikolakopoulos A. Gidenstam M. Papatriantafilou P. Tsigas Distributed.
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
A Methodology for Creating Fast Wait-Free Data Structures Alex Koganand Erez Petrank Computer Science Technion, Israel.
Practical concurrent algorithms Mihai Letia Concurrent Algorithms 2012 Distributed Programming Laboratory Slides by Aleksandar Dragojevic.
Collections Data structures in Java. OBJECTIVE “ WHEN TO USE WHICH DATA STRUCTURE ” D e b u g.
Priority Queues Dan Dvorin Based on ‘The Art of Multiprocessor Programming’, by Herlihy & Shavit, chapter 15.
An algorithm of Lock-free extensible hash table Yi Feng.
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects MAGED M. MICHAEL PRESENTED BY NURIT MOSCOVICI ADVANCED TOPICS IN CONCURRENT PROGRAMMING,
Memory Management Memory Areas and their use Memory Manager Tasks:
Cpt S 122 – Data Structures Abstract Data Types
11 Map ADTs Map concepts. Map applications.
Stacks.
Chapter 12 – Data Structures
Course Developer/Writer: A. J. Ikuomola
Hazard Pointers C++ Memory Ordering Issues
COMP 53 – Week Eleven Hashtables.
Combining HTM and RCU to Implement Highly Efficient Balanced Binary Search Trees Dimitrios Siakavaras, Konstantinos Nikas, Georgios Goumas and Nectarios.
Combining Data Structures
Håkan Sundell Philippas Tsigas
CSCI 210 Data Structures and Algorithms
Memory Management Memory Areas and their use Memory Manager Tasks:
Week 15 – Monday CS221.
Hashing Exercises.
A Lock-Free Algorithm for Concurrent Bags
Expander: Lock-free Cache for a Concurrent Data Structure
Efficiency add remove find unsorted array O(1) O(n) sorted array
Introduction to Data Structure
O(lg n) Search Tree Tree T is a search tree made up of n elements: x0 x1 x2 x3 … xn-1 No function (except transverse) takes more than O(lg n) in the.
CS302 Data Structures Fall 2012.
Anders Gidenstam Håkan Sundell Philippas Tsigas
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
Data Structures 2018 Quiz Answers
structures and their relationships." - Linus Torvalds
Chapter 8 – Binary Search Tree
Concurrent Data Structures Concurrent Algorithms 2017
Data Structures and Algorithms
slides adapted from Marty Stepp and Hélène Martin
Multicore programming
CSE 373 Data Structures and Algorithms
Cs212: Data Structures Computer Science Department Lecture 7: Queues.
Sorted Linked List A linked list is a data structure that consists of a sequence of data records such that in each record there is a field that contains.
Multicore programming
A Concurrent Lock-Free Priority Queue for Multi-Thread Systems
Introduction to Data Structures
Multicore programming
slides created by Marty Stepp
Memory Management Memory Areas and their use Memory Manager Tasks:
EECE.3220 Data Structures Instructor: Dr. Michael Geiger Spring 2019
structures and their relationships." - Linus Torvalds
MV-RLU: Scaling Read-Log-Update with Multi-Versioning
Presentation transcript:

Practical Non-blocking Unordered Lists Kunlong Zhang Yujiao Zhao Yajun Yang Tianjin University Yujie Liu Michael Spear Lehigh University 9/20/2018 DISC 2013

Background Linked lists are fundamental and important Building blocks of more complex data structures, i.e. hash tables Widely used in software systems, i.e. operating system kernels Non-blocking implementations Wait-free if every operation completes in bounded number of steps Lock-free if some operation completes in bounded number of steps (Individual threads may starve) Exist many practical lock-free data structures Lists, stacks, queues, skip lists, binary search trees, red black trees… Practical wait-free implementations are rare Stacks and queues by universal construction [FatourouKallimanis SPAA11] Queues and ordered lists using fast-path-slow-path [KoganPetrank PPoPP12] [Timnat et al. OPODIS12] 9/20/2018 DISC 2013

Our Work Lock-free and wait-free unordered list based set implementations The implementations are linearizable Operations happen atomically at some instant point between the invocation and response Building blocks A novel lock-free unordered list algorithm A wait-free enqueue technique [KoganPetrank PPoPP11] (Opt.) Fast-path-slow-path method [KoganPetrank PPoPP12] 9/20/2018 DISC 2013

Outline LFList WFList Performance evaluation Conclusion & discussion A lock-free unordered list based set WFList Based on the LFList algorithm Key: making “Enlist” wait-free Performance evaluation Conclusion & discussion 9/20/2018 DISC 2013

LFList: A Lock-Free List Based Set Insert(int k) : bool Add value k to the set, return true if k was not in set or false otherwise Remove(int k) : bool Remove value k from the set, return true if k was in the set or false otherwise Contains(int k) : bool Indicate whether k is in the set or not 9/20/2018 DISC 2013

LFList Primiteves Require garbage collection Compare-And-Swap (CAS) & atomic load/store of memory words Require garbage collection Key values are comparable (using equality operator) Not necessarily totally ordered Not necessarily bounded/discrete 9/20/2018 DISC 2013

Insert() Step One - Enlist head CAS CAS CAS 3 INSERT 7 INSERT 3 INSERT 9/20/2018 DISC 2013

Insert() Step Two - Traversal head 3 INSERT 7 INSERT 3 INSERT 9/20/2018 DISC 2013

Insert() Step Two - Traversal head 3 INVALID 7 INSERT 3 INSERT CAS Insert(): false 9/20/2018 DISC 2013

Insert() Step Two - Traversal head 3 INVALID 7 DATA 3 DATA CAS CAS Insert(): true Insert(): true 9/20/2018 DISC 2013

Remove() Step One - Enlist head CAS CAS 3 REMOVE 7 REMOVE 3 DATA 9/20/2018 DISC 2013

Remove() Step Two - Traversal head 3 REMOVE 7 REMOVE 3 DATA 9/20/2018 DISC 2013

Remove() Step Two - Traversal head 3 REMOVE 7 REMOVE 3 INVALID Store 9/20/2018 DISC 2013

Remove() Step Two - Traversal head 3 INVALID 7 REMOVE 3 INVALID Store Remove(): true 9/20/2018 DISC 2013

Remove() Step Two - Traversal head 3 INVALID 7 REMOVE 3 INVALID 9/20/2018 DISC 2013

Remove() Step Two - Traversal head 3 INVALID 7 INVALID 3 INVALID Store Remove(): false 9/20/2018 DISC 2013

Insert() and Remove(): Tricky Stuff During traversals INVALID nodes are ignored Return (likely) if encounter a non-INVALID node with the same key head Consider adding a picture showing early return on meeting an operation node Split the tricky stuff slide 3 REMOVE 3 REMOVE 5 DATA 9/20/2018 DISC 2013

Insert() and Remove(): Tricky Stuff Memory reclamation is subtle Only requires loads and stores INVALID nodes can “resurrect” See paper for details Linearization points Insert / Remove are at the successful CAS in Enlist; return values are determined but not yet “known” Contains? Consider adding a picture showing early return on meeting an operation node Split the tricky stuff slide 9/20/2018 DISC 2013

Contains() Search from the head for the first non-INVALID node with the key value Return true if the node is INSERT/DATA Return false otherwise (REMOVE) Behavior similar to LazyList [Heller et al. OPODIS06] Subsequent Remove() can bypass Contains() Same technique to prove linearizability [Colvin et al. CAV06] 9/20/2018 DISC 2013

Achieving Wait-freedom Idea: Make the Enlist operation wait-free Threads announce an operation by creating a descriptor in a shared array On each operation, scan the array to help others A descriptor contains sufficient information for anyone to help finish the operation Each operation is associated with a priority number to prevent unbounded helping Technique: Adapted from WF enqueue [KoganPetrank PPoPP11] 9/20/2018 DISC 2013

A Wait-free Enlist Implementation 9/20/2018 DISC 2013

1 2 3 head status phase = 0 pending = true 65 next INSERT prev tid = 2 1 2 3 head status phase = 0 pending = true 65 next INSERT prev tid = 2 -1 next REMOVE prev tid = -1 CAS sentinel 9/20/2018 DISC 2013

1 2 3 head status phase = 0 pending = true phase = 0 pending = false 1 2 3 head status CAS phase = 0 pending = true phase = 0 pending = false 65 next INSERT prev tid = 2 -1 next REMOVE prev tid = -1 sentinel 9/20/2018 DISC 2013

1 2 3 head status phase = 0 pending = false 65 next INSERT prev 1 2 3 head status phase = 0 pending = false CAS 65 next INSERT prev tid = 2 -1 next REMOVE prev tid = -1 dummy sentinel 9/20/2018 DISC 2013

An Adaptive (Wait-free) Algorithm Using the fast-path-slow-path method [KoganPetrank, PPoPP 12’] Threads start by running fast path Enlist() Fall back to slow path if succeed/fail too many times Fast path is not the Enlist in LFList! More like the wait-free Enlist, but taking out announcing and helping steps 9/20/2018 DISC 2013

Performance Evaluation HarrisAMR: Harris-Michael [Harris DISC01, Michael SPAA02] + Wait-free lookup [Heller et al. OPODIS06] Best known lock-free ordered list algorithm HarrisRTTI: Above algorithm + Java RTTI [Heller et al. OPODIS06] Best known implementation of above algorithm LazyList: An efficient lock-based algorithm [Heller et al. OPODIS06] LFList: Our lock-free unordered list algorithm WFList: Our vanilla wait-free unordered list algorithm Adaptive: Wait-free algorithm using fast-path-slow-path method FastPath: The fast-path lock-free algorithm used in the adaptive algorithm 9/20/2018 DISC 2013

Performance Evaluation Environment Xeon 5650, 6 cores (12 threads) 6GB memory Linux 2.6.37 + OpenJDK 1.6.0 Median of 5 trials (5 seconds) Variance below 5% Varied key range & R-W ratio Fix the ? 9/20/2018 DISC 2013

80% Lookup, [0,512), 256 Elems Need labels on axis Legends not overlapping with curves Note that over 6 threads triggers hyperthreading Lists are prepopulated 9/20/2018 DISC 2013

80% Lookup, [0,2K), 1K Elems Difference between unordered and ordered lists (traversal distance) 9/20/2018 DISC 2013

Conclusion & Discussion We presented practical lock-free and wait-free unordered list implementations Scalable, competitive performance for short lists (high contention) Constant factor slowdown for large lists Experience with the fast-path-slow-path adaptive method Given a lock-free algorithm, deriving its corresponding wait-free version is non-trivial (also shown in [Timnat et al., OPODIS 12’]) Given the wait-free algorithm, choosing its lock-free fast path needs care Future work Same technique (Enlist) can be used to implement WF stacks A static WF hash table is trivial to implement What other data structures can we build using unordered lists? 9/20/2018 DISC 2013