Hashing and Natural Parallism Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Slides:



Advertisements
Similar presentations
Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
Advertisements

Hash Tables.
CS4432: Database Systems II Hash Indexing 1. Hash-Based Indexes Adaptation of main memory hash tables Support equality searches No range searches 2.
1 Hash-Based Indexes Module 4, Lecture 3. 2 Introduction As for any index, 3 alternatives for data entries k* : – Data record with key value k – –Choice.
Hash-Based Indexes The slides for this text are organized into chapters. This lecture covers Chapter 10. Chapter 1: Introduction to Database Systems Chapter.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11.
Lecture 11 oct 6 Goals: hashing hash functions chaining closed hashing application of hashing.
Chapter 11 (3 rd Edition) Hash-Based Indexes Xuemin COMP9315: Database Systems Implementation.
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Lock-free Cuckoo Hashing Nhan Nguyen & Philippas Tsigas ICDCS 2014 Distributed Computing and Systems Chalmers University of Technology Gothenburg, Sweden.
Shared Counters and Parallelism Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Hash Table indexing and Secondary Storage Hashing.
1 Hash-Based Indexes Yanlei Diao UMass Amherst Feb 22, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
1 Hash-Based Indexes Chapter Introduction  Hash-based indexes are best for equality selections. Cannot support range searches.  Static and dynamic.
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
1 Hash-Based Indexes Chapter Introduction : Hash-based Indexes  Best for equality selections.  Cannot support range searches.  Static and dynamic.
Art of Multiprocessor Programming1 Concurrent Hashing Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified.
Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.
Lecture 11 oct 7 Goals: hashing hash functions chaining closed hashing application of hashing.
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Skip Lists.
Hashing. Hashing as a Data Structure Performs operations in O(c) –Insert –Delete –Find Is not suitable for –FindMin –FindMax –Sort or output as sorted.
COSC 2007 Data Structures II
Multicore Programming
CS121 Data Structures CS121 © JAS 2004 Tables An abstract table, T, contains table entries that are either empty, or pairs of the form (K, I) where K is.
Hashing and Hash-Based Index. Selection Queries Yes! Hashing  static hashing  dynamic hashing B+-tree is perfect, but.... to answer a selection query.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing as a Dictionary Implementation Chapter 19.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
CSC 427: Data Structures and Algorithm Analysis
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11 Modified by Donghui Zhang Jan 30, 2006.
Hashing and Natural Parallism Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Introduction to Database, Fall 2004/Melikyan1 Hash-Based Indexes Chapter 10.
1.1 CS220 Database Systems Indexing: Hashing Slides courtesy G. Kollios Boston University via UC Berkeley.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Indexed Sequential Access Method.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 10.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Building Java Programs Bonus Slides Hashing. 2 Recall: ADTs (11.1) abstract data type (ADT): A specification of a collection of data and the operations.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Concurrent Queues Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Multiprocessor Architecture Basics Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming Instructor: Erez Petrank Presented by Tomer Hermelin.
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Skip Lists.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing. Search Given: Distinct keys k 1, k 2, …, k n and collection T of n records of the form (k 1, I 1 ), (k 2, I 2 ), …, (k n, I n ) where I j is.
CMSC 341 Hashing Readings: Chapter 5. Announcements Midterm II on Nov 7 Review out Oct 29 HW 5 due Thursday CMSC 341 Hashing 2.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
An algorithm of Lock-free extensible hash table Yi Feng.
Chapter 5 Record Storage and Primary File Organizations
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
The Relative Power of Synchronization Operations Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Skip Lists.
Building Java Programs Generics, hashing reading: 18.1.
Fundamental Structures of Computer Science II
Hopscotch Hashing Nir Shavit Tel Aviv U. & Sun Labs Joint work with Maurice Herlihy and Moran Tzafrir.
Dynamic Hashing (Chapter 12)
Hashing CENG 351.
Hashing and Natural Parallism
Concurrent Hashing and Natural Parallelism
CSE 373 Data Structures and Algorithms
Hashing and Natural Parallism
Presentation transcript:

Hashing and Natural Parallism Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Art of Multiprocessor Programming22 Sequential Closed Hash Map h(k) = k mod 4 2 Items buckets

Art of Multiprocessor Programming3 Add an Item h(k) = k mod 4 3 Items

Art of Multiprocessor Programming44 Add Another: Collision h(k) = k mod 4 4 Items

Art of Multiprocessor Programming55 More Collisions h(k) = k mod 4 5 Items

Art of Multiprocessor Programming66 More Collisions h(k) = k mod 4 5 Items Problem: buckets getting too long

Art of Multiprocessor Programming77 Resizing Grow the array 5 Items h(k) = k mod 8

Art of Multiprocessor Programming88 5 Items Resizing h(k) = k mod 8 Adjust hash function

Art of Multiprocessor Programming99 Resizing h(4) = 0 mod h(k) = k mod 8

Art of Multiprocessor Programming10 Resizing h(k) = k mod 8 h(4) = 4 mod 8

Art of Multiprocessor Programming11 Resizing h(k) = k mod 8 h(15) = 7 mod 8

Art of Multiprocessor Programming12 Resizing h(k) = k mod 8 h(15) = 7 mod 8 157

Art of Multiprocessor Programming13 Fields public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet(int capacity) { table = new LockFreeList[capacity]; for (int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } … Array of lock-free lists

Art of Multiprocessor Programming14 Constructor public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet(int capacity) { table = new LockFreeList[capacity]; for (int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } … Initial size

Art of Multiprocessor Programming15 Constructor public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet(int capacity) { table = new LockFreeList[capacity]; for (int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } … Allocate memory

Art of Multiprocessor Programming16 Constructor public class SimpleHashSet { protected LockFreeList[] table; public SimpleHashSet(int capacity) { table = new LockFreeList[capacity]; for (int i = 0; i < capacity; i++) table[i] = new LockFreeList(); } … Initialization

Art of Multiprocessor Programming17 Add Method public boolean add(Object key) { int hash = key.hashCode() % table.length; return table[hash].add(key); }

Art of Multiprocessor Programming18 Add Method public boolean add(Object key) { int hash = key.hashCode() % table.length; return table[hash].add(key); } Use object hash code to pick a bucket

Art of Multiprocessor Programming19 Add Method public boolean add(Object key) { int hash = key.hashCode() % table.length; return table[hash].add(key); } Call bucket’s add() method

Art of Multiprocessor Programming20 No Brainer? We just saw a –Simple –Lock-free –Concurrent hash-based set implementation What’s not to like?

Art of Multiprocessor Programming21 No Brainer? We just saw a –Simple –Lock-free –Concurrent hash-based set implementation What’s not to like? We don’t know how to resize …

Art of Multiprocessor Programming22 Is Resizing Necessary? Constant-time method calls require –Constant-length buckets –Table size proportional to set size –As set grows, must be able to resize

Art of Multiprocessor Programming23 Set Method Mix Typical load –90% contains() –9% add () –1% remove() Growing is important Shrinking not so much

Art of Multiprocessor Programming24 When to Resize? Many reasonable policies. Here’s one. Pick a threshold on num of items in a bucket Global threshold –When ≥ ¼ buckets exceed this value Bucket threshold –When any bucket exceeds this value

Art of Multiprocessor Programming25 Coarse-Grained Locking Good parts –Simple –Hard to mess up Bad parts –Sequential bottleneck

Art of Multiprocessor Programming26 Fine-grained Locking Each lock associated with one bucket

Art of Multiprocessor Programming27 Resize This Make sure table reference didn’t change between resize decision and lock acquisition Acquire locks in ascending order

Art of Multiprocessor Programming28 Resize This Allocate new super-sized table

Art of Multiprocessor Programming29 Resize This

Art of Multiprocessor Programming30 Resize This

Art of Multiprocessor Programming31 Resize This Striped Locks: each lock now associated with two buckets

Art of Multiprocessor Programming32 Observations We grow the table, but not locks –Resizing lock array is tricky … We use sequential lists –Not LockFreeList lists –If we’re locking anyway, why pay?

Art of Multiprocessor Programming33 Fine-Grained Hash Set public class FGHashSet { protected RangeLock[] lock; protected List[] table; public FGHashSet(int capacity) { table = new List[capacity]; lock = new RangeLock[capacity]; for (int i = 0; i < capacity; i++) { lock[i] = new RangeLock(); table[i] = new LinkedList(); }} …

Art of Multiprocessor Programming34 Fine-Grained Hash Set public class FGHashSet { protected RangeLock[] lock; protected List[] table; public FGHashSet(int capacity) { table = new List[capacity]; lock = new RangeLock[capacity]; for (int i = 0; i < capacity; i++) { lock[i] = new RangeLock(); table[i] = new LinkedList(); }} … Array of locks

Art of Multiprocessor Programming35 Fine-Grained Hash Set public class FGHashSet { protected RangeLock[] lock; protected List[] table; public FGHashSet(int capacity) { table = new List[capacity]; lock = new RangeLock[capacity]; for (int i = 0; i < capacity; i++) { lock[i] = new RangeLock(); table[i] = new LinkedList(); }} … Array of buckets

Art of Multiprocessor Programming36 Fine-Grained Hash Set public class FGHashSet { protected RangeLock[] lock; protected List[] table; public FGHashSet(int capacity) { table = new List[capacity]; lock = new RangeLock[capacity]; for (int i = 0; i < capacity; i++) { lock[i] = new RangeLock(); table[i] = new LinkedList(); }} … Initially same number of locks and buckets

Art of Multiprocessor Programming37 The add() method public boolean add(Object key) { int keyHash = key.hashCode() % lock.length; synchronized (lock[keyHash]) { int tabHash = key.hashCode() % table.length; return table[tabHash].add(key); }

Art of Multiprocessor Programming38 Fine-Grained Locking public boolean add(Object key) { int keyHash = key.hashCode() % lock.length; synchronized (lock[keyHash]) { int tabHash = key.hashCode() % table.length; return table[tabHash].add(key); } Which lock?

Art of Multiprocessor Programming39 The add() method public boolean add(Object key) { int keyHash = key.hashCode() % lock.length; synchronized (lock[keyHash]) { int tabHash = key.hashCode() % table.length; return table[tabHash].add(key); } Acquire the lock

Art of Multiprocessor Programming40 Fine-Grained Locking public boolean add(Object key) { int keyHash = key.hashCode() % lock.length; synchronized (lock[keyHash]) { int tabHash = key.hashCode() % table.length; return table[tabHash].add(key); } Which bucket?

Art of Multiprocessor Programming41 The add() method public boolean add(Object key) { int keyHash = key.hashCode() % lock.length; synchronized (lock[keyHash]) { int tabHash = key.hashCode() % table.length; return table[tabHash].add(key); } Call that bucket’s add() method

Art of Multiprocessor Programming42 Fine-Grained Locking private void resize(int depth, List[] oldTab) { synchronized (lock[depth]) { if (oldTab == this.table){ int next = depth + 1; if (next < lock.length) resize (next, oldTab); else sequentialResize(); }}} resize() calls resize(0,this.table)

Art of Multiprocessor Programming43 Resizing private void resize(int depth, List[] oldTab) { synchronized (lock[depth]) { if (oldTab == this.table){ int next = depth + 1; if (next < lock.length) resize (next, oldTab); else sequentialResize(); }}}

Art of Multiprocessor Programming44 Resizing private void resize(int depth, List[] oldTab) { synchronized (lock[depth]) { if (oldTab == this.table){ int next = depth + 1; if (next < lock.length) resize (next, oldTab); else sequentialResize(); }}} Acquire next lock

Art of Multiprocessor Programming45 Resizing private void resize(int depth, List[] oldTab) { synchronized (lock[depth]) { if (oldTab == this.table) { int next = depth + 1; if (next < lock.length) resize (next, oldTab); else sequentialResize(); }}} Check that no one else has resized

Art of Multiprocessor Programming46 Resizing private void resize(int depth, List[] oldTab) { synchronized (lock[depth]) { if (oldTab == this.table){ int next = depth + 1; if (next < lock.length) resize (next, oldTab); else sequentialResize(); }}} Recursively acquire next lock

Art of Multiprocessor Programming47 Resizing private void resize(int depth, List[] oldTab) { synchronized (lock[depth]) { if (oldTab == this.table){ int next = depth + 1; if (next < lock.length) resize (next, oldTab); else sequentialResize(); }}} Locks acquired, do the work

Art of Multiprocessor Programming48 Read/Write Locks public interface ReadWriteLock { Lock readLock(); Lock writeLock(); }

Art of Multiprocessor Programming49 Read/Write Locks public interface ReadWriteLock { Lock readLock(); Lock writeLock(); } Returns associated read lock

Art of Multiprocessor Programming50 Read/Write Locks public interface ReadWriteLock { Lock readLock(); Lock writeLock(); } Returns associated read lock Returns associated write lock

Art of Multiprocessor Programming51 Lock Safety Properties Read lock: –Locks out writers –Allows concurrent readers Write lock –Locks out writers –Locks out readers

Art of Multiprocessor Programming52 Read/Write Lock Safety –If readers > 0 then writer == false –If writer == true then readers == 0 Liveness? –Will a continual stream of readers … –Lock out writers?

Art of Multiprocessor Programming53 FIFO R/W Lock As soon as a writer requests a lock No more readers accepted Current readers “drain” from lock Writer gets in

Art of Multiprocessor Programming54 The Story So Far Resizing is the hard part Fine-grained locks –Striped locks cover a range (not resized) Read/Write locks –FIFO property tricky

Art of Multiprocessor Programming55 Optimistic Synchronization Let the contains() method –Scan without locking If it finds the key –OK to return true –Actually requires a proof …. What if it doesn’t find the key?

Art of Multiprocessor Programming56 Optimistic Synchronization If it doesn’t find the key –May be victim of resizing Must try again –Getting a read lock this time Makes sense if –Keys are present –Resizes are rare

Art of Multiprocessor Programming57 Stop The World Resizing Resizing stops all concurrent operations What about an incremental resize? Must avoid locking the table A lock-free table + incremental resizing?

Art of Multiprocessor Programming58 Lock-Free Resizing Problem

Art of Multiprocessor Programming Lock-Free Resizing Problem Need to extend table

Art of Multiprocessor Programming60 Lock-Free Resizing Problem

Art of Multiprocessor Programming61 Lock-Free Resizing Problem to remove and then add even a single item single location CAS not enough We need a new idea…

Art of Multiprocessor Programming62 Don’t move the items Move the buckets instead Keep all items in a single lock-free list Buckets become “shortcut pointers” into the list

Art of Multiprocessor Programming63 Recursive Split Ordering

Art of Multiprocessor Programming64 Recursive Split Ordering 0 1/

Art of Multiprocessor Programming65 Recursive Split Ordering 0 1/2 1 1/43/

Art of Multiprocessor Programming66 Recursive Split Ordering 0 1/2 1 1/43/ List entries sorted in order that allows recursive splitting. How?

Art of Multiprocessor Programming67 Recursive Split Ordering

Art of Multiprocessor Programming68 Recursive Split Ordering LSB = Least significant Bit LSB 0 LSB 1

Art of Multiprocessor Programming69 Recursive Split Ordering LSB 00LSB 10LSB 01LSB 11

Art of Multiprocessor Programming70 Split-Order If the table size is 2 i, –Bucket b contains keys k k = b (mod 2 i ) –bucket index consists of key's i LSBs

Art of Multiprocessor Programming71 When Table Splits Some keys stay –b = k mod(2 i+1 ) Some move –b+2 i = k mod(2 i+1 ) Determined by (i+1) st bit –Counting backwards Key must be accessible from both –Keys that will move must come later

Art of Multiprocessor Programming72 A Bit of Magic Real keys:

Art of Multiprocessor Programming73 A Bit of Magic Real keys: Split-order: Real key 1 is in the 4 th location

Art of Multiprocessor Programming74 A Bit of Magic Real keys: Split-order: Real key 1 is in 4 th location

Art of Multiprocessor Programming75 A Bit of Magic Real keys: Split-order:

Art of Multiprocessor Programming76 A Bit of Magic Real keys: Split-order: Just reverse the order of the key bits

Art of Multiprocessor Programming77 Split Ordered Hashing Order according to reversed bits

Art of Multiprocessor Programming78 Parent Always Provides a Short Cut search

Art of Multiprocessor Programming79 Sentinel Nodes Problem: how to remove a node pointed by 2 sources using CAS

Art of Multiprocessor Programming80 Sentinel Nodes Solution: use a Sentinel node for each bucket 0 1

Art of Multiprocessor Programming81 Sentinel vs Regular Keys Want sentinel key for i ordered –before all keys that hash to bucket i –after all keys that hash to bucket (i-1)

Art of Multiprocessor Programming82 Splitting a Bucket We can now split a bucket In a lock-free manner Using two CAS() calls... –One to add the sentinel to the list –The other to point from the bucket to the sentinel

Art of Multiprocessor Programming83 Initialization of Buckets

Art of Multiprocessor Programming84 Initialization of Buckets Need to initialize bucket 3 to split bucket 1

Art of Multiprocessor Programming85 Adding = 2 mod 4 2 Must initialize bucket 2 Before adding 10

Art of Multiprocessor Programming86 Recursive Initialization = 3 mod 4 To add 7 to the list 3 Must initialize bucket 3Must initialize bucket 1 = 1 mod 2 1 Could be log n depth But expected depth is constant

Art of Multiprocessor Programming87 Lock-Free List int makeRegularKey(int key) { return reverse(key | 0x ); } int makeSentinelKey(int key) { return reverse(key); }

Art of Multiprocessor Programming88 Lock-Free List int makeRegularKey(int key) { return reverse(key | 0x ); } int makeSentinelKey(int key) { return reverse(key); } Regular key: set high-order bit to 1 and reverse

Art of Multiprocessor Programming89 Lock-Free List int makeRegularKey(int key) { return reverse(key | 0x ); } int makeSentinelKey(int key) { return reverse(key); } Sentinel key: simply reverse (high-order bit is 0)

Art of Multiprocessor Programming90 Main List Lock-Free List from earlier class With some minor variations

Art of Multiprocessor Programming91 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove(int k) {...} public boolean contains(int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; }

Art of Multiprocessor Programming92 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove(int k) {...} public boolean contains(int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; } Change: add takes key argument

Art of Multiprocessor Programming93 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove(int k) {...} public boolean contains(int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; } Inserts sentinel with key if not already present …

Art of Multiprocessor Programming94 Lock-Free List public class LockFreeList { public boolean add(Object object, int key) {...} public boolean remove(int k) {...} public boolean contains(int k) {...} public LockFreeList(LockFreeList parent, int key) {...}; } … returns new list starting with sentinel (shares with parent)

Art of Multiprocessor Programming95 Split-Ordered Set: Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); }

Art of Multiprocessor Programming96 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } For simplicity treat table as big array …

Art of Multiprocessor Programming97 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } In practice, want something that grows dynamically

Art of Multiprocessor Programming98 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } How much of table array are we actually using?

Art of Multiprocessor Programming99 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(2); setSize = new AtomicInteger(0); } Track set size so we know when to resize

Art of Multiprocessor Programming100 Fields public class SOSet { protected LockFreeList[] table; protected AtomicInteger tableSize; protected AtomicInteger setSize; public SOSet(int capacity) { table = new LockFreeList[capacity]; table[0] = new LockFreeList(); tableSize = new AtomicInteger(1); setSize = new AtomicInteger(0); } Initially use single bucket, and size is zero

Art of Multiprocessor Programming101 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; }

Art of Multiprocessor Programming102 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Pick a bucket

Art of Multiprocessor Programming103 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Non-Sentinel split-ordered key

Art of Multiprocessor Programming104 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Get reference to bucket’s sentinel, initializing if necessary

Art of Multiprocessor Programming105 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Call bucket’s add() method with reversed key

Art of Multiprocessor Programming106 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } No change? We’re done.

Art of Multiprocessor Programming107 add() public boolean add(Object object) { int hash = object.hashCode(); int bucket = hash % tableSize.get(); int key = makeRegularKey(hash); LockFreeList list = getBucketList(bucket); if (!list.add(object, key)) return false; resizeCheck(); return true; } Time to resize?

Art of Multiprocessor Programming108 Resize Divide set size by total number of buckets If quotient exceeds threshold –Double tableSize field –Up to fixed limit

Art of Multiprocessor Programming109 Initialize Buckets Buckets originally null If you find one, initialize it Go to bucket’s parent –Earlier nearby bucket –Recursively initialize if necessary Constant expected work

Art of Multiprocessor Programming110 Recall: Recursive Initialization = 3 mod 4 To add 7 to the list 3 Must initialize bucket 3Must initialize bucket 1 = 1 mod 2 1 Could be log n depth expected depth is constant

Art of Multiprocessor Programming111 Initialize Bucket void initializeBucket(int bucket) { int parent = getParent(bucket); if (table[parent] == null) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); }

Art of Multiprocessor Programming112 Initialize Bucket void initializeBucket(int bucket) { int parent = getParent(bucket); if (table[parent] == null) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); } Find parent, recursively initialize if needed

Art of Multiprocessor Programming113 Initialize Bucket void initializeBucket(int bucket) { int parent = getParent(bucket); if (table[parent] == null) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); } Prepare key for new sentinel

Art of Multiprocessor Programming114 Initialize Bucket void initializeBucket(int bucket) { int parent = getParent(bucket); if (table[parent] == null) initializeBucket(parent); int key = makeSentinelKey(bucket); LockFreeList list = new LockFreeList(table[parent], key); } Insert sentinel if not present, and get back reference to rest of list

Art of Multiprocessor Programming115 Correctness Linearizable concurrent set Theorem: O(1) expected time –No more than O(1) items expected between two dummy nodes on average –Lazy initialization causes at most O(1) expected recursion depth in initializeBucket()

Closed (Chained) Hashing Advantages: –with N buckets, M items, Uniform h –retains good performance as table density (M/N) increases  less resizing Disadvantages: –dynamic memory allocation –bad cache behavior (no locality) Oh, did we mention that cache behavior matters on a multicore?

Linear Probing* contains(x) – search linearly from h(x) to h(x) + H recorded in bucket. H z x h(x) =7 z *Attributed to Amdahl…

Linear Probing add(x) – put in first empty bucket, and update H. H z z h(x) =3 zzzzzz zz zzzz x =6

Linear Probing Open address means M · N Expected items in bucket same as Chaining Expected distance till open slot: ½(1+(1/(1-M/N)) 2 M/N = 0.5  search 2.5 buckets M/N = 0.9  search 50 buckets

Linear Probing Advantages: –Good locality  fewer cache misses Disadvantages: –As M/N increases more cache misses searching 10s of unrelated buckets “Clustering” of keys into neighboring buckets –As computation proceeds “Contamination” by deleted items  more cache misses

Cuckoo Hashing Add(x) – if h 1 (x) and h 2 (x) full evict y and move it to h 2 (y)  h 2 (x). Then place x in its place. z h 1 (x) zzzzz z z zzzz z h 2 (x) z z zwzz zz z zzz h 2 (y) y x But cycles can form

Cuckoo Hashing Advantages: –contains() : deterministic 2 buckets –No clustering or contamination Disadvantages: –2 tables –h i (x) are complex –As M/N increases  relocation cycles –Above M/N = 0.5 Add() does not work!

Hopscotch Hashing Single Array, Simple hash function Idea: define neighborhood of original bucket In neighborhood items found quickly Use sequences of displacements to move items into their neighborhood

Hopscotch Hashing contains(x) – search in at most H buckets (the hop-range) based on hop-info bitmap. In practice pick H to be 32. z h(x) H=4 x 101 0

Hopscotch Hashing add(x) – probe linearly to find open slot. Move the empty slot via sequence of displacements into the hop-range of h(x). z z h(x) x x r r s s v v u u w w

Hopscotch Hashing contains –wait-free, just look in neighborhood

Hopscotch Hashing contains –wait-free, just look in neighborhood add –expected distance same as in linear probing

Hopscotch Hashing contains –wait-free, just look in neighborhood add –Expected distance same as in linear probing resize –neighborhood full less likely as H  log n –one word hop-info bitmap, or use smaller H and default to linear probing

Advantages Good locality and cache behavior As table density (M/N) increases  less resizing Move cost to add() from contains() Easy to parallelize

Recall: Concurrent Chained Hashing Lock for add() and unsuccessful contains() Striped Locks

Concurrent Simple Hopscotch contains() is wait-free h(x)

Concurrent Simple Hopscotch Add(x) – lock bucket, mark empty slot using CAS, add x erasing mark z z r r v v u u ts x x

Concurrent Simple Hopscotch add(x) – lock bucket, mark empty slot using CAS, lock bucket and update timestamp of bucket being displaced before erasing old value z z v v u u ts r r s s s s

Concurrent Simple Hopscotch A z z r r v v u u ts s s x not found

Is performance dominated by cache behavior? Run algs on state of the art multicores and uniprocessors: –Sun 64 way Niagara II, and –Intel 3GHz Xeon Benchmarks pre-allocated memory to eliminate effects of memory management

with memory pre-allocated

Cuckoo stops here

with memory pre-allocated with allocation

Summary Chained hash with striped locking is simple and effective in many cases Hopscotch with striped locking great cache behavior If incremental resizing needed go for split-ordered

Art of Multiprocessor Programming144 This work is licensed under a Creative Commons Attribution- ShareAlike 2.5 License.Creative Commons Attribution- ShareAlike 2.5 License You are free: –to Share — to copy, distribute and transmit the work –to Remix — to adapt the work Under the following conditions: –Attribution. You must attribute the work to “The Art of Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work). –Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to – Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights.