Download presentation
Presentation is loading. Please wait.
1
Multicore programming
Unordered sets: concurrent hash tables Week 3 - Monday Trevor Brown
2
Last time Fixed size, probing hashtables
With deletion (using tombstones)
3
How tombstones fix our problem
Insert(7) hashes to 2 Insert(3) hashes to 5 7 9 3 Insert(9) hashes to 2 Delete(7) hashes to 2 Delete(9) hashes to 2
4
Downsides of tombstones?
TOMBSTONEs are never cleaned up Cleaning them up seems hard Cannot remove a TOMBSTONE until there are no keys in the table that probed past the TOMBSTONE when they were inserted Thought: if we do table expansion, there’s no need to copy over TOMBSTONEs… 9 3 Probes when 9 was inserted
5
Efficiently Expanding a hash table Concurrent Hash Tables: Fast and General(?)! [MSD2016]
Issues to solve When to expand? Too much probing? Estimate of table size vs capacity? Which threads will perform the expansion? Enslaving user threads that access the table? Custom thread pool? How to expand? Block updates when expansion is happening? Always allow updates? Hard to always allow updates “we are fine with small amounts of (b)locking, as long as it improves the overall performance” (a healthy attitude)
6
High level idea When estimated size > 50% of table size, expand by 2x & migrate contents into the new table Naïve idea: global lock on table, copy with one thread --- inefficient! Partition old table into chunks (default size 4096) Numbered 0, 1, 2, …, floor(capacity / 4096) Threads use fetch & add to claim chunks they will migrate Migrate a chunk to the new table by performing insert() on each key
7
What if expansion is Concurrent with updates?
1 2 Thread 1: F&A to claim a chunk chunksClaimed Thread 2: F&A to claim a chunk Thread 1 Thread 2 Thread 1: see bucket 0 is empty Thread 1: copy bucket 1 (val 6) Old table 9 6 3 4 2 Thread 3: insert 9 at index 0 Threads 1&2: finish expansion Lost key 9! Must prevent updates to cells we’ve already migrated! New table 6 3 4 2
8
Solution: marking Steal a bit from each key field
Use it to store a mark Invariant: a bucket with a marked key cannot be modified In expansion, mark each bucket before migrating it currKey = bucket->key CAS(bucket->key, currKey, currKey | MARKED_MASK) In insert/delete, before performing CAS(bucket->key, currKey, newKey), must verify that old is not marked (otherwise, help with expansion) Could we use a write to mark the bucket, instead of CAS? unmarked marked 7 7 X
9
How marking helps 1 2 Thread 1: F&A to claim a chunk chunksClaimed
Thread 1: F&A to claim a chunk chunksClaimed Thread 2: F&A to claim a chunk Thread 1 Thread 2 Thread 1: mark bucket 0 Thread 1: mark & copy bucket 1 Old table 6 3 4 2 X X Thread 3: try idx0 CAS(&bucket[0]->key, <NULL, unmarked>, <9, unmarked>) CAS FAILS! New table 6 Thread 3: help expansion (if there are any unclaimed chunks) After expansion, retry insert(9)
10
Detecting table > 50% full
Approximate counter: Global data: volatile int size Per-thread data: int numInserts When a thread inserts, it increments its own private numInserts After a thread does Ω(𝑛𝑢𝑚𝑇ℎ𝑟𝑒𝑎𝑑s) increments, it does Fetch&Add(&size, numInserts) then sets its private numInserts to 0 How far off can size be from the true value? 𝐸𝑟𝑟𝑜𝑟=𝑐⋅𝑛𝑢𝑚𝑇ℎ𝑟𝑒𝑎𝑑 𝑠 2 Insignificant for large tables (for c=1, 100 threads, error is 10000: 1% of 1M keys) Could we do better? Reduces contention on size vs F&A every time
11
Implementation sketch
With expansion struct hashmap 1 volatile char padding0[64]; 2 table * volatile currentTable; 3 volatile char padding1[64]; /* code for operations ... */ Without expansion struct hashmap 1 volatile char padding0[64]; 2 volatile uint64_t * data; 3 const int capacity; 4 volatile char padding1[64]; /* code for operations ... */ struct table 1 volatile char padding0[64]; 2 volatile uint64_t * data; 3 volatile uint64_t * old; 4 volatile int chunksClaimed; 5 volatile int size; 6 const int capacity; 7 volatile char padding1[64];
12
Implementation sketch
Check if expansion is in progress, and help if so. Otherwise, create new struct table and CAS it into currentTable. int hashmap::insert(int key) table * t = currentTable; int h = hash(key); for (int i=0;i<capacity;++i) { if (isTooFull(t)) { tryExpand(); return insert(key); } int index = (h+i) % t->capacity; int found = t->data[index]; if (found & MARKED_MASK) { helpExpand(); return insert(key); } else if (found == key) return false; else if (found == NULL) { if (CAS(&t->data[index], NULL, key)) return true; else { found = t->data[index]; } } } assert(false); Check if expansion is in progress, and help if so. Check if expansion is in progress, and help if so. Could fail CAS become someone inserted, or because someone marked
13
Summary Deletion in concurrent hash tables
Expansion of concurrent hash tables Padding to avoid false sharing in the hash table At the start & end of the hash table struct/class At the start & end of the data array Not padding each bucket!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.