Engineering a Sorted List Data Structure for 32 Bit Keys Roman Dementiev Lutz Kettner Jens Mehnert Peter Sanders MPI für Informatik, Saarbrücken
2 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Introduction The power of integer keys helps in – Sorting (radix MSB,LSB) – Priority queues (radix heaps) – Static search trees – Dictionaries (hash tables) Faster both in theory and practice What about dynamic search data structures?
3 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Motivation van Emde Boas (vEB) search trees [ van Emde Boas77,MehlhornNaeher90 ]: Small K, large n → vEB are faster ? NO, their direct implementations are 2-8 times slower than comp. based trees [ Wenzel92,here ] Here: a tuned vEB data structure that outperforms comp. based implementations operationcomparison basedvan Emde Boas insert, delete, searchO(log n)O(log K) range queryO(c + log n)O(c + log K) n – number of elements K – bit width of keys c – size of the output
4 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Direct vEB Implementation vEB tree maintains set Recursive definition: – |M|=1 or K=1: store directly, – otherwise let K’ = K/2: store minM,maxM, top: store (top recursion) bot i : store (bottom recursion) use hash table K’ bit vEB top hash table K’ bit vEB bot i
5 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Improvement 1 Replace top data structure with a bit pattern hierarchy K’ bit vEB top hash table K’ bit vEB bot i 0 … … … … … … …… …
6 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Improvement 2 Break recursion when K=8 3 levels max 0 … … … … … … …… … hash table Level 1 – root Bits … … … … hash table … … … … Level 2 Bits 15-8 Level 2 Bits 15-8 Level 3 Bits 7-0 … … … … hash table single elements K’ bit vEB bot i
7 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Improvement 3 Replace root hash table with an array 0 … … … … … … …… … Level 1 – root Bits … … … … hash table … … … … Level 2 Bits 15-8 Level 2 Bits 15-8 Level 3 Bits 7-0 … … … … hash table single elements hash table array …
8 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Range Query Support Link elements 0 … … … … … … …… … Level 1 – root Bits … … … … hash table … … … … Level 2 Bits 15-8 Level 2 Bits 15-8 Level 3 Bits 7-0 … … … … hash table array …
9 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Example: Locate Operation return handle of N Function locate(y:N):ElementHandle if y > maxM then return // no larger element i := y[16..31]// index into root table top if top[i]=null or y>maxM i then return minM top.locate(i) // look in the next L2 table if M i ={x} then return x// single element case j := y[8..15]// key for L2 table at M i if r i [j]=null or y > maxM ij then return minM i,top(i).locate(j) // look in the next L3 table if M ij ={x} then return x// single element case return r ij [top ij.locate(y[0..7])]// L3 table access At most 9 comparisons for any input sizes
10 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Locate Performance
11 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Construction
12 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Deletion
13 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Hard Inputs
14 R. Dementiev et al.: A Sorted List Data Structure for 32 Bit Keys (ALENEX'04) Conclusions and Future Work Integer search trees can outperform comp. based search data struct. Future work: – Support multi-set functionality – Other key lengths (up to 38 bits) – Reduce space consumption – Find real inputs – Port it to the LEDA library