Cuckoo Hashing : Hardware Implementations Adam Kirsch Michael Mitzenmacher.

2 Motivation Hash tables are ubiquitous. Highly useful in router hardware. –Measurement and monitoring tasks. Desiderata: –Few (parallel) memory accesses. –High space utilization. –Low failure probability. –Hardware-level simplicity. What are good hash table designs for hardware?

3 State of the Art : Multiple Choice Hashing Each element placed in least loaded of d locations. (If 1 element/cell, look for 1 empty cell out of d.)

4 Cuckoo Hashing and Moves Cuckoo hashing paradigm: give each element d choices, and move elements among choices as needed.

5 Original Cuckoo Hashing 2 subtables, left and right. Each element gets one location per subtable. Place new element in left subtable. –If element already there, kick it out, move to right subtable. –If element already there, kick it out, move to left subtable… –Until everything placed. Works with high probability as long as load is less than ½.

6 Better Cuckoo Hashing More choices More elements per bucket Generally kick out a random item. Such schemes are not fully analyzed.

7 What’s Wrong with Cuckoo Hashing? Lots of moves per insert in worst case. –Average is constant. –But maximum is Omega(log n) with non-trivial (inverse-poly) probability. Router hardware settings: may need bounded number of memory accesses per insert.

9 The Power of One Move Previous work (submitted): How much gain from allowing just one move? Framework: allow small content-addressable memory (CAM) to handle unsolvable collisions [max 0.2%]. Multiple schemes analyzed. With 4 choices, insertions only (no deletions), factor of 2 or larger improvement in space.

19 Analysis Currently we do not know how to analyze such systems. –For d > 2 choices, lots of open questions in cuckoo hashing analysis. –Analyzing d = 2 may be possible, but very low space utilization. See [Kutzelnigg], asymptotic analysis of cuckoo hashing. Need to understand distribution of move operations/element to analyze queue.

20 Conclusions and Open Questions Moving elements leads to much better space utilization in hash tables, at a price. Cuckoo hashing appears implementable, with per-insert move guarantees based on de-amortization via a CAM queue. Analysis in an idealized model? –Even analysis for basic cuckoo hashing open. Performance on real traffic? –Bursty insertions/deletions? –Distribution of element lifetimes? Proper sizing of CAM queue? –How does overflow probability scale?

