Download presentation
Presentation is loading. Please wait.
1
Optimal Fast Hashing Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)
2
Hash Tables for Networking Devices Hash tables and hash-based structures are often used in high-speed devices Heavy-hitter flow identification Flow state keeping Flow counter management Virus signature scanning IP address lookup algorithms For hash tables, ideally, 1 memory access per element insertion Maximize throughput & minimize power
3
Hash Tables for Networking Devices 123 Collisions are unavoidable wasted memory accesses For load≤1, let a and d be the average and worst- case time (number of memory accesses) per element insertion Initially empty buckets Only insertions (no deletions) Objective: Minimize a and d 123456789 Memory
4
Why We Care On-chip memory: memory accesses power consumption Off-chip memory: memory accesses lost on/off-chip pin capacity Datacenters: memory accesses network & server load Parallelism does not help reduce these costs d serial or parallel memory accesses have same cost
5
Traditional Hash Table Schemes Example 1: linked lists (chaining) 123456789 Memory 12 3 45 6 7 8 9
6
Traditional Hash Table Schemes Example 1: linked lists (chaining) Example 2: linear probing (open addressing) Problem: the worst-case time cannot be bounded by a constant d 123456789 Memory 12345 6 8
7
High-Speed Hardware Enable overflows: if time exceeds d → overflow list Can be stored in expensive CAM Otherwise, overflow elements = lost elements Bucket contains h elements E.g.: 128-bit memory word h=4 elements of 32 bits Assumption: Access cost (read & write word) = 1 cycle 123456789 Memory 4 7 15 3 6 28 h CAM 9
8
Problem Formulation 123456789 Memory 4 7 15 3 6 28 h CAM 9 Given average time a and worst-case time d, Minimize overflow rate Given average time a and worst-case time d, Minimize overflow rate
9
Example: Power of d Random Choices d hash functions: pick least loaded bucket. Break ties u.a.r. [Azar et al.] or to the left [Vöcking] Intuition: can reach low … but average time a = worst-case time d wasted memory accesses 123456789 Memory 4 7 15 3 6 28 h CAM 9 10 11 12
10
Main Results Lower bound on overflow for any scheme Optimality of three schemes on successively larger ranges: SIMPLE GREEDY MHT (optimal when subtable sizes fall geometrically)
11
Overflow Lower Bound Objective: given any online scheme with average a and worst-case d, find lower-bound on overflow . [h=4, load=n/(mh)=0.95, fixed d] No scheme can achieve (capacity region)
12
Overflow Lower Bound Problem: the number of hashes of each element depends on the instantaneous memory state. How can we bound the overflow? 123456789 4 7 15 3 6 28 h CAM 910 11 12 1314
13
123456789 4 7 15 3 6 28 h CAM 910 11 12 Overflow Lower Bound: Proof Intuition Assume hashes are uniform. Then relax constraints: Offline, No worst-case d, and Uncolor the hashes (n elements) x (a hashes per element) = an uncolored hashes Lower-bound on expected number of unhashed memory bins 13141314
14
Overflow Lower Bound Result: closed-form lower-bound formula Given n elements in m buckets of height h: Valid also for non-uniform hashes Defines a capacity region for high- throughput hashing
15
Lower-Bound Example [h=4, load=n/(mh)=0.95] For 3% overflow rate, throughput can be at most 1/a = 2/3 of memory rate
16
Overflow Lower Bound Example: d-left scheme: low overflow , but high average memory access rate a [h=4, load=n/(mh)=0.95, m=5,000]
17
Main Results Lower bound on overflow for any scheme Optimality of three schemes on successively larger ranges: SIMPLE GREEDY MHT (optimal when subtable sizes fall geometrically)
18
The SIMPLE Scheme SIMPLE scheme: single hash function Looks like truncated linked list Intuition: The final state only depends on the hashes, not on the successive states can uncolor elements 123456789 Memory 4 7 15 3 6 28 h CAM 9 10 11
19
The SIMPLE Scheme: Proof Intution Same reasoning as offline lower-bound Result: for a = 1, SIMPLE is optimal (i.e. achieves min ) Formal proof relies on mean-field analysis (differential equations with continuous-time fluid limit) 123456789 Memory 4 7 15 3 6 28 h CAM 9 10 11 When all elements have been hashed:
20
Performance of SIMPLE Scheme [h=4, load=0.95, m=5,000] The lower bound can actually be achieved for a=1
21
The GREEDY Scheme Using uniform hashes, try to insert each element greedily until either inserted or d 123456789 Memory 4 7 15 3 6 28 h CAM 9 10 11 12 d=2
22
The GREEDY Scheme: Proof Intuition Un-coloring argument: 2 nd try of collided element new element with 1 hash (GREEDY with x elements, i.e. x∙a(x) hashes) (SIMPLE with x∙a(x) elements) Optimal: For any x n elements Optimality true until no more elements can be added: cut-off point a co ≡ a(n) 123456789 4 7 15 3 6 28 h CAM 910 11 12 1314 1314
23
Performance of GREEDY Scheme [d=4, h=4, load=0.95, m=5,000] The GREEDY scheme is always optimal until a co
24
Performance of GREEDY Scheme [d=4, h=4, load=0.95, m=5,000] Overflow rate worse than 4-left, but better throughput (1/a)
25
The MHT Scheme MHT (Multi-Level Hash Table) [Broder&Karlin]: d successive subtables with their d hash functions 1234567 Memory 4 7 1 5 3 6 28 h CAM 9 10 11 1 st Subtable2 nd Subtable3 rd Subtable
26
Performance of MHT Scheme Optimality of MHT until cut-off point a co (MHT) Proof that subtable sizes fall geometrically Confirmed in simulations [d=4, h=4, load=0.95, m=5,000] Overflow rate close to 4-left, with much better throughput (1/a)
27
Conclusion Established “capacity region” of high- speed hashing Showed that three schemes are optimal on different ranges MHT is optimal when subtable sizes fall geometrically Long-known rule-of-thumb The MHT cut-off point is larger than the Greedy one
28
Thank you.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.