Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hashing Jeff Chastine.

Similar presentations


Presentation on theme: "Hashing Jeff Chastine."— Presentation transcript:

1 Hashing Jeff Chastine

2 Hashing Many applications require INSERT, SEARCH and DELETE functions
Hashing on average time can do all of these in O (1) Based on keys Falls under two general categories: Direct-Address Tables Hash Tables Jeff Chastine

3 Direct-Addressing Good for when universe U of keys is small
U = {0, 1, …, m – 1 | m is not large} All elements have unique keys Table T [0..m -1] | each slot corresponds to a key All operations take only O (1) Jeff Chastine

4 Direct Implementation
key satellite data 1 U (universe of keys) 2 2 3 9 6 3 7 4 4 1 2 5 K (actual keys) 5 3 6 5 8 7 8 8 9 Jeff Chastine

5 Direct-Addressing Operations
DIRECT-ADDRESS-SEARCH (T, k) return T[k] DIRECT-ADDRESS-INSERT (T, x) T[key[x]] ← x DIRECT-ADDRESS-DELETE (T, x) T[key[x]] ← NIL Jeff Chastine

6 Hash Tables What are potential problems with direct addressing?
|U| may be impractical Set of actual keys may be small Example SSNs Here, hash tables require much less storage Only catch: O (1) is average time instead of worst-case ! Jeff Chastine

7 How it works With direct-addressing, something with key k goes into slot k With hashing it goes into h (k) | h is a hash function Hash functions try to “randomize” Hash function maps U to T [0..m – 1] h :U → {0, 1, …, m – 1} Instead of |U| values,need only m values Jeff Chastine

8 Hash Implementation T U (universe of keys) K (actual keys) k1 k4 k5 k2
U (universe of keys) h (k1) h (k4) k1 h (k2) = h (k5) K (actual keys) k4 k5 k2 k3 h (k3) m - 1 Jeff Chastine

9 Collisions Have two keys hash to the same slot
Because |U| > m, pigeon hole principle Therefore, collisions must exist We often talk of the load factor (α = n/m) Pick a good hash function Near random, yet deterministic Can chain collisions together This is where the worst-case comes from Can use open addressing Jeff Chastine

10 Chaining T U (universe of keys) K (actual keys) k1 k7 k4 k7 k1 k5 k2
Jeff Chastine

11 Hash Functions What makes a good hash function?
Equally likely to hash to any of the m slots If keys are random numbers [0 … 1} then take floor of km Convert strings to ASCII to hash? Most usually involve mod Jeff Chastine

12 Hash Functions Division method: Multiplication method:
h (k ) = k mod m Multiplication method: Let 0 < A < 1 h (k ) = floor(m (k A mod 1) ) // Fractional part Jeff Chastine

13 Open Addressing Systematically examine or probe slots until item is found No lists and no elements stored outside the table; thus α <= 1 Instead of following pointers, we compute the sequence Instead of fixed order – is based off of key Jeff Chastine

14 Kinds of Open Addressing
Linear Probing h (k, i ) = (h’ (k ) + i ) mod m Quadratic Probing h (k, i ) = (h’ (k ) +c1i + c2i 2) mod m Double Hashing h (k, i ) = (h1(k ) + i h2(k )) mod m Jeff Chastine

15 Jeff Chastine


Download ppt "Hashing Jeff Chastine."

Similar presentations


Ads by Google