Richard Anderson (instead of Martin Tompa) CSE 326 Hashing Richard Anderson (instead of Martin Tompa)
Chaining review H(k) = k mod 17 k H(k) A B C D E F G H I J K 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 k H(k) A B C D E F G H I J K H(k) = k mod 17 Collect twelve birthdays from students and hash into the table with chaining
Open address hashing Store all elements in table If a cell is occupied, try another cell. Linear probing, try cells H(k), H(k) + 1 mod m, H(k) + 2 mod m, . .
Open Address Hashing H(k) = k mod 17 k H(k) A 53 2 B 41 7 C 91 6 D 75 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 A 3 I 4 J 5 6 C 7 B 8 D 9 F 10 K 11 12 13 E 14 15 16 H k H(k) A 53 2 B 41 7 C 91 6 D 75 E 13 F G 43 H 67 16 I 88 3 J 36 K 40 H(k) = k mod 17 Collect twelve birthdays from students and hash into the table with chaining Step through example Emphasize the growing regions
Open address hashing Lookup (K) { p = H(K); loop { if (A[p] is empty) return false; if (A[p] == K) return true; p = (p + 1) mod m; }
Open address hashing issues Clumping Cost per operation Deletion
Double hashing Use separate hash functions for the first probe and the collision resolution H1(k), H1(k) + H2(k) mod m, H1(k) + 2H2(k) mod m, H1(k) + 3H2(k) mod m , . . . Return to earlier slide to update the access code
Double hashing example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 month day H1(k) A B C D E F G H I J K H1(k) = day mod 17 H2(k) = month Collect twelve birthdays (month and day) from students and hash into the table with chaining WRITE MONTHS AS NUMBERS
Double hashing vs. Single hashing Load factor a, cost per operation Single hashing Double hashing Single Double a
Trade offs between chaining and open addressing Chaining Open Addressing Space Time Deletions Coding complexity High load factor
Hash Functions Function Efficient Uniform mapping to range Avoids systematic collisions
Hashing strings String Suppose
Fact: So: