Download presentation
Presentation is loading. Please wait.
Published byBenjamin Freeman Modified over 6 years ago
1
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 1 2 3 4 © 2010 Goodrich, Tamassia Hash Tables 1 1
2
Maps Model searchable collections of key-value entries.
Dictionaries 07/27/16 16:46 Maps Model searchable collections of key-value entries. Operations are find, put, erase, size, and empty. Two entries cannot have the same key. Applications: address book student-record database © 2010 Goodrich, Tamassia Maps 2
3
Example Operation Output Map empty() true Ø put(5,A) [(5,A)] (5,A)
Dictionaries 07/27/16 16:46 Example Operation Output Map empty() true Ø put(5,A) [(5,A)] (5,A) put(7,B) [(7,B)] (5,A),(7,B) put(2,C) [(2,C)] (5,A),(7,B),(2,C) put(8,D) [(8,D)] (5,A),(7,B),(2,C),(8,D) put(2,E) [(2,E)] (5,A),(7,B),(2,E),(8,D) find(7) [(7,B)] (5,A),(7,B),(2,E),(8,D) find(4) end (5,A),(7,B),(2,E),(8,D) find(2) [(2,E)] (5,A),(7,B),(2,E),(8,D) size() 4 (5,A),(7,B),(2,E),(8,D) erase(5) — (7,B),(2,E),(8,D) erase(2) — (7,B),(8,D) find(2) end (7,B),(8,D) empty() false (7,B),(8,D) © 2010 Goodrich, Tamassia Maps 3
4
Implementations List: O(n) space, O(n) time per operation.
Dictionaries 07/27/16 16:46 Implementations List: O(n) space, O(n) time per operation. Balanced tree: O(n) space, O(log n) time per operation. Hash table: O(n) space, O(1) time per operation on average. © 2010 Goodrich, Tamassia Maps 4
5
Hash Functions and Hash Tables
Dictionaries 07/27/16 16:46 Hash Functions and Hash Tables A hash function h maps keys of a given type to integers in an interval [0, N1]. Example: h(x) x mod N is a hash function for integer keys. The integer h(x) is called the hash value of key x. A hash table for a given key type consists of a hash function h, and an array (table) of size N. The goal is to store item (k, o) at index i h(k). © 2010 Goodrich, Tamassia Hash Tables 5
6
Example Entries: (SSN, Name) with SSN a nine-digit positive integer. 1
Dictionaries 07/27/16 16:46 Example Entries: (SSN, Name) with SSN a nine-digit positive integer. Hash table size: N10,000. Hash function: h(x)last four digits of x. 1 2 3 4 9997 9998 9999 … © 2010 Goodrich, Tamassia Hash Tables 6
7
Dictionaries 07/27/16 16:46 Hash Functions A hash function is usually specified as the composition of two functions. Hash code: h1: keys integers Compression function: h2: integers [0, N1] The hash code is applied first, and the compression function is applied next on the result: h(x) = h2(h1(x)). The hash function should disperse the keys uniformly. © 2010 Goodrich, Tamassia Hash Tables 7
8
Hash Codes Memory address: Integer cast: Component sum:
Dictionaries 07/27/16 16:46 Hash Codes Memory address: Interpret the memory address of the key as an integer. Suitable for one-word keys. Integer cast: Interpret the bits of the key as an integer. Suitable for keys of bit length at most the number of bits of the integer type. Component sum: Partition the bits of the key into components of fixed length (32 bits) and sum the components (ignoring overflow). Suitable for keys of bit length greater than the number of bits of the integer type. © 2010 Goodrich, Tamassia Hash Tables 8
9
Compression Functions
Dictionaries 07/27/16 16:46 Compression Functions Division: h2 (y) y mod N N prime for number theoretical reasons. y values less than N are not spread out. Multiply, Add and Divide (MAD): h2 (y) (ay b) mod N Helps spread out the keys. © 2010 Goodrich, Tamassia Hash Tables 9
10
Dictionaries 07/27/16 16:46 Collision Handling Collisions occur when multiple elements map to the same cell. Separate Chaining: each cell points to a list of entries. Requires extra, non- contiguous memory. Open addressing: colliding item is placed in another cell. 1 2 3 4 © 2010 Goodrich, Tamassia Hash Tables 10
11
Linear Probing Example:
Dictionaries 07/27/16 16:46 Linear Probing Place the colliding item in the next (circularly) available table cell. Each cell inspected is referred to as a probe. Colliding items lump together, creating longer sequence of probes in later collisions. Example: h(x) x mod 13 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order. 1 2 3 4 5 6 7 8 9 10 11 12 41 18 44 59 32 22 31 73 1 2 3 4 5 6 7 8 9 10 11 12 © 2010 Goodrich, Tamassia Hash Tables 11
12
Search with Linear Probing
Dictionaries 07/27/16 16:46 Search with Linear Probing Consider a hash table A with linear probing. find(k) Start at cell h(k). Probe consecutive locations until one of the following occurs. An item with key k is found. An empty cell is found. N cells are probed. Algorithm find(k) i h(k) p 0 repeat c A[i] if c return null else if c.key () k return c.value() else i (i 1) mod N p p 1 until p N return null © 2010 Goodrich, Tamassia Hash Tables 12
13
Dictionaries 07/27/16 16:46 Double Hashing Double hashing uses a secondary hash function d(k) and places a collision i in the first available cell of the series (i jd(k)) mod N for j 0, 1, … , N 1. The secondary hash function d(k) cannot have zero values. The table size N must be a prime to allow probing of all the cells. Common choice of compression function for the secondary hash function: d2(k) q k mod q where q N q is a prime The possible values for d2(k) are 1, 2, … , q © 2010 Goodrich, Tamassia Hash Tables 13
14
Example of Double Hashing
Dictionaries 07/27/16 16:46 Example of Double Hashing Consider a hash table with integer keys and double hashing. N13 h(k) k mod 13 d(k) 7 k mod 7 Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order. 1 2 3 4 5 6 7 8 9 10 11 12 31 41 18 32 59 73 22 44 1 2 3 4 5 6 7 8 9 10 11 12 © 2010 Goodrich, Tamassia Hash Tables 14
15
Performance of Hashing
Dictionaries 07/27/16 16:46 Performance of Hashing In the worst case, find, put, and erase take O(n) time on a table of size N with n entries. The worst case occurs when all the keys collide. The load factor nN governs the performance. If the hash values are random, the expected number of probes per operation with open addressing is 1 (1 ). The expected running time is O(1) hashing is fast if the load factor is not close to 1. Applications of hash tables: small databases compilers browser caches © 2010 Goodrich, Tamassia Hash Tables 15
16
Dynamic Hashtables What if the number of elements n is unknown?
Dictionaries 07/27/16 16:46 Dynamic Hashtables What if the number of elements n is unknown? Pick a reasonable initial table size. When the load reaches 75%, double the table. Total doubling time is O(n log n). © 2010 Goodrich, Tamassia Hash Tables 16
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.