Download presentation
Presentation is loading. Please wait.
Published byGabriella Gregory Modified over 6 years ago
1
Hash Tables banana 12 15 1 2 10 apple banana apple cantaloupe kiwi
12 15 1 2 ... 10 apple banana apple cantaloupe Hash Function kiwi kiwi mango ... Hash tables are used when speedy insertion, deletion, and lookup is the priority. In fact, for an ideally tuned hash table, insertion, deletion, and lookup can be accomplished in constant time. A hash table is an array associated with a function (the hash function). mango pear ... cantaloupe pear
2
Hash Function apple 1 2 cantaloupe banana 10 kiwi 12 mango 15 pear
apple 1 2 cantaloupe Hash Function ... banana 10 kiwi ... This function maps keys to array indices. For example, in this slide we see that the hash function has mapped the key 'banana' to index 1. But what's going on under the hood? The hash function takes a key as input and computes an array index from the intrinsic properties of that key. You’d initially use the hash function to determine where in the hash table to store a given key. Later, you’d use the same hash function to determine where in the hash table to search for a given key. For this reason, it’s crucial that a hash function behaves consistently and outputs the same index for identical inputs. 12 mango ... 15 pear
3
Hash Function Example int hash_function(char* key) {
// hash on first letter of string int hash = toupper(key[0]) - 'A'; return hash % SIZE; } Keep in mind that hash tables can be used to store data of all types, but for now, let’s consider a very simple hash function for strings. This hash function uses the first letter of a string to determine a hash table index for that string, so words that start with the letter 'a' are assigned to index 0, 'b' to index 1, and so on. Note that this hash function returns hash % SIZE, where SIZE is the size of the hash table. Modding by the size of the hash table is a good way to avoid indexing into a hash table slot that does not exist.
4
Collisions 12 15 1 2 10 apple banana cantaloupe berry kiwi mango pear
12 15 1 2 ... 10 apple banana cantaloupe Hash Function berry kiwi ... Time to throw a wrench into things. What if we want to store the word berry into the table as well? Berry hashes to index 1, just as banana did. This is an example of a collision, the result of two keys hashing to the same index. Even if your hash table is larger than your dataset and you’ve chosen a good hash function, you need a plan for dealing with collisions if and when they arise. Two common methods of dealing with collisions are linear probing and separate chaining. mango ... pear
5
Linear Probing apple 1 banana 2 cantaloupe 3 berry berry 10 kiwi mango
apple 1 banana 2 cantaloupe Hash Function 3 berry berry ... 10 kiwi With linear probing, if a key hashes to the same index as a previously stored key, it is assigned the next available slot in the table. So banana is stored at index 1, and berry is stored at index 3. As you can surmise even from this simple example, once a collision occurs, you’ve significantly increased chances that another collision will occur in the same area. This is called clustering, and it’s a serious drawback to linear probing. Worst case insertion, deletion, and lookup times have devolved to O(n), as the next available slot could potentially have been the last slot in the table. ... mango 12 ... 15 pear
6
Separate Chaining avocado apple berry banana 1 cantaloupe 2 ...
berry banana 1 cantaloupe 2 In the separate chaining model, the hash table is actually an array of pointers to linked lists. When a collision occurs, the key can be inserted in constant time at the head of the appropriate linked list. What happens now when we search for banana in the hash table? We must traverse the entire linked list at index 1. The worst case lookup time for a hash table that uses separate chaining (assuming a uniform hash function) is therefore O(n/m), where m is the size of the hash table. Since m is a constant, O(n/m) is theoretically equivalent to O(n). In the real world, however, O(n/m) is a big improvement over O(n)! A good hash function will maximize this real world improvement. ...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.