Download presentation
Presentation is loading. Please wait.
Published byThomas Leonard Modified over 9 years ago
1
Hashing Chapter 7 Section 3
2
What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items are stored by the key and have related values A real dictionary example o The defined words are the keys o The definitions are the values A more pertinent example o Your Ole Miss ID number would be your key o Your name, major, address, etc. would be your values
3
Hash Table info Items are placed in a specific location o This location is determined by the hash function The hash function can be any function as long as it evenly distributes the keys to unique slots or buckets The function should satisfy two principles: o Even distribution (helpful to have a prime sized array) o Easy computation We can use "mod" for example o Takes advantage of the prime feature
4
Collisions are imminent... What happens if two keys get mapped to the same slot? o A collision occurs
5
Collision resolution Why was there a collision in the first place? o Poor hash function? o Load factor? o Coincidence? Two ways of resolving collisions o Open Hashing called separate chaining o Closed Hashing called open addressing
6
Separate Chaining Each slot in the array turns into a linked list When a collision occurs, the key/value is added to the end of the list at that slot How do we search for items with separate chaining? o Calculate the address o Linear search the list Efficiency depends on: o Length of the linked lists o Size of the array in comparison to the number of keys o Quality of the hash function
7
Open Addressing Collisions are resolved by moving to the next open slot How do we search in this case? o Compute what the address "should" be o Not there? Linear search until you find an empty slot This is called linear probing What happens if we delete a key?
8
Issues in Open Addressing If we delete a key, that leaves an open spot in the array If a key AFTER the one we deleted depended on linear probing, then we will never find it We'll compute the location, and then find the open slot before we reach it. We can use a "book mark" o Just place a key in that slot that means nothing o This will hold the location until a real item is inserted Problem #1 SOLVED!
9
More issues in Open Addressing! Clustering: groups of unwanted "linear probes" When groups of clusters get linked, we essentially have a linked list, which is bad news How can we resolve this clustering problem? o Double Hashing This means using ANOTHER hash function to determine a fixed increment for probing besides 1 o Rehashing This just means resizing the array and remapping all the keys to new locations Problem #2 SOLVED!
10
Performance Analysis
11
Real Performance Analysis Worst case: o All operations: O(n) o Table is full and we are either linear probing or linear searching to find the key o Our table is acting like a linked list... Best case: o All operations: O(1) Called "perfect hashing" Avg case: o O(1+ k/n) for chaining and unsuccessful lookup o O(1/(1- k/n)) for open addressing and unsuccessful lookup
12
Problems Problems from 7.3: o 1: Input 30, 20, 56, 75, 31, 19. Use h(K) = K mod 11 Construct open hash table (separate chaining) Find largest # of comparisons for a valid search Find avg. # of comparisons for a valid search o 2: Input 30, 20, 56, 75, 31, 19. Use h(K) = K mod 11 Construct closed hash table (open addressing) Find largest # of comparisons for a valid search Find avg. # of comparisons for a valid search o 5: How many people should be in a room so that a > 50% chance exists that two share a birthday? What implications does this have for hashing?
13
Examples http://en.wikipedia.org/wiki/Birthday_problem http://turing.cs.olemiss.edu/~apernell/Count.java http://www.codinghorror.com/blog/2007/12/hashtables- pigeonholes-and-birthdays.htmlhttp://www.codinghorror.com/blog/2007/12/hashtables- pigeonholes-and-birthdays.html
14
First person to raise your hand AND give me these three answers: A) What the two types of hashing are called? B) What is it called when two keys map to the same slot? C) What is it called when we resize the table? gets 5 points on your next homework...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.