Design and Analysis of Algorithms - Chapter 71 Hashing b A very efficient method for implementing a dictionary, i.e., a set with the operations: – insert – find – delete b Applications: – databases – symbol tables
Design and Analysis of Algorithms - Chapter 72 Hash tables and hash functions b Hash table: an array with indices that correspond to buckets b Hash function: determines the bucket for each record b Example: student records, key=SSN. Hash function: h(k) = k mod m (k is a key and m is the number of buckets) if m = 1000, where is record with SSN= stored? if m = 1000, where is record with SSN= stored? b Hash function must: be easy to computebe easy to compute distribute keys evenly throughout the tabledistribute keys evenly throughout the table
Design and Analysis of Algorithms - Chapter 73 Collisions b If h(k1) = h(k2) then there is a collision. b Good hash functions result in fewer collisions. b Collisions can never be completely eliminated. b Two types handle collisions differently: Open hashing - bucket points to linked list of all keys hashing to it. Closed hashing – – –one key per bucket –Collision resolution strategy) –in case of collision, find another bucket for one of the keys (need Collision resolution strategy) – –linear probing: use next bucket – – double hashing: use second hash function to compute increment
Design and Analysis of Algorithms - Chapter 74 Open hashing b If hash function distributes keys uniformly, average length of linked list will be n/m b Average number of probes = 1+α/2 b Worst-case is still linear! b Open hashing still works if n>m.
Design and Analysis of Algorithms - Chapter 75 Closed hashing b Does not work if n>m. b Avoids pointers. b Deletions are not straightforward. b Number of probes to insert/find/delete a key depends on load factor α = n/m (hash table density) – successful search: (½) (1+ 1/(1- α)) – unsuccessful search: (½) (1+ 1/(1- α)²) b As the table gets filled (α approaches 1), number of probes increases dramatically: