Hashing, Hashing Tables Chapter 8
Class Hierarchy
Introduction Definition: –Key: a key is a field or composite of fields that uniquely identifies an entry in a table.
Example Table of students in a course sorted by name NameYearMark Adams, Keith394 Davis, Susan175 Jordan, Ann186 Patterson, Lynn473 Williams, George165
Insert Function of ListAsArray
Find Function of ListAsArray
Insert Function of ListAsLinkedList
Find Function of ListAsLinkedList
Insert Function of SortedListAsArray
Binary Search
Hashing The implementation of hash tables is called Hashing. Hashing is a technique used for performing insertions and finds in constant average time. Efficient removal of items not required
The General Idea –Array of some fixed size, containing items.
Example
Keys and Hash Functions Each key is mapped into some number in the range 0 to TableSize-1 and placed in the appropriate cell. The mapping is called a hash function
Keys and Hash Functions Characteristics of a good hash function –Avoids collisions –Spread keys evenly in the array –Easy to compute
Avoid Collisions Ideal situation –Given a set of n<=M distinct keys {k1,k2,…,kn}, the set of hash values {h(k1),h(k2),…,h(kn)} contains no duplicates We can only try to reduce the likelihood of a collision using knowledge about the keys E.g. if we know the telephone numbers are all from the same district, so the district number will have little use in our hash function
Spreading Keys Evenly We need to know the distribution of the keys An equal number of keys should map into each array position
Ease of Computation The running time of the hash function should be O(1) (Jumping immediately to the desired record is a direct access approach, much like direct access of data on a disk)
Hashing Methods We are dealing with integer values first, K=Z The value of the hash function falls between 0 and M-1
Division Method The simplest method of hashing an integer The division method of hashing h(x) = x mod M.
Choice of M Generally, any M is good –we often choose M to be a prime number
Implementation Unsigned int const M = 1031; // a prime Unsigned int h(unsigned int x) { return x%M; }
Middle Square Method Avoid division Making use of the fact that computer does finite- precision integer arithmetic –All arithmetic is done modulo W, where W=2 w, w is the word size of the computer M=2 k, W=2 w Meaning: –Multiply x by itself, then shift to the right k bits.
Implementation unsigned int const k = 10; // M==1024 unsigned int const w = bitsizeof (unsigned int); unsigned int h (unsigned int x) { return (x * x) >> (w - k); }
Multiplication Method We multiply the key by a
Implementation unsigned int const k = 10; // M==1024 unsigned int const w = bitsizeof (unsigned int); unsigned int const a = U; unsigned int h (unsigned int x) { return (x * a) >> (w - k); } }
Hash Tables
HashTable Class Definition
Separate Chaining