Hash Tables Computer Science and Engineering Chapter 8 of Goodrich and Tomassia’s Text 2/24/2019 B. Ramamurthy
Topics Hashing Hash functions Hash Tables Collision Collision resolution Chaining, linear probing, quadratic probing, double hashing Java Hash table Example 2/24/2019 B. Ramamurthy
Hashing Concept Another approach for storing and searching elements. Used in applications where add and delete, besides search, are used. Worst case linear but can get O(1) best case. 2/24/2019 B. Ramamurthy
Hash Function A hash function h maps keys of a given type into an integer interval {0, N-1} A simple function: h(x) = x mod N A good hash function will uniformly disperse the keys in the range {0, N-1} 2/24/2019 B. Ramamurthy
Hash Table Hash table ADT: Has a hash function h Array of size N Collision occurs when two keys map to the same array index. Two major collision resolution schemes are: chaining and open addressing 2/24/2019 B. Ramamurthy
Example Item(name, ssn) where ssn is 9 digit positive integer. Hash table N = 10000. Hash function: last four digits of of x Use chaining to handle collision. 2/24/2019 B. Ramamurthy
Hash Table: Example 1 2 3 045-34-0002 . 9996 9997 9998 9999 045-35-9996 567-34-9996 2/24/2019 B. Ramamurthy
Hash Functions Usually specified as two components: Component1 called the hash code map, collects the parts of the data and maps them to integers (numeric data) H1: keys integers Component2 called the compression map, takes the integers and maps them to 0 to N-1. H2: integers {0, N-1} 2/24/2019 B. Ramamurthy
Hash Code Maps Memory address of data Integer cast of non-numeric data: bit/byte pattern of data Sum of all components: add the ascii values of your last name Polynomials of various parts of data. 2/24/2019 B. Ramamurthy
Compression maps Reminder of division (mod) h(y) = y mod N Multiple, Add and Divide (MAD) h(y) = (a*y + b) mod N where a mod N <> 0 (otherwise everything will map to b!) 2/24/2019 B. Ramamurthy
Linear Probing Linear probing resolves collision by placing the colliding item in the next available empty cell. Each entry inspected is referred to as the “probe” Example: h(x) = x mod 13 Insert keys: 18, 41, 22, 44, 59, 32, 31, 73, in that order 2/24/2019 B. Ramamurthy
Search with linear probing: find(k) j = h(k); probe = 0; while (p < N) 2.1 c = A[j] 2.2 if (c == null) return NOT_FOUND; else if (c.key() == k) return c.element(); else j = (j+1) mod N p = p +1; 3. return NOT_FOUND; 2/24/2019 B. Ramamurthy
Double Hashing h1: primary hash function If it results in collision, resolve by applying another hash function, secondary hash function; here is an example of such a function. d(k) = q – k mod q Where q < N, Possible values of d(k) are 1,2,3,..q It cannot be 0 When collision occurs: (h(k)+j*d(k)) mod N j = 0, 1, ..N-1 2/24/2019 B. Ramamurthy
Example h(k) = k mod 13 d(k) = 7 – k mod 7 Insert 18, 41, 22, 44, 59, 32, 31, 73. 2/24/2019 B. Ramamurthy
Java support for Hashing Java1.4 API Hashtable since jdk1.0 HashMap since jdk1.2 2/24/2019 B. Ramamurthy
Hashtable/HashMap Array of Entry Chaining is the collision resolution policy Entry is an inner class; Entry is a list; Entry holds a key, value and a link to next Entry Key Object should implement hashcode() and equals() methods hashcode() of the Key class is the hash code map that converts key attributes to an integer Compression map/function is hashcode mod array.length Loadfactor (how full it can get before expanding), initial capacity can be specified 2/24/2019 B. Ramamurthy
Hashtable vs HashMap Dictionary extends Hashtable Map, Cloneable, Serailizable Dictionary extends implements AbstractMap extends Hashtable Map, Cloneable, Serailizable implements 2/24/2019 B. Ramamurthy
Hashtable vs HashMap Hashtable implements Dictionary which is deprecated and has been superceded by AbstractMap Hashtable implementation is a synchronized for concurrent access. HashMap leaves synchronization to the application. Hashtable is slower (due to synchronized implementation) 2/24/2019 B. Ramamurthy
HashMap 2/24/2019 B. Ramamurthy
Using HashMap Problem: Consider employee data that has last name, first name, employee number (5 digit integer), and salary. We want to maintain a data structure of 100 temporary employees for easy add, remove and search. Solution: HashMap 2/24/2019 B. Ramamurthy
Using HashMap Problem: An employee data is made up of Name (first, last), employee number (Integer), and salary. Maintain a set of 50 employee data for a “temp agency” so that employees can be added, removed and searched easily. Solution: Choice of data structure: HashMap Key? Lets consider employee number as key in the first version In the second version we will consider name as the key Name is a class of two attributes: first, last 2/24/2019 B. Ramamurthy
TempAgency: Design 2/24/2019 B. Ramamurthy
More hashCode() method hashCode() is predefined in Object class for all classes. The default value that hashCode() returns is the memory address of the object. Where is hashCode() called in the HashMap? In the put(), get(), remove() and others that compute the hash index in the two phase hashing (hash code map and compression map) 2/24/2019 B. Ramamurthy
Summary Hashing offers a convenient way to store and search for data. The data considered in the form of {key, value} pair. The location of a data depends only on its key, not on its “successor” and/or “predecessor” key. 2/24/2019 B. Ramamurthy