“Never doubt that a small group of thoughtful, committed people can change the world. Indeed, it is the only thing that ever has.” – Margaret Meade Thought for the Day
Hash Functions Essential ingredient for a successful hash table Numeric keys –simple Textual (alphanumeric) keys –e.g. 609A1234 –Not so simple! key Hash Function index
Collisions Prevented by using perfect hashing functions But this is not always possible Other solutions? Two types of hash tables: –Internal hashing –External hashing
Internal Hashing with Open Addressing Hash the key to find location If it is already occupied: start “probing” –simple approach: try immediately following positions programming programming words 59Collision! words Probe
Probing Example shows linear probing Other possibilities –add constant amount to hash value –add variable amount to hash value –apply second hash function to get probe “distance” Must be consistent
Important Side Effect Must be very careful when removing items programmingwords Must mark deleted entries One of three states: –empty –occupied –deleted X
Clustering and Efficiency As the hash table fills up collisions become more frequent Clustering –primary clustering –secondary clustering Efficiency decreases
Clustering: Solutions? Make the table bigger than required Use external hashing Use more sophisticated probing techniques –helps decrease secondary clustering
Deletions Decrease efficiency of searching If deletions are frequent, need to consider rebuilding the table periodically
External Hashing Using the hash function identifies a bucket into which the key should be placed programming 59 programming words 59 Collision! words
External Hashing Buckets hold several values As the buckets fill up, efficiency decreases –no worse than probing –usually better (no “secondary clustering”) Big advantage: –space is not limited by table size
Implementing the Buckets Many possible approaches –linked lists –binary search trees –secondary hash tables! We will use a simple unordered linked list
The ExternalHashTable Class public class ExternalHashTable implements Dictionary { private static final int DEF_SIZE = 101; private class EntryNode extends DictionaryPair { EntryNode next; } // class EntryNode private EntryNode [] table; private int hash (K aKey) // Scale hash value for table size {... } // hash... } // class ExternalHashTable The buckets
Hashing Function How do we handle the hashing function? Java to the rescue! –Object class has hashCode method public int hashCode ()
Using the hashCode Method private int hash (K aKey) // Scale hash value for table size { return ((aKey.hashCode() & 0x7FFFFFFF) % table.length); } // hash Need to ensure it is positive and in correct range:
The Buckets table... key value key value key value
The insert Method public void insert (K aKey, V aValue) // Insert new element or update existing one { int index = hash(aKey); // Look for aKey in linked list EntryNode c; for (c = table[index]; c != null && ! c.getKey().equals(aKey); c = c.next) ;... } // insert
The insert Method (cont.)... if (c == null) // Insert new node { EntryNode n = new EntryNode (aKey, aValue); n.next = table[index]; table[index] = n; } else // Update existing entry c.setValue(aValue);
Other Methods Quite simple Remember structure: –array of linked lists Example: public boolean isEmpty () // Tell whether hash table is empty { for (int k = 0; k < table.length; k++) if (table[k] != null) return false; // Found at least one entry return true; // Found no entries } // isEmpty
External Hash Table: Iterators Slightly more complicated: –need to work through array –and work through linked lists table... key value key value key value