Presentation is loading. Please wait.

Presentation is loading. Please wait.

Podcast Ch21a Title: Hash Functions

Similar presentations


Presentation on theme: "Podcast Ch21a Title: Hash Functions"— Presentation transcript:

1 Podcast Ch21a Title: Hash Functions
Description: Introduction to hashing; designing hash functions Participants: Barry Kurtz (instructor); John Helfert and Tobie Williams (students) Textbook: Data Structures for Java; William H. Ford and William R. Topp

2 Introduction to Hashing
A hash table distributes elements in a series of linked lists, referred to as buckets. A hash function maps a value to an index in the table. The function provides access to an element much like an index provides access to an array element. Like a binary search tree, a hash table provides an implementation of the Set and Map interfaces.

3 Introduction to Hashing (continued)
A binary search tree can access data stored by value with O(log2n) average search time. We would like to design a storage structure that yields O(1) average retrieval time. In this way, access to an item is independent of the number of other items in the collection.

4 Introduction to Hashing (continued)
A hash table is an array of references. Associated with the table is a hash function that takes a key as an argument and returns an integer value. By using the remainder after dividing the hash value by the table size, we have a mapping of the key to an index in the table.

5 Introduction to Hashing (concluded)
Hash Value: hf(key) = hashValue HashTable index: hashValue % n

6 Hashing is an ______ algorithm.
(a) O(log2n) (b) O(n2) (c) O(n) (d) O(1)

7 Using a Hash Function Consider the hash function hf(x) = x, where x is a nonnegative integer (the identity function). Assume the table is the array tableEntry with n = 7 elements.

8 Using a Hash Function (concluded)
With hash function hf() and table size n, the table index for a key is i = hf(key)%n. Collisions occur for any two keys that differ by a multiple of n.

9 Designing Hash Functions
Some general design principles guide the creation of all hash functions. Evaluating a hash function should be efficient. A hash function should produce uniformly distributed hash values. This spreads the hash table indices around the table, which helps minimize collisions. The Java programming language provides a general hashing function with the hashCode() method in the Object superclass.

10 Given a set of keys k0, k1, …, kn-1,
a __________ is a hash function that produces no collisions. (a) closed probe hash (b) sparse distribution (c) non-link hash (d) perfect hashing function

11 Designing Hash Functions (continued)
Object's hashCode() converts the internal address of the object into an integer value, which has limited application since two different objects will normally have different values for hashCode(), even if they store the same data. // strings one and two are the same; not so for // integer values one.hashCode() and two.hashCode() String one = "java", two = "java";

12 Consider the following hashing function
Consider the following hashing function. Assume that s is a string variable in the class that defines hashCode(). public int hashCode() { int i; int hashval = 0; for(i=0; i < s.length(); i++) hashval += s[i]; return hashval; } Give two words of length four that hash to the same value. How could you improve this hash function?

13 Designing Hash Functions (continued)
In the majority of hash-table applications, the key is a string. To create an efficient hash function, we must combine the sequence of characters in the string to form an integer. public int hashCode() { int hash = 0; for (int i = 0; i < n; i++) hash = 31*hash + s[i]; return hash; }

14 Designing Hash Functions (concluded)
The following are hash code values for three different strings. The value for string strB is a negative number due to integer overflow. String strA = "and", strC = "algorithm"; strB = "uncharacteristically", hashValue=strA.hashCode();// hashValue = 96727 hashValue=strB.hashCode();// hashValue = hashValue=strC.hashCode();// hashValue = In general, a hash function may result in integer overflow and return a negative number. The following calculation insures that the table index is nonnegative. tableIndex = (hashValue & Integer.MAX_VALUE) % tableSize

15 User-Defined Hash Functions
To create a custom hash function, a class overrides the method hashCode(). For the Time24 class, the hash value for an object is its time converted to minutes. Since hour and minute are normalized to fall within the ranges 0 to 23 and 0 to 59 respectively, each time is unique. public int hashCode() { // hash value is time in minutes; // as normalized time, value is positive return hour*60 + minute; }

16 User-Defined Hash Functions (continued)
The custom hash function for Product objects must mix the bits for the serial number to create a random value. public class Product { // last 4 digits record year in which the product // was made. // identity hash function is not sufficient private int serialNum; ... }

17 User-Defined Hash Functions
public class Product { private int serialNum; ... public int hashCode() { // assign serialNum to a long variable long hashValue = serialNum; // square to obtain a nonnegative long integer hashValue *= hashValue; // return the remainder after dividing // by the largest int value; its bits // are "jumbled up" return (int)(hashValue % Integer.MAX_VALUE); }


Download ppt "Podcast Ch21a Title: Hash Functions"

Similar presentations


Ads by Google