Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6.

Similar presentations


Presentation on theme: "1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6."— Presentation transcript:

1 1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6

2 2 Hashing Data items stored in an array of some fixed size –Hash table Search performed using some part of the data item –key Used for performing insertions, deletions, and finds in constant average time Operations requiring ordering information not supported efficiently –Such as findMin, findMax

3 3 Hash Table Example

4 4 Hash Table Applications Comparing search efficiency of different data structures: –Vector, list: O(N) –AVL search tree: O(log(N)) –Hash table: O(1) expected time Compilers to keep track of declared variables –Symbol tables –Mapping from name to id On-line spelling checkers

5 5 Hash Functions Map keys to integers (which represent table indices) –Hash(Key) = Integer –Evenly distributed index values Even if the input data is not evenly distributed What happens if multiple keys mapped to the same integer (same position)? –Collision management (discussed in detail later) –Collisions are likely to be reduced if keys are evenly distributed over the hash table

6 6 Simple Hash Functions Assumptions: –K: an unsigned 32-bit integer –M: the number of buckets (the number of entries in a hash table) Goal: –If a bit is changed in K, all bits are equally likely to change for Hash(K) –So that items are evenly distributed in the hash table

7 7 A Simple Function What if –Hash(K) = K % M –Where M is of any integer value What is wrong? Values of K may not be evenly distributed –But Hash(K) needs to be evenly distributed Suppose –M = 10, –K = 10, 20, 30, 40 Then K % M = 0, 0, 0, 0, 0…

8 8 Another Simple Function If –Hash(K) = K % P, P = prime number Suppose –P = 11 –K = 10, 20, 30, 40 K % P = 10, 9, 8, 7 More uniform distribution… So hash tables often have prime number of entries

9 9 A Simple Hash for Strings unsigned int Hash(const string& Key) { unsigned int hash = 0; for (int j = 0; j != Key.size(); ++j) { hash += Key[j] } return hash; } Problem: Small sized keys may not use a large fraction of a large hash table

10 10 Another Simple Hash Function unsigned int Hash(const string& Key) { return Key[0] + 27*Key[1] + 729*Key[2]; } Problem: English does not use random strings; so, the hash values are not uniformly distributed –Using more characters of the key can improve the hash function

11 11 A Better Hash Function unsigned int Hash(const string &Key) { unsigned int hash = 0; for (int j = 0; j != Key.size(); ++j) hash = 37*hash + (Key[j]-’a’+1); return hash%TableSize; } The for loop computes  a i 37 n-i using Horner’s rule, where a i has the value 1 for ‘a’, 2 for ‘b’, etc –a 3 + 37a 2 + 37 2 a 1 + 37 3 a 0 = 37(37(37a 0 + a 1 )+ a 2 ) + a 3 The for implicitly performs arithmetic modulo 2k, where k is the number of bits in an unisigned int

12 12 STL Hash Tables STL extensions –hash_set –hash_map The key type, hash function, and equality operator may need to be provided Available in new standard as unordered set and map – or Example: Lec24/hashmapex.cpp –Reference www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1456.html


Download ppt "1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6."

Similar presentations


Ads by Google