Hash Tables (Chapter 13) Part 2.

Hash Tables (Chapter 13) Part 2

Performance analysis The complexity of the linear open addressed hast table is dependent on the number of buckets. In modular operation, the number of buckets is given by the divisor L. The best is O(1), worst is O(n). The worst happens when all n data elements are mapped to the same bucket. On an average the performance of the hash table is much more efficient than that of the linear lists. It has been shown that the average case performance of a linear open addressed hash table for successful and unsuccessful search where and are the number of buckets examined on an average during an unsuccessful and successful search respectively, is given by 𝑈 𝑛 𝑈 𝑛 𝑆 𝑛 𝑈 𝑛 ~ −𝛼 ) 2 𝛼= 𝑛 𝑏 Is a loading factor. The smaller the loading factor, the better the performance 𝑆 𝑛 ~ −𝛼

Other collision resolution techniques with open addressing
Rehashing Quadratic probing Random probing

Rehashing Solution: One of the major drawback of linear probing is
clustering (long sequence of records with gaps in between the sequences). The hash table becomes inefficient when the load factor is too high This leads to longer sequential search. Solution: Resorting or rehashing! Build another table that is twice as big and has a new hash function Move all elements from smaller table to bigger table Use 2nd hash table to determine the slot where the key is to be accommodated. If the slot is not empty, then another function is call for and so on.

Rehashing Thus, rehashing makes us of at least two functions H and H' where H(X) and H(X') map keys X to any one of the b buckets. Steps for insertion: H(X) is computed to obtain position of bucket b Accommodate key K to bucket b if it is empty Otherwise, the 2nd hash function H(X') is computed. Thus, the search for the slot proceeds by: = (H(X) + i. H(X')) mod b, where i = 1, 2, … Here, h1, h2, ..hn is the search sequence before an empty slot is found to accommodate a key Good choice for H(X') = M – (X mod M) where M is the prime number smaller than hash table size. ℎ 𝑖

Rehashing

Rehashing Consider a set of keys: {11, 55, 13, 35, 71, 52, 61, 9, 86, 31, 49, 85, 70}. Hash function = H(X) mod 9 The resulting hashing table are as follows: HT [0] [1] 9 55 85 [2] 11 49 [3] [4] 13 31 [5] 86 70 [6] [7] 52 61 [8] 35 71 NOTE: 1.For X = 49: H(49) = 49 mod 9 = 4 which is FULL. So rehashing (H'(X) = 7 – (X mod 7)) got: H'(49) = 7 – (49 mod 7) = 7 – 0 = 7 Therefore: H1 = (H(49) + 1.H'(49)) mod 9 = (4 + 7) mod 9 = 2 2. For X = 85: H(85) = 85 mod 9 = 4 is also FULL Rehashing: H'(X) = 7 – (85 mod 7) = 6 So H1 = (H(85) + 1.H'(85)) mod 9 = (4 + 6) mod 9 = 1

Quadratic Probing Is another method to reduce clustering.
When a collision occurs, the bucket location increases by quadratic function: H+1^2, H+2^2, H+3^2,...until and empty slot is found. Consider the keys of {17, 9, 34, 56, 11, 4, 71, 86, 55, 10, 39, 49, 52, 82, 31, 13, 22, 35, 44, 20, 60, 28} using H(X) = X mod 9 NOTE: 1. For 13 mod 9 = 4 is FULL. So using quadratic probing: 4 + 1 mod 9 = 5 2. For 44 mod 9 = 8 is FULL So: 8 + 1 mod 9 = 0 3. For 28 mod 9 = 1 is FULL So 1 + (1^2) mod 9 = 2 1 + (2^1) mod 9 = 5 1 + (3^2) mod 9 = 1 1 + (4^2) mod 9 = 8 1 + (5^2) mod 9 = 8 HT [0] [1] [2] 9 44 55 10 82 56 11 20 [3] 39 [4] 4 49 31 [5] 86 13 22 [6] 60 [7] 34 52 [8] 17 71 35 **There is no guarantee to find an empty slot if the table size is not a prime number.

Chaining keep all synonyms that are mapped to the same bucket chained to it In other words, every bucket is maintained as a singly linked list with synonyms represented as nodes The buckets continue to be represented as a sequential data structure as before, to favor the hash function computation Such a method of handling overflows is called chaining or open hashing or separate chaining

Chaining

Chaining Example:.. Let us consider the set of keys {45, 98, 12, 55, 46, 89, 65, 88, 36, 21} to be presented as a chained hash table. The hash function H used is H(X) = X mod 11. The hash function keys are as shown below. Key X 45 98 12 55 46 89 65 88 36 21 H(X) 1 10 2 3 The chain Hash table is : [0] 55 88 [1] 12 45 89 [2] 46 [3] 36 [4] [5] [6] [7] [8] [9] [10] 21 65 98

Operations on chained hash tables
Search → The search for a key X in the chained hash table. Steps: Access the bucket of the H(X) value, perform the sequential search until the key is found (successful). Insert → To insert a key X into a hash table. Steps: Compute hash function H)X) to determine the bucket. In the case of collision, the new key could be inserted either in the beginning or at the end of the chain leaving the list unordered. Delete → The deletion of Key X is elegantly done. Just search for the key X and delete in accordingly.

procedure CHAIN_HASH_SEARCH(HT, b, X)
Algorithm 13.2: Procedure to search for a key X in a chained hash table procedure CHAIN_HASH_SEARCH(HT, b, X) /* HT[0:b-1] is the hash table implemented as a one dimensional array of pointers to buckets. Here b is the number of buckets. X is the key to be searched in the hash table. In case of unsuccessful search, the procedure prints the message “KEY not found” otherwise prints “KEY found”*/ h = H(X); /* H(X) is the hash function computed on X */ TEMP = HT[h]; /* TEMP is the pointer to the first node in the chain*/ while (DATA(TEMP) ≠ X and TEMP ≠ NIL ) do /* search for the key down the chain*/ TEMP = LINK(TEMP); endwhile if ( DATA(TEMP)== X) then print (“ KEY found”); if ( TEMP == NIL) then print (“ KEY not found”); end CHAIN_HASH_SEARCH.

APPLICATIONS Representation of a keyword table in a compiler evaluation of a join operation on relational databases Direct file organization

Hash Tables (Chapter 13) Part 2.

Similar presentations

Presentation on theme: "Hash Tables (Chapter 13) Part 2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hash Tables (Chapter 13) Part 2.

Similar presentations

Presentation on theme: "Hash Tables (Chapter 13) Part 2."— Presentation transcript:

Similar presentations

About project

Feedback