Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hashing and Hash Tables

Similar presentations


Presentation on theme: "Hashing and Hash Tables"— Presentation transcript:

1 Hashing and Hash Tables
Binhai Zhu Computer Science Department, Montana State University Frequently, presenters must deliver material of a technical nature to an audience unfamiliar with the topic or vocabulary. The material may be complex or heavy with detail. To present technical material effectively, use the following guidelines from Dale Carnegie Training®. Consider the amount of time available and prepare to organize your material. Narrow your topic. Divide your presentation into clear segments. Follow a logical progression. Maintain your focus throughout. Close the presentation with a summary, repetition of the key steps, or a logical conclusion. Keep your audience in mind at all times. For example, be sure data is clear and information is relevant. Keep the level of detail and vocabulary appropriate for the audience. Use visuals to support key points or steps. Keep alert to the needs of your listeners, and you will have a more receptive audience. 11/12/2018

2 Motivation What are the dictionary operations? 11/12/2018

3 Motivation What are the dictionary operations? (1) Insert (2) Delete
(3) Search (most of the time, we will be focusing on search) 11/12/2018

4 Objective Searching takes Θ(n) time in the worst case (when the data is unorganized). Even using binary search it takes Θ(log n) time when the data are sorted. Our Objective? 11/12/2018

5 Objective Searching takes Θ(n) time in the worst case (when the data is unorganized). Even using binary search it takes Θ(log n) time when the data are sorted. Our Objective? O(1) time on average using hashing, under a reasonable assumption. 11/12/2018

6 Definitions A hash table is a generalization of an array (direct addressing is allowed), so let’s first talk about direct-address table. Universe of keys U={0,1,2,…,m-1}, no two elements have the same key. To represent a dynamic set, we use an array, or direct address table T[0..m-1], in which each position (slot) corresponds to the key in the universe. 11/12/2018

7 Definitions To represent a dynamic set, we use an array, or direct address table T[0..m-1], in which each position (slot) corresponds to a key in the universe. T satellite data / key 1 / 2 U (universe of keys) 2 3 3 1 4 / 2 9 3 K (actual keys) 4 5 5 8 6 / 5 7 / 8 11/12/2018 8 9 /

8 With a direct address table T[0
With a direct address table T[0..m-1], how do we search an element x with key k? T satellite data / key 1 / 2 U (universe of keys) 2 3 3 1 4 / 2 9 3 K (actual keys) 4 5 5 8 6 / 5 7 / 8 11/12/2018 8 9 /

9 Direct-Address-Search(T,k): return T[k]
With a direct address table T[0..m-1], how do we search an element x with key k? Direct-Address-Search(T,k): return T[k] T satellite data / key 1 / 2 U (universe of keys) 2 3 3 1 4 / 2 9 3 K (actual keys) 4 5 5 8 6 / 5 7 / 8 11/12/2018 8 9 /

10 Direct-Address-Search(T,k): return T[k]
With a direct address table T[0..m-1], how do we search/insert/delete an element x with key k? Direct-Address-Search(T,k): return T[k] T satellite data / key 1 / 2 U (universe of keys) 2 3 3 1 4 / 2 9 3 K (actual keys) 4 5 5 8 6 / 5 7 / 8 11/12/2018 8 9 /

11 Direct-Address-Search(T,k): return T[k]
With a direct address table T[0..m-1], how do we search/insert/delete an element x with key k? Direct-Address-Search(T,k): return T[k] Direct-Address-Insert(T,x): T[key[x]] ← x Direct-Address-Delete(T,x): T[key[x]] ← Nil T satellite data / key 1 / 2 U (universe of keys) 2 3 3 1 4 / 2 9 3 K (actual keys) 4 5 5 8 6 / 5 7 / 8 11/12/2018 8 9 /

12 Direct-Address-Search(T,k): return T[k]
With a direct address table T[0..m-1], how do we search/insert/delete an element x with key k? Direct-Address-Search(T,k): return T[k] Direct-Address-Insert(T,x): T[key[x]] ← x O(1) time! Direct-Address-Delete(T,x): T[key[x]] ← Nil T satellite data / key 1 / 2 U (universe of keys) 2 3 3 1 4 / 2 9 3 K (actual keys) 4 5 5 8 6 / 5 7 / 8 11/12/2018 8 9 /

13 Direct-Address-Search(T,k): return T[k]
With a direct address table T[0..m-1], how do we search/insert/delete an element x with key k? Direct-Address-Search(T,k): return T[k] Direct-Address-Insert(T,x): T[key[x]] ← x Problem? Direct-Address-Delete(T,x): T[key[x]] ← Nil T satellite data / key 1 / 2 U (universe of keys) 2 3 3 1 4 / 2 9 3 K (actual keys) 4 5 5 8 6 / 5 7 / 8 11/12/2018 8 9 /

14 Hash Table With direct addressing, an element with key k is inserted in slot h(k). h is called a hash function. h maps the universe U of keys into the slots of a hash table T[0..m-1]. h : U → {0,1,…,m-1} T / 1 8 2 / U (universe of keys) 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 8 6 / 5 7 5 8 / 11/12/2018 9 /

15 Hash Table With direct addressing, an element with key k is inserted in slot h(k). h is called a hash function. h maps the universe U of keys into the slots of a hash table T[0..m-1]. h : U → {0,1,…,m-1} T / 1 / / U (universe of keys) 2 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 If h(5)=h(8) 8 6 / 5 7 X Collision! 5 8 8 / 11/12/2018 9 /

16 Collision X Collision! Two keys hash to the same slot --- collision.
While collision is hard to avoid, if we design the hash function carefully we can at least decrease the chance for collision (and in some cases may avoid collision). T / 1 / / U (universe of keys) 2 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 If h(5)=h(8) 8 6 / 5 7 X Collision! 5 8 8 / 11/12/2018 9 /

17 Collision Resolution by Chaining
Two keys hash to the same slot --- collision. While collision is hard to avoid, if we design the hash function carefully we can at least decrease the chance for collision (and in some cases may avoid collision). T / 1 / 2 / U (universe of keys) 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 If h(5)=h(8) 8 6 / 5 7 5 8 8 / 11/12/2018 9 /

18 Collision Resolution by Chaining
Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x])] Chained-Hash-Search(T,k): T / 1 / 2 / U (universe of keys) 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 If h(5)=h(8) 8 6 / 5 7 5 8 8 / 11/12/2018 9 /

19 Collision Resolution by Chaining
Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x])] Chained-Hash-Search(T,k): search for an element with key k in list T[h(k)] T / 1 / 2 / U (universe of keys) 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 If h(5)=h(8) 8 6 / 5 7 5 8 8 / 11/12/2018 9 /

20 Collision Resolution by Chaining
Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x]) Chained-Hash-Search(T,k): search for an element with key k in list T[h(k)] Chained-Hash-Delete(T,x): T / 1 / 2 / U (universe of keys) 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 If h(5)=h(8) 8 6 / 5 7 5 8 8 / 11/12/2018 9 /

21 Collision Resolution by Chaining
Chained-Hash-Insert(T,x): insert x at the head of list T[h(key[x])] Chained-Hash-Search(T,k): search for an element with key k in list T[h(k)] Chained-Hash-Delete(T,x): delete x from the list T[h(key[x])] Time? T / 1 / 2 / U (universe of keys) 3 / 1 4 2 2 9 3 K (actual keys) 4 5 3 If h(5)=h(8) 8 6 / 5 7 5 8 8 / 11/12/2018 9 /

22 Collision Resolution by Chaining
Example: Let h(k)= k mod 11, insert 5,28,19,15,20,33,12,17,39,11 into T[0..10]. T 11 33 1 12 2 / 3 / 15 4 5 5 6 39 17 28 7 / 8 19 9 20 11/12/2018 10 /

23 Hash function A hash function which causes no collision is called perfect hash function. A good hash function is one which satisfies simple uniform hashing --- each key is equally likely to hash to any of the m slots. (It is difficult to check this condition though.) Now let’s see some example for hash functions. Assume that all the keys can be represented as natural numbers. 11/12/2018

24 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= 11/12/2018

25 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= … Example. K = , m=10000. h(k) = └10000( x … mod 1)┘ 11/12/2018

26 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Example. K = , m=10000. h(k) = └10000( x … mod 1)┘ = └10000( … mod 1)┘ 11/12/2018

27 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= … Example. K = , m=10000. h(k) = └10000( x … mod 1)┘ = └10000( … mod 1)┘ = └10000 x …)┘ 11/12/2018

28 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Example. K = , m=10000. h(k) = └10000( x mod 1)┘ = └10000( … mod 1)┘ = └10000 x …)┘ = └41.151…┘ 11/12/2018

29 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Example. K = , m=10000. h(k) = └10000( x mod 1)┘ = └10000( … mod 1)┘ = └10000 x …)┘ = └41.151…┘ = 41 11/12/2018

30 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Example 1. Shift folding: (SSN) = 1368 1368 mod 1000 = 368. 11/12/2018

31 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Example 1. Shift folding: (SSN) = 1368 1368 mod 1000 = 368. Example 2. Boundary folding: (SSN) = 1566 1566 mod 1000 = 566. 11/12/2018

32 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Mid-square function: key is squared and the middle part of the result is taken as the address. Example. k=3121, = , so h(k) = 11/12/2018

33 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Mid-square function: key is squared and the middle part of the result is taken as the address. Example. k=3121, = , so h(k) = 406. You can also encode the square into binary representation and take the middle part. 11/12/2018

34 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Mid-square function: key is squared and the middle part of the result is taken as the address. Extraction: Only a part of the key is used to compute the address. Example: , first 4 digits 1234, last 4 digits 6789 first 2 digits of 1234 ◦ last digits of 6789 we have 1289 11/12/2018

35 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Mid-square function: key is squared and the middle part of the result is taken as the address. Extraction: Only a part of the key is used to compute the address. Radix Transformation: k is transformed into another number base Example: = 4239 , then 423 mod 100 = 23. 11/12/2018

36 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Mid-square function: key is squared and the middle part of the result is taken as the address. Extraction: Only a part of the key is used to compute the address. Radix Transformation: k is transformed into another number base Example: = 4239 , then 423 mod 100 = 23. 26410 = 3239, then 323 mod 100 =23. 11/12/2018

37 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Mid-square function: key is squared and the middle part of the result is taken as the address. Extraction: Only a part of the key is used to compute the address. Radix Transformation: k is transformed into another number base Example: = 4239 , then 423 mod 100 = 23. 26410 = 3239, then 323 mod 100 =23. Collision is hard to avoid in the worst case! 11/12/2018

38 Famous Examples of Hash Functions
Division: h(k) = k mod m, m should be a prime number, better close to a power of 2. Multiplication: h(k) = └m(kA mod 1)┘, A=(√5 – 1)/2= Folding: The key is divided into several parts. These parts are combined or folded together and are transformed in a certain way to create the target address. Mid-square function: key is squared and the middle part of the result is taken as the address. Extraction: Only a part of the key is used to compute the address. Radix Transformation: k is transformed into another number base 11/12/2018

39 Open Addressing In some applications, it is hard to dynamically allocate additional space for handling the chaining. So it is natural to come up with a different way to handle collision in which all elements are stored in the hash table itself. Then, instead of following pointers, we simply compute the sequences of slots to be examined. Let’s use insertion as an example. 11/12/2018

40 Open Addressing In some applications, it is hard to dynamically allocate additional space for handling the chaining. So it is natural to come up with a different way to handle collision in which all elements are stored in the hash table itself. Then, instead of following pointers, we simply compute the sequences of slots to be examined. Let’s use insertion as an example. To perform insertion using open addressing, we successively examine or probe the hash table until we find an empty slot to put the element. Moreover, the sequence of positions probed depends on the key being inserted; i.e., h: U x {0,1,…,m-1} → {0,1,…,m-1} 11/12/2018

41 Open Addressing To perform insertion using open addressing, we successively examine or probe the hash table until we find an empty slot to put the element. Moreover, the sequence of positions probed depends on the key being inserted; i.e., h: U x {0,1,…,m-1} → {0,1,…,m-1} Apparently, for every key k, the probe sequence <h(k,0), h(k,1),…,h(k,m-1)> is a permutation of <0,1,…,m-1> so that every position in the hash table is eventually considered as a slot for a new key as the table fills up. Now, for simplicity, assume k=x, and there is no deletion. 11/12/2018

42 Open Addressing Hash-Insert(T,k) 1. i ← 0 2. repeat j ← h(k,i)
if T[j] == Nil then T[j] ← k return j else i ← i + 1 7. until i=m 8. error “hash table overflow” 11/12/2018

43 Open Addressing Hash-Insert(T,k) 1. i ← 0 T 2. repeat j ← h(k,i)
if T[j] == Nil then T[j] ← k return j else i ← i + 1 7. until i=m 8. error “hash table overflow” T 1 2 3 4 5 6 7 8 Example. Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 10 11/12/2018

44 Open Addressing Hash-Insert(T,k) 1. i ← 0 T 2. repeat j ← h(k,i)
if T[j] == Nil then T[j] ← k return j else i ← i + 1 7. until i=m 8. error “hash table overflow” T h(10,0)=(10+0) mod 11 = 10 1 2 3 4 5 6 7 8 Example. Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 10 10 11/12/2018

45 Open Addressing Hash-Insert(T,k) 1. i ← 0 T 2. repeat j ← h(k,i)
if T[j] == Nil then T[j] ← k return j else i ← i + 1 7. until i=m 8. error “hash table overflow” T 22 h(10,0)=(10+0) mod 11 = 10 h(22,0)= 0 h(31,0)=9 h(4,0)=4 h(15,0)=(4+0) mod 11 =4 1 2 3 4 4 5 6 7 8 Example. Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

46 Open Addressing Hash-Insert(T,k) 1. i ← 0 2. repeat j ← h(k,i)
if T[j] == Nil then T[j] ← k return j else i ← i + 1 7. until i=m 8. error “hash table overflow” T 22 h(10,0)=(10+0) mod 11 = 10 h(22,0)= 0 h(31,0)=9 h(4,0)=4 h(15,0)=(4+0) mod 11 =4 h(15,1)=(4+1) mod 11 =5 1 2 3 4 4 5 15 6 7 8 Example. Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

47 Open Addressing Hash-Insert(T,k) 1. i ← 0 2. repeat j ← h(k,i)
if T[j] == Nil then T[j] ← k return j else i ← i + 1 7. until i=m 8. error “hash table overflow” T 22 1 88 2 3 4 4 5 15 6 28 7 17 8 59 Example. Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

48 Open Addressing Hash-Search(T,k) 1. i ← 0 2. repeat j ← h(k,i)
if T[j] == k then return j i ← i + 1 6. until T[j]=Nil or i=m 7. return Nil T 22 1 88 i = 0 j ← h(15,0)=4 T[j] != 15 i = 1 j ← h(15,1)=5 T[j] = 15 return 5 2 3 4 4 5 15 6 28 7 17 8 59 Example. Search 15 in T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

49 Open Addressing How about deletion? You can simply use Hash-Search
to find the key first. Then what? 1. i ← 0 2. repeat j ← h(k,i) if T[j] != Nil and T[j]==k then T[j] ← Nil? exit i ← i + 1 6. until T[j]=Nil or i=m T 22 1 88 2 3 4 4 5 15 6 28 7 17 8 59 Example. Delete 4,15 in T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

50 Open Addressing How about deletion? You can simply use Hash-Search
to find the key first. Then what? 1. i ← 0 2. repeat j ← h(k,i) if T[j] != Nil and T[j] == k then T[j] ← Nil?, exit i ← i + 1 6. until T[j]=Nil or i=m T 22 1 88 Delete 15: i = 0 j ← h(15,0)=4 T[j] = Nil exit 2 3 4 Nil 5 15 6 28 7 17 8 59 Example. Delete 4,15 in T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

51 Open Addressing How about deletion? You can simply use Hash-Search
to find the key first. 1. i ← 0 2. repeat j ← h(k,i) if T[j] != Nil and T[j] == k then T[j] ← deleted, exit i ← i + 1 6. until T[j]=Nil or i=m T 22 1 88 Delete 15: i = 0 j ← h(15,0)=4 T[j] = deleted i = 1 j ← h(15,1)=5 T[j]=15 2 3 4 deleted 5 15 6 28 7 17 8 59 Example. Delete 4,15 in T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

52 Open Addressing How about deletion? You can simply use Hash-Search
to find the key first. 1. i ← 0 2. repeat j ← h(k,i) if T[j] != Nil and T[j] == k then T[j] ← deleted, exit i ← i + 1 6. until T[j]=Nil or i=m T 22 1 88 Delete 15: i = 0 j ← h(15,0)=4 T[j] = deleted i = 1 j ← h(15,1)=5 T[j]=15 15 is deleted! 2 3 4 deleted 5 deleted 6 28 7 17 8 59 Example. Delete 4,15 in T. h(k,i)=[h’(k)+i] mod m, h’(k)=k mod m. 9 31 10 10 11/12/2018

53 Linear probing That is sth we have just seen.
h’ is an ordinary hash function; i.e., h’: U → {0,1,2,…,m-1} h(k,i) = [h’(k) + i] mod m. Initial slot probed is exactly T[h’(k)]. 11/12/2018

54 Quadratic probing h’ is an ordinary hash function; i.e.,
h’: U → {0,1,2,…,m-1} h(k,i) = [h’(k) + C1i + C2i2] mod m, C1, C2 are two non-zero constants. Initial slot probed is also T[h’(k)], but when i>0 it is intuitively better than linear probing. 11/12/2018

55 Quadratic probing Example.
Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+C1i + C2i2] mod m, h’(k)=k mod m, C1=1, C2=3. T 1 2 3 4 5 6 7 8 9 10 11/12/2018

56 Quadratic probing Example.
Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+C1i + C2i2] mod m, h’(k)=k mod m, C1=1, C2=3. T 22 1 h(10,0)=10 h(22,0)=0 h(31,0)=9 h(4,0)=4 h(15,0)=4 2 3 4 4 5 6 7 8 9 31 10 10 11/12/2018

57 Quadratic probing Example.
Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+C1i + C2i2] mod m, h’(k)=k mod m, C1=1, C2=3. T 22 1 h(10,0)=10 h(22,0)=0 h(31,0)=9 h(4,0)=4 h(15,0)=4 h(15,1)=[4+1+3] mod 11 = 8 h(28,0)=6 2 3 4 4 5 6 28 7 8 15 9 31 10 10 11/12/2018

58 Quadratic probing Example.
Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+C1i + C2i2] mod m, h’(k)=k mod m, C1=1, C2=3. T 22 1 h(17,0)=6 h(17,1)=10 h(17,2)=[6+2+3x22] mod 11 = 9 h(17,3)=[6+3+3x32] mod 11 = 3 2 3 17 4 4 5 6 28 7 8 15 9 31 10 10 11/12/2018

59 Quadratic probing Example.
Insert keys 10,22,31,4,15,28,17,88,59 into T. h(k,i)=[h’(k)+C1i + C2i2] mod m, h’(k)=k mod m, C1=1, C2=3. T 22 1 2 88 3 17 4 4 5 6 28 7 59 8 15 9 31 10 10 11/12/2018

60 Double probing h(k,i) = [h1(k) + ih2(k)] mod m,
h1, h2 are two auxiliary hash functions. Initial slot probed is also T[h1(k)]. If m = 2j, then h2 should better be an odd function. Example. h1(k)=k mod m, h2(k)=1+[k mod m’], m’=m-2. 11/12/2018

61 Double probing Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h1(k)+ih2(k)] mod m, h1(k)=k mod m, h2(k)=1 + [k mod (m-1)]. T 1 2 3 4 5 6 7 8 9 10 11/12/2018

62 Double probing Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h1(k)+ih2(k)] mod m, h1(k)=k mod m, h2(k)=1 + [k mod (m-1)]. T 22 h(10,0)=10 h(22,0)=0 h(31,0)=9 h(4,0)=4 h(15,0)=4 h(15,1)=[4+1x6] mod 11 = 10 h(15,2)=[4+2x6] mod 11 = 5 1 2 3 4 4 5 6 7 8 9 31 10 10 11/12/2018

63 Double probing Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h1(k)+ih2(k)] mod m, h1(k)=k mod m, h2(k)=1 + [k mod (m-1)]. T 22 h(28,0)=6 h(17,0)=6 h(17,1)=[6+1x8] mod 11 = 3 h(88,2)=[0+2x9] mod 11 = 7 h(59,2)=[4+2x10] mod 11 = 2 1 2 3 17 4 4 5 15 6 28 7 8 9 31 10 10 11/12/2018

64 Double probing Example. Insert keys 10,22,31,4,15,28,17,88,59 into T.
h(k,i)=[h1(k)+ih2(k)] mod m, h1(k)=k mod m, h2(k)=1 + [k mod (m-1)]. T 22 h(28,0)=6 h(17,0)=6 h(17,1)=[6+1x8] mod 11 = 3 h(88,2)=[0+2x9] mod 11 = 7 h(59,2)=[4+2x10] mod 11 = 2 1 59 2 3 17 4 4 5 15 6 28 7 88 8 9 31 10 10 11/12/2018

65 Analysis of hashing (in general tough)
In a hash table with size m, we want to store n elements with collision resolved by chaining. Let α = n/m. Theorem. An unsuccessful search takes expected time O(1+α), under the assumption of simple uniform hashing. 11/12/2018

66 Analysis of hashing (in general tough)
Theorem An unsuccessful search takes expected time O(1+α), under the assumption of simple uniform hashing. Sketch of proof. Under the assumption of simple uniform hashing, any new key k is equally likely to hash to any of the m slots. The expected time to search unsuccessfully for the key k is the expected time to search to the end of list T[h(k)], which has expected size α. Therefore, including the cost for computing h(k), the cost for this unsuccessful search is O(1+α) □ 11/12/2018


Download ppt "Hashing and Hash Tables"

Similar presentations


Ads by Google