Download presentation
Presentation is loading. Please wait.
1
1.1 Data Structure and Algorithm Lecture 9 Hashing Topics Reference: Introduction to Algorithm by Cormen Chapter 12: Hash Tables
2
1.2 Data Structure and Algorithm Introduction A hash table is a generalization of the simpler notion of an ordinary array Searching for an element in a hash table can take as long as searching for an element in an array/linked list i.e. O(N) time in the worst case. But under reasonable assumptions, hashing takes O(1) time to search an element in a hash table.
3
1.3 Data Structure and Algorithm Dictionary/Table Keys Given a student ID find the record (entry)
4
1.4 Data Structure and Algorithm Direct Addressing Direct Addressing is a simple technique that works well when the universe U of keys is reasonably small. Let Universe U= (0,1,…….m-1} where m is not too large We may assume that no two elements have the same key. To represent the dynamic set, we use an array or direct address table T[0..m-1] in which each position or slot correspond to a key in the universe U
5
1.5 Data Structure and Algorithm Direct Addressing k4k4 k5k5 U (universe of keys) actual keys 5 3 8 2 1 9 4 0 7 6 9 88 7 6 55 4 33 22 1 0 key data T Slot k points to an element in the set with key k If the set contains no element with key k, then T[k]=nil
6
1.6 Data Structure and Algorithm Direct addressing is a simple technique that works well when the universe of keys is small. Assuming each key corresponds to a unique slot. Direct-Address-Search(T,k) return T[k] Direct-Address-Insert(T,x) return T[key[x]] x Direct-Address-Delete(T,x) return T[key[x]] Nil 1 7 5 0 1 2 3 4 5 6 7 / / / / / 1 5 7 entry Direct-address Table O(1) time for all operations
7
1.7 Data Structure and Algorithm The Problem With Direct Addressing If the universe U is large, storing a table T of size |U| may be impractical or even impossible. Furthermore, the set K of keys actually stored may be so small relative to U that most of the space allocated for T would be wasted. Solution: map keys to smaller range 0..m-1 This mapping is called a hash function
8
1.8 Data Structure and Algorithm Hash function Hash function h maps the universe U of keys into slots of a hash table T [0..m-1]: h : U {0,1,….m-1} But two keys may hash to the same slot – a collision T 0 m - 1 h(k 1 ) h(k 4 ) h(k 2 ) = h(k 5 ) h(k 3 ) k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys)
9
1.9 Data Structure and Algorithm Next Problem But two keys may hash to the same slot – a collision T 0 m - 1 h(k 1 ) h(k 4 ) h(k 2 ) = h(k 5 ) h(k 3 ) k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys)
10
1.10 Data Structure and Algorithm Resolving Collisions How can we solve the problem of collisions? Solution 1: chaining Solution 2: open addressing
11
1.11 Data Structure and Algorithm Chaining Chaining puts elements that hash to the same slot in a linked list: —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k4k4 k1k1 —— k7k7 k3k3 k8k8 k6k6 k5k5 k2k2
12
1.12 Data Structure and Algorithm Chaining (insert at the head) —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k1k1 ——
13
1.13 Data Structure and Algorithm Chaining (insert at the head) —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k1k1 —— k2k2 k3k3
14
1.14 Data Structure and Algorithm Chaining (insert at the head) —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k1k1 —— k2k2 k3k3 k4k4 k1k1
15
1.15 Data Structure and Algorithm Chaining (insert at the head) —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k1k1 —— k2k2 k3k3 k4k4 k1k1 k5k5 k2k2 k6k6
16
1.16 Data Structure and Algorithm Chaining (Insert to the head) —— T k4k4 k2k2 k3k3 k1k1 k5k5 U (universe of keys) K (actual keys) k6k6 k8k8 k7k7 k4k4 k1k1 —— k7k7 k3k3 k8k8 k6k6 k5k5 k2k2
17
1.17 Data Structure and Algorithm Operations Direct-Hash-Search(T,k) Search for an element with key k in list T[h(k)] (running time is proportional to length of the list) Direct-Hash-Insert(T,x) (worst case O(1)) Insert x at the head of the list T[h(key[x])] Direct-Hash-Delete(T,x) Delete x from the list T[h(key[x])] (same as searching)
18
1.18 Data Structure and Algorithm Open Addressing Basic idea (details in Section 12.4): To insert: if slot is full, try another slot, …, until an open slot is found (probing) To search, follow same sequence of probes as would be used when inserting the element If reach element with correct key, return it If reach a NULL pointer, element is not in table Good for fixed sets (adding but no deletion) Table needn’t be much bigger than n
19
1.19 Data Structure and Algorithm Choosing A Hash Function Choosing the hash function well is crucial Bad hash function puts all elements in same slot A good hash function: Should distribute keys uniformly into slots Should not depend on patterns in the data Three popular methods: Division method Multiplication method Universal hashing
20
1.20 Data Structure and Algorithm The Division Method h(k) = k mod m hash k into a table with m slots using the slot given by the remainder of k divided by m Elements with adjacent keys hashed to different slots: good If keys bear relation to m: bad In Practice: pick table size m = prime number not too close to a power of 2 (or 10)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.