Download presentation
Presentation is loading. Please wait.
Published byKenneth Cooper Modified over 8 years ago
1
1 5. Abstract Data Structures & Algorithms 5.2 Static Data Structures
2
5.2.3 Hash Tables
3
3 Searching arrays Linear search - slow for large arrays. Binary search - fast but array must be constantly kept in order. So, to allow direct access to an item, keep an index table (the extra time taken to look up in this index is negligible compare to search time).
4
4 Hash: definition A hash is a number generated from a larger set of data (e.g. a string of text) that “summarises” that text. Also known as a message digest. Used for digitally signing a document as well as hash (index) tables.
5
5 Hash functions Perform some maths on the data to produce a summary that is (fairly) unique for those data. ‣ e.g. add all the letters' ASCII codes together, then apply mod 17. This is known as the hash function.
6
6 Hash tables The result can then be used as the memory location (address ) in which to store data related to that item, ‣ e.g. hash a person’s name to give the memory location for storing their phone number. To retrieve the data (phone number), the computer performs the hash function on the name and accesses it directly in memory.
7
7 Collisions Even the best hash function may produce the same hash for different data items (a collision). Using division by prime numbers tends to avoid collisions, but they will always occur.
8
8 Handling collisions Use an overflow area: ‣ divide the table in two, one half for successful hashes, the other for overflow, ‣ collisions are stored in the overflow area sequentially, ‣ if the required value is not in the main hash table, the overflow area is searched linearly.
9
9 Handling collisions Chaining: ‣ each hash value actually points to a list or chain of values with the same hash, ‣ a two-dimensional array could be used for this (wasteful), but a dynamic linked list is more common.
10
10 Handling collisions Probing: ‣ on a collision, store the value in the next available space along, ‣ retrieve it by finding the hash, then searching linearly until found.
11
11 Efficiency Hash table takes more space than a sequential file. Table itself contains empty spaces. Large prime divisors make for good hashing algorithms. Allows direct access (much faster).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.