Download presentation
1
Data Management and File Organization
Hashing
2
Hashing Motivation: The number of file access in an indexed file is as many as the tree height (3 or 4 for example) Hashing method provides a quick access to the records (1 or 2 file access)
3
Definitions Hash function: A function that returns the location of a record given its key value. Example: f(25)=1, f(1)=3
4
Definition Hash table: The data file having the records is called the hash table. Hash table is created using the order returned from the hash function.
5
Example Hash Table Use Key Mod 10 to create the hash table.
Data File Hash Table
6
Collision Problem The hash function may generate same values for different keys. Example: Keys 12 and 32 generate same results with hash function :: key mod 10 This is called collision problem
7
Solutions for collision problem
Bucketing: Use buckets as large as n records at each hash table entry Chaining: Records with the same hash values are chained in a linked list using an overflow area
8
Bucketing
9
Chaining
10
Bucketing Bucket size (n) affects the amount of file I/O
Large bucket size means more I/O to find a record. Bucketing should be used with chaining for better performance
11
Chaining The chain length is important in I/O speed.
As far as possible, we should keep the chain lengths short. The performance of the Hashing method depends on choosing a good hash function
12
Questions?
13
Term Project 1 : B-Trees Assume a file with the following record structure struct Record { int status; int EmployeeID; char Name[30]; char position[10]; float salary; } struct InternalNode int EmpId[4]; InternalNode* links[5]; // Use when pointing to another internal node LeafNode* dlinks[5]; // Use if points to leaf nodes }; Struct LeafNode int Location[4];
14
Term Project 1 : B-Tree Write a program to create a B-tree index by reading the data file and inserting the key values and the record locations into the index. Your program should also have a menu to choose the operations by the user. The possible operations are: Find a record given Employee ID Insert a new record (Get the data of the new record from the user) Delete a record given the Employee ID
15
Term Project 2 : Hashing Assume a file with the following record structure struct Record { int status; int EmployeeID; char Name[30]; char position[10]; float salary; }
16
Term Project 2 : Hashing Write a program to read the file and create a hash table. Use buckets as large as two record size in your hash table. Assume the number of records is less than 100 Two rightmost digits of each ID can be used as the hash value. To solve the collision problem use an overflow area at the end of the hash table.
17
Term Project 2 : Hashing Your program should have a menu with the following options: Find a record given Employee ID Delete a record given Employee ID Insert a new record by reading the record data
18
Assignment Write a report about choosing hash function for different data types . Consider integer, float numbers, and character strings as possible data types.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.