Presentation is loading. Please wait.

Presentation is loading. Please wait.

HASHING Section 12.7 (P. 707-717). HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity)

Similar presentations


Presentation on theme: "HASHING Section 12.7 (P. 707-717). HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity)"— Presentation transcript:

1 HASHING Section 12.7 (P. 707-717)

2 HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity) - have already seen binary and linear search and discussed when they might be useful (based on complexity) Linear: O(n) Linear: O(n) Binary: O(log n) Binary: O(log n) this may not be fast enough in some cases: this may not be fast enough in some cases: ex: a web server (like Yahoo!) ex: a web server (like Yahoo!) millions of searches/sec – millions of searches/sec – sometimes need a structure that can allow very fast retrieval of records sometimes need a structure that can allow very fast retrieval of records a hash table is one such structure a hash table is one such structure

3 HASHING (Example) HASHING (Example) library card catalog library card catalog suppose you only have a few books; suppose you only have a few books; you could make a list of the books by their catalog numbers you could make a list of the books by their catalog numbers if you need to find a book, you search through the numbers till you find the one you're looking for if you need to find a book, you search through the numbers till you find the one you're looking for not efficient with large library (Library of Congress) not efficient with large library (Library of Congress) solution: solution: create an array containing the number of elements that you have books (each book has a slot) create an array containing the number of elements that you have books (each book has a slot) put in each slot: T - the book is there F - it is not put in each slot: T - the book is there F - it is not then, if you are looking for a book number, look in the array at that slot; if True - the book is there then, if you are looking for a book number, look in the array at that slot; if True - the book is there running time: O(1) - time to find 1 element running time: O(1) - time to find 1 element what is the problem? - not practical what is the problem? - not practical need a structure that can give us the efficiency of O(1) as well as the efficient use of space need a structure that can give us the efficiency of O(1) as well as the efficient use of space

4 How Does “Hashing” Work? can come up with an efficient way to handle the indexes (say, last 2 digits) can come up with an efficient way to handle the indexes (say, last 2 digits) Then, you can search for the slot representing the last 2 digits Then, you can search for the slot representing the last 2 digits problem: problem: what if more than one book has same last 2 digits? what if more than one book has same last 2 digits? called collision called collision one solution: one solution: create an array at least twice as big as what we are storing create an array at least twice as big as what we are storing handle duplicates through technique called chaining handle duplicates through technique called chaining when duplicate happens, link them together like a linked list when duplicate happens, link them together like a linked list [draw picture of book list, with last 2 digits as index] [draw picture of book list, with last 2 digits as index]

5 How Does Hashing Work? (cont) another solution: place duplicate in another location in the table another solution: place duplicate in another location in the table One way: linear probing: One way: linear probing: can create a way to mark indexes, called a hashing function: can create a way to mark indexes, called a hashing function: index = number % sizeoftable index = number % sizeoftable add a mark indicating that a cell is occupied or unoccupied add a mark indicating that a cell is occupied or unoccupied as you add item, indicate that the cell is now occupied as you add item, indicate that the cell is now occupied if collision: if collision: start at place where collision occurs start at place where collision occurs keep moving until find empty cell keep moving until find empty cell put item in that cell and mark it occupied put item in that cell and mark it occupied good strategy: good strategy: if table gets half-full, double the size and redo the hashing function to create new indexes if table gets half-full, double the size and redo the hashing function to create new indexes problem: collision can group items together using this method problem: collision can group items together using this method

6 HASHING SOLUTIONS (cont.) another way: quadratic probing another way: quadratic probing uses a formula to determine what is the next available cell: uses a formula to determine what is the next available cell: index = (value + no_to_move_ahead 2 ) % tablesize index = (value + no_to_move_ahead 2 ) % tablesize then, if half-full, double the size of the table as before then, if half-full, double the size of the table as before Note: Note: for hashing to be a good thing, duplicates must be minimized and hashing function must be quick (have to find balance between computation of indexes and search times) for hashing to be a good thing, duplicates must be minimized and hashing function must be quick (have to find balance between computation of indexes and search times)

7 HASHING IMPLEMENTATION Explanation: Explanation: use of a class containing a struct representing the items in the table use of a class containing a struct representing the items in the table three functions: insert(), delete(), and search() three functions: insert(), delete(), and search() write our hashing function as a member function: hash() write our hashing function as a member function: hash() define the following constants: define the following constants: OCCUPIED OCCUPIED UNOCCUPIED UNOCCUPIED DELETED DELETED

8 HASHING IMPLEMENTATION #define Occupied 0 #define Unoccupied 1 #define Deleted 2 #define NotFound -1 #define DefaultSize 30 typedef int Item; struct TableItem { Item Value; Item Value; int Status; int Status;}

9 Hashing Implementation (cont) class HashTable { class HashTable { public: public: bool Insert(const Item &Value); bool Insert(const Item &Value); bool Delete(const Item &Value); bool Delete(const Item &Value); int Find(const Item &Value); int Find(const Item &Value); HashTable(); HashTable(); ~HashTable(); ~HashTable(); void Clear(); void Clear(); int HashFunction(int HashValue); int HashFunction(int HashValue); private: private: int TableSize; int TableSize; int CurrentSize; int CurrentSize; TableItem * MyTable; TableItem * MyTable;}

10 HASHING MEMBER FUNCTIONS The Constructor: allocates memory and calls clear to set the status of each cell The Constructor: allocates memory and calls clear to set the status of each cell HashTable :: HashTable() { HashTable :: HashTable() { TableSize = DefaultSize; TableSize = DefaultSize; MyTable = new TableItem[TableSize]; MyTable = new TableItem[TableSize]; Clear(); Clear(); } Clear: Clear: void HashTable :: Clear() { void HashTable :: Clear() { int TempIndex = TableSize - 1; int TempIndex = TableSize - 1; while (TempIndex >= 0) while (TempIndex >= 0) MyTable[TempIndex--].Status = Unoccupied; MyTable[TempIndex--].Status = Unoccupied; }

11 HASHING MEMBER FUNCTIONS (cont.) Deconstructor - deallocates all memory allocated for the table Deconstructor - deallocates all memory allocated for the table HashTable :: ~HashTable() HashTable :: ~HashTable() { delete [] MyTable; delete [] MyTable; } Hashing Function - used to calculate the indexes of the table Hashing Function - used to calculate the indexes of the table int HashTable :: HashFunction(Item HashValue) int HashTable :: HashFunction(Item HashValue) { return HashValue % TableSize; return HashValue % TableSize; }

12 HASHING MEMBER FUNCTIONS (cont.) Delete: Delete: finds the item using the Find() function finds the item using the Find() function if NOT FOUND returned, then delete returns false if NOT FOUND returned, then delete returns false if an index comes back, then status of that cell set to Deleted and delete returns true if an index comes back, then status of that cell set to Deleted and delete returns true bool HashTable :: Delete(const Item &Value) bool HashTable :: Delete(const Item &Value) { int Pos = Find(Value); int Pos = Find(Value); if (Pos == NotFound) if (Pos == NotFound) return false; return false; MyTable[Pos].Status = Deleted; MyTable[Pos].Status = Deleted; return true; return true; }

13 HASHING MEMBER FUNCTIONS (cont.) Find: Find: Find calls the HashingFunction to find the first possible spot for the item Find calls the HashingFunction to find the first possible spot for the item It then moves through the table looking for the next unoccupied spot It then moves through the table looking for the next unoccupied spot int HashTable :: Find(const Item &Value) { int HashTable :: Find(const Item &Value) { int Pos = HashFunction(Value); int Pos = HashFunction(Value); while (MyTable[Pos].Status != Unoccupied && while (MyTable[Pos].Status != Unoccupied && MyTable[Pos].Value != Value) MyTable[Pos].Value != Value) if (++Pos >= TableSize) if (++Pos >= TableSize) Pos = 0; Pos = 0; if (MyTable[Pos].Status == Unoccupied) || if (MyTable[Pos].Status == Unoccupied) || MyTable[Pos].Status == Deleted) MyTable[Pos].Status == Deleted) return NotFound; return NotFound; else else return Pos; return Pos; }

14 Hashing Member Functions (cont.) Insert: Insert: 1) find the initial spot that value should be added 1) find the initial spot that value should be added 2) use linear probing to find the first available location 2) use linear probing to find the first available location 3) once found, if the value is already in that spot,return false 3) once found, if the value is already in that spot,return false 4) otherwise, add the item, and increase the current size of the Table 4) otherwise, add the item, and increase the current size of the Table 5) Then, make sure the table is not half full if so, then double the table and copy over the values Then, delete the old array from memory 5) Then, make sure the table is not half full if so, then double the table and copy over the values Then, delete the old array from memory

15 Hashing Member Functions (Insert) bool HashTable :: Insert(const Item &Value) { // find spot to add int Pos = HashFunction(Value); int Pos = HashFunction(Value); while (MyTable[Pos].Status != Unoccupied && while (MyTable[Pos].Status != Unoccupied && MyTable[Pos].Value != Value) { MyTable[Pos].Value != Value) { Pos++; Pos++; // see if at the end of table // see if at the end of table if (Pos >= TableSize) if (Pos >= TableSize) Pos = 0; Pos = 0; } //if value exists, return without inserting //if value exists, return without inserting if (MyTable[Pos].Status == Occupied) if (MyTable[Pos].Status == Occupied) return false; return false;

16 Hashing – Insert (cont.) // add new item MyTable[Pos].Status = Occupied; MyTable[Pos].Value = Value; Currentsize++; //see if now more than half full if (CurrentSize * 2 < TableSize) return true; return true; //if it is more than half, increase size TableItem * OldTable = MyTable; //points to old array CurrentSize = 0; //get space for new table MyTable = new HashElement[TableSize*2]; clear();

17 Hashing (Insert) //copy values from old table to new one int OldTableSize = TableSize; TableSize *= 2; for (int i = 0; i < OldTableSize; i++) { if (OldTable[i].Status == Occupied) if (OldTable[i].Status == Occupied) Insert(OldTable[i].Value); Insert(OldTable[i].Value);} //delete the old table from memory delete [] OldTable; return true; }

18 Hashing (an Example) EXAMPLE: Given: hash table with initial size 10 that uses linear probing, show table after following insertions: 13 4 23 99 100 25 33 (be sure to double size of table when required to)

19 Questions? Read chapter on Sorting (Intro) Read chapter on Sorting (Intro) P. 722 - 733 (Insertion, Selection, Bubble) P. 722 - 733 (Insertion, Selection, Bubble)


Download ppt "HASHING Section 12.7 (P. 707-717). HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity)"

Similar presentations


Ads by Google