HASHING Section 12.7 (P. 707-717). HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity)

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
An Introduction to Hashing. By: Sara Kennedy Presented: November 1, 2002.
Hashing General idea Hash function Separate Chaining Open Addressing
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Lecture 11 oct 6 Goals: hashing hash functions chaining closed hashing application of hashing.
Hashing as a Dictionary Implementation
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Hashing Techniques.
1 Hash Tables Gordon College CS Hash Tables Recall order of magnitude of searches –Linear search O(n) –Binary search O(log 2 n) –Balanced binary.
hashing1 Hashing It’s not just for breakfast anymore!
hashing1 Hashing It’s not just for breakfast anymore!
Hashing Text Read Weiss, §5.1 – 5.5 Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Lecture 11 oct 7 Goals: hashing hash functions chaining closed hashing application of hashing.
Hashing General idea: Get a large array
L. Grewe. Computing hash function for a string Horner’s rule: (( … (a 0 x + a 1 ) x + a 2 ) x + … + a n-2 )x + a n-1 ) int hash( const string & key )
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
INTRODUCTION TO AVL TREES P. 839 – 854. INTRO  Review of Binary Trees: –Binary Trees are useful for quick retrieval of items stored in the tree –order.
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
Hash Table March COP 3502, UCF.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
INTRODUCTION TO BINARY TREES P SORTING  Review of Linear Search: –again, begin with first element and search through list until finding element,
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Storage and Retrieval Structures by Ron Peterson.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hash Tables. 2 Exercise 2 /* Exercise 1 */ void mystery(int n) { int i, j, k; for (i = 1; i
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
CSE 373 Data Structures and Algorithms Lecture 17: Hashing II.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
Searching Tables Table: sequence of (key,information) pairs (key,information) pair is a record key uniquely identifies information, so no duplicate records.
CMSC 341 Hashing Readings: Chapter 5. Announcements Midterm II on Nov 7 Review out Oct 29 HW 5 due Thursday CMSC 341 Hashing 2.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
CSC2100B Tutorial 6 Hashing Hao Ma Yi LIU Mar 4, 2004.
1 Designing Hash Tables Sections 5.3, 5.4, 5.5, 5.6.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Fundamental Structures of Computer Science II
Hashing.
Hashing Problem: store and retrieving an item using its key (for example, ID number, name) Linked List takes O(N) time Binary Search Tree take O(logN)
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Search by Hashing.
Hash Tables.
Searching Tables Table: sequence of (key,information) pairs
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Collision Resolution: Open Addressing Extendible Hashing
CSE 373: Data Structures and Algorithms
Presentation transcript:

HASHING Section 12.7 (P )

HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity) - have already seen binary and linear search and discussed when they might be useful (based on complexity) Linear: O(n) Linear: O(n) Binary: O(log n) Binary: O(log n) this may not be fast enough in some cases: this may not be fast enough in some cases: ex: a web server (like Yahoo!) ex: a web server (like Yahoo!) millions of searches/sec – millions of searches/sec – sometimes need a structure that can allow very fast retrieval of records sometimes need a structure that can allow very fast retrieval of records a hash table is one such structure a hash table is one such structure

HASHING (Example) HASHING (Example) library card catalog library card catalog suppose you only have a few books; suppose you only have a few books; you could make a list of the books by their catalog numbers you could make a list of the books by their catalog numbers if you need to find a book, you search through the numbers till you find the one you're looking for if you need to find a book, you search through the numbers till you find the one you're looking for not efficient with large library (Library of Congress) not efficient with large library (Library of Congress) solution: solution: create an array containing the number of elements that you have books (each book has a slot) create an array containing the number of elements that you have books (each book has a slot) put in each slot: T - the book is there F - it is not put in each slot: T - the book is there F - it is not then, if you are looking for a book number, look in the array at that slot; if True - the book is there then, if you are looking for a book number, look in the array at that slot; if True - the book is there running time: O(1) - time to find 1 element running time: O(1) - time to find 1 element what is the problem? - not practical what is the problem? - not practical need a structure that can give us the efficiency of O(1) as well as the efficient use of space need a structure that can give us the efficiency of O(1) as well as the efficient use of space

How Does “Hashing” Work? can come up with an efficient way to handle the indexes (say, last 2 digits) can come up with an efficient way to handle the indexes (say, last 2 digits) Then, you can search for the slot representing the last 2 digits Then, you can search for the slot representing the last 2 digits problem: problem: what if more than one book has same last 2 digits? what if more than one book has same last 2 digits? called collision called collision one solution: one solution: create an array at least twice as big as what we are storing create an array at least twice as big as what we are storing handle duplicates through technique called chaining handle duplicates through technique called chaining when duplicate happens, link them together like a linked list when duplicate happens, link them together like a linked list [draw picture of book list, with last 2 digits as index] [draw picture of book list, with last 2 digits as index]

How Does Hashing Work? (cont) another solution: place duplicate in another location in the table another solution: place duplicate in another location in the table One way: linear probing: One way: linear probing: can create a way to mark indexes, called a hashing function: can create a way to mark indexes, called a hashing function: index = number % sizeoftable index = number % sizeoftable add a mark indicating that a cell is occupied or unoccupied add a mark indicating that a cell is occupied or unoccupied as you add item, indicate that the cell is now occupied as you add item, indicate that the cell is now occupied if collision: if collision: start at place where collision occurs start at place where collision occurs keep moving until find empty cell keep moving until find empty cell put item in that cell and mark it occupied put item in that cell and mark it occupied good strategy: good strategy: if table gets half-full, double the size and redo the hashing function to create new indexes if table gets half-full, double the size and redo the hashing function to create new indexes problem: collision can group items together using this method problem: collision can group items together using this method

HASHING SOLUTIONS (cont.) another way: quadratic probing another way: quadratic probing uses a formula to determine what is the next available cell: uses a formula to determine what is the next available cell: index = (value + no_to_move_ahead 2 ) % tablesize index = (value + no_to_move_ahead 2 ) % tablesize then, if half-full, double the size of the table as before then, if half-full, double the size of the table as before Note: Note: for hashing to be a good thing, duplicates must be minimized and hashing function must be quick (have to find balance between computation of indexes and search times) for hashing to be a good thing, duplicates must be minimized and hashing function must be quick (have to find balance between computation of indexes and search times)

HASHING IMPLEMENTATION Explanation: Explanation: use of a class containing a struct representing the items in the table use of a class containing a struct representing the items in the table three functions: insert(), delete(), and search() three functions: insert(), delete(), and search() write our hashing function as a member function: hash() write our hashing function as a member function: hash() define the following constants: define the following constants: OCCUPIED OCCUPIED UNOCCUPIED UNOCCUPIED DELETED DELETED

HASHING IMPLEMENTATION #define Occupied 0 #define Unoccupied 1 #define Deleted 2 #define NotFound -1 #define DefaultSize 30 typedef int Item; struct TableItem { Item Value; Item Value; int Status; int Status;}

Hashing Implementation (cont) class HashTable { class HashTable { public: public: bool Insert(const Item &Value); bool Insert(const Item &Value); bool Delete(const Item &Value); bool Delete(const Item &Value); int Find(const Item &Value); int Find(const Item &Value); HashTable(); HashTable(); ~HashTable(); ~HashTable(); void Clear(); void Clear(); int HashFunction(int HashValue); int HashFunction(int HashValue); private: private: int TableSize; int TableSize; int CurrentSize; int CurrentSize; TableItem * MyTable; TableItem * MyTable;}

HASHING MEMBER FUNCTIONS The Constructor: allocates memory and calls clear to set the status of each cell The Constructor: allocates memory and calls clear to set the status of each cell HashTable :: HashTable() { HashTable :: HashTable() { TableSize = DefaultSize; TableSize = DefaultSize; MyTable = new TableItem[TableSize]; MyTable = new TableItem[TableSize]; Clear(); Clear(); } Clear: Clear: void HashTable :: Clear() { void HashTable :: Clear() { int TempIndex = TableSize - 1; int TempIndex = TableSize - 1; while (TempIndex >= 0) while (TempIndex >= 0) MyTable[TempIndex--].Status = Unoccupied; MyTable[TempIndex--].Status = Unoccupied; }

HASHING MEMBER FUNCTIONS (cont.) Deconstructor - deallocates all memory allocated for the table Deconstructor - deallocates all memory allocated for the table HashTable :: ~HashTable() HashTable :: ~HashTable() { delete [] MyTable; delete [] MyTable; } Hashing Function - used to calculate the indexes of the table Hashing Function - used to calculate the indexes of the table int HashTable :: HashFunction(Item HashValue) int HashTable :: HashFunction(Item HashValue) { return HashValue % TableSize; return HashValue % TableSize; }

HASHING MEMBER FUNCTIONS (cont.) Delete: Delete: finds the item using the Find() function finds the item using the Find() function if NOT FOUND returned, then delete returns false if NOT FOUND returned, then delete returns false if an index comes back, then status of that cell set to Deleted and delete returns true if an index comes back, then status of that cell set to Deleted and delete returns true bool HashTable :: Delete(const Item &Value) bool HashTable :: Delete(const Item &Value) { int Pos = Find(Value); int Pos = Find(Value); if (Pos == NotFound) if (Pos == NotFound) return false; return false; MyTable[Pos].Status = Deleted; MyTable[Pos].Status = Deleted; return true; return true; }

HASHING MEMBER FUNCTIONS (cont.) Find: Find: Find calls the HashingFunction to find the first possible spot for the item Find calls the HashingFunction to find the first possible spot for the item It then moves through the table looking for the next unoccupied spot It then moves through the table looking for the next unoccupied spot int HashTable :: Find(const Item &Value) { int HashTable :: Find(const Item &Value) { int Pos = HashFunction(Value); int Pos = HashFunction(Value); while (MyTable[Pos].Status != Unoccupied && while (MyTable[Pos].Status != Unoccupied && MyTable[Pos].Value != Value) MyTable[Pos].Value != Value) if (++Pos >= TableSize) if (++Pos >= TableSize) Pos = 0; Pos = 0; if (MyTable[Pos].Status == Unoccupied) || if (MyTable[Pos].Status == Unoccupied) || MyTable[Pos].Status == Deleted) MyTable[Pos].Status == Deleted) return NotFound; return NotFound; else else return Pos; return Pos; }

Hashing Member Functions (cont.) Insert: Insert: 1) find the initial spot that value should be added 1) find the initial spot that value should be added 2) use linear probing to find the first available location 2) use linear probing to find the first available location 3) once found, if the value is already in that spot,return false 3) once found, if the value is already in that spot,return false 4) otherwise, add the item, and increase the current size of the Table 4) otherwise, add the item, and increase the current size of the Table 5) Then, make sure the table is not half full if so, then double the table and copy over the values Then, delete the old array from memory 5) Then, make sure the table is not half full if so, then double the table and copy over the values Then, delete the old array from memory

Hashing Member Functions (Insert) bool HashTable :: Insert(const Item &Value) { // find spot to add int Pos = HashFunction(Value); int Pos = HashFunction(Value); while (MyTable[Pos].Status != Unoccupied && while (MyTable[Pos].Status != Unoccupied && MyTable[Pos].Value != Value) { MyTable[Pos].Value != Value) { Pos++; Pos++; // see if at the end of table // see if at the end of table if (Pos >= TableSize) if (Pos >= TableSize) Pos = 0; Pos = 0; } //if value exists, return without inserting //if value exists, return without inserting if (MyTable[Pos].Status == Occupied) if (MyTable[Pos].Status == Occupied) return false; return false;

Hashing – Insert (cont.) // add new item MyTable[Pos].Status = Occupied; MyTable[Pos].Value = Value; Currentsize++; //see if now more than half full if (CurrentSize * 2 < TableSize) return true; return true; //if it is more than half, increase size TableItem * OldTable = MyTable; //points to old array CurrentSize = 0; //get space for new table MyTable = new HashElement[TableSize*2]; clear();

Hashing (Insert) //copy values from old table to new one int OldTableSize = TableSize; TableSize *= 2; for (int i = 0; i < OldTableSize; i++) { if (OldTable[i].Status == Occupied) if (OldTable[i].Status == Occupied) Insert(OldTable[i].Value); Insert(OldTable[i].Value);} //delete the old table from memory delete [] OldTable; return true; }

Hashing (an Example) EXAMPLE: Given: hash table with initial size 10 that uses linear probing, show table after following insertions: (be sure to double size of table when required to)

Questions? Read chapter on Sorting (Intro) Read chapter on Sorting (Intro) P (Insertion, Selection, Bubble) P (Insertion, Selection, Bubble)