hashing1 Hashing It’s not just for breakfast anymore!

Slides:



Advertisements
Similar presentations
Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach to the.
Advertisements

Hashing as a Dictionary Implementation
Searching “It is better to search, than to be searched” --anonymous.
CSC212 Data Structure - Section AB Lecture 20 Hashing Instructor: Edgardo Molina Department of Computer Science City College of New York.
Hashing21 Hashing II: The leftovers. hashing22 Hash functions Choice of hash function can be important factor in reducing the likelihood of collisions.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
They’re not just binary anymore!
Hashing Techniques.
Data Structures Hash Tables
Lecture 10 Sept 29 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
Lecture 11 March 5 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.
hashing1 Hashing It’s not just for breakfast anymore!
L l Chapter 11 discusses several ways of storing information in an array, and later searching for the information. l l Hash tables are a common approach.
Aree Teeraparbseree, Ph.D
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
COSC 2007 Data Structures II
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
HASHING Section 12.7 (P ). HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity)
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Hashing CS 202 – Fundamental Structures of Computer Science II Bilkent.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
CS121 Data Structures CS121 © JAS 2004 Tables An abstract table, T, contains table entries that are either empty, or pairs of the form (K, I) where K is.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
CSC211 Data Structures Lecture 20 Hashing Instructor: Prof. Xiaoyan Li Department of Computer Science Mount Holyoke College.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
The Map ADT and Hash Tables. 2 The Map ADT  Map: An abstract data type where a value is "mapped" to a unique key  Need a key and a value to insert new.
CS201: Data Structures and Discrete Mathematics I Hash Table.
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hash Tables CSIT 402 Data Structures II. Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Chapter 11 Hash Anshuman Razdan Div of Computing Studies
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
CS261 Data Structures Hash Tables Open Address Hashing.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Department of Computer Engineering Faculty of Engineering, Prince of Songkla University 1 9 – Hash Tables Presentation copyright 2010 Addison Wesley Longman,
Class List { public: // TYPEDEF and MEMBER CONSTANTS enum { CAPACITY = 30 }; // Or: static const size_t CAPACITY = 30; typedef double Item; // CONSTRUCTOR.
Hashing O(1) data access (almost) -access, insertion, deletion, updating in constant time (on average) but at a price… references: Weiss, Goodrich & Tamassia,
Prof. amr Goneid, AUC1 CSCE 110 PROGRAMMING FUNDAMENTALS WITH C++ Prof. Amr Goneid AUC Part 15. Dictionaries (1): A Key Table Class.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Dictionaries and Hashing CSCI 3333 Data Structures.
CSCI  Sequence Containers – store sequences of values ◦ vector ◦ deque ◦ list  Associative Containers – use “keys” to access data rather than.
1 Designing Hash Tables Sections 5.3, 5.4, 5.5, 5.6.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
Hashtables.
Hashing CSE 2011 Winter July 2018.
Search by Hashing.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
CS202 - Fundamental Structures of Computer Science II
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
CSC212 Data Structure - Section KL
Lecture No.02 Data Structures Dr. Sohail Aslam
Collision Handling Collisions occur when different elements are mapped to the same cell.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Presentation transcript:

hashing1 Hashing It’s not just for breakfast anymore!

hashing2 Hashing: the facts Approach that involves both storing and searching for values Behavior is linear in the worst case, but strong competitor with binary searching in the average case Hashing makes it easy to add and delete elements, an advantage over binary search (since the latter requires sorted array)

hashing3 Dictionary ADT Previously, we have seen a dictionary ADT implemented as a binary search tree A hash table can be used to provide an array-based dictionary implementation Abstract properties of dictionary: –every item has a key –to retrieve an item, specify key and retrieval process fetches associated data

hashing4 Possible structure for single dictionary item template struct RecordType { size_t key; item datarecord; }

hashing5 Setting up the array One approach to an array-based dictionary would be to create consecutive keys, storing the records so that each key corresponds to its index -- this is the method used in MS Access, for example An alternative would be to use an existing attribute of the data to be stored as the key value; this approach is more typical of hashing

hashing6 Setting up the array Use of existing key field presents challenges: –Value may be too large for indexing: e.g. social security number –No guarantee that individual values will be close enough together for effective indexing: e.g. last 4 digits of social security numbers of students in a class

hashing7 Solution: hashing Instead of direct use of data field, a function is applied to the original value to produce a valid index: this is called the hash function The hash function maps the key to an index that can be used to insert data into the array or to retrieve data based on a given key An array that uses hashing for indexing is called a hash table

hashing8 Operations on a hash table Inserting an item –calculate hash value (index) from item key –check index to determine if space is open if open, insert item if not open, collision occurs; search through array for next open slot –requires some mechanism for recognizing an empty space; can’t just start with uninitialized array

hashing9 Open-address hashing The insertion scheme just described uses open-address hashing In open addressing, collisions are resolved by placing a new item in the next open spot in the array Scheme requires that the key field of each array element be initialized to some known value; -1, for example

hashing10 Inserting a New Record In order to insert a new record, the key must somehow be converted to an array index. The index is called the hash value of the key. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number

hashing11 Inserting a New Record Typical hash function –701 is the number of items in the array –Number is the original key value [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number (Number mod 701) What is ( mod 701) ? 3

hashing12 Inserting a New Record The hash value is used for the location of the new record. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number [3]

hashing13 CollisionsCollisions Here is another new record to insert, with a hash value of 2. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number My hash value is [2].

hashing14 CollisionsCollisions This is called a collision, because there is already another valid record at [2]. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number When a collision occurs, move forward until you find an empty spot. When a collision occurs, move forward until you find an empty spot.

hashing15 CollisionsCollisions [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number The new record goes in the empty spot. The new record goes in the empty spot.

hashing16 Operations on a hash table Retrieving an item –calculate hash value based on desired key –search array, beginning at calculated index, for desired data –search is finished when: item is found; successful search an empty index is encountered; unsuccessful search

hashing17 Searching for a Key The data that's attached to a key can be found fairly quickly. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number

hashing18 Searching for a Key Calculate the hash value. Check that location of the array for the key. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number Not me. Number My hash value is [2].

hashing19 Searching for a Key Keep moving forward until you find the key, or you reach an empty spot. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number Not me. Number My hash value is [2]. Not me.Yes!

hashing20 Searching for a Key When the item is found, the information can be copied to the necessary location. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number My hash value is [2].

hashing21 Operations on a hash table Deleting an item: –find index based on hashed key, as with insertion and retrieval –mark record at index to indicate the spot is open can’t use ordinary “empty” designation -- this could interfere with record retrieval use alternative “open” designation: indicate the slot is open for insertion, but won’t stop a search

hashing22 Deleting a Record Records may also be deleted from a hash table. But the location must not be left as an ordinary "empty spot" since that could interfere with searches. The location must be marked in some special way so that a search can tell that the spot used to have something in it. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number Please delete me.

hashing23 A class specification for a hashing dictionary Public functions: –constructor: creates and initializes empty dictionary –insert: inserts a new item –is_present: returns true if specified item is found in dictionary, false if not –find: returns a copy of the desired item, if found –remove: removes specified record if it exists –size: returns total number of records in dictionary

hashing24 Invariant for dictionary class Member variable used stores the number of records currently in dictionary Member variable data is an array of CAPACITY entries; actual records are stored here Each valid record has a non-negative key value; an unused record has its key field set to the constant NEVER_USED or the constant PREVIOUSLY_USED

hashing25 Code for dictionary class template class Dictionary { public: enum {CAPACITY = 811}; Dictionary( ); void insert (const RecType& entry); void remove (int key); bool is_present(int key) const; void find (int key, bool& found, RecType& result) const; size_t size( ) const {return used;}

hashing26 Code for dictionary class … private: const int NEVER_USED = -1; const int PREVIOUSLY_USED = -2; RecType data[CAPACITY]; size_t used; …

hashing27 Helper functions in dictionary class hash: calculates hash value for given key next_index: steps through array, providing wrap- around function at end of array find_index: finds array index of record with given key never_used: returns true if index has never been used is_vacant: returns true if index is not currently in use

hashing28 Code for dictionary class... // helper functions: size_t hash (int key) const {return key%CAPACITY;} size_t next_index (size_t index) const {return (index+1)%CAPACITY;} void find_index (int key, bool& found, size_t& index) const; bool never_used (size_t index) const {return data[index].key == NEVER_USED;} bool is_vacant(size_t index) const {return data[index].key < 0;} };

hashing29 Function implementations // constructor template Dictionary ::Dictionary( ) { used = 0; for (int x=0; x<CAPACITY; x++) data[x].key = NEVER_USED; }

hashing30 Function implementations // helper function find_index template void Dictionary ::find_index(int key, bool& found, size_t& index) { size_t count=0; index = hash(key); while ((count < CAPACITY) && (!never_used(index)) && (data[index].key != key)) { count++; index = next_index(index); } found = (data[index].key == key); }

hashing31 Function implementations template void Dictionary ::insert (const RecType& entry) { bool already_present;// true if entry already in table size_t index;// location of new entry find_index(entry.key, already_present, index); if (!already_present) { assert (size( ) < CAPACITY); used++; data[index] = entry; }

hashing32 Function implementations template void Dictionary ::remove (int key) { bool found;// true if key occurs somewhere in table size_t index;// index of key value assert (key >= 0);// must be valid key find_index(key, found, index); if (found) { data[index].key = PREVIOUSLY_USED; used--; }

hashing33 Function implementations template bool Dictionary ::is_present(int key) { bool found; size_t index; assert (key >= 0); find_index (key, found, index); return found; }

hashing34 Function implementations template void Dictionary ::find(int key, bool& found, RecType& result) const { size_t index; assert (key >= 0); find_index(key, found, index); if (found) result = data[index]; }