1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look.

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Lecture 11 oct 6 Goals: hashing hash functions chaining closed hashing application of hashing.
Data Structures Using C++ 2E
© 2004 Goodrich, Tamassia Hash Tables1  
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Log Files. O(n) Data Structure Exercises 16.1.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Hashing CS 3358 Data Structures.
CSE 250: Data Structures Week 12 March 31 – April 4, 2008.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
Hash Tables1 Part E Hash Tables  
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
Lecture 11 oct 7 Goals: hashing hash functions chaining closed hashing application of hashing.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hashing. Hashing as a Data Structure Performs operations in O(c) –Insert –Delete –Find Is not suitable for –FindMin –FindMax –Sort or output as sorted.
Hashing The Magic Container. Interface Main methods: –Void Put(Object) –Object Get(Object) … returns null if not i –… Remove(Object) Goal: methods are.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 17 Disk Storage, Basic File Structures, and Hashing.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
1.  We’ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees.  The implementation of hash tables.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Hash Tables1   © 2010 Goodrich, Tamassia.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
© 2004 Goodrich, Tamassia Hash Tables1  
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hash Tables - Motivation
CS201: Data Structures and Discrete Mathematics I Hash Table.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
CSE 373 Data Structures and Algorithms Lecture 17: Hashing II.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
CS261 Data Structures Hash Tables Open Address Hashing.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
1 Chapter 9 Searching And Table. 2 OBJECTIVE Introduces: Basic searching concept Type of searching Hash function Collision problems.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
DS.H.1 Hashing Chapter 5 Overview The General Idea Hash Functions Separate Chaining Open Addressing Rehashing Extendible Hashing Application Example: Geometric.
Fundamental Structures of Computer Science II
Hashing (part 2) CSE 2011 Winter March 2018.
Hashing CSE 2011 Winter July 2018.
Lecture No.43 Data Structures Dr. Sohail Aslam.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Resolving collisions: Open addressing
Searching Tables Table: sequence of (key,information) pairs
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hashing.
DATA STRUCTURES-COLLISION TECHNIQUES
Collision Resolution: Open Addressing Extendible Hashing
CSE 373: Data Structures and Algorithms
Presentation transcript:

1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look at 4 methods –Chaining –Closed hashing –Rehashing –Extendible hashing (we spend more time on chaining than on the other 3!) ADS2 lecture 21 Chapter 10. Maps and hash tables contd.

2 Chaining E.g. Encode a large set as a fixed-size sequence of small sets. –Use a hash function h(x) to determine which small set x belongs to –This type of hashing is called chaining. ADS2 lecture 21

3 Example Set A, of naturals encoded as 10 element sequence of mini-sets: Each value of set A is placed in the mini-set with index x mod 10. Set Index After inserting 15 and 27 we have Index ADS2 lecture 21

4 Search To find if x occurs look in mini-set x mod 10. On average this reduces searching work by a factor of 10. To make indexing quick, implement the sequence as a 10-element array. Map ADT as hash table Suppose we already have a set defined (using a linked list). E.g. to add key-value pair (K,V) (put (K,V)) Use hash function to find h(K) =i. Use set operations on set stored at position I of hash table to insert new value. ADS2 lecture 21 Code for this implementation will appear in the Maps folder

5 Complexity issues The larger the hash table the better the speed-up but more space needed. If n is upper bound of the of the set's cardinality, (and all elements positive) average time complexity of MakeEmpty, AddToSet and ElementOf is O(1) for an n-element hash table. Hash function is h(x)=x. average time complexity is O(1) for an n/4-element table but operations are slower because each mini-set has 4 items. Hash function is h(x)=x mod n/4. If we know that data is well distributed, and hash collisions rare, we can assume that complexity is O(1) ADS2 lecture 21

6 Sequencing Writing all the elements in an injective linked list takes O(n) time where n is the cardinality. With a k-element hash table time becomes O(n+k). Minimum If the set were represented by a sorted sequence [array or linked list], finding the minimum would take O(1) time but with a k-element hash table time would become O(n+k). –If deleting the minimum is much used operation: do not hash. ADS2 lecture 21

7 Other types of collision resolution… Closed hashing/Open addresssing: Does not use linked lists to resolve hash collisions. If a collision occurs, alternative cells are tried until an empty cell is found. Requires more space. –Various collision resolution strategies available: e.g. linear probing, quadratic probing, double hashing –Method preferred when memory limited (e.g. small handheld device or sensor network) See next slide ADS2 lecture 21

Linear probing ADS2 lecture 218 To hash k, if h(k) is closed (i.e. full) successively check next cells along until an open (empty) cell is found (wrapping round if necessary) Providing amount of data is less than size of table, will always find a space eventually To search for a value k, look in cell h(k) and (if necessary) all successors until an open cell is found. (And delete similarly). But will this work? See board

Quadratic probing and double hashing ADS2 lecture 219 Quadratic probing: If value is to be inserted into A[i] and it is full, then cells A[i + j 2 (mod n)] checked, j=0,1,2, … until empty space found. Double hashing: A secondary hash function h used. IF original hash function h maps some key k to bucket A[i], with i=h(k) that is already occupied, then iteratively try buckets A[i+j.h(k) (mod n)] j=0,1,2, … until empty space found. Open addressing preferred when memory limited: e.g. in programs for small (memory-limited) handheld devices or a node in a sensor network.

10 Other types of collision resolution cont. Re-hashing: when table becomes too full running time for operations becomes prohibitively large. So build another table twice as large (with new hash function). Insert all elements into new table. See board Rehashing expensive ( O (n) ) but happens infrequently. Must have at least n/2 insertions prior to rehash. ADS2 lecture 21

11 Extendible hashing: when amount of data too large to be stored in main memory. This method allows put and get to be performed using only two disk accesses. Keys to smaller sets are stored in main memory, and the size of the smaller sets are at most m. When the smaller sets become full, new keys are introduced. Other types of collision resolution cont. See board ADS2 lecture 21

12 When should I use hashing? When we have a large amount of data and only need to do insert, delete and search operations Example applications: –compilers (to keep track of declared variables) –Graph theory problems where nodes have real names instead of numbers. –Online spell-checkers. –game playing programs ADS2 lecture 21

Map Implementation ADS2 lecture 2113 Hash table implementation of Map ADT. Rather complex, and involves several different files. All explained in a ReadMe file: Z:\public_html\ADS\CodeFromLectures\MapsAndHashTables\Maps\R eadMe.txt Important points: Use chaining method of collision resolution Use NodeSet to implement the linked lists for each entry Need new interface Hashable to define classes which have a defined hash code We use a hash code that is really a hash code + hash function (like one used in OOSE?) Wouldn’t expect you to reproduce any of this code in exam, but some of you may find it interesting. In OOSE you may have seen an implemention of a hash table, we use a hash table to implement a Map. Different!