Double hashing Removal (open addressing) Chaining

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Hash Tables.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hashing as a Dictionary Implementation
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hashing21 Hashing II: The leftovers. hashing22 Hash functions Choice of hash function can be important factor in reducing the likelihood of collisions.
Hash Tables How well do hash tables support dynamic set operations? Implementations –Direct address –Hash functions Collision resolution methods –Universal.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Hash Tables.  What can we do if we want rapid access to individual data items?  Looking up data for a flight in an air traffic control system  Looking.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing as a Dictionary Implementation Chapter 19.
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can.
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Data Structures Using C++ 2E
Hashing CSE 2011 Winter July 2018.
Data Abstraction & Problem Solving with C++
Slides by Steve Armstrong LeTourneau University Longview, TX
Hashing - resolving collisions
Hashing Alexandra Stefan.
Week 8 - Wednesday CS221.
Hash Tables (Chapter 13) Part 2.
Linked List Variations
Hashing Alexandra Stefan.
Data Structures Using C++ 2E
Hashing Course: Data Structures Lecturer: Uri Zwick March 2008
Hash functions Open addressing
Quadratic probing Double hashing Removal and open addressing Chaining
Hash Table.
Hash Table.
Chapter 28 Hashing.
Hash In-Class Quiz.
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Hash Tables.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
Hassan Khosravi / Geoffrey Tien
Hashing Alexandra Stefan.
CS202 - Fundamental Structures of Computer Science II
A Hash Table with Chaining
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
Hashing Course: Data Structures Lecturer: Uri Zwick March 2008
Data Structures and Algorithm Analysis Hashing
DATA STRUCTURES-COLLISION TECHNIQUES
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Collision Resolution: Open Addressing Extendible Hashing
Lecture-Hashing.
Presentation transcript:

Double hashing Removal (open addressing) Chaining Hash Tables Double hashing Removal (open addressing) Chaining November 29, 2017 Hassan Khosravi / Geoffrey Tien

Hassan Khosravi / Geoffrey Tien Double hashing In both linear and quadratic probing the probe sequence is independent of the key Double hashing produces key dependent probe sequences In this scheme a second hash function, ℎ 2 , determines the probe sequence The second hash function must follow these guidelines ℎ 2 key ≠0 ℎ 2 ≠ ℎ 1 A typical ℎ 2 is 𝑝− key mod 𝑝 where 𝑝 is a prime number November 29, 2017 Hassan Khosravi / Geoffrey Tien

Double hashing example Hash table is size 23 The hash function, ℎ=𝑥 mod 23, where x is the search key value The second hash function, ℎ 2 =5− key mod 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 21 November 29, 2017 Hassan Khosravi / Geoffrey Tien

Double hashing example Insert 81, ℎ=81 mod 23=12 Which collides with 58 so use ℎ 2 to find the probe sequence value ℎ 2 =5− 81 𝑚𝑜𝑑 5 =4, so insert at 12 + 4 = 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 81 21 November 29, 2017 Hassan Khosravi / Geoffrey Tien

Double hashing example Insert 35, ℎ=35 mod 23=12 Which collides with 58 so use ℎ 2 to find a free space ℎ 2 =5− 35 mod 5 =5, so insert at 12 + 5 = 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 81 35 21 November 29, 2017 Hassan Khosravi / Geoffrey Tien

Double hashing example Insert 60, ℎ=60 mod 23=14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 60 81 35 21 November 29, 2017 Hassan Khosravi / Geoffrey Tien

Double hashing example Insert 83, ℎ=83 mod 23=14 ℎ 2 =5− 83 mod 5 =2, so insert at 14 + 2 = 16, which is occupied The second probe increments the insertion point by 2 again, so insert at 16 + 2 = 18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 60 81 35 83 21 November 29, 2017 Hassan Khosravi / Geoffrey Tien

Removals and open addressing Removals add complexity to hash tables It is easy to find and remove a particular item But what happens when you want to search for some other item? The recently empty space may make a probe sequence terminate prematurely One solution is to mark a table location as either empty, occupied or removed (tombstone) Locations in the removed state can be re-used as items are inserted After confirming non-existence November 29, 2017 Hassan Khosravi / Geoffrey Tien

Tombstones and performance After many removals, the table may become clogged with tombstones which must still be scanned as part of a cluster it may be beneficial to periodically re-hash all valid items Example with linear probing and ℎ 𝑥 =𝑥 mod 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X 29 54 60 35 𝜆= 4 23 ~0.174 search(75) requires 15 probes! After rehashing: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 54 35 60 𝜆= 4 23 ~0.174 search(75) requires 2 probes November 29, 2017 Hassan Khosravi / Geoffrey Tien

Hassan Khosravi / Geoffrey Tien Separate chaining Separate chaining takes a different approach to collisions Each entry in the hash table is a pointer to a linked list (or other dictionary-compatible data structure) If a collision occurs the new item is added to the end of the list at the appropriate location Performance degrades less rapidly using separate chaining with uniform random distribution, separate chaining maintains good performance even at high load factors 𝜆>1 November 29, 2017 Hassan Khosravi / Geoffrey Tien

Separate chaining example ℎ 𝑥 =𝑥 mod 23 Insert 81, ℎ(𝑥)=12, add to back (or front) of list Insert 35, ℎ(𝑥)=12 Insert 60, ℎ(𝑥)=14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 60 21 81 November 29, 2017 Hassan Khosravi / Geoffrey Tien 35

Hassan Khosravi / Geoffrey Tien Hash table discussion If 𝜆 is less than ½, open addressing and separate chaining give similar performance As 𝜆 increases, separate chaining performs better than open addressing However, separate chaining increases storage overhead for the linked list pointers It is important to note that in the worst case hash table performance can be poor That is, if the hash function does not evenly distribute data across the table November 29, 2017 Hassan Khosravi / Geoffrey Tien

Readings for this lesson Thareja Chapter 15.5.1 (Double hashing) Chapter 15.5.2 (Chaining) Next class: We're done! November 29, 2017 Hassan Khosravi / Geoffrey Tien