Quadratic probing Double hashing Removal and open addressing Chaining

Slides:



Advertisements
Similar presentations
Hash Tables.
Advertisements

Hashing as a Dictionary Implementation
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
Hashing21 Hashing II: The leftovers. hashing22 Hash functions Choice of hash function can be important factor in reducing the likelihood of collisions.
Dictionaries and Their Implementations
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hash Tables.  What can we do if we want rapid access to individual data items?  Looking up data for a flight in an air traffic control system  Looking.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing as a Dictionary Implementation Chapter 19.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Dictionaries and Their Implementations Chapter 18 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
Fundamental Structures of Computer Science II
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Hashing.
Data Structures Using C++ 2E
Open Addressing: Quadratic Probing
Hashing CSE 2011 Winter July 2018.
Data Abstraction & Problem Solving with C++
Slides by Steve Armstrong LeTourneau University Longview, TX
Hashing - resolving collisions
Hashing Alexandra Stefan.
Hash Tables (Chapter 13) Part 2.
Hashing Alexandra Stefan.
Data Structures Using C++ 2E
Hash functions Open addressing
Design and Analysis of Algorithms
Hash Table.
Hash Table.
Chapter 28 Hashing.
Instructor: Lilian de Greef Quarter: Summer 2017
Hash In-Class Quiz.
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Hash Tables.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Double hashing Removal (open addressing) Chaining
Hashing Alexandra Stefan.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
A Hash Table with Chaining
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Collision Handling Collisions occur when different elements are mapped to the same cell.
Ch. 13 Hash Tables  .
Hashing.
What we learn with pleasure we never forget. Alfred Mercier
DATA STRUCTURES-COLLISION TECHNIQUES
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Collision Resolution: Open Addressing Extendible Hashing
Lecture-Hashing.
CSE 373: Data Structures and Algorithms
Presentation transcript:

Quadratic probing Double hashing Removal and open addressing Chaining Collision Resolution Quadratic probing Double hashing Removal and open addressing Chaining March 09, 2018 Cinda Heeren / Geoffrey Tien

Cinda Heeren / Geoffrey Tien Quadratic probing Quadratic probing is a refinement of linear probing that prevents primary clustering For each probe, 𝑝, add 𝑝 2 to the original location index 1st probe: ℎ 𝑥 + 1 2 , 2nd: ℎ 𝑥 + 2 2 , 3rd: ℎ 𝑥 + 3 2 , etc. Results in secondary clustering The same sequence of probes is used when two different values hash to the same location This delays the collision resolution for those values Analysis suggests that secondary clustering is not a significant problem March 09, 2018 Cinda Heeren / Geoffrey Tien

Quadratic probing example Hash table is size 23 The hash function, ℎ=𝑥 mod 23, where 𝑥 is the search key value The search key values are shown in the table 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Quadratic probing example Insert 81, ℎ=81 mod 23=12 Which collides with 58 so use quadratic probing to find a free space First look at 12 + 12, which is free so insert the item at index 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 81 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Quadratic probing example Insert 35, ℎ=35 mod 23=12 Which collides with 58 First look at 12 + 12, which is occupied, then look at 12 + 22 = 16 and insert the item there 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 81 35 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Quadratic probing example Insert 60, ℎ=60 mod 23=14 The location is free, so insert the item 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 81 60 35 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Quadratic probing example Insert 12, ℎ=12 mod 23=12, which is occupied First check index 12 + 12, Then 12 + 22 = 16, Then 12 + 32 = 21 (which is also occupied), Then 12 + 42 = 28, wraps to index 5 which is free 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 12 29 32 58 81 60 35 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Quadratic probe sequences Note that after some time a sequence of probes repeats itself In the preceding example h(key) = key % 23 = 12, resulting in this sequence of probes (table size of 23) 12, 13, 16, 21, 28(5), 37(14), 48(2), 61(15), 76(7), 93(1), 112(20), 133(18), 156(18), 181(20), 208(1), 237(7), ... This generally does not cause problems if The data is not significantly skewed, The hash table is large enough (around 2 * the number of items), and The hash function scatters the data evenly across the table Quadratic probing may potentially fail at 𝜆> 1 2 proof is outside the scope of this course March 09, 2018 Cinda Heeren / Geoffrey Tien

Cinda Heeren / Geoffrey Tien Double hashing In both linear and quadratic probing the probe sequence is independent of the key Double hashing produces key dependent probe sequences In this scheme a second hash function, ℎ 2 , determines the probe sequence The second hash function must follow these guidelines ℎ 2 key ≠0 ℎ 2 ≠ ℎ 1 A typical ℎ 2 is 𝑝− key mod 𝑝 where 𝑝 is a prime number March 09, 2018 Cinda Heeren / Geoffrey Tien

Double hashing example Hash table is size 23 The hash function, ℎ=𝑥 mod 23, where x is the search key value The second hash function, ℎ 2 =5− key mod 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Double hashing example Insert 81, ℎ=81 mod 23=12 Which collides with 58 so use ℎ 2 to find the probe sequence value ℎ 2 =5− 81 𝑚𝑜𝑑 5 =4, so insert at 12 + 4 = 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 81 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Double hashing example Insert 35, ℎ=35 mod 23=12 Which collides with 58 so use ℎ 2 to find a free space ℎ 2 =5− 35 mod 5 =5, so insert at 12 + 5 = 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 81 35 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Double hashing example Insert 60, ℎ=60 mod 23=14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 60 81 35 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Double hashing example Insert 83, ℎ=83 mod 23=14 ℎ 2 =5− 83 mod 5 =2, so insert at 14 + 2 = 16, which is occupied The second probe increments the insertion point by 2 again, so insert at 16 + 2 = 18 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 60 81 35 83 21 March 09, 2018 Cinda Heeren / Geoffrey Tien

Removals and open addressing Removals add complexity to hash tables It is easy to find and remove a particular item But what happens when you want to search for some other item? The recently empty space may make a probe sequence terminate prematurely One solution is to mark a table location as either empty, occupied or removed (tombstone) Locations in the removed state can be re-used as items are inserted After confirming non-existence March 09, 2018 Cinda Heeren / Geoffrey Tien

Tombstones and performance After many removals, the table may become clogged with tombstones which must still be scanned as part of a cluster it may be beneficial to periodically re-hash all valid items Example with linear probing and ℎ 𝑥 =𝑥 mod 23 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X 29 54 60 35 𝜆= 4 23 ~0.174 search(75) requires 15 probes! After rehashing: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 54 35 60 𝜆= 4 23 ~0.174 search(75) requires 2 probes March 09, 2018 Cinda Heeren / Geoffrey Tien

Cinda Heeren / Geoffrey Tien Try it! Insert the following items into a hash table of 29 elements: 61, 19, 32, 72, 3, 76, 5, 34 Using a hash function h(x) = x mod 29 and quadratic probing Using double hashing with: h(x) = x mod 29 and h2(x) = 11 – (x mod 11) March 09, 2018 Cinda Heeren / Geoffrey Tien

Cinda Heeren / Geoffrey Tien Separate chaining Separate chaining takes a different approach to collisions Each entry in the hash table is a pointer to a linked list (or other dictionary-compatible data structure) If a collision occurs the new item is added to the end of the list at the appropriate location Performance degrades less rapidly using separate chaining with uniform random distribution, separate chaining maintains good performance even at high load factors 𝜆>1 March 09, 2018 Cinda Heeren / Geoffrey Tien

Separate chaining example ℎ 𝑥 =𝑥 mod 23 Insert 81, ℎ(𝑥)=12, add to back (or front) of list Insert 35, ℎ(𝑥)=12 Insert 60, ℎ(𝑥)=14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 29 32 58 60 21 81 March 09, 2018 Cinda Heeren / Geoffrey Tien 35

Cinda Heeren / Geoffrey Tien Hash table discussion If 𝜆 is less than ½, open addressing and separate chaining give similar performance As 𝜆 increases, separate chaining performs better than open addressing However, separate chaining increases storage overhead for the linked list pointers It is important to note that in the worst case hash table performance can be poor That is, if the hash function does not evenly distribute data across the table March 09, 2018 Cinda Heeren / Geoffrey Tien

Readings for this lesson Carrano & Henry Chapter 18.4.2 (Collision resolution) Chapter 18.4.6 (Chaining) Next class Carrano & Henry, Chapter 21.2.5 (B-Trees) March 09, 2018 Cinda Heeren / Geoffrey Tien