Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by: Search the hash table in some.

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

Hash Tables.
Part II Chapter 8 Hashing Introduction Consider we may perform insertion, searching and deletion on a dictionary (symbol table). Array Linked list Tree.
Skip List & Hashing CSE, POSTECH.
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
Dictionaries Collection of pairs.  (key, element)  Pairs have different keys. Operations.  get(theKey)  put(theKey, theElement)  remove(theKey) 5/2/20151.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by:  Search the hash table in.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Hash Table March COP 3502, UCF.
Hashing. Concept of Hashing In CS, a hash table, or a hash map, is a data structure that associates keys (names) with values (attributes). Look-Up Table.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
Hashing as a Dictionary Implementation Chapter 19.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by:  Search the hash table in.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
Data Structures and Algorithms Lecture (Searching) Instructor: Quratulain Date: 4 and 8 December, 2009 Faculty of Computer Science, IBA.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Been-Chian Chien, Wei-Pang Yang, and Wen-Yang Lin 8-1 Chapter 8 Hashing Introduction to Data Structure CHAPTER 8 HASHING 8.1 Symbol Table Abstract Data.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
Sets and Maps Chapter 9.
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Data Structures Using C++ 2E
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Slides by Steve Armstrong LeTourneau University Longview, TX
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Hashing Alexandra Stefan.
Hash Tables (Chapter 13) Part 2.
Homework will be announced soon Midterm exam date announced
Hashing CENG 351.
EEE2108: Programming for Engineers Chapter 8. Hashing
Hashing Alexandra Stefan.
Data Structures Using C++ 2E
Efficiency add remove find unsorted array O(1) O(n) sorted array
Advanced Associative Structures
Hash Table.
Hash Table.
Chapter 28 Hashing.
Hash Tables.
Data Structures and Algorithms
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries Collection of unordered pairs. Operations. (key, element)
Dictionaries Collection of pairs. Operations. (key, element)
Hashing.
Resolving collisions: Open addressing
Data Structures and Algorithms
CSCE 3110 Data Structures & Algorithm Analysis
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by: Search the hash table in some.
Data Structures & Algorithms Hash Tables
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Dictionaries Collection of pairs. Operations. (key, element)
Sets and Maps Chapter 9.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by: Search the hash table in some.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by: Search the hash table in some.
Collision Handling Collisions occur when different elements are mapped to the same cell.
What we learn with pleasure we never forget. Alfred Mercier
DATA STRUCTURES-COLLISION TECHNIQUES
17CS1102 DATA STRUCTURES © 2018 KLEF – The contents of this presentation are an intellectual and copyrighted property of KL University. ALL RIGHTS RESERVED.
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Collision Resolution: Open Addressing Extendible Hashing
Presentation transcript:

Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by: Search the hash table in some systematic fashion for a bucket that is not full. (probe sequence) Linear probing (linear open addressing). Quadratic probing. Random probing. Eliminate overflows by permitting each bucket to keep a list of all pairs for which it is the home bucket. Array linear list. Chain.

Linear Probing – Get And Put divisor = b (number of buckets) = 17. Home bucket = key % 17. 4 8 12 16 34 45 6 23 7 28 12 29 11 30 33 Put in pairs whose keys are 6, 12, 34, 29, 28, 11, 23, 7, 0, 33, 30, 45

Linear Probing – Erase(0) 4 8 12 16 6 29 34 28 11 23 7 33 30 45 erase(0) 4 8 12 16 6 29 34 28 11 23 7 45 33 30 Search cluster for pair (if any) to fill vacated bucket. 4 8 12 16 6 29 34 28 11 23 7 45 33 30

Linear Probing – erase(34) 4 8 12 16 6 29 34 28 11 23 7 33 30 45 4 8 12 16 6 29 28 11 23 7 33 30 45 Search cluster for pair (if any) to fill vacated bucket. 4 8 12 16 6 29 28 11 23 7 33 30 45 4 8 12 16 6 29 28 11 23 7 33 30 45

Linear Probing – erase(29) 4 8 12 16 6 29 34 28 11 23 7 33 30 45 4 8 12 16 6 34 28 11 23 7 33 30 45 Search cluster for pair (if any) to fill vacated bucket. 4 8 12 16 6 11 34 28 23 7 33 30 45 4 8 12 16 6 11 34 28 23 7 33 30 45 4 8 12 16 6 11 34 28 23 7 33 30 45

Performance Of Linear Probing 4 8 12 16 6 29 34 28 11 23 7 33 30 45 Worst-case find/insert/erase time is Q(n), where n is the number of pairs in the table. This happens when all pairs are in the same cluster.

Expected Performance alpha = loading density = (number of pairs)/b. 4 8 12 16 6 29 34 28 11 23 7 33 30 45 alpha = loading density = (number of pairs)/b. alpha = 12/17. Sn = expected number of buckets examined in a successful search when n is large Un = expected number of buckets examined in a unsuccessful search when n is large Time to put and remove governed by Un. A put that increases the number of pairs in the table involves an unsuccessful search followed by the addition of an element. An unsuccessful remove is essentially an unsuccessful search. A successful remove must also go to the end of the cluster and so is like an unsuccessful search.

Expected Performance alpha <= 0.75 is recommended. Sn ~ ½(1 + 1/(1 – alpha)) Un ~ ½(1 + 1/(1 – alpha)2) Note that 0 <= alpha <= 1. alpha <= 0.75 is recommended.

Linear probing - pros Simple to implement Open addressing avoids the time overhead of allocating each new entry record, and can be implemented even in the absence of a memory allocator Can provide high performance because of its good locality of reference

Linear probing - cons More sensitive to the quality of its hash function than some other collision resolution schemes number of stored entries cannot exceed the number of slots in the bucket array (for most applications, mandates dynamic resizing) Open addressing schemes also put more stringent requirements on the hash function: besides distributing the keys more uniformly over the buckets, the function must also minimize the clustering of hash values that are consecutive in the probe order. Using separate chaining, the only concern is that too many objects map to the same hash value; whether they are adjacent or nearby is completely irrelevant. [https://en.wikipedia.org/wiki/Hash_table#Collision_resolution]

Hash Table Design Performance requirements are given, determine maximum permissible loading density. We want a successful search to make no more than 10 compares (expected). Sn ~ ½(1 + 1/(1 – alpha)) alpha <= 18/19 We want an unsuccessful search to make no more than 13 compares (expected). Un ~ ½(1 + 1/(1 – alpha)2) alpha <= 4/5 So alpha <= min{18/19, 4/5} = 4/5.

Hash Table Design Dynamic resizing of table. Fixed table size. Whenever loading density exceeds threshold (4/5 in our example), rehash into a table of approximately twice the current size. Fixed table size. Know maximum number of pairs. No more than 1000 pairs. Loading density <= 4/5 => b >= 5/4*1000 = 1250. Pick b (equal to divisor) to be a prime number or an odd number with no prime divisors smaller than 20.

Linear List Of Synonyms Each bucket keeps a linear list of all pairs for which it is the home bucket. The linear list may or may not be sorted by key. The linear list may be an array linear list or a chain.

[0] [4] [8] [12] [16] 12 6 34 29 28 11 23 7 33 30 45 Sorted Chains Put in pairs whose keys are 6, 12, 34, 29, 28, 11, 23, 7, 0, 33, 30, 45 Home bucket = key % 17. The chains may be sorted or unsorted (the text uses sorted chains).

Expected Performance Note that alpha >= 0. Expected chain length is alpha. Sn ~ 1 + alpha/2. Un <= alpha, when alpha < 1. Un ~ 1 + alpha/2, when alpha >= 1.