Podcast Ch21b Title: Collision Resolution

Slides:



Advertisements
Similar presentations
Hashing as a Dictionary Implementation
Advertisements

What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Hashing General idea: Get a large array
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hash Table March COP 3502, UCF.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Chapter 9 Hashing Dr. Youssef Harrath
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Sets and Maps Chapter 9.
Sections 10.5 – 10.6 Hashing.
Hashing.
Data Structures Using C++ 2E
Hashing CSE 2011 Winter July 2018.
Data Abstraction & Problem Solving with C++
School of Computer Science and Engineering
Slides by Steve Armstrong LeTourneau University Longview, TX
Hashing - Hash Maps and Hash Functions
Hash Tables (Chapter 13) Part 2.
Subject Name: File Structures
Data Structures Using C++ 2E
Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Hash functions Open addressing
Hash tables Hash table: a list of some fixed size, that positions elements according to an algorithm called a hash function … hash function h(element)
Design and Analysis of Algorithms
Advanced Associative Structures
Hash Table.
Hash Table.
Chapter 10 Hashing.
Hashing.
Resolving collisions: Open addressing
Searching Tables Table: sequence of (key,information) pairs
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Podcast Ch18b Title: STree Class
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Hash Tables Computer Science and Engineering
Advanced Implementation of Tables
Advanced Implementation of Tables
Sets and Maps Chapter 9.
Podcast Ch22b Title: Inserting into a Heap
Podcast Ch18a Title: Overview of Binary Search Trees
Podcast Ch20b Title: TreeMap Design
Podcast Ch18d Title: Binary Search Tree Iterator
Ch Hash Tables Array or linked list Binary search trees
Podcast Ch21d Title: Hash Class Iterators
Podcast Ch26b Title: Digraph Class Implementation
Podcast Ch21a Title: Hash Functions
Ch. 13 Hash Tables  .
Podcast Ch21f Title: HashSet Class
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Podcast Ch27b Title: AVLTree implementation
What we learn with pleasure we never forget. Alfred Mercier
Hash Maps Introduction
Data Structures and Algorithm Analysis Hashing
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Lecture-Hashing.
CSE 373: Data Structures and Algorithms
Presentation transcript:

Podcast Ch21b Title: Collision Resolution Description: Linear probling; chaining with lists; table expansion Participants: Barry Kurtz (instructor); John Helfert and Tobie Williams (students) Textbook: Data Structures for Java; William H. Ford and William R. Topp

Designing Hash Tables When two or more data items hash to the same table index, they cannot occupy the same position in the table. We are left with the option of locating one of the items at another position in the table (linear probing) or of redesigning the table to store a sequence of colliding keys at each index (chaining with separate lists) .

Linear Probing The hash table is an array of elements with an associated hash function. To add an item Initially, tag each entry in the table as "empty". Apply the hash function to the key and divide the value by the table size to obtain a table index. If the entry is empty, insert the item. Otherwise, start at the next hash index and scan successive indices, wrapping around to the start of the table after probing the last table entry. An insertion occurs at the first open location.

Linear Probing (continued) The search returns to the original hash location without finding an open slot, the table is full, and the linear probing algorithm throws an exception. tableIndex = x % 11

Linear Probing (continued) // compute hash index of item for a table of size n int index = (item.hashCode()&Integer.MAX_VALUE)%n int origIndex; origIndex = index; // save the original hash index // cycle through the table looking for an empty slot, // a match or a table full condition do {// test whether the table slot is empty or the // key matches the data field of the table entry if table[index] is empty insert item in table at table[index] & return else if table[index] matches item then return // begin a probe starting at next table location index = (index+1) % n; } while (index != origIndex); // we have gone around table without finding match // or open slot throw new BufferOverflowException();

Linear Probing (concluded) If the size of the table is large relative to the number of items, linear probing works well, because a good hash function generates indices that are evenly distributed over the table range, and collisions will be minimal. As the ratio of table size to the number of items approaches 1, the algorithm deteriorates to the sequential search.

Use the integer hash function hf(x) = x and the % 11 operator to store the integer values from array intArr in a hash table of size 11 using the open probe method. Assume that the elements are added to the table in the same order they are defined in the array. int[] intArr = {5, 19, 43, 38, 63, 96, 45, 65} Display the elements in the following table. (b) Which element(s) require the largest number of probes to locate it in the table? ___ 96 ____ __ 65 _________ (c) Which element(s) can be accessed with a single probe? ___ 5 _____ __ 19 ____ ___ 43 ____ (d) What is the average number of probes for linear probing. __ 16 / 8 = 2 ________________ 1 2 3 4 5 6 7 8 9 10 96 45 65 38 19 63 43

The open probe method suffers from the phenomenon known as The open probe method suffers from the phenomenon known as (a) fatal collisions (b) sparse distribution (c) clustering (d) broken chains

Chaining with Separate Lists Chaining with separate lists defines the hash table as an indexed sequence of linked lists. Each list, called a bucket, holds a set of items that hash to the same table location.

Chaining with Separate Lists (continued) A bucket is a singly linked list. Each entry of the array is the first node in a sequence of items that hash to the table index. A node has the familiar structure with two fields, one for the value and one for the reference to the next node.

Chaining with Separate Lists (continued) To add object item, use the hash function to identify the index of the appropriate bucket in the array (table). If table[i] is null, add item as the first entry in the list. Otherwise begin with the first node, entry = table[i], and compare item with entry.nodeValue. If there is no match, continue the scan with node entry.next, and so forth. If item is not in the list, add it to the front of the list.

Chaining with Separate Lists Consider the following sequence of eight elements {54, 77, 94, 89, 14, 45, 35, 76} with the identity hash function and tableSize = 11. The figure displays the lists. Each entry in a table includes the number of probes to add the element.

Use the integer hash function hf(x) = x and table size 11 Use the integer hash function hf(x) = x and table size 11. Using chaining with separate lists, show the location in the hash table for each integer value in the following array. int[] intArr = {5, 19, 43, 38, 63, 96, 45, 65} 1 2 3 4 5 6 7 8 9 10

Keys 203, 426, and 561 hash to 5 Keys 987 and 316 hash to 3 Key 736, 97 hashes to 2 Key 124 hashes to 0 Assume that insertions are done in order {987, 203, 736, 316, 426, 561, 97, 124} Insert the position of the data if chaining with m = 7 is used to resolve collisions.

Chaining with Separate Lists Chaining with separate lists is generally faster than linear probing since chaining only searches items that hash to the same table location. With linear probing, the number of table entries is limited to the table size, whereas the linked lists used in chaining grow as necessary. To delete an element, just erase it from the associated list.

Rehashing As the number of entries in the hash table increases, search performance deteriorates. Rehashing increases the hash table size when the number of entries in the table is a specified percentage of its size.

With rehashing, the table size is increased from size m to n With rehashing, the table size is increased from size m to n. How are the elements copied from the original table to the new table? Elements are scanned in the original table and then rehashed with the new table size into the new table.