Sections 10.5 – 10.6 Hashing.

Slides:



Advertisements
Similar presentations
Hash Tables.
Advertisements

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Skip List & Hashing CSE, POSTECH.
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
Hashing Techniques.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
Hashing. Searching Consider the problem of searching an array for a given value –If the array is not sorted, the search requires O(n) time If the value.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
Hash Table March COP 3502, UCF.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
ISOM MIS 215 Module 5 – Binary Trees. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Fundamental Structures of Computer Science II
Hashing: Collision Resolution Schemes
Sets and Maps Chapter 9.
Binary Search Trees Implementing Balancing Operations Hash Tables
Data Structures Using C++ 2E
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Slides by Steve Armstrong LeTourneau University Longview, TX
Hashing - Hash Maps and Hash Functions
Subject Name: File Structures
Data Structures Using C++ 2E
Efficiency add remove find unsorted array O(1) O(n) sorted array
Advanced Associative Structures
Hash Table.
Hashing as a Dictionary Implementation
Chapter 21 Hashing: Implementing Dictionaries and Sets
Resolving collisions: Open addressing
CSCE 3110 Data Structures & Algorithm Analysis
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
Advanced Implementation of Tables
Sets and Maps Chapter 9.
EE 312 Software Design and Implementation I
Dictionaries 4/5/2019 1:49 AM Hash Tables  
Chapter 8 The Map ADT.
slides created by Marty Stepp
Hash Maps Introduction
Podcast Ch21b Title: Collision Resolution
Data Structures and Algorithm Analysis Hashing
17CS1102 DATA STRUCTURES © 2018 KLEF – The contents of this presentation are an intellectual and copyrighted property of KL University. ALL RIGHTS RESERVED.
EE 312 Software Design and Implementation I
CSE 373 Set implementation; intro to hashing
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Hashing: Collision Resolution Schemes
Lecture-Hashing.
CSE 373: Data Structures and Algorithms
Presentation transcript:

Sections 10.5 – 10.6 Hashing

Linear Searching If we want to add elements as quickly as possible to a list, and we are not as concerned about how long it takes to find them we would put the element into the last slot in an array-based list into the first slot in a linked list To search this list for the element with a given key, we must use a simple linear (or sequential) search Beginning with the first element in the list, we search for the desired element by examining each subsequent element’s key until either the search is successful or the list is exhausted. Based on the number of comparisons this search is O(N) In the worst case we have to make N key comparisons. On the average, assuming that there is an equal probability of searching for any element in the list, we make N/2 comparisons for a successful search

High-Probability Ordering Sometimes certain list elements are in much greater demand than others. We can then improve the search: Put the most-often-desired elements at the beginning of the list Using this scheme, we are more likely to make a hit in the first few tries, and rarely do we have to search the whole list. If the elements in the list are not static or if we cannot predict their relative demand, we can move each element accessed to the front of the list as an element is found, it is swapped with the element that precedes it Lists in which the relative positions of the elements are changed in an attempt to improve search efficiency are called self-organizing or self-adjusting lists.

Sorted Lists If the list is sorted, a sequential search no longer needs to search the whole list to discover that an element does not exist. It only needs to search until it has passed the element’s logical place in the list—that is, until an element with a larger key value is encountered. Another advantage of linear searching is its simplicity. The binary search is usually faster, however, it is not guaranteed to be faster for searching very small lists. As the number of elements increases, however, the disparity between the linear search and the binary search grows very quickly. The binary search is appropriate only for list elements stored in a sequential array-based representation. However, the binary search tree allows us to perform a binary search on a linked data representation

10.6 Hashing

Definitions Hash Functions: A function used to manipulate the key of an element in a list to identify its location in the list Hashing: The technique for ordering and accessing elements in a list in a relatively constant amount of time by manipulating the key to identify its location in the list Hash table: Term used to describe the data structure used to store and retrieve elements using hashing

Collisions Collision: The condition resulting when two or more keys produce the same hash location Linear probing: Resolving a hash collision by sequentially searching a has table beginning at the location returned by the hash function

Handling collisions with linear probing

Removing an element An approach could be remove (element) Set location to element.hash( ) Set list[location] to null Collisions, however, complicate the matter. We cannot be sure that our element is in location element.hash(). We must examine every array element, starting with location element.hash(), until we find the matching element. We cannot stop looking when we reach an empty location, because that location may represent an element that was previously removed. This problem illustrates that hash tables, in the forms that we have studied thus far, are not the most effective data structure for implementing lists whose elements may be deleted.

More Considerations Clustering: The tendency of elements to become unevenly distributed in the hash table, with many elements clustering around a single hash location Rehashing: Resolving a collision by computing a new hash location from a hash function that manipulates the original location rather than the element’s key Quadratic probing  Resolving a hash collision by using the rehashing formula (HashValue +/- I2) % array-size Random probing  Resolving a hash collision by generating pseudo-random hash values in successive applications of the rehash function Chaining  Using a linked list of elements that share the same hash location

Handling collisions by hashing with chaining

Java’s Support for Hashing The Java Library includes a HashTable class that uses hash techniques to support storing objects in a table. The library includes several other collection classes, such as HashSet, that provide an ADT whose underlying implementation uses the approaches described in this section. The Java Object class exports a hashCode method that returns an int hash code. Therefore all Java objects have an associated hash code. The standard Java hash code for an object is a function of the object’s memory location. For most applications, hash codes based on memory locations are not usable. Therefore, many of the Java classes that define commonly used objects (such as String and Integer), override the Object class’s hashCode method with one that is based on the contents of the object.