hashing1 Hashing It’s not just for breakfast anymore!

Slides:



Advertisements
Similar presentations
Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach to the.
Advertisements

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Hashing as a Dictionary Implementation
Searching “It is better to search, than to be searched” --anonymous.
CSC212 Data Structure - Section AB Lecture 20 Hashing Instructor: Edgardo Molina Department of Computer Science City College of New York.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Hashing21 Hashing II: The leftovers. hashing22 Hash functions Choice of hash function can be important factor in reducing the likelihood of collisions.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Implementation of Linear Probing (continued) Helping method for locating index: private int findIndex(long key) // return -1 if the item with key 'key'
Hashing Techniques.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
hashing1 Hashing It’s not just for breakfast anymore!
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
L l Chapter 11 discusses several ways of storing information in an array, and later searching for the information. l l Hash tables are a common approach.
CSE 373 Data Structures and Algorithms Lecture 18: Hashing III.
Aree Teeraparbseree, Ph.D
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
COSC 2007 Data Structures II
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
HASHING Section 12.7 (P ). HASHING - have already seen binary and linear search and discussed when they might be useful (based on complexity)
Hashing CS 105. Hashing Slide 2 Hashing - Introduction In a dictionary, if it can be arranged such that the key is also the index to the array that stores.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Copyright © 2002, Systems and Computer Engineering, Carleton University Hashtable.ppt * Object-Oriented Software Development Unit 8.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
CS121 Data Structures CS121 © JAS 2004 Tables An abstract table, T, contains table entries that are either empty, or pairs of the form (K, I) where K is.
90-723: Data Structures and Algorithms for Information Processing Copyright © 1999, Carnegie Mellon. All Rights Reserved. 1 Lecture 9: Searching Data Structures.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
CSC211 Data Structures Lecture 20 Hashing Instructor: Prof. Xiaoyan Li Department of Computer Science Mount Holyoke College.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
The Map ADT and Hash Tables. 2 The Map ADT  Map: An abstract data type where a value is "mapped" to a unique key  Need a key and a value to insert new.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Chapter 11 Hash Anshuman Razdan Div of Computing Studies
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashing CS 110: Data Structures and Algorithms First Semester,
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Department of Computer Engineering Faculty of Engineering, Prince of Songkla University 1 9 – Hash Tables Presentation copyright 2010 Addison Wesley Longman,
Hashing O(1) data access (almost) -access, insertion, deletion, updating in constant time (on average) but at a price… references: Weiss, Goodrich & Tamassia,
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CSE 373: Data Structures and Algorithms Lecture 16: Hashing III 1.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
Sections 10.5 – 10.6 Hashing.
Hashtables.
Hashing CSE 2011 Winter July 2018.
Data Structures and Algorithms for Information Processing
Search by Hashing.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
CS202 - Fundamental Structures of Computer Science II
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
CSC212 Data Structure - Section KL
Collision Handling Collisions occur when different elements are mapped to the same cell.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Presentation transcript:

hashing1 Hashing It’s not just for breakfast anymore!

hashing2 Hashing: the facts Approach that involves both storing and searching for values Behavior is linear in the worst case, but strong competitor with binary searching in the average case Hashing makes it easy to add and delete elements, an advantage over binary search (since the latter requires sorted array)

hashing3 Dictionary ADT Previously, we have seen a dictionary ADT implemented as a binary search tree A hash table can be used to provide an array-based dictionary implementation Abstract properties of dictionary: –every item has a key –to retrieve an item, specify key and retrieval process fetches associated data

hashing4 Setting up the array One approach to an array-based dictionary would be to create consecutive keys, storing the records so that each key corresponds to its index -- this is the method used in MS Access, for example An alternative would be to use an existing attribute of the data to be stored as the key value; this approach is more typical of hashing

hashing5 Setting up the array Use of existing key field presents challenges: –Value may be too large for indexing: e.g. social security number –No guarantee that individual values will be close enough together for effective indexing: e.g. last 4 digits of social security numbers of students in a class

hashing6 Solution: hashing Instead of direct use of data field, a function is applied to the original value to produce a valid index: this is called the hash function The hash function maps the key to an index that can be used to insert data into the array or to retrieve data based on a given key An array that uses hashing for indexing is called a hash table

hashing7 Operations on a hash table Inserting an item –calculate hash value (index) from item key –check index to determine if space is open if open, insert item if not open, collision occurs; search through array for next open slot –requires some mechanism for recognizing an empty space; can’t just start with uninitialized array

hashing8 Open-address hashing The insertion scheme just described uses open-address hashing In open addressing, collisions are resolved by placing a new item in the next open spot in the array Scheme requires that the key field of each array element be initialized to some known value; -1, for example

hashing9 Inserting a New Record In order to insert a new record, the key must somehow be converted to an array index. The index is called the hash value of the key. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number

hashing10 Inserting a New Record Typical hash function –701 is the number of items in the array –Number is the original key value [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number (Number mod 701) What is ( mod 701) ? 3

hashing11 Inserting a New Record The hash value is used for the location of the new record. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number [3]

hashing12 CollisionsCollisions Here is another new record to insert, with a hash value of 2. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number My hash value is [2].

hashing13 CollisionsCollisions This is called a collision, because there is already another valid record at [2]. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number When a collision occurs, move forward until you find an empty spot. When a collision occurs, move forward until you find an empty spot.

hashing14 CollisionsCollisions [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number The new record goes in the empty spot. The new record goes in the empty spot.

hashing15 Operations on a hash table Retrieving an item –calculate hash value based on desired key –search array, beginning at calculated index, for desired data –search is finished when: item is found; successful search an empty index is encountered; unsuccessful search

hashing16 Searching for a Key The data that's attached to a key can be found fairly quickly. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number

hashing17 Searching for a Key Calculate the hash value. Check that location of the array for the key. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number Not me. Number My hash value is [2].

hashing18 Searching for a Key Keep moving forward until you find the key, or you reach an empty spot. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number Not me. Number My hash value is [2]. Not me.Yes!

hashing19 Searching for a Key When the item is found, the information can be copied to the necessary location. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number My hash value is [2].

hashing20 Operations on a hash table Deleting an item: –find index based on hashed key, as with insertion and retrieval –mark record at index to indicate the spot is open can’t use ordinary “empty” designation -- this could interfere with record retrieval use alternative “open” designation: indicate the slot is open for insertion, but won’t stop a search

hashing21 Deleting a Record Records may also be deleted from a hash table. But the location must not be left as an ordinary "empty spot" since that could interfere with searches. The location must be marked in some special way so that a search can tell that the spot used to have something in it. [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ] [ 700] Number Number Number Number Number Number Please delete me.

Hash tables & Java: a breakfast classic Java classes have built-in support for hash tables; all classes inherit the hashcode() method from Object The hashcode() method returns a pseudorandom number that can be used as input to a hash method One caution: if you define an equals() method for a new class, you should also override the hashcode method – this is because equal objects must have equal hashcodes hashing22

A Table class implementation hashing23 public class Table { private int manyItems; private Object[ ] data; // hash table data private Object[ ] keys; // array of keys parallel to data array private boolean[ ] hasBeenUsed; // another parallel array used to indicate the status // of each index – if not currently in use, but has // been used previously, an index should not stop // a search operation

Constructor hashing24 public Table(int capacity) { if (capacity <= 0) throw new IllegalArgumentException("Capacity is negative"); keys = new Object[capacity]; data = new Object[capacity]; hasBeenUsed = new boolean[capacity]; }

The put method (adds item to table) hashing25 public Object put(Object key, Object element) { int index = findIndex(key); Object answer; if (index != -1) { // The key is already in the table. answer = data[index]; data[index] = element; return answer; }

put method continued hashing26 else if (manyItems < data.length) { // key is not yet in Table. index = hash(key); while (keys[index] != null) index = nextIndex(index); keys[index] = key; data[index] = element; hasBeenUsed[index] = true; manyItems++; return null; } else { // The table is full. throw new IllegalStateException("Table is full."); } }

remove method hashing27 public Object remove(Object key) { int index = findIndex(key); Object answer = null; if (index != -1) { answer = data[index]; keys[index] = null; data[index] = null; manyItems--; } return answer; }

findIndex method hashing28 private int findIndex(Object key) { int count = 0; int i = hash(key); while (count < data.length && hasBeenUsed[i]) { if (key.equals(keys[i])) return i; count++; i = nextIndex(i); } return -1; }

nextIndex & containsKey methods hashing29 private int nextIndex(int i) { if (i+1 == data.length) return 0; else return i+1; } public boolean containsKey(Object key) { return findIndex(key) != -1; }

hash method hashing30 private int hash(Object key) { return Math.abs(key.hashCode( )) % data.length; }

Java’s HashTable class The Java API contains a HashTable class in the java.util package Unlike the implementation just presented, the API HashTable grows automatically when it approaches capacity This has important performance issues; when the array grows, the entire table must be rehashed hashing31