Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.

Slides:



Advertisements
Similar presentations
Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach to the.
Advertisements

Searching “It is better to search, than to be searched” --anonymous.
CSC212 Data Structure - Section AB Lecture 20 Hashing Instructor: Edgardo Molina Department of Computer Science City College of New York.
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
hashing1 Hashing It’s not just for breakfast anymore!
hashing1 Hashing It’s not just for breakfast anymore!
L l Chapter 11 discusses several ways of storing information in an array, and later searching for the information. l l Hash tables are a common approach.
Hashing General idea: Get a large array
Aree Teeraparbseree, Ph.D
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
COSC 2007 Data Structures II
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
CSC211 Data Structures Lecture 20 Hashing Instructor: Prof. Xiaoyan Li Department of Computer Science Mount Holyoke College.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
P p Chapter 11 discusses several ways of storing information in an array, and later searching for the information. p p Hash tables are a common approach.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
CS261 Data Structures Hash Tables Open Address Hashing.
Department of Computer Engineering Faculty of Engineering, Prince of Songkla University 1 9 – Hash Tables Presentation copyright 2010 Addison Wesley Longman,
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can.
1  H ASH TABLES ARE A COMMON APPROACH TO THE STORING / SEARCHING PROBLEM. Hashing & Hash Tables Dr. Yousef Qawqzeh CSI 312 Data Structures.
CS203 Lecture 14. Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Hashtables.
Data Structures Using C++ 2E
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
Hashing CSE 2011 Winter July 2018.
Slides by Steve Armstrong LeTourneau University Longview, TX
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Search by Hashing.
Hash functions Open addressing
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Design and Analysis of Algorithms
Hash Table.
Chapter 28 Hashing.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Computer Science 2 Hashing
Hash Tables.
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Heaps Chapter 10 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates two.
Heaps Chapter 10 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates two.
Heaps Chapter 11 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates.
CS202 - Fundamental Structures of Computer Science II
Sorting "There's nothing in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting Hat, Harry Potter.
Heaps Chapter 11 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates.
Advanced Implementation of Tables
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
CSC212 Data Structure - Section KL
Collision Handling Collisions occur when different elements are mapped to the same cell.
Hash Maps Introduction
17CS1102 DATA STRUCTURES © 2018 KLEF – The contents of this presentation are an intellectual and copyrighted property of KL University. ALL RIGHTS RESERVED.
Heaps Chapter 10 has several programming projects, including a project that uses heaps. This presentation shows you what a heap is, and demonstrates two.
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Lecture-Hashing.
Hash Tables Chapter 11 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach.
Presentation transcript:

Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common approach to the storing/searching problem. This lecture illustrates hash tables, using open addressing. Before this lecture, students should have seen other forms of a Dictionary, where a collection of data is stored, and each data item has a key associated with it. Data Structures and Other Objects Using C++

Serial Search Algorithm: step through an array, check one item at a time looking for a desired item  

Binary Search  

What is a Hash Table ? The simplest kind of hash table is an array of records. This example has 701 records. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] This lecture introduces hash tables, which are an array-based method for implementing a Dictionary. You should recall that we have seen dictionaries implemented in other ways, for example with a binary search tree. The abstract properties of a dictionary remain the same: We can insert items in the dictionary, and each item has a key associated with it. When we want to retrieve an item, we specify only the key, and the retrieval process finds the associated data. What we do now is use an array to implement the dictionary. The array is an array of records. In this example, we could store up to 701 records in the array. . . . An array of records

What is a Hash Table ? [ 4 ] Each record has a special field --- unique identification value, called its key. In this example, the key is a long integer field called Number. Number 506643548 Each record in the array contains two parts. The first part is a number that we'll use for the key of the item. We could use something else for the keys, such as a string. But for a hash table, numbers make the most convenient keys. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] . . .

What is a Hash Table ? [ 4 ] Number 506643548 The number might be a person's identification number, and the rest of the record has information about the person. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] The numbers might be identification numbers of some sort, and the rest of the record contains information about a person. So the pattern that you see here is the same pattern that you've seen in other dictionaries: Each entry in the dictionary has a key (in this case an identifying number) and some associated data. . . .

What is a Hash Table ? When a hash table is in use, some spots contain valid records, and other spots are "empty". . . . When a hash table is being used as a dictionary, some of the array locations are in use, and other spots are "empty", waiting for a new entry to come along. Oftentimes, the empty spots are identified by a special key. For example, if all our identification numbers are positive, then we could use 0 as the Number that indicates an empty spot. With this drawing, locations [0], [4], [6], and maybe some others would all have Number=0. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322

Inserting a New Record Number 580625685 In order to insert a new record, the key must somehow be converted to an array index. The index is called the hash value of the key. . . . In order to insert a new entry, the key of the entry must somehow be converted to an index in the array. For our example, we must convert the key number into an index between 0 and 700. The conversion process is called hashing and the index is called the hash value of the key. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322

Inserting a New Record Number 580625685 Typical way create a hash value through hash function: (Number % 701) What is (580625685 % 701) ? . . . There are many ways to create hash values. Here is a typical approach. a. Take the key mod 701 (which could be anywhere from 0 to 700). So, quick, what is (580,625,685 mod 701) ? [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322

Inserting a New Record 3 Typical way to create a hash value: Number 580625685 Typical way to create a hash value: (Number mod 701) 3 What is (580625685 mod 701) ? . . . Three. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322

Inserting a New Record [3] Number 580625685 The hash value is used for the location of the new record. [3] . . . So, this new item will be placed at location [3] of the array. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 506643548 Number 155778322

Inserting a New Record The hash value is used for the location of the new record. . . . The hash value is always used to find the location for the record. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

Collisions Number 701466868 Here is another new record to insert, with a hash value of 2. My hash value is [2]. . . . Sometimes, two different records might end up with the same hash value. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

When a collision occurs, find an empty spot --- open-address hashing. Collisions Number 701466868 This is called a collision, because there is already another valid record at [2]. When a collision occurs, move forward until you find an empty spot --- open-address hashing. . . . This is called a collision. When a collision occurs, the insertion process will move forward through the array until an empty spot is found. Sometimes you will have a second collision... [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

When a collision occurs, Collisions Number 701466868 This is called a collision, because there is already another valid record at [2]. When a collision occurs, move forward until you find an empty spot. . . . ...and a third collision... [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

When a collision occurs, Collisions Number 701466868 This is called a collision, because there is already another valid record at [2]. When a collision occurs, move forward until you find an empty spot. . . . But if there are any empty spots, eventually you will reach an empty spot, and the new item is inserted here. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 155778322

Collisions This is called a collision, because there is already another valid record at [2]. The new record goes in the empty spot. . . . The new record is always placed in the first available empty spot, after the hash value. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

A Quiz Where would you be placed in this table, if there is no collision? Use your social security number or some other favorite number. . . . Time for another quiz . . . Did anyone end up with a hash value of 700? Well, I did. My ID number is 155779023, which has a hash value of 700. (No, not really, but I needed to illustrate another kind of collision. There is a collision with the last location of the array. In this case, you would circle back to the start of the array, and try location number 0, then 1, .... until you find an empty spot. In this example, I would be inserted at location 0, since that location is empty. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Searching for a Key Number 701466868 The data that's attached to a key can be found fairly quickly. . . . It is fairly easy to search for a particular item based on its key. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Searching for a Key Calculate the hash value. Number 701466868 Calculate the hash value. Check that location of the array for the key. My hash value is [2]. Not me. . . . Start by computing the hash value, which is 2 in this case. Then check location 2. If location 2 has a different key than the one you are looking for, then move forward... [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Searching for a Key Number 701466868 Keep moving forward until you find the key, or you reach an empty spot. My hash value is [2]. Not me. . . . ...if the next location is not the one we are looking for, then keep moving forward... [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Searching for a Key Number 701466868 Keep moving forward until you find the key, or you reach an empty spot. My hash value is [2]. Not me. . . . Keep moving forward until you find the sought-after key... [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Searching for a Key Number 701466868 Keep moving forward until you find the key, or you reach an empty spot. My hash value is [2]. Yes! . . . In this case we find the key at location [5]. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Searching for a Key Number 701466868 When the item is found, the information can be copied to the necessary location. My hash value is [2]. Yes! . . . The data from location [5] can then be copied to to provide the result of the search function. What happens if a search reaches an empty spot? In that case, it can halt and indicate that the key was not in the hash table. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Deleting a Record Records may also be deleted from a hash table. Please delete me. . . . Records can be deleted from a hash table... [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 506643548 Number 701466868 Number 155778322

Deleting a Record Records may also be deleted from a hash table. But the location must not be left as an ordinary "empty spot" since that could interfere with searches. . . . But the spot of the deleted record cannot be left as an ordinary empty spot, since that would interfere with searches. (Remember that a search can stop when it reaches an empty spot.) [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 701466868 Number 155778322

Deleting a Record Records may also be deleted from a hash table. But the location must not be left as an ordinary "empty spot" since that could interfere with searches. The location must be marked in some special way so that a search can tell that the spot used to have something in it. . . . Instead we must somehow mark the location as "a location that used to have something here, but no longer does." We might do this by using some other special value for the Number field of the record. In any case, a search can not stop when it reaches "a location that used to have something here". A search can only stop when it reaches a true empty spot. [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 700] Number 281942902 Number 233667136 Number 580625685 Number 701466868 Number 155778322

Double Hashing Linear probing: the insertion function move forward from the calculated index until a vacant spot is found in the hash table Double hashing: uses a second hash function to determine how to resolve a collision Example: hash1(key) = 330; hash2(key) = 7 If location 330 is occupied, then move forward 7 spots and try location 337. If 337 is also occupied, then move forward 7 spots to location 344, and so on. If the calculation of the next spot beyond the end of the array, then “wrap around” to the front of the array

Quadratic Probing If there is a collision, then move to the next array index. If there is a second collision, then move forward two more spots through the array. After a third collision, move forward three more spots, and so on. If the calculation of the next spot beyond the end of the array, then “wrap around” to the front of the array

Chained Hashing Open-address hashing: a collision is handled by probing the array for an unused position. Each array location can hold just one entry. Chained hashing (Chaining): each location of the hash table’s array can hold more than one entry

Time Analysis  

Summary Hash tables store a collection of records with keys. The location of a record depends on the hash value of the record's key. When a collision occurs, the next available location is used. Searching for a particular key is generally quick. When an item is deleted, the location must be marked in a special way, so that the searches know that the spot used to be used. A quick summary . . .