CS203 Lecture 14. Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive.

Slides:



Advertisements
Similar presentations
CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.
Advertisements

The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Hashing. CENG 3512 Motivation The primary goal is to locate the desired record in a single access of disk. – Sequential search: O(N) – B+ trees: O(log.
Skip List & Hashing CSE, POSTECH.
Hashing as a Dictionary Implementation
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Maps, Dictionaries, Hashtables
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
Hashing General idea: Get a large array
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
1 Lecture 12 More on Hashing Multithreading. 2 Speakers!
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
1 Resolving Collision Although collisions should be avoided as much as possible, they are inevitable Need a strategy for resolving collisions. We look.
Sections 10.5 – 10.6 Hashing.
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Hashing.
May 3rd – Hashing & Graphs
Data Structures Using C++ 2E
Azita Keshmiri CS 157B Ch 12 indexing and hashing
LEARNING OBJECTIVES O(1), O(N) and O(LogN) access times. Hashing:
Hashing CSE 2011 Winter July 2018.
Data Abstraction & Problem Solving with C++
School of Computer Science and Engineering
Slides by Steve Armstrong LeTourneau University Longview, TX
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Dictionaries 9/14/ :35 AM Hash Tables   4
Search by Hashing.
Hash functions Open addressing
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
Data Structures and Database Applications Hashing and Hashtables in C#
Advanced Associative Structures
Hash Table.
Hash Table.
Chapter 28 Hashing.
Extendible Indexing Dina Said
Data Structures and Database Applications Hashing and Hashtables in C#
Hash Tables.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Dictionaries and Their Implementations
TreeMap & HashMap 7/16/2009.
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Searching CLRS, Sections 9.1 – 9.3.
Advance Database System
CS202 - Fundamental Structures of Computer Science II
Hash Tables Computer Science and Engineering
Advanced Implementation of Tables
Data Structure and Algorithms
Hash Tables Computer Science and Engineering
Advanced Implementation of Tables
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Hash Tables Computer Science and Engineering
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Data Structures – Week #7
Podcast Ch21a Title: Hash Functions
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Podcast Ch21b Title: Collision Resolution
17CS1102 DATA STRUCTURES © 2018 KLEF – The contents of this presentation are an intellectual and copyrighted property of KL University. ALL RIGHTS RESERVED.
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Lecture-Hashing.
Presentation transcript:

CS203 Lecture 14

Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive suppose your collection of Strings stores the text of various books, you are adding a book, and you need to make sure you are preserving the Set definition – ie that no book already in the Set has the same text as the one you are adding. A hash function maps each datum to a value to a fixed and manageable size. This reduces the search space and makes searching less expensive Hash functions must be deterministic, since when we search for an item we will search for its hashed value. If an identical item is in the list, it must have received the same hash value 2

Hashing Any function that maps larger data to smaller ones must map more than one possible original datum to the same mapped value Diagram from Wikipedia 3 When more than one item in a collection receives the same hash value, a collision is said to occur. There are various ways to deal with this. The simplest is to create a list of all items with the same hash code, and do a sequential or binary search of the list if necessary

Hashing Hash functions often use the data to calculate some numeric value, then perform a modulo operation. This results in a bounded-length hash value. If the calculation ends in % n, the maximum hash value is n-1, no matter how large the original data is. It also tends to produce hash values that are relatively uniformly distributed, minimizing collisions. Modulo may be used again within a hash-based data structure in order to scale the hash values to the number of keys found in the structure 4

Hashing Sparse: A AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ Not Sparse: Juror 1 Juror 2 Juror 3 Juror 4 5 The more sparse the data, the more useful hashing is

Hashing Hashing is used for many purposes in programming. The one we are interested in right now is that it makes it easier to look up data or memory addresses in a table 6

Why Hashing? 7 The preceding units introduced search trees. An element can be found in O(logn) time in a well-balanced search tree. There is a more efficient way to search for an element in a container: you can use hashing to implement a map or a set to search, insert, and delete an element in an array in O(1) time.

Map 8 Recall that a map stores entries. Each entry contains two parts: key and value. The key is also called a search key, which is used to search for the corresponding value. For example, a dictionary can be stored in a map, where the words are the keys and the definitions of the words are the values. A map is also called a dictionary, a hash table, or an associative array.

What is Hashing? 9 If you know the index of an element in the array, you can retrieve the element using the index in O(1) time. So, can we store the values in an array and use the key as the index to find the value? The answer is yes if you can map a key to an index. The array that stores the values is called a hash table. The function that maps a key to an index in the hash table is called a hash function. Hashing is a technique that retrieves the value using the index obtained from key without performing a search.

Hash Function and Hash Codes 10 A typical hash function first converts a search key to an integer value called a hash code, and then compresses the hash code into an index to the hash table.

What is Hashing? 11 A collision occurs when multiple values receive the same hash code. There are several standard ways to deal with collisions. The first approach, open addressing, involves ways of finding available addresses in the table. The simplest form of this is linear probing, which checks whether the key is found at the expected place in the array. If not, it checks the next spot, etc. There are more complex variants of open addressing, which you can read about in the textbook. The second approach, separate chaining, uses a list at each location in the hash table. When looking up a value by hash, search the list of all nodes with that hash value.

Linear Probing 12

Handling Collisions Using Separate Chaining 13 The separate chaining scheme places all entries with the same hash index into the same location, rather than finding new locations. Each location in the separate chaining scheme is called a bucket. A bucket is a container that holds multiple entries.

Implementing Map Using Hashing 14 Run TestMyHashMap MyHashMap MyMap