Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.

Slides:



Advertisements
Similar presentations
HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
Advertisements

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Part II Chapter 8 Hashing Introduction Consider we may perform insertion, searching and deletion on a dictionary (symbol table). Array Linked list Tree.
Hashing. CENG 3512 Motivation The primary goal is to locate the desired record in a single access of disk. – Sequential search: O(N) – B+ trees: O(log.
CSCE 3400 Data Structures & Algorithm Analysis
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
© 2004 Goodrich, Tamassia Hash Tables1  
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
Using arrays – Example 2: names as keys How do we map strings to integers? One way is to convert each letter to a number, either by mapping them to 0-25.
Hashing Techniques.
1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.
Dictionaries and Hash Tables1  
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Hash Tables and Associative Containers CS-212 Dick Steflik.
Introduction to Hashing & Hashing Techniques
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Hash Tables1   © 2010 Goodrich, Tamassia.
Comp 335 File Structures Hashing.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing – Part I CS 367 – Introduction to Data Structures.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
March 23 & 28, Csci 2111: Data and File Structures Week 10, Lectures 1 & 2 Hashing.
March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
Hashing is a method to store data in an array so that sorting, searching, inserting and deleting data is fast. For this every record needs unique key.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hash Tables. Group Members: Syed Husnain Bukhari SP10-BSCS-92 Ahmad Inam SP10-BSCS-06 M.Umair Sharif SP10-BSCS-38.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashed Files Text Versus Binary Meghan Cavanagh. Hashed Files a file that is searched using one of the hashing methods User gives the key, the function.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing. Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.
Week 9 - Monday.  What did we talk about last time?  Practiced with red-black trees  AVL trees  Balanced add.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
Hash Tables. Group Members: Syed Husnain Bukhari SP10-BSCS-92 Ahmad Inam SP10-BSCS-06 M.Umair Sharif SP10-BSCS-38.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
Appendix I Hashing.
School of Computer Science and Engineering
Hashing Alexandra Stefan.
Review Graph Directed Graph Undirected Graph Sub-Graph
© 2013 Goodrich, Tamassia, Goldwasser
Hash Table.
Chapter 10 Hashing.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Dictionaries 1/17/2019 7:55 AM Hash Tables   4
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Presentation transcript:

Chapter 10 Hashing

The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called Hashing or Hash addressing which is essentially independent of the number n. Hashing uses a data structure called a hash table. Although hash tables provide fast insertion, deletion, and retrieval, operations that involve searching, such as finding the minimum or maximum value, are not performed very quickly. It is also used in many encryption algorithms.

Hash Table is a data structure in which keys are mapped to array positions by a hash function. This table can be searched for an item in O(1) time using a hash function to form an address from the key. Hash Function is a function which, when applied to the key, produces an integer which can be used as an address in a hash table. Perfect hash function Good hash function When more than one element tries to occupy the same array position, we have a collision. Collision is a condition resulting when two or more keys produce the same hash location.

Comparison of keys was the main operation used by the previous discussed searching methods. There is a different way of searching by calculates the position of the key based on the value of the key. So, the search time is reduced to O(1) from O(n) or from O(log n). We need to find a function h that can transfer a key K (string, number, record, etc..) into an index the a table used for storing items of the same type as K. This function is called hash function.

Example: Suppose we want to store a sequence of randomly generated numbers, keys: 5, 17, 37, 20, 42, 3. The array A, the hash table, where we want to store the numbers: | | | | | | | | | | We need a way of mapping the numbers to the array indexes, a hash function, that will let us store the numbers and later recompute the index when we want to retrieve them. There is a natural choice for this.

Our hashtable has 9 fields and the mod function, which sends every integer to its remainder modulo 9, will map an integer to a number between 0 and 8. 5 mod 9 = 5 17 mod 9 = 8 37 mod 9 = 1 20 mod 9 = 2 42 mod 9 = 6 3 mod 9 = 3 We store the values: | | 37 | 20 | 3 | | 5 | 42 | | 17 | In this case, computing the hash value of the number n to be stored: n mod 9, costs a constant amount of time. And so does the actual storage, because n is stored directly in an array field.

Hash Functions 1. Division A hash function must guarantee that the number it returns is a valid index to one of the table entries. The simplest way is to use division modulo. TSize=sizeof(table), as in h(K)= K mod TSize. It is best if TSize is a prime number. Advantages: simple useful if we don't know much about the keys 2. Extraction Idea: use only part of the key to compute the hash value/ address/ index. Exe: Key is (SSN) This method might use for example: the first four digits ( 1234) or the last four (6789), or combined the first two with the last two (1289) to be the index.

Hash Functions 3. Folding Idea: divide the key into parts, then combine (“fold”) the parts to create the index The key is divided into several parts. These parts are combined or folded together and are usually transformed in a certain way to create (address) index into the table. This is done by first dividing the key into parts where each of the parts of the key will be the same length as the desired index Note: after combining the key parts if the resulted index is grater that the desired length then you can apply either division (which is usually used) or use extraction.

There are two types of folding 1) Shift folding The key is divided into several parts then these parts are added together to create the index Exe: Key is (SSN) (SSN) can be divided into three parts, 123, 456, 789, and then these parts can be added. The resulting 1,368 can be divided modulo TSize. 2) Boundary folding Same as shift folding, except that every other part is written backwards Exe: Key is (SSN) (SSN) with three parts, 123, 456, 789. the first part is taken in the same order the second part is in reverse order the third pat is in the same order The result is =1,566, then division Exe: Key is Boundary folding: = 1228 This process is simple and fast especially when bit patterns are used instead of numerical values, replace addition in previous examples with XOR

4. Mid-Square function Idea: square the key (key is multiplied by itself), then use the “middle (mid) part of the result” as the address. Note: extraction could be used to extract the mid part. Exe: Key is 3121 Square the key: (3,121) 2 =9,740,641 Then use the mid part as the address (406) Here, for 1,000-cell table, h(3,121)= Radix transformation Idea: convert key into another number base, then divide (modulo) could be used. So, the key is expressed in a numerical system using a different radix. Example: convert to base 7 --> use this as the hash value This method may cause collisions. Hash Functions (cont’)

Detecting and resolving collisions Even with the methods introduced previously, collisions may still occur. We cannot hash two keys to the same location, so we must find a way to resolve collisions. Choice of hash function and choice of table size may reduce collisions, but will not eliminate them. Methods for resolving collisions: open addressing: find another empty position chaining: use linked lists bucket addressing: store elements at same location