Hashing. Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
Part II Chapter 8 Hashing Introduction Consider we may perform insertion, searching and deletion on a dictionary (symbol table). Array Linked list Tree.
Hashing as a Dictionary Implementation
File Processing - Indirect Address Translation MVNC1 Hashing Indirect Address Translation Chapter 11.
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
Hashing Techniques.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Hash Tables and Associative Containers CS-212 Dick Steflik.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Hashing Lesson Plan - 8.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Hash Table March COP 3502, UCF.
Searching Chapter 2.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
File Structures Foundations of Computer Science  Cengage Learning.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.
Appendix E-A Hashing Modified. Chapter Scope Concept of hashing Hashing functions Collision handling – Open addressing – Buckets – Chaining Deletions.
Comp 335 File Structures Hashing.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
CS201: Data Structures and Discrete Mathematics I Hash Table.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hash Tables. Group Members: Syed Husnain Bukhari SP10-BSCS-92 Ahmad Inam SP10-BSCS-06 M.Umair Sharif SP10-BSCS-38.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashed Files Text Versus Binary Meghan Cavanagh. Hashed Files a file that is searched using one of the hashing methods User gives the key, the function.
Copyright © Curt Hill Hashing A quick lookup strategy.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Hash Tables. Group Members: Syed Husnain Bukhari SP10-BSCS-92 Ahmad Inam SP10-BSCS-06 M.Umair Sharif SP10-BSCS-38.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
1 What is it? A side order for your eggs? A form of narcotic intake? A combination of the two?
Appendix I Hashing.
Data Structures Using C++ 2E
Hashing, Hash Function, Collision & Deletion
Hashing CSE 2011 Winter July 2018.
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Hash tables Hash table: a list of some fixed size, that positions elements according to an algorithm called a hash function … hash function h(element)
Hash Table.
Hash Table.
Hash Tables.
Chapter 10 Hashing.
Data Structures Hashing 1.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Advanced Implementation of Tables
What we learn with pleasure we never forget. Alfred Mercier
Collision Resolution.
Presentation transcript:

Hashing

Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string. Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. It is also used in many encryption algorithms.

Hash Table Is a data structure that associates keys with values A small phone book as a hash table.

Hash Table (1) The primary operation it supports efficiently is a lookup: given a key (a person's name), find the corresponding value (that person's telephone number). It works by transforming the key using a hash function into a hash, a number that is used as an index in an array to locate the desired location where the values should be.

Hash Function The hashing algorithm is any well-defined procedure or mathematical function which converts a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index into an array. The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes.

Hash Function

1.Direct Hashing The key is the address without any algorith- mic manipulation. The data structure must therefore contain an element for every possible key. While the situations where you can use direct hashing are limited, when it can be used it is very powerful because it guarantees that there are no synonyms.

001Elmer 002Markh 005Reymund 007Hubert 100Rollyn Hash Function Address Key

2.Subtration Method Sometimes we have keys that are consecutive but do not start from one. Example: A company may have only 100 employees, but the employee numbers start from 1000 and go to In this case, we use a very simple hashing function that subtracts 1000 from the key to determine the address.

3.Digit Extraction Selected digits are extracted from the key and used as the address. Example: Using six-digit employee number to hash to a three-digit address ( ), we could select the first, third, and fourth digits = = = =102

379452Elmer Markh Hubert Arno Rollyn Hash Function Divides the key by the array size and uses the remainder + 1 [001] [006] [005] [004] [003] [002] [007] [306] [307] Mod division

5.Midsquare Hashing The key is squared and the address selected from the middle of the squared number. Example: 9452 * 9452 = : address is 3403 As a variation, we can select a portion of the key, and then use them rather than the whole key : 379 * 379 = : address is : 378 * 378 = : address is 288

6.Folding Methods There are two folding methods that are used: Fold Shift, the key value is divided into parts whose size matches the size of the required address. Then, the left and right parts are shifted and added with the middle part. Fold Boundary, the left and right numbers are folded on a fixed boundary between them and the center number. This results in a two outside values being reverse

Discarded 12 3 Key Digits reversed 789 Digits reversed

Load Factor Is the number of elements in the list divided by the number of physical elements allocated for the list expressed for a percentage. a = k / n x 100 Clustering T he tendency of data to build up unevenly across a hashed list. It is usually created by collisions.

Collision

Is the event that occurs when a hashing algorithm produce an address for an insertion key and that address is already occupied. Home Address T he address produced by hashing algorithm. Prime Area T he memory that contains all of the home addresses. Probe Calculation of address and test for success.

[1][5] [9][17] 1. hash(A) 2. hash(B)3. hash(C) B & A Collides C & B Collides ABC

Collision Resolution The process of finding alternate location Collision strategy techniques: –Separate chaining –Open addressing –Coalesced hashing –Perfect hashing –Dynamic perfect hashing –Probabilistic hashing –Robin hood hashing –Cache-conscious collision resolution

Separate Chaining Sometimes called simply chaining or direct chaining, in its simplest form each slot in the array is a linked list, or the head cell of a linked list, where the list contains the elements that hashed to the same location. Insertion requires finding the correct slot, then appending to either end of the list in that slot

Open Addressing Open addressing hash tables store the records directly within the array. This approach is also called closed hashing. A hash collision is resolved by probing, or searching through alternate locations in the array (following a probe sequence) until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table.

Well Known Probe Sequences

379452Elmer Markh Hubert Arno Rollyn Hash Function Collision is resolved by adding one(1) to the current address [001] [006] [005] [004] [003] [002] [007] [306] [307] Linear Probing Redjie Reymund

Quadratic Probing The increment is the collision probe number squared. Probe Collision Probe 2 and New Num Location Increment Address = = = = = = 3611

Key Offset Is a double hashing method that produces different collision path for different keys. Formula: offset = (key / listsize) adress = ((offset + old address) modulo listsize) + 1 For example if the key is and the listsize is 307, using the modulo division… offset = ( / 307) = 543 address = (( ) modulo 307) + 1 = 239

379452Elmer Redjie Markh Hubert Arno Rollyn [001] [006] [005] [004] [003] [002] [007] [306] [307] Reymund Angelus

Hash collision resolved by linear probing (interval=1).