Hashing Lesson Plan - 8.

Slides:



Advertisements
Similar presentations
CS Data Structures Chapter 8 Hashing.
Advertisements

Part II Chapter 8 Hashing Introduction Consider we may perform insertion, searching and deletion on a dictionary (symbol table). Array Linked list Tree.
Hashing. CENG 3512 Motivation The primary goal is to locate the desired record in a single access of disk. – Sequential search: O(N) – B+ trees: O(log.
CSCE 3400 Data Structures & Algorithm Analysis
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
File Processing - Indirect Address Translation MVNC1 Hashing Indirect Address Translation Chapter 11.
What we learn with pleasure we never forget. Alfred Mercier Smitha N Pai.
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
Hashing21 Hashing II: The leftovers. hashing22 Hash functions Choice of hash function can be important factor in reducing the likelihood of collisions.
Hashing Techniques.
Hashing CS 3358 Data Structures.
1 Introduction to Hashing & Hashing Techniques Review of Searching Techniques Introduction to Hashing Hash Tables Types of Hashing Hash Functions Applications.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Introduction to Hashing & Hashing Techniques
FALL 2004CENG 3511 Hashing Reference: Chapters: 11,12.
Hash Tables1 Part E Hash Tables  
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
CS Data Structures Chapter 8 Hashing (Concentrating on Static Hashing)
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Searching Chapter 2.
Chapter 13 File Structures. Understand the file access methods. Describe the characteristics of a sequential file. After reading this chapter, the reader.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Appendix E-A Hashing Modified. Chapter Scope Concept of hashing Hashing functions Collision handling – Open addressing – Buckets – Chaining Deletions.
Comp 335 File Structures Hashing.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Hashing – Part I CS 367 – Introduction to Data Structures.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
March 23 & 28, Csci 2111: Data and File Structures Week 10, Lectures 1 & 2 Hashing.
March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.
Been-Chian Chien, Wei-Pang Yang, and Wen-Yang Lin 8-1 Chapter 8 Hashing Introduction to Data Structure CHAPTER 8 HASHING 8.1 Symbol Table Abstract Data.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hash Tables. Group Members: Syed Husnain Bukhari SP10-BSCS-92 Ahmad Inam SP10-BSCS-06 M.Umair Sharif SP10-BSCS-38.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashed Files Text Versus Binary Meghan Cavanagh. Hashed Files a file that is searched using one of the hashing methods User gives the key, the function.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing. Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Hash Tables. Group Members: Syed Husnain Bukhari SP10-BSCS-92 Ahmad Inam SP10-BSCS-06 M.Umair Sharif SP10-BSCS-38.
Appendix I Hashing.
Data Structures Using C++ 2E
Hashing, Hash Function, Collision & Deletion
Data Structures Using C++ 2E
Review Graph Directed Graph Undirected Graph Sub-Graph
Hash functions Open addressing
Hash tables Hash table: a list of some fixed size, that positions elements according to an algorithm called a hash function … hash function h(element)
CS223 Advanced Data Structures and Algorithms
Hash Table.
Data Structures Hashing 1.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
CS223 Advanced Data Structures and Algorithms
What we learn with pleasure we never forget. Alfred Mercier
Presentation transcript:

Hashing Lesson Plan - 8

Contents Evocation Objective Introduction General idea Hash Tables Types of Hashing Hash function Hashing methods Hashing algorithm Mind map Summary

ANNEXURE-I Evocation

Objective To study the basic concept of hashing techniques and algorithm

ANNEXURE-II Introduction-Hashing In hashed search, key through algorithmic function determines location of data Transform key into index that contains data need to locate Hashing is a key to address mapping process The implementation of hash tables is called hashing Hashing is a technique used for performing insertions, deletions and finds in constant average time (i.e.O(1))

General idea The ideal hash table structure is merely an array of some fixed size, containing the items A stored item needs to have a data member, called key, that will be used in computing the index value for the item Key could be an integer, a string, etc e.g. a name or Id that is a part of a large employee structure The size of the array is TableSize The items that are stored in the hash table are indexed by values from 0 to TableSize – 1 Each key is mapped into some number in the range 0 to TableSize – 1 The mapping is called a hash function

Hashing

Hash Tables There are two types of Hash Tables: Open-addressed Hash Tables and Separate-Chained Hash Tables An Open-addressed Hash Table is a one-dimensional array indexed by integer values that are computed by an index function called a hash function A Separate-Chained Hash Table is a one-dimensional array of linked lists indexed by integer values that are computed by an index function called a hash function Hash tables are sometimes referred to as scatter tables Typical hash table operations are: Initialization Insertion Searching Deletion

Types of Hashing There are two types of hashing : Static hashing: In static hashing, the hash function maps search-key values to a fixed set of locations Dynamic hashing: In dynamic hashing a hash table can grow to handle more items. The associated hash function must change as the table grows The load factor of a hash table is the ratio of the number of keys in the table to the size of the hash table Note: The higher the load factor, the slower the retrieval With open addressing, the load factor cannot exceed 1. With chaining, the load factor often exceeds 1

Hash Function Choice of hash function: Must be simple to compute Must distribute the keys evenly among the cells Minimal number of collisions If a hashing function groups key values together, this is called clustering of the keys A good hashing function distributes the key values uniformly throughout the range Problems: Keys may not be numeric Number of possible keys is much larger than the space available in table Different keys may map into same location Hash function is not one-to-one => collision If there are too many collisions, the performance of the hash table will suffer dramatically

Collision Resolution Collision occurs when hashing algorithm produces an address for insertion key and the address is already occupied Address produced by hash algorithm is home address Memory contains all home address is prime area When two keys collide at home address, resolve collision by placing one of keys and data in another location [0] [4] [8] [16] B and A collide at 8 C and B collide at 16 C A B hash(A) hash(B) 3. hash (c)

Hashing Methods Hashing methods Direct Modulo division Mid square Rotation Subtraction Folding Digit Extraction Pseudorandom generation

Hashing methods Direct Method Key is the address without algorithmic manipulation Data structure contain element for possible key Example problem To analyze total monthly sales by days of month For each sale we need date and amount of sale To calculate sales record for month, we need day of month as key for array and add sales amount to accumulator daily Sales[ sale. day ] = daily Sales[ sale . day ] + sale . amount;

Direct Hashing of Employee numbers 000 (Not used) 001 Harry Lee 002 Sarah Trapp 005 Vu Nguymen 007 Ray Black [001] [002] [003] [004] [005] [006] [007] [008] [009] Address 005 100 002 Hash Key [099] [100] 100 John Adams

Hashing Methods Subtraction method Direct and subtraction hash functions guarantee search effort of one with no collisions In one to one hashing method only one key hashes to each address Example Company have 100 employees, employee number starts from 1001 to 1100 Hashing function subtracts 1000 from key to determine address

ANNEXURE-III Rapid Eye Movement Exercise Imagine a huge clock in front of you Direct your focus on 12 o'clock, slowly move eyes clockwise around the clock without stopping at the reference hours. Do three complete rotations Focus straight ahead at the horizon and note changes in feelings, body sensations Repeat as many times as you feel necessary for any inner pacing

What do you see in the image below? Optical Illusion What do you see in the image below?

Music Word search

Branches and Descriptions Anthology Pomology Batracology Petrology Odontology

Modulo Division Method Divides the key by array size and use remainder for address Algorithm works with any list size, but list size is a prime number produces fewer collisions address = key MODULO list Size

Modulo Division Hashing [000] [001] [002] [003] [004] [005] [006] [007] 379452 Mary Dodd 121267 Bryan Devaux 378845 Partrick Linn 160252 Tuan Ngo 045128 Feldman 121267 045128 379452 Hash [008] [305] [306]

Digit Extraction Method Digits are extracted from key and used as address Example Six digit employee number is used to hash three digit address (000-999) Select first, third and fourth digits use them as address 379452-394 121267-112 378845-388 160252-102 045128-051

Midsquare Method Key is squared and the address is selected from the middle of the squared number Example Given a key of 9452, midsquare address calculation is shown using four digit address (0000-9999) 94522=89340304: address is 3403 Select first three digits and then use midsquare method 379452 : 3792 = 143641 - 364 121267 : 1212 = 014641 - 464 378845 : 3782 = 142884 - 288 160252 : 1602 = 025600 - 560 045128 : 0452 = 002025 - 202

Folding Methods Two folding methods Fold shift Fold boundary Fold shift key value is divided into parts whose size matches size of required address Left and right parts are shifted and added with middle part In fold boundary, left and right numbers are folded on a fixed boundary between them and center number

Hash Fold examples 123 321 456 456 789 987 368 764 Fold Shift Fold boundary 123456789 123 789 123 789 Digits reversed 1 1 Discarded

Rotation Method Rotation is used in combination with folding and pseudorandom hashing Hashing keys are identical except for last character Rotating last character to front of key and minimize the effect Example Consider case of six digit employee number that is used in large company 600101 600101 160010 600102 600102 260010 600103 600103 360010 600104 600104 460010 600105 600105 560010 Original Key Rotation Rotated Key

Pseudorandom Hashing Key is used as seed in pseudorandom number generator and the random number is scaled into possible address range using modulo-division Pseudorandom number generator generate same number of series A common random number generator is y = ax + c Example Assume a=17, c=7, x=121267,Prime number=307 y= ((17*121267) + 7) modulo 307 y= (2061539 + 7) modulo 307 y= 2061546 modulo 307 y= 41

Hashing Algorithm Algorithm hash (key size, maxAddr, addr) set looper to 0 set addr to 0 for each character in key if (character not space) add character to address rotate addr 12 bits right end if end loop if (addr < 0) addr = absolute (addr) addr = addr modulo maxAddr end hash

ANNEXURE-IV Mind Map General idea Types Hash Function Hashing Collision resolution Hashing Methods Hashing Algorithm

ANNEXURE-V Summary In hashed search, key through algorithmic function determines location of data Hashing functions including direct, subtraction, modulo division, digit extraction, midsquare, folding, rotation, pseudorandom generation In direct hashing, key is address without algorithmic manipulation In subtraction hashing key is transformed to address by subtracting a fixed number from it In modulo division hashing key is divided by list size and remainder plus 1 is used as address In digit extraction hashing select digits are extracted from key and used as address

Summary In midsquare hashing key is squared and address is selected from middle of result In fold shift hashing key is divided into parts whose size match size of required address. Then parts are added to obtain address In fold boundary hashing, key is divided into parts whose size match size of required address. Then left and right parts are reserved and added to middle part to obtain address In rotation hashing far right digit of key is rotated to left to determine address In pseudorandom generation hashing, key is used as seed to generate pseudorandom number. Result is scaled to obtain address