Hashing: is an alternative search technique (earlier we had BST) Motivation: Try to access directly each possible keys! Suggestion: Enumerate possible.

Slides:



Advertisements
Similar presentations
Hash Tables.
Advertisements

Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CSCE 3400 Data Structures & Algorithm Analysis
Hashing as a Dictionary Implementation
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Hash Tables1 Part E Hash Tables  
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
Hashing Hashing is another method for sorting and searching data.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing as a Dictionary Implementation Chapter 19.
CS201: Data Structures and Discrete Mathematics I Hash Table.
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Sets and Maps Chapter 9.
Sections 10.5 – 10.6 Hashing.
CE 221 Data Structures and Algorithms
Hashing (part 2) CSE 2011 Winter March 2018.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Hashing, Hash Function, Collision & Deletion
Hash table CSC317 We have elements with key and satellite data
CSCI 210 Data Structures and Algorithms
Hashing CSE 2011 Winter July 2018.
Data Abstraction & Problem Solving with C++
School of Computer Science and Engineering
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Hashing Alexandra Stefan.
Hash Tables (Chapter 13) Part 2.
Hashing Alexandra Stefan.
Review Graph Directed Graph Undirected Graph Sub-Graph
Hash functions Open addressing
Advanced Associative Structures
Hash Table.
Hash Table.
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
CSE 2331/5331 Topic 8: Hash Tables CSE 2331/5331.
Hash Tables.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Collision Resolution Neil Tang 02/18/2010
Introduction to Algorithms 6.046J/18.401J
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Double hashing Removal (open addressing) Chaining
CSCE 3110 Data Structures & Algorithm Analysis
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Data Structures – Week #7
Hashing Alexandra Stefan.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Introduction to Algorithms
Advanced Implementation of Tables
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Sets and Maps Chapter 9.
Pseudorandom number, Universal Hashing, Chaining and Linear-Probing
Collision Resolution Neil Tang 02/21/2008
Data Structures – Week #7
Collision Handling Collisions occur when different elements are mapped to the same cell.
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
What we learn with pleasure we never forget. Alfred Mercier
Lecture-Hashing.
Presentation transcript:

Hashing: is an alternative search technique (earlier we had BST) Motivation: Try to access directly each possible keys! Suggestion: Enumerate possible keys. This means an ordering. HASHING

Example: key space: 3 binary digits 1 2 3 4 7 8 N=2³ 1 1 1 1 1

Problem: in real life this is not possible. Names : ~20 characters long needs 26²º (26 letters). 2. Digital numbers: 20 digits needs 10²º (0,..,9: 10 letters). Binary Numbers: 20 spaces needs 2²º. => Too many => we cannot consider all.

We cannot create such long vectors such as 2²º or…. 26²º! But:  We don’t need all such long vectors because not all combinations occur in practice! For example: A vocabulary contains maybe 100,000 ~ words. Personal names (memphis). White pages: 300pp*300 ~ 100,000 Or ~ ½ million 26²º.

Idea: Assume that, There are ~ N keys occur (approx.) Define a vector of length ~ 2N. Assign integers [1,..,2N] to the keys! Look up according to the serial number (integer).

A Dynamical System Perspective: Hashing, Chopping up, Granulating, Coarse graining information! This is done: Continuously, Autonomously, Reliably… in Bio-Systems! (worms, ants….., humans….alike)

The Major Challenge Of Life: How the delicately defined living substances can exist in an infinitely complex world? How animals can survive and succeed? How they separate the important from the useless? =>There is/are mechanisms to complete ‘hashing’ very efficiently and promptly. =>This course is far from that but indicates a few main principles.

Major issues in Hashing: How to assign the hashing? Using the hashing function. 2. Sometimes the hashing function gives the same number to different keys. We have to resolve. _________________________________

Complete space Hashing HASH TABLE K L Eg: 26²º elements 2N

Example: dates: 1055, 1492, 1776, 1812, 1918, 1945. Q: What is complete sp? Hash Function: HashCode(x) = (5x mod 8) 0 1 2 3 4 5 6 7 (hash code) 1776 1055 1492 1812 1945 1918

Evaluation of closed address or chained hashing Costs of Search: Compute hash code I : costs ‘a’. Search through linked list H[i]. Linked lists H[1]………H[h] hashing L1 L2 L3 … Lh ~

Average total cost of search k: T(n)=a + 1/n (h-1,i=0)(L1 + 1)/2 Worst case:Bad Distribution: All are in the same bucket. Needs n/2 comparisons in average same as search unordered array. Better:Good Distribution: Equally distributed among cells. Load factor  = n/h const. f cells average O(1) computations. Search # [i]

Hashing evaluation continued: For uniform distribution: there is very good performance. But a hashing function is required that gives uniform distribution independently of actual data structure! Randomization: computer pseudo- random generator. Eg: multiplicative congruent.

HashCode (K) = (aK) mod h. Strategy: multiply with constant a and take the modulus (i.e. remainder after division). HashCode (K) = (aK) mod h.

Open Address Hashing: this is really dynamic. does not allow collisions as linked lists (before in closed hashing). load factor;  = n/h <1 (if  >= 0.5, array doubling) If there is a collision: Rehashing. Linear Probing. Simple: Rehash (j) = (j+1) mod h (j is the most recent probed location, start with j = i, go until empty cell found. Eg: 6.10)

Rehashing: 2. Double Hashing: Rehash (j,d) = (j+d) mod h Here d-increment of rehashing. If d = 1  linear rehashing  it is determined separately.

Schedule 18-month schedule highlights Timing Isolate timing dependencies critical to success Jan Feb Mar Apr May Jun July Sep Oct Nov Dec Task 2 Task 3 Task 4 Task 1 Milestone