Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.

Slides:



Advertisements
Similar presentations
Chapter 11. Hash Tables.
Advertisements

Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Hash Tables.
§4 Open Addressing 2. Quadratic Probing f ( i ) = i 2 ; /* a quadratic function */ 【 Theorem 】 If quadratic probing is used, and the table size is prime,
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Data Structures Using C++ 2E
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Log Files. O(n) Data Structure Exercises 16.1.
CSE 250: Data Structures Week 12 March 31 – April 4, 2008.
Overflow Handling An overflow occurs when the home bucket for a new pair (key, element) is full. We may handle overflows by:  Search the hash table in.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hashing. Hashing as a Data Structure Performs operations in O(c) –Insert –Delete –Find Is not suitable for –FindMin –FindMax –Sort or output as sorted.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
1 Chapter 5 Hashing General ideas Methods of implementing the hash table Comparison among these methods Applications of hashing Compare hash tables with.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
Hashing Table Professor Sin-Min Lee Department of Computer Science.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
HASHING PROJECT 1. SEARCHING DATA STRUCTURES Consider a set of data with N data items stored in some data structure We must be able to insert, delete.
Hashing - 2 Designing Hash Tables Sections 5.3, 5.4, 5.4, 5.6.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
March 23 & 28, Csci 2111: Data and File Structures Week 10, Lectures 1 & 2 Hashing.
March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
H ASH TABLES. H ASHING Key indexed arrays had perfect search performance O(1) But required a dense range of index values Otherwise memory is wasted Hashing.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
Hashing by Rafael Jaffarove CS157b. Motivation  Fast data access  Search  Insertion  Deletion  Ideal seek time is O(1)
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions Collision handling Separate chaining.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
DS.H.1 Hashing Chapter 5 Overview The General Idea Hash Functions Separate Chaining Open Addressing Rehashing Extendible Hashing Application Example: Geometric.
Hashing.
Data Structures Using C++ 2E
Hashing - resolving collisions
Hash Tables (Chapter 13) Part 2.
Data Structures Using C++ 2E
Instructor: Lilian de Greef Quarter: Summer 2017
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Hash Tables.
Collision Resolution Neil Tang 02/18/2010
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Searching Tables Table: sequence of (key,information) pairs
CS202 - Fundamental Structures of Computer Science II
Collision Resolution Neil Tang 02/21/2008
Ch Hash Tables Array or linked list Binary search trees
Ch. 13 Hash Tables  .
DATA STRUCTURES-COLLISION TECHNIQUES
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Collision Resolution: Open Addressing Extendible Hashing
Presentation transcript:

Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There is no convenient way to traverse a hash table. –At least double the memory is required. –If the hash table becomes too full (load factor > 50%), the insert(), remove(), and find() operations degrade to O(N). –Careful design must be given to the hash key.

Design of Hash Keys A Hash Table is a collection of elements that performs lookups using an appropriately selected hash function Definition of a Hash Function –A function that when applied to a key value, computes a hash key used as an index to locate the data element Design Issue: How do we Choose Hash Functions? –Goal: The hash function must compute values that are random and span the entire hash table. –Goal: The hash function must be quickly calculated

Additional Design Considerations What if the hash function produces (collisions) the same index for different keys? –Open Addressing (h1(key)+ h2(key,tries))%tableSize Examples: linear probing, secondary probing, quadratic probing, double hashing –Separate Chaining How big should the hash table be? –Answer: At least twice as big as the number of elements the table is to stored. –Answer: A prime length

Collision Resolution Open Addressing Linear Probing (h2(key,tries) = tries) –Characteristics: Primary Clustering, deletions difficult Secondary Probing (h2(key,tries) = constant*tries) –Characteristics: Primary Clustering, deletions difficult Quadratic Probing (h2(key,tries) = tries^2) –Characteristics: Secondary clustering (same collision resolution pattern for all keys) – incomplete use of the hash table, deletions are difficult Double Hashing (h2(key,tries) = second hash function*tries) –Characteristics: Eliminates clustering, deletions are difficult Clustering: Tendency for sections of the table to fill up, with increasing probability that keys to insert hit these areas

Separate Chaining Compute hash key If Collision occurs then Insert key in the front of chain (linked list) Advantages –Hash table grows as needed –Performance is less sensitive to full hash table –Deletion is easy –No clustering

Performance There are charts in the text describing performance of –Linear, Quadratic, Double hash probing –Open Addressing versus Separate Chaining If F = load factor (percentage full). Probability of one collision = F Probability of two collisions = F 2 Expected collisions E=F+2*F 2 +3*F 3 + …= i=0. i*F i = F/(1-F) 2 If F=.5, E= ½ +2* ¼ +3* 1/8 + … ½+ ½ + 3/8 + 4/16 + … ½/(1-½) 2 =2 If F=.75 E = F/(1-F) 2 =(3/4)/(1/16) = 12 If F=0.9 E = F/(1-F)2 = (9/10)/(1/100) = 90 Hash Tables are often used for file system folders, They complement Databases using bTrees for sequential processing and a hash table for rapid searching.