Hashing & Hash Tables. Sets/Dictionaries Set - Our best efforts to date:

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

Hashing.
Searching: Self Organizing Structures and Hashing
Hashing as a Dictionary Implementation
CHAPTER 7 HASHING What is hashing for? For searching But we already have binary search in O( ln n ) time after sorting. And we already have algorithms.
Hashing CS 3358 Data Structures.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
CSC 2300 Data Structures & Algorithms February 27, 2007 Chapter 5. Hashing.
CSE 326: Data Structures: Hash Tables
Chapter 5: Hashing Hash Tables
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
CS2420: Lecture 33 Vladimir Kulyukin Computer Science Department Utah State University.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hashing. Hashing as a Data Structure Performs operations in O(c) –Insert –Delete –Find Is not suitable for –FindMin –FindMax –Sort or output as sorted.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
CS261 Data Structures Hash Tables Concepts. Goals Hash Functions Dealing with Collisions.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Hashing CS 202 – Fundamental Structures of Computer Science II Bilkent.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture8.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 HashTable. 2 Dictionary A collection of data that is accessed by “key” values –The keys may be ordered or unordered –Multiple key values may/may-not.
1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.
Comp 335 File Structures Hashing.
1 CSE 326: Data Structures: Hash Tables Lecture 12: Monday, Feb 3, 2003.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Search  We’ve got all the students here at this university and we want to find information about one of the students.  How do we do it?  Linked List?
Storage and Retrieval Structures by Ron Peterson.
1 5. Abstract Data Structures & Algorithms 5.2 Static Data Structures.
1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6.
Hash Tables - Motivation
WEEK 1 Hashing CE222 Dr. Senem Kumova Metin
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashing, Hashing Tables Chapter 8. Class Hierarchy.
Chapter 5: Hashing Collision Resolution: Open Addressing Extendible Hashing Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova,
CS261 Data Structures Hash Tables Open Address Hashing.
H ASH TABLES. H ASHING Key indexed arrays had perfect search performance O(1) But required a dense range of index values Otherwise memory is wasted Hashing.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
Copyright © Curt Hill Hashing A quick lookup strategy.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing by Rafael Jaffarove CS157b. Motivation  Fast data access  Search  Insertion  Deletion  Ideal seek time is O(1)
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing. Search Given: Distinct keys k 1, k 2, …, k n and collection T of n records of the form (k 1, I 1 ), (k 2, I 2 ), …, (k n, I n ) where I j is.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
Chapter 11 (Lafore’s Book) Hash Tables Hwajung Lee.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
Hash table CSC317 We have elements with key and satellite data
Searching Tables Table: sequence of (key,information) pairs
Data Structures and Algorithm Analysis Hashing
DATA STRUCTURES-COLLISION TECHNIQUES
Collision Resolution: Open Addressing Extendible Hashing
Chapter 5: Hashing Hash Tables
Presentation transcript:

Hashing & Hash Tables

Sets/Dictionaries Set - Our best efforts to date:

Easy Set Fast way to represent set if 0-9 only possible values:

Easy Set Fast way to represent set if 0-9 only possible values: Could apply to letters A-J via mapping char  int

Easy Set Fast way to represent set if 0-9 only possible values: How could we apply same strategy to all English words? AaAbAcAdAeAfAg… ???

Hashing Hash function : maps data onto fixed size value

Cryptographic Hashing Desirable traits: – Output is fixed size – Easy to compute – Output varies wildly with small input change – One way

Hash Table Hash Table : – Use hash function to map values into array indexes – Constant time to find index and check

Hash Table Hash Functions Desirable qualities – Return number 0…(tablesize – 1) map values into array indexes – Efficiently computable constant time to find index – Evenly distribute keys over table

Hash Table Functions Desirable qualities – Return number 0…(tablesize – 1) – Efficiently computable – Evenly distribute keys over table Don't waste space – Mapping is onto – every index has 1+ keys Minimize collisions

Hash Table Functions Split roles – hash function vs mapping to table: – Hash Function: Evenly distribute keys over space (unsigned ints) – Table mapping: Hash function's result % table size = index

Optimal Hash Functions If all keys and table size known, can compute optimal hash… – Rarely the case

Hash Function - Integral For integral types: – Hash(x) = x – Table size should be prime

Hash Function - Integral For integral types: – Hash(x) = x – Table size should be prime Keys often have pattern – if not relatively prime to table size, get paterns: , 10, 20 2, 12, 22 4, 14, 24 6, 16, 26 8, 18, 28

Hash Function - String String approach 1 – add up characters: for (i=0;i<key.length();i++) hashVal += key[i]; Problem 1: What if TableSize is 10,000 and all keys are 8 or less characters long? Problem 2: What if keys often contain the same characters (“abc”, “bca”, etc.)?

Hash Function - String String approach 2 – multiply each character by different powers of some number: – "apple" : 'a' * 31 4 'p' * 31 3 'p' * 31 2 'l' * 31 1 'e' * 31 0

Hash Function - String String approach 2 – multiply each character by different powers of some number: – "apple" : 'a' * 'p' * 'p' * 'l' * 'e' * 31 0 Efficiently do via bit shifting: for (i=0;i<key.length();i++) hashVal = (hashVal << 6) ^ key[i]; * 64

Hash Function - String String approach 2 – multiply each character by different powers of some number: – "apple" : 'a' * 'p' * 'p' * 'l' * 'e' * 31 0 Efficiently do via bit shifting: for (i=0;i<key.length();i++) hashVal = (hashVal << 6) ^ key[i]; Binary XOR

Collisions Collision : two keys map to same index: – 12 and

Probing Linear Probing: value goes in next available slot

Probing Linear Probing: value goes in next available slot

Probing Linear Probing: value goes in next available slot

Probing Linear Probing: value goes in next available slot Issue: – No longer constant access

Load Factor Must be < 1 for linear probing Performance drops rapidly past.5

Clustering Say we go to put in 3: Now 2-5 are blocked – Anything 2-6 will fill

Finding Probing used again to find keys: Find 32 – yep its there

Finding Probing used again to find keys: Find 42 – nope – must not be

Deletion Say we delete 22: Find 32…

Deletion Say we delete 22: Find 32… not there!

Tombstone Special value indicating something was there Search knows to continue Insertion can use that slot – But need to continue search to avoid duplicate #322