1 Hash Tables Gordon College CS212. 2 Hash Tables Recall order of magnitude of searches –Linear search O(n) –Binary search O(log 2 n) –Balanced binary.

Slides:



Advertisements
Similar presentations
Hash Tables.
Advertisements

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.
Data Structures Using C++ 2E
Hashing as a Dictionary Implementation
CS202 - Fundamental Structures of Computer Science II
Hashing Techniques.
Hashing CS 3358 Data Structures.
1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
Hash Tables and Associative Containers CS-212 Dick Steflik.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Hashing General idea: Get a large array
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
ICS220 – Data Structures and Algorithms Lecture 10 Dr. Ken Cosh.
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
Hash Table March COP 3502, UCF.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Searching:
Hashing Table Professor Sin-Min Lee Department of Computer Science.
Hashing Dr. Yingwu Zhu.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 Hash table. 2 Objective To learn: Hash function Linear probing Quadratic probing Chained hash table.
1 Hash table. 2 A basic problem We have to store some records and perform the following:  add new record  delete record  search a record by key Find.
Comp 335 File Structures Hashing.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Storage and Retrieval Structures by Ron Peterson.
1 5. Abstract Data Structures & Algorithms 5.2 Static Data Structures.
1 HASHING Course teacher: Moona Kanwal. 2 Hashing Mathematical concept –To define any number as set of numbers in given interval –To cut down part of.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Hash Tables - Motivation
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Data Structures and Algorithms Hashing First Year M. B. Fayek CUFE 2010.
CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.
Hashing 8 April Example Consider a situation where we want to make a list of records for students currently doing the BSU CS degree, with each.
Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
CHAPTER 8 SEARCHING CSEB324 DATA STRUCTURES & ALGORITHM.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
H ASH TABLES. H ASHING Key indexed arrays had perfect search performance O(1) But required a dense range of index values Otherwise memory is wasted Hashing.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing. Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Hashing.
Hashing Alexandra Stefan.
Hashing Alexandra Stefan.
Hash functions Open addressing
Advanced Associative Structures
Hash Tables and Associative Containers
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
Hashing Alexandra Stefan.
CS202 - Fundamental Structures of Computer Science II
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
EECE.3220 Data Structures Instructor: Dr. Michael Geiger Spring 2019
Instructor: Dr. Michael Geiger Spring 2017 Lecture 33: Hash tables
Presentation transcript:

1 Hash Tables Gordon College CS212

2 Hash Tables Recall order of magnitude of searches –Linear search O(n) –Binary search O(log 2 n) –Balanced binary tree search O(log 2 n) –Unbalanced binary tree can degrade to O(n)

3 Hash Tables In some situations faster search is needed –Solution is to use a hash function –Value of key field given to hash function –Location in a hash table is calculated Like an array but much better: Do not have to set aside space to account for every possible key

4 Hash Functions mapping from key to index Simple function: mod (%) the key by arbitrary integer int h(int i) {return i % maxSize; } Note the max number of locations in table maxSize

5 Hash Function Access Note that we have traded speed for wasted space –Table must be considerably larger than number of items anticipated

6 Hash Function Access Example: 7 digit serial number Need 10 million records* * Not practical to have this much space when in reality you are only stocking at most a few thousand records Why 10 million records? n!/(n-r)! 10!/(10-7)! Number of r-permutations of a set with n elements

7 Hash Function (Mapping) Example: 7 digit serial number Use only slots Hashing (Mapping) function - unsigned int Hf(int key) Hf( ) = % = /10000 = (123 * 10000) = 4567

8 Hash Function (Mapping) Design Considerations –Efficient –Minimize collisions –Produce uniformly distributed mappings (helps minimize collisions) –Must be able to deal with int, char, string, etc. types for keys –Must be able to associate a hash function with a container

9 Function Objects Can pass a function to a function Can use Function Objects template class functionobject { public: returntype operator() (arguments) const { return returnvalue; } ……. };

10 Function Objects Example function class: less than template class lessThan { public: bool operator() (const T& x, const T& y) const { return x < y; } };

11 Function Objects Example function class use template void insertionSort(vector & v, Compare comp) { int i, j, n = v.size(); T temp; ….. } Called: insertionSort(v, lessThan ());

12 Function Objects Example function class use (as seen with the SET container) template class lessThan { public: bool operator() (const T& x, const T& y) const { return x < y; } }; set > A(arr, arr+arrSize); for( set >::iterator ii=A.begin();ii!=A.end();ii++ ) cout << *ii << " "; cout << endl;

13 Collisions Hash Function Access Problem Collisions are possible: Depending on the number of slots and the size of the key mapping

14 Collisions Hash Function Access Problem Problem: same value returned by h(i) for different values of i –Called collisions Simple solution: linear probing –Linear search begins at collision location –Continues until empty slot found for insertion

15 Linear Probing

16 Hash Functions Retrieving a value: linear probe until found –If empty slot encountered then value is not in table What if deletions permitted? Slot can be marked so it will not be empty and cause an invalid linear probe

17 Hash Functions Improved performance strategies: –Increase table capacity (less collisions) –Use different collision resolution technique –Devise different hash function Hash table capacity –Size of table must be 1.5 to 2 times the size of the number of items to be stored –Otherwise probability of collisions is too high

18 Other Collision Strategies Linear probing can result in primary clustering Consider: quadratic probing –Probe sequence from location i is i + 1, i – 1, i + 4, i – 4, i + 9, i – 9, … –Secondary clusters can still form Double hashing –Use a second hash function to determine probe sequence hF(key) --> index hF(index)--> next index

19 Collision Strategies Chaining –Table is a list or vector of head nodes to linked lists –When item hashes to location, it is added to that linked list

20 Chaining

21 Improving the Hash Function Ideal hash function –Simple to evaluate (fast) –Scatters items uniformly throughout table Modulo arithmetic not so good for strings –Possible to manipulate numeric (ASCII) value of first and last characters of a name

22 Hash Function (basic mapping) class hFintID { public: unsigned int operator() (int item) const { return (unsigned int) item % 10000; } }; hFintID hf; Hf( ) = 1234;

23 Hash Function (better) class hFint { public: unsigned int operator() (int item) const { unsigned int value = (unsigned int) item; value *= value; value /=256; //discard low order 8 bits // (division performs a shift right) return value % 65536; } }; Midsquare technique mixes up the digits in the serial number

24 String Hash Functions class hFstring { public: unsigned int operator() (const string & item) const { unsigned int prime = ; int n = 0, i; for (i = 0; i < item.length(); i++) n = n*8 + item[i]; return n > 0 ? (n % prime) : (-n % prime); } }; GOAL: random distribution

25 Custom Hash Functions class hfCode { public: unsigned int operator() (const code & item) const { return (unsigned int )item.getNum % NumofSlots; } }; FILE0000.CHK, FILE0001.CHK, FILE0002.CHK

26 Search Algorithms Sequential Search - search O(n) (fairly slow) + good when data set size is small and does have to be sorted Binary Search (sorted vector) + search O(log n) [much faster] + low cost when it comes to space - however, requires data be sorted - not good when the data set is very dynamic (sorting overhead) Binary Search Tree + search O(log n) + can scan data in order - higher cost when it comes to space (various pointers) Hashing + search O(1) [fastest] - higher cost when it comes to space (depends on method)