Introduction to Hashing - Hash Functions

Slides:



Advertisements
Similar presentations
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
Advertisements

Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing CS 3358 Data Structures.
Lecture 11 March 5 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.
Hash Tables and Associative Containers CS-212 Dick Steflik.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
CS2420: Lecture 33 Vladimir Kulyukin Computer Science Department Utah State University.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
CS261 Data Structures Hash Tables Concepts. Goals Hash Functions Dealing with Collisions.
1 Joe Meehean 1.  BST easy to implement average-case times O(LogN) worst-case times O(N)  AVL Trees harder to implement worst case times O(LogN)  Can.
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Hashing CS 202 – Fundamental Structures of Computer Science II Bilkent.
1.  We’ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees.  The implementation of hash tables.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6.
Hash Tables - Motivation
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hashing Fundamental Data Structures and Algorithms Margaret Reid-Miller 18 January 2005.
Hashing, Hashing Tables Chapter 8. Class Hierarchy.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing & Hash Tables. Sets/Dictionaries Set - Our best efforts to date:
Hash Functions Andy Wang Data Structures, Algorithms, and Generic Programming.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
1 Introduction to Hashing - Hash Functions Sections 5.1 and 5.2.
CS203 Lecture 14. Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive.
Sets and Maps Chapter 9.
Hashing (part 2) CSE 2011 Winter March 2018.
CSE373: Data Structures & Algorithms Lecture 6: Hash Tables
Hashing CSE 2011 Winter July 2018.
Data Abstraction & Problem Solving with C++
School of Computer Science and Engineering
Lecture No.43 Data Structures Dr. Sohail Aslam.
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Cse 373 April 24th – Hashing.
Dictionaries 9/14/ :35 AM Hash Tables   4
Hash Functions Sections 5.1 and 5.2
Hash Tables in C James Goerke.
Hash table another data structure for implementing a map or a set
CS223 Advanced Data Structures and Algorithms
Advanced Associative Structures
Hash Table.
Dictionaries Collection of pairs. Operations. (key, element)
CMSC 341 Hashing 12/2/2018.
Hash Tables and Associative Containers
Hash Tables Chapter 12.7 Wherein we throw all the data into random array slots and somehow obtain O(1) retrieval time Nyhoff, ADTs, Data Structures and.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
CMSC 341 Hashing 2/18/2019.
2018, Spring Pusan National University Ki-Joune Li
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Sets and Maps Chapter 9.
EE 312 Software Design and Implementation I
CMSC 341 Hashing 4/11/2019.
CS223 Advanced Data Structures and Algorithms
Data Structures – Week #7
CMSC 341 Hashing 4/27/2019.
Ch Hash Tables Array or linked list Binary search trees
Podcast Ch21a Title: Hash Functions
Podcast Ch21f Title: HashSet Class
Hashing.
CMSC 341 Lecture 12.
EE 312 Software Design and Implementation I
Chapter 5: Hashing Hash Tables
Presentation transcript:

Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6

Hashing Data items stored in an array of some fixed size Hash table Search performed using some part of the data item key Used for performing insertions, deletions, and finds in constant average time Operations requiring ordering information not supported efficiently Such as findMin, findMax

Hash Table Example

Hash Table Applications Comparing search efficiency of different data structures: Vector, list: O(N) AVL search tree: O(log(N)) Hash table: O(1) expected time Compilers to keep track of declared variables Symbol tables Mapping from name to id On-line spelling checkers

Hash Functions Map keys to integers (which represent table indices) Hash(Key) = Integer Evenly distributed index values Even if the input data is not evenly distributed What happens if multiple keys mapped to the same integer (same position)? Collision management (discussed in detail later) Collisions are likely to be reduced if keys are evenly distributed over the hash table

Simple Hash Functions Assumptions: Goal: K: an unsigned 32-bit integer M: the number of buckets (the number of entries in a hash table) Goal: If a bit is changed in K, all bits are equally likely to change for Hash(K) So that items are evenly distributed in the hash table

A Simple Function What if What is wrong? Hash(K) = K % M Where M is of any integer value What is wrong? Values of K may not be evenly distributed But Hash(K) needs to be evenly distributed Suppose M = 10, K = 10, 20, 30, 40 Then K % M = 0, 0, 0, 0, 0…

Another Simple Function If Hash(K) = K % P, P = prime number Suppose P = 11 K = 10, 20, 30, 40 K % P = 10, 9, 8, 7 More uniform distribution… So hash tables often have prime number of entries

A Simple Hash for Strings unsigned int Hash(const string& Key) { unsigned int hash = 0; for (int j = 0; j != Key.size(); ++j) { hash += Key[j] } return hash; Problem: Small sized keys may not use a large fraction of a large hash table 9 9

Another Simple Hash Function unsigned int Hash(const string& Key) { return Key[0] + 27*Key[1] + 729*Key[2]; } Problem: English does not use random strings; so, the hash values are not uniformly distributed Using more characters of the key can improve the hash function

A Better Hash Function unsigned int Hash(const string &Key) { for (int j = 0; j != Key.size(); ++j) hash = 37*hash + (Key[j]-’a’+1); return hash%TableSize; } The for loop computes ai37n-i using Horner’s rule, where ai has the value 1 for ‘a’, 2 for ‘b’, etc a3 + 37a2 + 372a1 + 373a0 = 37(37(37a0 + a1)+ a2) + a3 The for loop implicitly performs arithmetic modulo 2k, where k is the number of bits in an unisigned int

STL Hash Tables STL extensions hash_set hash_map The key type, hash function, and equality operator may need to be provided Available in new standard as unordered set and map <tr1/unordered_map> or <unordered_map> <trl/unordered_set> or <unordered_set> Example: Lec24/hashmapex.cpp Reference www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1456.html