1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6.

Slides:



Advertisements
Similar presentations
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Advertisements

Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
Hashing.
Dictionaries Again Collection of pairs.  (key, element)  Pairs have different keys. Operations.  Search(theKey)  Delete(theKey)  Insert(theKey, theElement)
Theory I Algorithm Design and Analysis (5 Hashing) Prof. Th. Ottmann.
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
CS202 - Fundamental Structures of Computer Science II
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing CS 3358 Data Structures.
Lecture 10 Sept 29 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.
Maps & Hashing Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Lecture 11 March 5 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.
Hash Tables and Associative Containers CS-212 Dick Steflik.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
CSE 326: Data Structures Lecture #11 B-Trees Alon Halevy Spring Quarter 2001.
Chapter 5: Hashing Hash Tables
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
CS2420: Lecture 33 Vladimir Kulyukin Computer Science Department Utah State University.
Dictionaries Again Collection of pairs.  (key, element)  Pairs have different keys. Operations.  Get(theKey)  Delete(theKey)  Insert(theKey, theElement)
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
COSC 2007 Data Structures II
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
CS261 Data Structures Hash Tables Concepts. Goals Hash Functions Dealing with Collisions.
1 Joe Meehean 1.  BST easy to implement average-case times O(LogN) worst-case times O(N)  AVL Trees harder to implement worst case times O(LogN)  Can.
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Hashing CS 202 – Fundamental Structures of Computer Science II Bilkent.
1.  We’ll discuss the hash table ADT which supports only a subset of the operations allowed by binary search trees.  The implementation of hash tables.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
1 HashTable. 2 Dictionary A collection of data that is accessed by “key” values –The keys may be ordered or unordered –Multiple key values may/may-not.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Comp 335 File Structures Hashing.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Prof. Amr Goneid, AUC1 CSCI 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 5. Dictionaries(2): Hash Tables.
Can’t provide fast insertion/removal and fast lookup at the same time Vectors, Linked Lists, Stack, Queues, Deques 4 Data Structures - CSCI 102 Copyright.
Hash Tables - Motivation
1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.
Hash Tables CSIT 402 Data Structures II. Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions.
Hashing is a method to store data in an array so that sorting, searching, inserting and deleting data is fast. For this every record needs unique key.
Hashing Fundamental Data Structures and Algorithms Margaret Reid-Miller 18 January 2005.
Hashing, Hashing Tables Chapter 8. Class Hierarchy.
October 6, Algorithms and Data Structures Lecture VII Simonas Šaltenis Aalborg University
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables From “Algorithms” (4 th Ed.) by R. Sedgewick and K. Wayne.
Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Hash Tables and Hash Maps. DCS – SWC 2 Hash Tables A Set and a Map are both abstract data types – we need a concrete implemen- tation in order to use.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Hashing & Hash Tables. Sets/Dictionaries Set - Our best efforts to date:
Hash Functions Andy Wang Data Structures, Algorithms, and Generic Programming.
Java Methods A & AB Object-Oriented Programming and Data Structures Maria Litvin ● Gary Litvin Copyright © 2006 by Maria Litvin, Gary Litvin, and Skylight.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
1 Introduction to Hashing - Hash Functions Sections 5.1 and 5.2.
Duke CPS Faster and faster and … search l Binary search trees ä average case insert/search/delete = O( ) ä worst case = O( ) l balanced search.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Hash 2004, Spring Pusan National University Ki-Joune Li.
School of Computer Science and Engineering
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Introduction to Hashing - Hash Functions
Hash Functions Sections 5.1 and 5.2
Advanced Associative Structures
Hash Tables and Associative Containers
2018, Spring Pusan National University Ki-Joune Li
EE 312 Software Design and Implementation I
Hashing.
EE 312 Software Design and Implementation I
Presentation transcript:

1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6

2 Hashing Data items stored in an array of some fixed size –Hash table Search performed using some part of the data item –key Used for performing insertions, deletions, and finds in constant average time Operations requiring ordering information not supported efficiently –Such as findMin, findMax

3 Hash Table Example

4 Hash Table Applications Comparing search efficiency of different data structures: –Vector, list: O(N) –AVL search tree: O(log(N)) –Hash table: O(1) expected time Compilers to keep track of declared variables –Symbol tables –Mapping from name to id On-line spelling checkers

5 Hash Functions Map keys to integers (which represent table indices) –Hash(Key) = Integer –Evenly distributed index values Even if the input data is not evenly distributed What happens if multiple keys mapped to the same integer (same position)? –Collision management (discussed in detail later) –Collisions are likely to be reduced if keys are evenly distributed over the hash table

6 Simple Hash Functions Assumptions: –K: an unsigned 32-bit integer –M: the number of buckets (the number of entries in a hash table) Goal: –If a bit is changed in K, all bits are equally likely to change for Hash(K) –So that items are evenly distributed in the hash table

7 A Simple Function What if –Hash(K) = K % M –Where M is of any integer value What is wrong? Values of K may not be evenly distributed –But Hash(K) needs to be evenly distributed Suppose –M = 10, –K = 10, 20, 30, 40 Then K % M = 0, 0, 0, 0, 0…

8 Another Simple Function If –Hash(K) = K % P, P = prime number Suppose –P = 11 –K = 10, 20, 30, 40 K % P = 10, 9, 8, 7 More uniform distribution… So hash tables often have prime number of entries

9 A Simple Hash for Strings unsigned int Hash(const string& Key) { unsigned int hash = 0; for (int j = 0; j != Key.size(); ++j) { hash += Key[j] } return hash; } Problem: Small sized keys may not use a large fraction of a large hash table

10 Another Simple Hash Function unsigned int Hash(const string& Key) { return Key[0] + 27*Key[1] + 729*Key[2]; } Problem: English does not use random strings; so, the hash values are not uniformly distributed –Using more characters of the key can improve the hash function

11 A Better Hash Function unsigned int Hash(const string &Key) { unsigned int hash = 0; for (int j = 0; j != Key.size(); ++j) hash = 37*hash + (Key[j]-’a’+1); return hash%TableSize; } The for loop computes  a i 37 n-i using Horner’s rule, where a i has the value 1 for ‘a’, 2 for ‘b’, etc –a a a a 0 = 37(37(37a 0 + a 1 )+ a 2 ) + a 3 The for implicitly performs arithmetic modulo 2k, where k is the number of bits in an unisigned int

12 STL Hash Tables STL extensions –hash_set –hash_map The key type, hash function, and equality operator may need to be provided Available in new standard as unordered set and map – or Example: Lec24/hashmapex.cpp –Reference