CS261 Data Structures Hash Tables Concepts. Goals Hash Functions Dealing with Collisions.

Slides:



Advertisements
Similar presentations
Hashing.
Advertisements

CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.
Hashing as a Dictionary Implementation
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
Data Types in Java Data is the information that a program has to work with. Data is of different types. The type of a piece of data tells Java what can.
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Nov 12, 2009IAT 8001 Hash Table Bucket Sort. Nov 12, 2009IAT 8002  An array in which items are not stored consecutively - their place of storage is calculated.
Hashing Techniques.
CS 261 – Data Structures Hash Tables Part 1. Open Address Hashing.
1 Hash Tables Gordon College CS Hash Tables Recall order of magnitude of searches –Linear search O(n) –Binary search O(log 2 n) –Balanced binary.
1 Chapter 9 Maps and Dictionaries. 2 A basic problem We have to store some records and perform the following: add new record add new record delete record.
Introduction to Computers and Programming Lecture 7:
Hash Tables and Associative Containers CS-212 Dick Steflik.
CS 261 – Data Structures Hash Tables Part II: Using Buckets.
Chapter 5: Hashing Hash Tables
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
CS 206 Introduction to Computer Science II 11 / 12 / 2008 Instructor: Michael Eckmann.
Dictionaries 4/17/2017 3:23 PM Hash Tables  
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
1 HashTable. 2 Dictionary A collection of data that is accessed by “key” values –The keys may be ordered or unordered –Multiple key values may/may-not.
Comp 335 File Structures Hashing.
CSE 501N Fall ‘09 11: Data Structures: Stacks, Queues, and Maps Nick Leidenfrost October 6, 2009.
Sets, Maps and Hash Tables. RHS – SOC 2 Sets We have learned that different data struc- tures have different advantages – and drawbacks Choosing the proper.
Hashing1 Hashing. hashing2 Observation: We can store a set very easily if we can use its keys as array indices: A: e.g. SEARCH(A,k) return A[k]
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Hashing – Part I CS 367 – Introduction to Data Structures.
1 Introduction to Hashing - Hash Functions Sections 5.1, 5.2, and 5.6.
Hash Tables - Motivation
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Hashing Basis Ideas A data structure that allows insertion, deletion and search in O(1) in average. A data structure that allows insertion, deletion and.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
CS261 Data Structures Hash Tables Open Address Hashing.
Hashing Suppose we want to search for a data item in a huge data record tables How long will it take? – It depends on the data structure – (unsorted) linked.
School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++,
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Week 9 - Monday.  What did we talk about last time?  Practiced with red-black trees  AVL trees  Balanced add.
Hash Tables ADT Data Dictionary, with two operations – Insert an item, – Search for (and retrieve) an item How should we implement a data dictionary? –
Hashing & Hash Tables. Sets/Dictionaries Set - Our best efforts to date:
CS 261 – Data Structures Hash Tables Part II: Using Buckets.
Java Methods A & AB Object-Oriented Programming and Data Structures Maria Litvin ● Gary Litvin Copyright © 2006 by Maria Litvin, Gary Litvin, and Skylight.
Java Basics. Tokens: 1.Keywords int test12 = 10, i; int TEst12 = 20; Int keyword is used to declare integer variables All Key words are lower case java.
CSE 311 Foundations of Computing I Lecture 12 Modular Arithmetic and Applications Autumn 2012 CSE
CSE 311 Foundations of Computing I Lecture 11 Modular Exponentiation and Primes Autumn 2011 CSE 3111.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
Appendix I Hashing.
Hashing & HashMaps CS-2851 Dr. Mark L. Hornick.
Slides by Steve Armstrong LeTourneau University Longview, TX
Introduction to Hashing - Hash Functions
Hash functions Open addressing
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hash Tables and Associative Containers
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Hash Tables Buckets/Chaining
Hash Tables Open Address Hashing
Podcast Ch21a Title: Hash Functions
Presentation transcript:

CS261 Data Structures Hash Tables Concepts

Goals Hash Functions Dealing with Collisions

Searching...Better than O(log n)? Skip lists and AVL trees reduce the time to perform operations (add, contains, remove) from O(n) to O(log n) Can we do better? Can we find a structure that will provide O(1) operations? Yes. No. Well, maybe...

Hash Tables Hash tables are similar to arrays except… – Elements can be indexed by values other than integers – Multiple values may share an index Huh??? What???

Hashing with a Hash Function Key ie. string, url,etc. Hash function integer index 0 Key y 1 Key w 2 Key z 3 4 Key x Hash Table Hash to index for storage AND retrieval!

Hashing to a Table Index Computing a hash table index is a two-step process: 1.Transform the value (or key) to an integer (using the hash function) 2.Map that integer to a valid hash table index (using the mod operator) Example App: spell checker – Compute an integer from the word – Map the integer to an index in a table (i.e., a vector, array, etc.)

Hash Function Goals FAST (constant time) Produce UNIFORMLY distributed indices REPEATABLE (ie. same key always results in same index)

Step 1: Transforming a key to an integer Mapping: Map (a part of) the key into an integer – Example: a letter to its position in the alphabet Folding: key partitioned into parts which are then combined using efficient operations (such as add, multiply, shift, XOR, etc.) – Example: summing the values of each character in a string Key Mapped chars (char in alpha) Folded (+) eat ate tea

Step 1: Transforming a key to an integer Shifting: can account for position of characters Key Mapped chars (char in alpha) Folded (+) Shifted and Folded eat = 42 ate = 49 tea = 91 Shifted by position in the word (right to left): 0th letter shifted left 0, first letter shifted left 1, etc.

Step 1: Transform key to an integer Mapping: Map (a part of) the key into an integer – Example: a letter to its position in the alphabet Folding: key partitioned into parts which are then combined using efficient operations (such as add, multiply, shift, XOR, etc.) – Example: summing the values of each character in a string Shifting: get rid of high- or low-order bits that are not random – Example: if keys are always even, shift off the low order bit Casts: converting a numeric type into an integer – Example: casting a character to an int to get its ASCII value – ie. char myChar = ‘b’; int idx = (int) myChar;

Typical Hash Functions Character: the char value cast to an int  it’s ASCII value Date: a value associated with the current time Double: a value generated by its bitwise representation Integer: the int value itself String: a folded sum of the character values URL: the hash on the host name Use the provided hash function!!! (ie. Java classes inherit a hashCode function …which you can override if desired

Step 2: Mapping to a Valid Index Use modulus operator (%) with table size: – Example: idx = hash(val) % size; Use only positive arithmetic or take absolute value To get a good distribution of indices, prime numbers make the best table sizes: – Example: if you have 1000 elements, a table size of 997 or 1009 is preferable

Hash Tables: Collisions A collision occurs when two values hash to the same index We’ll discuss how to deal with collisions in the next lecture! Minimally Perfect Hash Function: – No collisions – Table size = # of elements Perfect Hash Function: – No collisions – Table size equal or slightly larger than the number of elements

Minimally Perfect Hash Funciton Alfred f = 5 % 6 = 5 Alessia e = 4 % 6 = 4 Amina i = 8 % 6 = 2 Amy y = 24 % 6 = 0 Andy d = 3 % 6 = 3 Anne n = 13 % 6 = 1 Position of 3 rd letter (starting at left, index 0), mod 6 0 Amy 1 Anne 2 Amina 3 Andy 4 Alessia 5 Alfred

Hashing: Why do it?? Assuming – Hash function can be computed in constant time – computed indices are equally distributed over the table Allows for O(1) time bag/map operations!

Application Example Spell checker – Know all your words before hand – Need FAST lookups so you can highlight on the fly – Compute an integer index from the string Concordance – Use a HashMap in your assignment