Hashing Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.

Slides:



Advertisements
Similar presentations
An Introduction to Hashing. By: Sara Kennedy Presented: November 1, 2002.
Advertisements

CSCE 3400 Data Structures & Algorithm Analysis
Hashing as a Dictionary Implementation
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
HashMaps. Overview What are HashMaps? Implementing DictionaryADT with HashMaps HashMaps 2/16.
Hashing. 2 Searching Consider the problem of searching an array for a given value –If the array is not sorted, the search requires O(n) time If the value.
Maps & Hashing Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Hashing. Searching Consider the problem of searching an array for a given value –If the array is not sorted, the search requires O(n) time If the value.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 48 Hashing.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Hashing Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Sets and Maps (and Hashing)
Hashing. 2 Preview A hash function is a function that: When applied to an Object, returns a number When applied to equal Objects, returns the same number.
Lecture 11 oct 7 Goals: hashing hash functions chaining closed hashing application of hashing.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Hashing as a Dictionary Implementation Chapter 20 Slides by Steve Armstrong LeTourneau University Longview, TX  2007,  Prentice Hall.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
COMP 103 Hashing 2013-T2 Lecture 28 Thomas Kuehne School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
Not overriding equals  what happens if you do not override equals for a value type class?  all of the Java collections will fail in confusing ways 1.
TECH Computer Science Dynamic Sets and Searching Analysis Technique  Amortized Analysis // average cost of each operation in the worst case Dynamic Sets.
ADSA: Hashing/ Advanced Data Structures and Algorithms Objectives – –introduce hashing, hash functions, hash tables, collisions, linear probing,
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
CSC 427: Data Structures and Algorithm Analysis
Chapter 12 Hash Table. ● So far, the best worst-case time for searching is O(log n). ● Hash tables  average search time of O(1).  worst case search.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
Chapter 11 Hash Anshuman Razdan Div of Computing Studies
“Never doubt that a small group of thoughtful, committed people can change the world. Indeed, it is the only thing that ever has.” – Margaret Meade Thought.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
SETS AND HASHING. SETS An un-ordered collection of values Operations (S and T are sets): S ∩ T // the intersection of S and T S U T // The Union of S.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
COSC 1030 Lecture 10 Hash Table. Topics Table Hash Concept Hash Function Resolve collision Complexity Analysis.
Chapter 13 C Advanced Implementations of Tables – Hash Tables.
1 Hashing by Adlane Habed School of Computer Science University of Windsor May 6, 2005.
Hash Tables and Hash Maps. DCS – SWC 2 Hash Tables A Set and a Map are both abstract data types – we need a concrete implemen- tation in order to use.
Hashing O(1) data access (almost) -access, insertion, deletion, updating in constant time (on average) but at a price… references: Weiss, Goodrich & Tamassia,
Chapter 11 Sets © 2006 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CSE 373: Data Structures and Algorithms Lecture 16: Hashing III 1.
Hashing. Searching Consider the problem of searching an array for a given value If the array is not sorted, the search requires O(n) time If the value.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
Building Java Programs Generics, hashing reading: 18.1.
Sets and Maps Chapter 9.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
Hashing.
Hashing.
Slides by Steve Armstrong LeTourneau University Longview, TX
CMSC 341 Hashing Prof. Neary
Chapter 28 Hashing.
Hashing.
Chapter 21 Hashing: Implementing Dictionaries and Sets
CS202 - Fundamental Structures of Computer Science II
Hashing.
Sets and Maps Chapter 9.
Hashing.
CSE 373 Separate chaining; hash codes; hash maps
Collision Handling Collisions occur when different elements are mapped to the same cell.
Hashing.
Hashing.
Hashing.
Skip List: Implementation
Presentation transcript:

Hashing Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park

Hashing Approach Transform key into number (hash value) Use hash value to index object in hash table Use hash function to convert key to number

Hashing Hash Table Array indexed using hash values Hash Table A with size N Indices of A range from 0 to N-1 Store in A[ hashValue % N]

Hash Function Goal Scatter values uniformly across range Hash( ) = 0 Satisfies definition of hash function But not very useful Multiplicative congruency method Produces good hash values Hash value = (a  int(key)) % N Where N is table size a, N are large primes

Hash Function Example hashCode("apple") = 5 hashCode("watermelon") = 3 hashCode("grapes") = 8 hashCode("kiwi") = 0 hashCode("strawberry") = 9 hashCode("mango") = 6 hashCode("banana") = 2 Perfect hash function Unique values for each key kiwi banana watermelon apple mango grapes strawberry

Hash Function Suppose now hashCode("apple") = 5 hashCode("watermelon") = 3 hashCode("grapes") = 8 hashCode("kiwi") = 0 hashCode("strawberry") = 9 hashCode("mango") = 6 hashCode("banana") = 2 hashCode(“orange") = 3 Collision Same hash value for multiple keys kiwi banana watermelon apple mango grapes strawberry

Types of Hash Tables Open addressing Store objects in each table entry Chaining (bucket hashing) Store lists of objects in each table entry

Open Addressing Hashing Approach Hash table contains objects Probe  examine table entry Collision Move K entries past current location Wrap around table if necessary Find location for X 1. Examine entry at A[ key(X) ] 2. If entry = X, found 3. If entry = empty, X not in hash table 4. Else increment location by K, repeat

Open Addressing Hashing Approach Linear probing K = 1 May form clusters of contiguous entries Deletions Find location for X If X inside cluster, leave non-empty marker Insertion Find location for X Insert if X not in hash table Can insert X at first non-empty marker

Open Addressing Example Hash codes H(A) = 6H(C) = 6 H(B) = 7H(D) = 7 Hash table Size = 8 elements  = empty entry * = non-empty marker Linear probing Collision  move 1 entry past current location 

Open Addressing Example Operations Insert A, Insert B, Insert C, Insert D AA ABAB ABCABC DABCDABC

Open Addressing Example Operations Find A, Find B, Find C, Find D DABCDABC DABCDABC DABCDABC DABCDABC

Open Addressing Example Operations Delete A, Delete C, Find D, Insert C DCB*DCB* D*BCD*BC D*B*D*B* D*B*D*B*

Efficiency of Open Hashing Load factor = entries / table size Hashing is efficient for load factor < 90%

Chaining (Bucket Hashing) Approach Hash table contains lists of objects Find location for X Find hash code key for X Examine list at table entry A[ key ] Collision Multiple entries in list for entry

Chaining Example Hash codes H(A) = 6H(C) = 6 H(B) = 7H(D) = 7 Hash table Size = 8 elements  = empty entry 

Chaining Example Operations Insert A, Insert B, Insert C    A  A B  C B A

Chaining Example Operations Find B, Find A  C B A  C B A

Efficiency of Chaining Load factor = entries / table size Average case Evenly scattered entries Operations = O( load factor ) Worse case Entries mostly have same hash value Operations = O( entries )

Hashing in Java Collections hashMap & hashSet implement hashing Objects Built-in support for hashing boolean equals(object o) int hashCode() Can override with own definitions Must be careful to support Java contract

Java Contract hashCode() Must return same value for object in each execution, provided no information used in equals comparisons on the object is modified equals() if a.equals(b), then a.hashCode() must be the same as b.hashCode() if a.hashCode() != b.hashCode(), then !a.equals(b) a.hashCode() == b.hashCode() Does not imply a.equals(b) Though Java libraries will be more efficient if it is true