Week 8 - Wednesday CS221.

Slides:



Advertisements
Similar presentations
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Hash Tables,
Advertisements

Hashing as a Dictionary Implementation
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
Hashing Techniques.
CSE 250: Data Structures Week 12 March 31 – April 4, 2008.
© 2006 Pearson Addison-Wesley. All rights reserved13 A-1 Chapter 13 Hash Tables.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
CSE 373 Data Structures and Algorithms Lecture 18: Hashing III.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
Algorithm Course Dr. Aref Rashad February Algorithms Course..... Dr. Aref Rashad Part: 4 Search Algorithms.
Appendix E-A Hashing Modified. Chapter Scope Concept of hashing Hashing functions Collision handling – Open addressing – Buckets – Chaining Deletions.
Hashing Hashing is another method for sorting and searching data.
© 2004 Goodrich, Tamassia Hash Tables1  
Hashing as a Dictionary Implementation Chapter 19.
Lecture 12COMPSCI.220.FS.T Symbol Table and Hashing A ( symbol) table is a set of table entries, ( K,V) Each entry contains: –a unique key, K,
“Never doubt that a small group of thoughtful, committed people can change the world. Indeed, it is the only thing that ever has.” – Margaret Meade Thought.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Week 9 - Monday.  What did we talk about last time?  Practiced with red-black trees  AVL trees  Balanced add.
Week 9 - Friday.  What did we talk about last time?  Collisions  Open addressing ▪ Linear probing ▪ Quadratic probing ▪ Double hashing  Chaining.
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can.
Fundamental Structures of Computer Science II
Sets and Maps Chapter 9.
Hashing (part 2) CSE 2011 Winter March 2018.
Hashing.
Design & Analysis of Algorithm Hashing (Contd.)
CSCI 210 Data Structures and Algorithms
School of Computer Science and Engineering
Hashing - resolving collisions
Hashing Alexandra Stefan.
Hashing - Hash Maps and Hash Functions
Week 8 - Friday CS221.
EEE2108: Programming for Engineers Chapter 8. Hashing
Hashing Alexandra Stefan.
Review Graph Directed Graph Undirected Graph Sub-Graph
Hash functions Open addressing
Design and Analysis of Algorithms
Advanced Associative Structures
Hash Table.
Hash Tables.
Hashing CS2110.
Chapter 10 Hashing.
CSE 373: Data Structures and Algorithms
Resolving collisions: Open addressing
Double hashing Removal (open addressing) Chaining
Searching Tables Table: sequence of (key,information) pairs
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hash Tables and Associative Containers
Implementing Hash and AVL
Hashing Alexandra Stefan.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Advanced Implementation of Tables
Advanced Implementation of Tables
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Sets and Maps Chapter 9.
Collision Handling Collisions occur when different elements are mapped to the same cell.
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Data Structures and Algorithm Analysis Hashing
Chapter 13 Hashing © 2011 Pearson Addison-Wesley. All rights reserved.
Week 6 - Monday CS221.
Lecture-Hashing.
Presentation transcript:

Week 8 - Wednesday CS221

Last time What did we talk about last time? Hash tables

Questions?

Project 3

What are we looking for in a hash table? We want a function that will map data to buckets in our hash table Important characteristics: Efficient: It must be quick to execute Deterministic: The same data must always map to the same bucket Uniform: Data should be mapped evenly across all buckets

Student Lecture: Collisions

The real problem with hash tables What happens when you go to put a value in a bucket and one is already there? There are a couple basic strategies: Open Addressing Chaining Load factor is the number of items divided by the number of buckets 0 is an empty hash table 0.5 is a half full hash table 1 is a completely full hash table

Open Addressing With open addressing, we look for some empty spot in the hash table to put the item There are a few common strategies Linear probing Quadratic probing Double hashing

Linear probing With linear probing, you add a step size until you reach an empty location or visit the entire hash table Example: Add 6 with a step size of 5 104 3 19 7 89 1 2 4 5 6 8 9 10 11 12 104 3 19 7 6 89 1 2 4 5 8 9 10 11 12

Quadratic probing For quadratic probing, use a quadratic function to try new locations: h(k,i) = h(k) + c1i + c2i2, for i = 0, 1, 2, 3… Example: Add 6 with c1 = 0 and c2 = 1 104 3 19 7 89 1 2 4 5 6 8 9 10 11 12 104 3 19 7 6 89 1 2 4 5 8 9 10 11 12

Double hashing For double hashing, do linear probing, but with a step size dependent on the data: h(k,i) = h1(k) + i∙h2(k), for i = 0, 1, 2, 3… Example: Add 6 with h2(k) = (k mod 7) + 1 104 3 19 7 89 1 2 4 5 6 8 9 10 11 12 104 6 3 19 7 89 1 2 4 5 8 9 10 11 12

Open addressing pros and cons Open addressing schemes are fast and relatively simple Linear and quadratic probing can have clustering problems One collision means more are likely to happen Double hashing has poor data locality It is impossible to have more items than there are buckets Performance degrades seriously with load factors over 0.7

Chaining Make each hash table entry a linked list If you want to insert something at a location, simply insert it into the linked list This is the most common kind of hash table Chaining can behave well even if the load factor is greater than 1 Chaining is sensitive to bad hash functions No advantage if every item is hashed to the same location

Deletion Deletion can be a huge problem Easy for chaining Highly non-trivial for linear probing Consider our example with a step size of 5 Delete 19 Now see if 6 exists 104 3 19 7 6 89 1 2 4 5 8 9 10 11 12

Perfect Hash Functions If you know all the values you are going to see ahead of time, it is possible to create a minimal perfect hash function A minimal perfect hash function will hash every value without collisions and fill your hash table Cichelli’s method and the FHCD algorithm are two ways to do it Both are complex Look them up if you find yourself in this situation

Hash Table Implementation

Recall: Symbol table ADT We can define a symbol table ADT with a few essential operations: put(Key key, Value value) Put the key-value pair into the table get(Key key): Retrieve the value associated with key delete(Key key) Remove the value associated with key contains(Key key) See if the table contains a key isEmpty() size() It's also useful to be able to iterate over all keys

Chaining hash table public class HashTable { private int size = 0; private int power = 10; private Node[] table = new Node[1 << power]; private static class Node { public int key; public Object value; public Node next; } …

Easy methods Get the number of elements stored in the hash table Say whether or not the hash table is empty public int size() public boolean isEmpty()

Hashing function It's useful to have a function that finds the appropriate hash value Take the input integer and swap the low order 16 bits and the high order 16 bits (in case the number is small) Square the number Use shifting to get the middle power bits public int hash(int key)

Contains (chaining) If the hash table contains the given key, return true Otherwise return false public boolean contains(int key)

Get (chaining) Return the object with the given key If none found, return null public Object get(int key)

Put (chaining) public boolean put(int key, Object value) If the load factor is above 0.75, double the capacity of the hash table, rehashing all current elements Then, try to add the given key and value If the key already exists, update its value and return false Otherwise add the new key and value and return true public boolean put(int key, Object value)

Quiz

Upcoming

Next time… Hash table time trials Map in the JCF HashMap TreeMap Introduction to graphs

Reminders Keep working on Project 3 Keep reading 3.3 Read 4.1