CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can.

Slides:



Advertisements
Similar presentations
1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.
Advertisements

Hashing as a Dictionary Implementation
Hashing Techniques.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
Hashing General idea: Get a large array
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
CSC 213 – Large Scale Programming. Today’s Goal  Consider what will be important when searching  Why search in first place? What is its purpose?  What.
Data Structures and Algorithm Analysis Hashing Lecturer: Jing Liu Homepage:
LECTURE 34: MAPS & HASH CSC 212 – Data Structures.
Hashing as a Dictionary Implementation Chapter 19.
LECTURE 35: COLLISIONS CSC 212 – Data Structures.
Hashing Chapter 7 Section 3. What is hashing? Hashing is using a 1-D array to implement a dictionary o This implementation is called a "hash table" Items.
CSC 212 – Data Structures Lecture 26: Hash Tables.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
Tirgul 11 Notes Hash tables –reminder –examples –some new material.
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
CS 206 Introduction to Computer Science II 04 / 08 / 2009 Instructor: Michael Eckmann.
CSC 212 – Data Structures Lecture 28: More Hash and Dictionaries.
Fundamental Structures of Computer Science II
Hash Tables 1/28/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
Sections 10.5 – 10.6 Hashing.
Searching, Maps,Tries (hashing)
Hashing.
May 3rd – Hashing & Graphs
Hashing CSE 2011 Winter July 2018.
Slides by Steve Armstrong LeTourneau University Longview, TX
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Week 8 - Wednesday CS221.
Hashing - Hash Maps and Hash Functions
Hash Tables (Chapter 13) Part 2.
Subject Name: File Structures
Hashing Alexandra Stefan.
Dictionaries 9/14/ :35 AM Hash Tables   4
Efficiency add remove find unsorted array O(1) O(n) sorted array
Quadratic probing Double hashing Removal and open addressing Chaining
Hashing CS2110 Spring 2018.
Design and Analysis of Algorithms
Advanced Associative Structures
Hash Table.
Hash Table.
Instructor: Lilian de Greef Quarter: Summer 2017
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Computer Science 2 Hashing
Hash Tables.
Hashing CS2110.
Chapter 21 Hashing: Implementing Dictionaries and Sets
Resolving collisions: Open addressing
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Double hashing Removal (open addressing) Chaining
Searching Tables Table: sequence of (key,information) pairs
Data Structures and Algorithms
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
Hashing Alexandra Stefan.
CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.
CS202 - Fundamental Structures of Computer Science II
Database Systems (資料庫系統)
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
Pseudorandom number, Universal Hashing, Chaining and Linear-Probing
Hash Tables Chapter 12 discusses several ways of storing information in an array, and later searching for the information. Hash tables are a common.
slides created by Marty Stepp
Collision Handling Collisions occur when different elements are mapped to the same cell.
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Podcast Ch21b Title: Collision Resolution
Data Structures and Algorithm Analysis Hashing
Lecture-Hashing.
CSE 373: Data Structures and Algorithms
Presentation transcript:

CSC 213 – Large Scale Programming

Today’s Goal

 Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can help solve these problems  What is inappropriate and incorrect about hash jokes  Discover hash’s problems & what must be done  What would happen if keys hashed to same index  Ways of handling situation so that hash still works  To remove data, using null may not be best option  Dark secrets of hashing, exposed at lecture’s end

Map Performance  In many situations can be matter of life-or-death immediately  911 Operators immediately need addresses  Google’s search performance in TB/s  O(log n) time too slow for these uses  Would love to use arrays  Convert key to int with hash function  With result of hash, have index in table to examine only O(1) time put, remove & get only O(1) time

Hash Table Entry s “Jay Doe” “Bob Doe” “Jill Roe” “Rhi Smith” 9999 Hash Table

Ideal World  key hashed to unique index  Hash and done, Entry is there

Ideal World  key hashed to unique index  Hash and done, Entry is there And then… You wake up

Collisions  Occurs when 2 keys hash to same index  Ideal hash spreads keys out evenly across table  As nice side effect, this limits collisions  Small table size important also, since RAM limited  Unfortunately, no such thing as ideal hash  Must handle collisions to get O(1) efficiency buzz

Bad Hash  Perfect hash does not exist  Cannot know all keys beforehand  Clustered around a few indices  Or find all keys hashed to same index  Handling bad hash is a necessary  Even given Entry always check key  Store multiple Entry s with same hash  (Shot of adrenaline restarts heart)

Bucket Arrays  Make hash table an array of linked list Node s  First node aliased by the array location  Whenever we have collision, we “chain” Entry s  Create new Node to store the Entry  The linked list will have new Node at its front

Bucket Arrays  But what if have really bad hash?  Hashes to same index in every situation  All Entry s now found in single linked list  O(n) execution times would now be required

Bucket Arrays  But what if have really bad hash?  Hashes to same index in every situation  All Entry s now found in single linked list  O(n) execution times would now be required  (Also get bad case of the munchies)

Collisions

Linear Probing  Musical chairs uses this algorithm  At index where key hashed examine Entry  Circle through array until empty index found  Algorithm is very simple  But creates clusters of Entry s

Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) =

Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) =

Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) =

Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) =

Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) =

Probing Reaction Oh, **** Adding to hash table still O(n)

Quadratic Probe

Quadratic Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Quadratic Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Quadratic Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Quadratic Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Quadratic Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Quadratic Probing Reaction Darn it to heck. Adding to hash table still O(n)

Double Hashing  Solve bad hash with even more hash  Use 2 nd hash function very different from first  2 nd hash function not allowed to return zero  Re-hash key using 2 nd function after the collision sum  Check index equal to sum of two hash functions  Re-add 2 nd hash to this sum to continue probing  Guaranteed to work when  Still must get around -- table size is prime number

Double Hash Example h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Double Hash Example h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Double Hash Example h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Double Hash Example h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Double Hash Example h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

Double Probing Reaction Sweet! Double hashing keeps put O(n)

Probing and Searching  Search index where key hashed  If cannot place Entry at index  The array must keep being probed  Stop only at usable index  May need to probe every index!  Searching takes O(n) even with hash  May need to reallocate & rehash table  Worst case O(n) put even with perfect hash

Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)

Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)

Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen?

Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen?  First check index it is hashed to

Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen?  First check index it is hashed to  Checks first probe indexed…

Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen?  First check index it is hashed to & stops at null  Checks first probe indexed… & stops at null

* Marker Value Explained  Mark cleared indices in hash table  Since collision could have happened, continue search  Index can be used to store new Entry  Ways to show that array index is clear  Entry with null key could be used if one is careful  Could try and make key which is never used  Use static final field of type Entry

Why Use Hash Table & Probes?  Hash tables can require O(n) complexity  Provide O(1) time if you are really good  Ultimately depends on hash function used  Choose wisely and be rich

Before Next Lecture…  Get updated lab project into SVN directory  No need to , I will collect directories at 5PM  Finish working on week #4 assignment  Due at usual time tomorrow afternoon/evening  Start thinking of your design for the project  Due Friday a preliminary copy of this design  Read sections & of the book  What should we do if many values for 1 key?