COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus.

Slides:



Advertisements
Similar presentations
Hash Tables CSC220 Winter What is strength of b-tree? Can we make an array to be as fast search and insert as B-tree and LL?
Advertisements

Lecture 11 oct 6 Goals: hashing hash functions chaining closed hashing application of hashing.
Appendix I Hashing. Chapter Scope Hashing, conceptually Using hashes to solve problems Hash implementations Java Foundations, 3rd Edition, Lewis/DePasquale/Chase21.
© 2004 Goodrich, Tamassia Hash Tables1  
Log Files. O(n) Data Structure Exercises 16.1.
Implementation of Linear Probing (continued) Helping method for locating index: private int findIndex(long key) // return -1 if the item with key 'key'
CSE 250: Data Structures Week 12 March 31 – April 4, 2008.
1 CSE 326: Data Structures Hash Tables Autumn 2007 Lecture 14.
CS 206 Introduction to Computer Science II 11 / 17 / 2008 Instructor: Michael Eckmann.
Introduction to Hashing CS 311 Winter, Dictionary Structure A dictionary structure has the form: (Key, Data) Dictionary structures are organized.
Hash Tables1 Part E Hash Tables  
Lecture 11 oct 7 Goals: hashing hash functions chaining closed hashing application of hashing.
Hashing General idea: Get a large array
Hash Tables. Container of elements where each element has an associated key Each key is mapped to a value that determines the table cell where element.
L. Grewe. Computing hash function for a string Horner’s rule: (( … (a 0 x + a 1 ) x + a 2 ) x + … + a n-2 )x + a n-1 ) int hash( const string & key )
Lecture 6 Hashing. Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list.
CS 206 Introduction to Computer Science II 04 / 06 / 2009 Instructor: Michael Eckmann.
CS2110 Recitation Week 8. Hashing Hashing: An implementation of a set. It provides O(1) expected time for set operations Set operations Make the set empty.
Hashing 1. Def. Hash Table an array in which items are inserted according to a key value (i.e. the key value is used to determine the index of the item).
MA/CSSE 473 Day 28 Hashing review B-tree overview Dynamic Programming.
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
Introduction to Analysing Costs 2015-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina.
COMP 103 Priority Queues, Partially Ordered Trees and Heaps.
COMP 103 Hashing 2013-T2 Lecture 28 Thomas Kuehne School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay.
2013-T2 Lecture 22 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and Thomas.
Hashing1 Hashing. hashing2 Observation: We can store a set very easily if we can use its keys as array indices: A: e.g. SEARCH(A,k) return A[k]
COMP 103 Hashing 2014-T2 Lecture 32 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay.
Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.
An introduction to costs (continued), and Binary Search 2013-T2 Lecture 11 School of Engineering and Computer Science, Victoria University of Wellington.
COMP 103 Hashing. 2 RECAP-TODAY RECAP Bitmaps are a fast way to implement Sets of integers, characters, etc TODAY  Hashing is a similar idea  Detecting.
Hashing Hashing is another method for sorting and searching data.
2014-T2 Lecture 19 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and John.
2013-T2 Lecture 18 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and John.
CS201: Data Structures and Discrete Mathematics I Hash Table.
David Luebke 1 11/26/2015 Hash Tables. David Luebke 2 11/26/2015 Hash Tables ● Motivation: Dictionaries ■ Set of key/value pairs ■ We care about search,
LECTURE 35: COLLISIONS CSC 212 – Data Structures.
Hash Tables CSIT 402 Data Structures II. Hashing Goal Perform inserts, deletes, and finds in constant average time Topics Hash table, hash function, collisions.
More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University.
Hash Table March COP 3502, UCF 1. Outline Hash Table: – Motivation – Direct Access Table – Hash Table Solutions for Collision Problem: – Open.
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
CS261 Data Structures Hash Tables Open Address Hashing.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
2015-T2 Lecture 17 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, John Lewis,
Hashtables. An Abstract data type that supports the following operations: –Insert –Find –Remove Search trees can be used for the same operations but require.
2015-T2 Lecture 30 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas.
2014-T2 Lecture 29 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae and Thomas.
COMP 103 wrapping up and some exam tips 2015-T2 Lecture 32 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington 
1 Data Structures CSCI 132, Spring 2014 Lecture 33 Hash Tables.
CMSC 341 Hashing Readings: Chapter 5. Announcements Midterm II on Nov 7 Review out Oct 29 HW 5 due Thursday CMSC 341 Hashing 2.
Hash Tables Ellen Walker CPSC 201 Data Structures Hiram College.
2015-T2 Lecture 19 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and John.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
2014-T2 Lecture 18 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, and John.
2015-T2 Lecture 28 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae and Thomas.
Introduction to Analysing Costs 2013-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina.
COMP 103 Course Review. 2 Menu  A final word on hash collisions in Open Addressing / Probing  Course Summary  What we have covered  What you should.
COMP 103 Exam Tips. 2 The Exam (Example) answer all questions manage your time Dumb calculators & non- electronic dictionaries are OK.
Fundamental Structures of Computer Science II
COMP 103 Hashing Marcus Frean 2015-T2 Lecture 31
COMP 103 Linked Structures Marcus Frean 2014-T2 Lecture 17
Introduction to Analysing Costs
Hashing CSE 2011 Winter July 2018.
COMP 103 Sorting with Binary Trees: Tree sort, Heap sort Alex Potanin
More complexity analysis & Binary Search
Design and Analysis of Algorithms
Instructor: Lilian de Greef Quarter: Summer 2017
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
Searching Tables Table: sequence of (key,information) pairs
Collision Handling Collisions occur when different elements are mapped to the same cell.
Presentation transcript:

COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae and Thomas Kuehne, VUW

2 RECAP-TODAY RECAP  Hashing with “buckets” TODAY  Hashing by “probing”  the exam

3 Collisions: chaining / buckets  Store a Set in each cell: hash value → which set ant fox hen dog bee kea cow elk owl pig sow tui ape bat bug cat eel gnu jay nit ray yak cod roe

4 Dealing with Collisions  Two approaches  Use a collection at each place (“buckets” or “chaining”)  Look for an empty place in the hashtable (“probing” or “open addressing”) N ⋯⋯ “ 2001 – A Space Odyssey ” HASH “ Gravity ” HASH

5 Linear Probing Hash value tells us where to start looking.  if value.hashCode() → p start at index p if cell is used, try p+1, p+2, p+3 … wrap round to 0 at the end of the array. hash = (name[0]+name[1])% Sam Steve StigStu Sven Sun (3) (2) (5) (4) (2)

6 Hash Tables and Load Factor  When is the hashTable “full”?  When number of items is close to array size: May have to probe a large number of cells to find empty cell ⇒ performance becomes very slow. Linear probing is particularly bad!  Should not let table get more than 70% - 80% full (maximum “load factor”)  With a low load factor, cost is O(1)  high O(N) “ eel ”“ pig ”“ cat ”“ bee ”“ fox ”“ dog ”“ owl ”“ hen ”“ ant ” “ kea ”

7 ensureCapacity If it is full, double and copy:  how do you copy? Index depends on…  hashCode and length (division method)!  and it depends on previous collisions... ⇒ Have to rehash everything! “ eel ”“ kea ”“ ant ”“ cat ”“ bee ”“ fox ”“ dog ” “ eel ”“ kea ”“ ant ”“ cat ”“ bee ”“ fox ”“ dog ” “ eel ”“ kea ”“ ant ”“ cat ”“ bee ”“ fox ”“ dog ”

8 Linear Probing: Runs and Clustering  Linear probing is particularly bad:  Repeated collisions at one index create runs  Runs → linear performance  With linear probing, runs join up ⇒ they grow fast: the bigger the run, the faster it grows This is called "clustering“ Does it help to increase step size (p, p+d, p+2d, …) ? “ eel ”“ kea ”“ ant ”“ cat ”“ bee ”“ fox ”“ dog ” 3 1,254 hen owl pig gnu emu rat tui

9 Quadratic Probing  Make the sequence of probes have increasing steps:  runs don’t join up so fast h, h+1, h+4, h+9, h+16, … p=h, p+=1, p+=3, p+=5, p+= 7, p+= 9, ….  In general, quadratic probing uses a quadratic formula: probe i = hash + a  i + b  i 2 ( b  0)  Eg: with a=b=½, the step sizes become 1,2,3… instead of 1,3,5… “ eel ”“ kea ”“ ant ”“ cat ”“ fox ”“ dog ”“ hen ”“ bee ”“ owl ”

10 Quadratic Probing Another problem, perhaps?  sequence might wrap back on itself before checking each cell:  If we choose a = b = ½, and length is a power of 2... ⇒ guaranteed not to wrap until it has checked every cell ! probe i = hash + ½ (i + i 2 ) ⇒ probes are hash, hash+1, hash+3, hash+6, hash+10, hash+15,... ⇒ step sizes are 1, 2, 3, 4, 5, … “ eel ”“ dog ”“ hen ”

11 Hash Table with Probing: remove  Inserted: Stu (2) Sven (5) Sam (4) Steve (2) Sun (4)  Now remove: Sam (4)  What’s the problem?  contains(Sun) will return false!  To remove, need to leave a marker (not null, not a value !) public void remove() { throw new UnsupportedOperationException(); } SamSteveStigStuSvenSun insert a "tombstone" key instead

12 Iterator?  Iterating through hash table is not so simple!  there will be nulls to skip over  the order that items are returned appears random (and may change when the array is doubled!)  At each call to next(), Iterator must advance the index to the next non-null cell. Could be slow!... “ eel ”“ kea ”“ ant ”“ cat ”“ bee ”“ fox ”“ dog ”

13 hashing summary  hashing gives add/find that is crazily quick  two ideas: buckets and probing  with the probing method, removing requires “tombstones”  when a hashtable is too full, you need to increase its size: this requires rehashing everything  iterating over a HashSet can be a slow process

14 the COMP103 final exam The 4 th of November is a Tuesday Exam is at 2:30pm, and lasts TWO hours You will be distributed over 5 different rooms:  ABUBAKR - BHIKHUMYLT101  BHULA - DEIGHTONHMLT104  DEL ROSARIO - LATEGANKKLT303  LAWRENCE - PEREZHMLT205  PHEASE - ZHUMCLT103

15 preparing for the exam  the 103 homepage has link to “Assessment archive” Do your best without the answers 2. Then check against the answers  Next week: tutor-run help sessions (Jeffrey Wu) 1. Monday 20 th, 12:30-3pm, in Cotton Wednesday 22 nd, 12:30-3pm, but in AM ALSO, VUW Science Society runs “cram session” for ECS: Friday 24 th, 10am-3pm, in the Memorial Theatre Foyer  checklist – on the 103 homepage  friends... assignments... textbook... notes... videos...

16

17 The Exam answer all questions manage your time Dumb calculators & non- electronic dictionaries are OK

18 doing your best on the day  Read the question carefully and make sure you know what is being asked.  Write your answer clearly  Use extra pages for rough work or for answers  Cross out what you don’t want marked  Say where your answer is if not on same page  For coding questions:  There’s more than one way to skin a cat  If it’s complicated, start with the pseudocode

19 best wishes!