Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can.

Similar presentations


Presentation on theme: "CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can."— Presentation transcript:

1 CSC 213 – Large Scale Programming

2 Today’s Goal

3  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can help solve these problems  What is inappropriate and incorrect about hash jokes  Discover hash’s problems & what must be done  What would happen if keys hashed to same index  Ways of handling situation so that hash still works  To remove data, using null may not be best option  Dark secrets of hashing, exposed at lecture’s end

4 Map Performance  In many situations can be matter of life-or-death immediately  911 Operators immediately need addresses  Google’s search performance in TB/s  O(log n) time too slow for these uses  Would love to use arrays  Convert key to int with hash function  With result of hash, have index in table to examine only O(1) time put, remove & get only O(1) time

5 Hash Table Entry s 0 1 0256120001“Jay Doe” 2 9811010002“Bob Doe” 3 4 4512290004“Jill Roe” 9997 9998 2007519998“Rhi Smith” 9999 Hash Table

6 Ideal World  key hashed to unique index  Hash and done, Entry is there

7 Ideal World  key hashed to unique index  Hash and done, Entry is there And then… You wake up

8 Collisions  Occurs when 2 keys hash to same index  Ideal hash spreads keys out evenly across table  As nice side effect, this limits collisions  Small table size important also, since RAM limited  Unfortunately, no such thing as ideal hash  Must handle collisions to get O(1) efficiency buzz

9 Bad Hash  Perfect hash does not exist  Cannot know all keys beforehand  Clustered around a few indices  Or find all keys hashed to same index  Handling bad hash is a necessary  Even given Entry always check key  Store multiple Entry s with same hash  (Shot of adrenaline restarts heart)

10 Bucket Arrays  Make hash table an array of linked list Node s  First node aliased by the array location  Whenever we have collision, we “chain” Entry s  Create new Node to store the Entry  The linked list will have new Node at its front 0 1 2 3 4 5

11 Bucket Arrays  But what if have really bad hash?  Hashes to same index in every situation  All Entry s now found in single linked list  O(n) execution times would now be required

12 Bucket Arrays  But what if have really bad hash?  Hashes to same index in every situation  All Entry s now found in single linked list  O(n) execution times would now be required  (Also get bad case of the munchies)

13 Collisions

14 Linear Probing  Musical chairs uses this algorithm  At index where key hashed examine Entry  Circle through array until empty index found  Algorithm is very simple  But creates clusters of Entry s

15 Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5 15 18442032 2231 76 012345678 910 1112

16 Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5 15 18442032 2231 76 012345678 910 1112

17 Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5 15 18442032 2231 76 012345678 910 1112

18 Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5 15 18442032 2231 76 012345678 910 1112

19 Linear Probe Example h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5 15 18442032 2231 76 012345678 910 1112

20 Probing Reaction Oh, **** Adding to hash table still O(n)

21 Quadratic Probe

22 Quadratic Probe Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

23 Quadratic Probe Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

24 Quadratic Probe Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

25 Quadratic Probe Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

26 Quadratic Probe Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

27 Quadratic Probing Reaction Darn it to heck. Adding to hash table still O(n)

28 Double Hashing  Solve bad hash with even more hash  Use 2 nd hash function very different from first  2 nd hash function not allowed to return zero  Re-hash key using 2 nd function after the collision sum  Check index equal to sum of two hash functions  Re-add 2 nd hash to this sum to continue probing  Guaranteed to work when  Still must get around -- table size is prime number

29 Double Hash Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

30 Double Hash Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

31 Double Hash Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

32 Double Hash Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

33 Double Hash Example 3115 18442032 22 76 012345678 910 1112 h(x) = x mod 13 h 2 (x) = 5 - (x mod 5) Now add: 44 h(44) = 5 20 h(20) = 7 22 h(22) = 9 31 h(31) = 5

34 Double Probing Reaction Sweet! Double hashing keeps put O(n)

35 Probing and Searching  Search index where key hashed  If cannot place Entry at index  The array must keep being probed  Stop only at usable index  May need to probe every index!  Searching takes O(n) even with hash  May need to reallocate & rehash table  Worst case O(n) put even with perfect hash

36 Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44) 15 18442032 2231 76 012345678 910 1112

37 Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44) 15 182032 2231 76 012345678 910 1112

38 Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen? 15 182032 2231 76 012345678 910 1112

39 Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen?  First check index it is hashed to 15 182032 2231 76 012345678 910 1112

40 Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen?  First check index it is hashed to  Checks first probe indexed… 15 182032 2231 76 012345678 910 1112

41 Post-Removal Operations  What happens when we remove an Entry ?  Set index to null in most structures  Consider if we call remove(44)  get(31) called, what would happen?  First check index it is hashed to & stops at null  Checks first probe indexed… & stops at null 15 182032 2231 76 012345678 910 1112

42 * Marker Value Explained  Mark cleared indices in hash table  Since collision could have happened, continue search  Index can be used to store new Entry  Ways to show that array index is clear  Entry with null key could be used if one is careful  Could try and make key which is never used  Use static final field of type Entry

43 Why Use Hash Table & Probes?  Hash tables can require O(n) complexity  Provide O(1) time if you are really good  Ultimately depends on hash function used  Choose wisely and be rich

44 Before Next Lecture…  Get updated lab project into SVN directory  No need to e-mail, I will collect directories at 5PM  Finish working on week #4 assignment  Due at usual time tomorrow afternoon/evening  Start thinking of your design for the project  Due Friday a preliminary copy of this design  Read sections 9.3 - 9.3.1 & 9.3.3 of the book  What should we do if many values for 1 key?


Download ppt "CSC 213 – Large Scale Programming. Today’s Goal  Review when, where, & why we use Map s  Why Sequence -based approach causes problems  How hash can."

Similar presentations


Ads by Google