Download presentation
Presentation is loading. Please wait.
Published byAlexis Phillips Modified over 8 years ago
1
Hashing & Hash Tables
2
Sets/Dictionaries Set - Our best efforts to date:
3
Easy Set Fast way to represent set if 0-9 only possible values: 0123456789 0100010011
4
Easy Set Fast way to represent set if 0-9 only possible values: Could apply to letters A-J via mapping char int 0123456789 0100010011
5
Easy Set Fast way to represent set if 0-9 only possible values: How could we apply same strategy to all English words? AaAbAcAdAeAfAg… 1001000???
6
Hashing Hash function : maps data onto fixed size value
7
Cryptographic Hashing Desirable traits: – Output is fixed size – Easy to compute – Output varies wildly with small input change – One way
8
Hash Table Hash Table : – Use hash function to map values into array indexes – Constant time to find index and check
9
Hash Table Hash Functions Desirable qualities – Return number 0…(tablesize – 1) map values into array indexes – Efficiently computable constant time to find index – Evenly distribute keys over table
10
Hash Table Functions Desirable qualities – Return number 0…(tablesize – 1) – Efficiently computable – Evenly distribute keys over table Don't waste space – Mapping is onto – every index has 1+ keys Minimize collisions
11
Hash Table Functions Split roles – hash function vs mapping to table: – Hash Function: Evenly distribute keys over space (unsigned ints) – Table mapping: Hash function's result % table size = index
12
Optimal Hash Functions If all keys and table size known, can compute optimal hash… – Rarely the case
13
Hash Function - Integral For integral types: – Hash(x) = x – Table size should be prime
14
Hash Function - Integral For integral types: – Hash(x) = x – Table size should be prime Keys often have pattern – if not relatively prime to table size, get paterns: 0123456789 0, 10, 20 2, 12, 22 4, 14, 24 6, 16, 26 8, 18, 28
15
Hash Function - String String approach 1 – add up characters: for (i=0;i<key.length();i++) hashVal += key[i]; Problem 1: What if TableSize is 10,000 and all keys are 8 or less characters long? Problem 2: What if keys often contain the same characters (“abc”, “bca”, etc.)?
16
Hash Function - String String approach 2 – multiply each character by different powers of some number: – "apple" : 'a' * 31 4 'p' * 31 3 'p' * 31 2 'l' * 31 1 'e' * 31 0
17
Hash Function - String String approach 2 – multiply each character by different powers of some number: – "apple" : 'a' * 31 4 + 'p' * 31 3 + 'p' * 31 2 + 'l' * 31 1 + 'e' * 31 0 Efficiently do via bit shifting: for (i=0;i<key.length();i++) hashVal = (hashVal << 6) ^ key[i]; * 64
18
Hash Function - String String approach 2 – multiply each character by different powers of some number: – "apple" : 'a' * 31 4 + 'p' * 31 3 + 'p' * 31 2 + 'l' * 31 1 + 'e' * 31 0 Efficiently do via bit shifting: for (i=0;i<key.length();i++) hashVal = (hashVal << 6) ^ key[i]; Binary XOR
19
Collisions Collision : two keys map to same index: – 12 and 22 0123456789 12 22
20
Probing Linear Probing: value goes in next available slot 0123456789 12
21
Probing Linear Probing: value goes in next available slot 0123456789 1222
22
Probing Linear Probing: value goes in next available slot 0123456789 122232
23
Probing Linear Probing: value goes in next available slot Issue: – No longer constant access 0123456789 122232
24
Load Factor Must be < 1 for linear probing Performance drops rapidly past.5
25
Clustering Say we go to put in 3: Now 2-5 are blocked – Anything 2-6 will fill 6 0123456789 1222323
26
Finding Probing used again to find keys: Find 32 – yep its there 0123456789 1222322
27
Finding Probing used again to find keys: Find 42 – nope – must not be 0123456789 1222322
28
Deletion Say we delete 22: Find 32… 0123456789 12322
29
Deletion Say we delete 22: Find 32… not there! 0123456789 12322
30
Tombstone Special value indicating something was there Search knows to continue Insertion can use that slot – But need to continue search to avoid duplicate 0123456789 12#322
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.