Hashing Project by: Omar Benismail Comp3801
In this presentation: What is hashing? Types of hashing that I covered. Secure hashing. Min-Hashing. Best hashing algorithm.
A hash function is any function that can be used to map data of arbitrary size to data of fixed size.
A good hash function will arbitrarily map a value to a hash code with equal probability. One common problem is when our universe far exceeds the size of our hashing codes (PHP)
What happened since the last presentation?
Types of hashing that I covered Secure hashing Min-Hashing
Secure Hashing SHA -series MDS -series
Secure Hashing (cont.) Speed isn’t an issue Could have really complex algos. Hash codes have a larger size
Secure Hashing (cont. 2)
SHA-1 and MDS-5 http://www.miraclesalad.com/webtools/md5.php https://crackstation.net/
Min-Hashing Hashes in constant time Hash codes are usually small numbers Usually converts the keys to natural numbers (for speed) Sometimes implies a hash-table
Min-Hashing & Hash-tables For m = size of table For k = key entered
Simple algorithms Division method H(k)= k mod m If m = 2 𝑝 bad
Simple algorithms Division method H(k)= k mod m If m = 2 𝑝 -1 better multiplication method H(k)= m(kA mod 1)
Universal hashing Uses random hash functions that are independent of the key inputted We set p to a large prime number.
Universal hashing ha,b (k) = ((a*k + b) mod p) mod m h(k) = ((a*k)mod 2 𝑤 ) div 2 𝑤−𝑑
Dealing with collisions chaining Θ(1 + α)
Dealing with collisions Linear probing One problem is resizing
Perfect hashing
Best hashing algorithm is…. Depends
Questions What is the main difference between secure hashing and minHashing / universalHashing? What type of mapping function (secure or universal) would you use for these tasks: Converting a password to a hash-code For storing items in a hash-table Implementing bloom’s filter Implementing a cryptographic hashing algorithm 3. Describe one example from class where hashing was used. 4. Describe one of the hash table implantations that deals with collisions. 5. That is the main constraint for a perfect hashing algorithm to work? 6. From the slides What do the values a, b and m represent?