Improving Usability Through Password-Corrective Hashing Andrew Mehler Steven Skiena Stony Brook University 13 October
Password Authentication User Entry: Password Registry =? mehler1979
Password Authentication Users Not Perfect! User Entry Password Registry =? Enter wrong password o Can’t remember o Data Entry error (every 30 keystrokes) mehler1997 mehler1979
Should passwords with entry errors be accepted? o Increase Usability. o Accept ‘close enough’ strings, little loss of security. o User will choose stronger passwords. o User won’t write down password. Idea: We accept Passwords that differ by a single error (substitution or transposition). Transposition: student -> studnet Substitution: student -> studint PROBLEM: How to implement this?
Solution 1: Repeated Login For an entered password, simulate login with all possible passwords differing by a single transposition or substitution. Requires n-1 attempts for transpositions Requires n*m attempts for substitutions User Entry =? aba baa aab abb … ‘aba’ PROBLEMS
Solution 2: Check Equivalence For an entered password, compare it to the password on file not just for equality, but if it differs by a transposition/substitution. o Password Registry not plain text! o Cant do transpositions/substitutions on encrypted passwords. o Equality is really encrypted equality. User Entry =? Password Registry trans? sub? PROBLEMS
Solution 3: Store All Variants For each user, store in the encrypted file, their password, and all acceptable variations. o Registry file will be large. o Malicious decryption easier. User Entry =? Password Registry `aba` aba aab baa PROBLEMS
Our Solution: Corrective Hashing Reduce password space by a correcting hash function. o Solves problems of previous methods. o Loss of recall and increase of false positives User Entry =? Password Registry hh Meh Mehler1979 Mehler1997 Meh
Password Corrective Hashing Want to accept mistakes (recall) h(flpajack) = h(flapjack) Don’t accept other strings (false positive rate) h(pancake) ≠ h(flapjack) We separately consider correcting single transposition errors and single substitution errors (most common entry error types) Notation n = password (string) length m = alphabet size
Previous Work Phonetic Hashing (Soundex, Metaphone, etc.) h(Smith) = S43 = h(Smyth) SAMBA: repeated login to relax case and character order. Personal Question Answering. Semantic Pass-Phrase.
Correcting Transposition Errors Sorting a string imposes its own order. All strings differing by a transposition are the same when sorted, so Recall = 1 But many False Positives h(erika) = aeikr = h(keira) Theorem: No other method will have fewer false positives with perfect recall Idea: Sort the characters of a password. h(flpajack) = aacfjklp = h(flapjack)
Proof Assume some method M with recall M = 1 fp M < fp Sort Then there are strings S,T such that Sort(S) = Sort(T) M(S) ≠ M(T) Thus there exists a sequence S, s 1, s 2, …, s j, T With each string differing by a transposition. (example: keira, ekira, eikra, eirka, erika) Since M(S)≠M(T), there is some i such that M(s i ) ≠ M(s i+1 ) Contradicting M’s perfect recall.
Partial Sorting Sorting’s high false positive rate makes it insecure. Can we get a lower false positive rate with almost as good recall? We consider 2 methods that partially sort a string. Sorting Networks Block Sorting d d d a d a a a b b b bc c cc d b c a a b c d
Sorting Networks …. Correct Transpositions Impose some order on the string, up to completely sorted Take output of any stage as an operating point.
Sorting Network Analysis 1-stage All even Transpositions are corrected. Recall is 2-stage All even transpositions still corrected. Some odd transpositions corrected also. Consider ‘abcd’ and ‘acbd’. Hashed together if a b,c d
Block Sorting Partition string into substrings, and sort the substrings. Will correct all transposition errors except those occurring across substrings.
Block Sorting Analysis Does not correct transpositions across block boundaries. Recall = (n-k)/(n-1) False positive if each block is hashed together under complete sorting fp = 2 k-1 ∏(fp sort (n i )+tp sort (n i )) + ∑fp sort (n i )m n-n i
Example Domains ApplicationPassword Length (n) Alphabet Size (m) Logins WEP Key SSN910 Credit Card1610 Names726
Correcting Transposition Results Conclusion: Block Sorting can be used to match passwords, except on small alphabets.
Correcting Substitution Errors Hi/Low Weakening: Partition alphabet into two sets. Ex: Low = [0-4] High = [5-9] > LHHH Recall = (k(k-1) + (m-k)(m-k-1)) / m(m-1) Weak Set A subset of the alphabet is the weak set. All members of the weak set get hashed to the same symbol. Ex: Weak-Set = {a,e,i,o,u} Lawrence -> L.wr.nc. Recall = k(k-1) / m(m-1)
Weak Set Results Conclusion: Too insecure for usability gains.
Substitution Results
Crack Lists Previous analysis assumed uniform distribution of passwords. Users tend to use dictionary words. One common way of breaking into systems is by using a ‘crack’ list of common words and names that might appear in a password. How much smaller of a crack list would be needed if corrective hashing was used? erika keira last salt aeikr alst h = sorting
Crack Lists < 13% reduction of crack list for complete sorting. < 1% reduction of crack list for 50% recall.
Conclusions Usability increased with small security trade-off for correcting transposition errors Substitution errors harder to correct Crack list computational cost not significantly decreased Open Problems o Better hash functions? o Correcting insert/deletion errors? o Empirical usability experiments?