Theory of Computation II Topic presented by: Alberto Aguilar Gonzalez
Problem You are designing a banking application that will be accessed by thousands of users. Security of passwords is a key factor. Protect from people outside and inside the organization How do you store passwords in the database?
One Approach Encrypt passwords using a key. When the information is needed, decrypt it using same key! Example (very simple): Given a character, encrypt it by replacing it with other. What is the idea? CharacterASCII CODEEncrypted A B IDEA: “hi” = decrypt(encrypt(“hi”))
What is the problem with this approach? If someone accesses this database and knows the key (even people from IT or support), all passwords would be revealed! UserPwd (encrypted using a key) aagui003bbhrt aaoni001jhlkhj
A better approach One-way hash functions (The talk is about this) ONE WAY
One way function A function y = f(x) is one way if it is easy to compute y from x but “hard” to compute x from y However, nobody has proved that such function exist! A possible definition is: f(x) can be obtained in polynomial time f -1 (x) is NP-hard
An example of one-way functions Unique factorization Theorem: Every integer has a unique factorization as product of primes. Factoring Given two large prime numbers u, v, consider y = f(u, v) = u * v. It is polynomial time computable. However, given y, can we calculate u and v easily? NO
Hash function Map a message of variable length m to a fingerprint of fixed n bits, and m >= n Fundamental properties: Compression Easy to compute Can be used to detect changes since a modification (even a bit) would change the hash value.
One-way hash functions y = h(x) where Given x, calculating h(x) is easy Given y, calculating any x such that y=h(x) is hard, AND y is fixed length independent of the size of x (a compression function is needed for large inputs) Input Output
Two questions Is it easy to come up with new one- way hash functions? What do we need to build such functions? Easy to compute (in general, it is a public algorithm) Hard to invert (2 n different output!) Compression function Collision resistant
Collision Given x 1, x 2, and a hash function h, a collision exists if h(x 1 ) = h(x 2 ) Is this possible? YES, why? It is a many-to-one function! The input domain is greater that the output domain. Therefore, good one-way hash functions should be collision resistant! Collision resistant?
The Birthday paradox Consider the probability Q 1 (n, d) that no two people out of a group of n will have matching birthdays out of d equally possible birthdays. In general, let Q i (n, d) denote the probability that a birthday is shared by i people out of a group of n people, then the probability that a birthday is shared by k or more people. Probabilty that two do have same birthday
…birthday paradox An approximation for the minimum number of people needed to get chance that two have a match within k days out of d possible is given by: How many people do we need in this classroom for a chance? (Sevast'yanov 1972, Diaconis and Mosteller 1989). What about OWHFs?
Birthday attacks for OWHFs Given y = h(x), where y is length-fixed of n bits, 2 n outputs can be obtained. Since x is of variable length, and |x| > |y| in some cases. h(x) is a many-to-one function! How many attempts are necessary so that h(x 1 )=h(x 2 ) (probability of success >= 0.5)? Use the formula we just explained! Let d = 2 n, and k = 0
To be collision resistant, how big should n be? 64-bits is now regarded as too small, proposed Output lengthn(d) 64 bits 128 bits 160 bits
General structure of OWHF’s arbitrary length input iterated compression function fixed length output optional transformation output Input Output
Details append padding bits append length block g HiHi H 0 =IV xixi preprocessing HtHt H i-1 original input x formatted input x 1, x 2... x t iterated processing compression function f output h(x)=g(H t )
Two known OWHF’s MD5 From Ronald Rivest (the R from RSA) [1992] Produce a 128-bit hash value MD5 is widely used, however collisions were detected (Wang, 2004). SHA1 Designed by the National Institute of Standards and Technology (NIST), as an “upgrade” from MD5 Produces 160-bit hash values
Going back to our problem Save a pair Now, if somebody (inside or outside) access passwords table each entry should be attacked individually! An authentication algorithm would look as follows: if MD5(passw_typed) == hash_of_passw CorrectPassword = true else CorrectPassword = false
Other uses Digital signatures Antivirus Software validation Used to store passwords in some Linux implementations
Thank you What is he talking about? mmm… Z Z z… Questions?