Information Theory and Security pt. 2
Lecture Motivation Previous lecture talked about a way to measure “information”. In this lecture, our objective will be to use this concept of “information” to convey what it means to be “secure”
Lecture Outline Probability Review: Conditional Probability and Bayes Secrecy and Information Theory –Probabilistic definitions of a cryptosystem –Perfect Secrecy
Another way to look at cryptography, pg. 1 So far in class, we have looked at the security problem from an algorithm point-of-view (DES, RC4, RSA,…). But why build these algorithms? How can we say we are doing a good job? Enter information theory and its relationship to ciphers… Suppose we have a cipher with possible plaintexts P, ciphertexts C, and keys K. –Suppose that a plaintext P is chosen according to a probability law. –Suppose the key K is chosen independent of P –The resulting ciphertexts have various probabilities depending on the probabilities for P and K.
Another way to look at cryptography, pg. 2 Now, enter Eve… She sees the ciphertext C and several security questions arise: –Does she learn anything about P from seeing C? –Does she learn anything about the key K from seeing C? Thus, our questions are associated with H(P|C) and H(K|C). Ideally, we would like for the uncertainty to not decrease, i.e. H(P | C) = H(P) H(K | C) = H(K)
Another way to look at cryptography, pg. 3 Example: Suppose we have three plaintexts {a,b,c} with probabilities {0.5, 0.3, 0.2}. Suppose we have two keys k1 and k2 with probabilities 0.5 and 0.5. Suppose there are three ciphertexts U,V,W. We may calculate probabilities of the ciphertexts Similarly we get p C (V)=0.25 and p C (W)=0.25 E k1 (a)=UE k1 (b)=VE k1 (c)=W E k2 (a)=UE k2 (b)=WE k2 (c)=V
Another way to look at cryptography, pg. 4 Suppose Eve observes the ciphertext U, then she knows the plaintext was “a”. We may calculate the conditional probabilities: Similarly we get p P (c|V)=0.4 and p P (a|V)=0. Also p P (a|W)=0, p P (b|W)=0.6, p P (c|W)=0.4. What does this tell us? Remember, the original plaintexts probabilities were 0.5, 0.3, and 0.2. So, if we see a ciphertext, then we may revise the probabilities… Something is “learned”
Another way to look at cryptography, pg. 5 We use entropy to quantify the amount of information that is learned about the plaintext given the ciphertext is observed. The conditional entropy of P given C is Thus an entire bit of information is revealed just by observing the ciphertext!
Perfect Secrecy and Entropy The previous example gives us the motivation for the information-theoretic definition of security (or “secrecy”) Definition: A cryptosystem has perfect secrecy if H(P|C)=H(P). Theorem: The one-time pad has perfect secrecy. Proof: See the book for the details. Basic idea is to show each ciphertext will result with equal likelihood. We then use manipulations like: Equating these two as equal and using H(K)=H(C) gives the result. Why?
The trick to information theory results On the previous slide, the “Why?” question is where all the tricks to information theory reside The basic idea is the following: –H(X, f(X)) = H(X) : This means that the joint uncertainty of X and some function of X is precisely the same as the uncertainty in just X –Another way to think of this: If you know X, then you automatically know f(X), so that doesn’t make you any more uncertain Back to H(P,K,C) : –C, the ciphertext, is a function of the plaintext P and the key K, i.e. C=f(P,K), for some encryption function f –Thus, H(P,K,C) = H(P,K, f(P,K)) = H(P,K) –Now, P and K are independent, so H(P)+H(K)
The second part H(P,K,C) = H(P,C): Why? –For the one-time pad, P (XOR) K = C, so K=C (XOR) P –Thus, P and C uniquely define K, or in other words K=g(P,C) –Thus, H(P,K,C) = H(P,C, g(P,C)) = H(P,C) Now, H(P,C) = H(P|C) + H(C) by the chain rule