Announcements: Assignment 1 due tomorrow in class. Assignment 1 due tomorrow in class.Questions? Roll Call Today: Vigenere ciphers Pronunciation? DTTF/NB479: DszquphsbqizDay 5
Idea: the key is a vector of shifts Ex. Use a word like hidden ( ). Ex. Use a word like hidden ( ). Example: Example: The recent development of various methods of aph uiplvw giiltrsqrub ri znyqrxw zlbkrhf vnEncryption: Repeat the vector as many times as needed to get the same length as the plaintext Repeat the vector as many times as needed to get the same length as the plaintext Add this repeated vector to the plaintext. Add this repeated vector to the plaintext. Demo Demo Vigenere Ciphers
The recent development of various methods of modulation such as PCM and PPM which exchange bandwidth for signal-to-noise ratio has intensified the interest in a general theory of communication. A basis for such a theory is contained in the important papers of Nyquist and Hartley on this subject. In the present paper we will extend the theory to include a number of new factors, in particular the effect of noise in the channel, and the savings possible due to the statistical structure of the original message and due to the nature of the final destination of the information. (Claude Shannon’s A Mathematical Theory Of Communication) Source
The vector isn’t known It’s length isn’t even known! With shift cipher, most frequent cipher letter is probably e. But here, e maps to H, I, L, … (spread out!) But here, e maps to H, I, L, … (spread out!) Consider 4 attacks: Known plaintext? Known plaintext? Chosen plaintext? Chosen plaintext? Chosen ciphertext? Chosen ciphertext? Ciphertext only? Ciphertext only? Security
Cryptanalysis What would you do to simplify the problem? You can even assume that you know how long the key is… 3-4 min with a partner
English letter frequencies A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Graph:
Exceptions Consider Gadsby by Ernest Vincent Wright, February 1939: What do you notice about it?
Cryptanalysis Assume we know the key length, L, … We’ll see how to find it shortly We’ll see how to find it shortly Method 1: Parse out the characters at positions p = 1 (mod L) Parse out the characters at positions p = 1 (mod L) These have all been shifted the same amount Do a frequency analysis to find shift The most frequent letter should be e, given enough text. Can verify to see how shift affects other letters. The most frequent letter should be e, given enough text. Can verify to see how shift affects other letters. This gives the first letter of the key This gives the first letter of the key Repeat for positions p = 2, p = 3, … p = L Repeat for positions p = 2, p = 3, … p = L Problem: involves some trial and error. Problem: involves some trial and error. For brute force to work, would need to brute force all letters of key simultaneously: _____ possibilities For brute force to work, would need to brute force all letters of key simultaneously: _____ possibilities
Dot products Consider A = ( ) A i = A displaced i positions to the right A 0.* A 0 = ? A 0.* A 1 = ? A i.* A j depends on _____ only. Max occurs when _____. 3 reasons why:
Finding the key length What if the frequency of letters in the plaintext approximates A? Then for each k, frequency of each group of letters in position p = k (mod L) in the ciphertext approximates A. Then loop, displacing the ciphertext by i, and counting the number of coincidences. Get max when displace by correct key length Get max when displace by correct key length So just look for the max! So just look for the max!shift APHUIPLVWGIILTRSQRUBRIZNYQRXWZLBKRHFVN (0) NAPHUIPLVWGIILTRSQRUBRIZNYQRXWZLBKRHFV (1) VNAPHUIPLVWGIILTRSQRUBRIZNYQRXWZLBKRHF (2) … KRHFVNAPHUIPLVWGIILTRSQRUBRIZNYQRXWZLB (6) 5 matches
Key length: an example Take any random pair in the ciphertext: The letter in the top row is shifted by i (say 0) The letter in the bottom row is shifted by j (say 2) Prob(both ‘A’) = P(‘a’)*P(‘y’) = * Prob they are the same (any letter) is ______ When i=j, get max # of coincidences Occurs when we shift by the correct key length Demo
Another method Method 1 Parse out the characters at positions p = 1 (mod L) Parse out the characters at positions p = 1 (mod L) These have all been shifted the same amount Do a frequency analysis to find shift The most frequent letter should be e, given enough text. Can verify to see how shift affects other letters. The most frequent letter should be e, given enough text. Can verify to see how shift affects other letters. This gives the first letter of the key This gives the first letter of the key Repeat for positions p = 2, p = 3, … p = L Repeat for positions p = 2, p = 3, … p = L
Another method Method 2 Parse out the characters at positions p = 1 (mod L) Parse out the characters at positions p = 1 (mod L) These have all been shifted the same amount Get the whole freq. distribution W = (0.05, 0.002, …) W approximates A. Calculate W approximates A. Calculate Max occurs when we got the shift correct. Max occurs when we got the shift correct. Note we use the distribution of all 26 letters…more robust than just using 1 (‘e’). Note we use the distribution of all 26 letters…more robust than just using 1 (‘e’). This gives the first letter of the key This gives the first letter of the key Repeat for positions p = 2, p = 3, … p = L Repeat for positions p = 2, p = 3, … p = L
Summary of both methods Method 1 Pros? Pros? Method 2 Pros? Pros?
Demo Of my code…
Visualization /Vigenere.html /Vigenere.html Play with now and for homework Thanks to Dr. Dino Schweitzer, USAFA, who I met last Friday, for pointing me to his demo! Aren’t you glad I was at SIGCSE?
Closing Thought What if we modified the Vigenere cipher so that each individual letter was not simply shifted, but the result of an affine function?