Download presentation
Presentation is loading. Please wait.
Published byDorcas Park Modified over 9 years ago
1
1 Cryptanalysis Four kinds of attacks (recall) The objective: determine the key ( Herckhoff principle ) Assumption: English plaintext text Basic techniques: frequency analysis based on: –Probabilities of occurrences of 26 letters –Common digrams and trigrams.
2
2 Cryptanalysis -- statistical analysis Probabilities of occurrences of 26 letters –E, having probability about 0.120 (12%) –T,A,O,I,N,S,H,R, each between 0.06 and 0.09 –D,L, each around 0.04 –C,U,M,W,F,G,Y,P,B, each between 0.015 and 0.028 –V,K,J,X,Q,Z, each less than 0.01 –See table 1.1, page 26 30 common digrams (in decreasing order): –TH, HE, IN, ER, AN, RE,… 12 common trigrams (in decreasing order): –THE, ING,AND,HER,ERE,…
3
3 Cryptanalysis of Affine Cipher Suppose a attacker got the following Affine cipher –FMXVEDKAPHFERBNDKRXRSREFNORUDSDKDVSHVUFEDKAPRKDLYEVLR HHRH Cryptanalysis steps: –Compute the frequency of occurrences of letters R: 8, D:7, E,H,K:5, F,S,V: 4 (see table 1.2, page 27) –Guess the letters, solve the equations, decrypt the cipher, judge correct or not. First guess: R e, D t, i.e., e K (4)=17, e K (19)=3 –Thus, 4a+b=17 a=6, b=19, since gcd (6,26)=2, so incorrect. 19a+b=3 Next guess: R e, E t, the result will be a=13, not correct. Guess again: R e, H t, the result will be a=8, not correct again. Guess again: R e, K t, the result will be a=3, b=5. –K=(3,5), e K (x)=3x+5 mod 26, and d K (y)=9y-19 mod 26. –Decrypt the cipher: algorithmsarequitegeneraldefinitionsofarithmeticprocesses If the decrypted text is not meaningful, try another guess. Need programming: compute frequency and solve equations Since Affine cipher has 12*26=312 keys, can write a program to try all keys.
4
4 Cryptanalysis of substitution cipher Final goal is to find the corresponding plaintext letter for each ciphertext letter. Ciphertext: example 1.11, page 28 Steps: –Frequency computation, see table 1.3, page 29 Guess Z e, quite sure C,D,F,J,M,R,Y are t,a,o,i,n,s,h,r, but not exact –Look at digrams, especially –Z or Z-. Since ZW occurs 4 times, but no WZ, so guess W d (because ed is a common digram, but not de) Continue to guess –Look at the trigrams, especially THE, ING, AND,…
5
5 Cryptanalysis of Vigenere cipher In some sense, the cryptanalysis of Vigenere cipher is a systematic method and can be totally programmed. Step 1: determine the length m of the keyword –Kasiski test and index of coincidence Step 2: determine K=(k 1,k 2,…,k m ) –Determine each k i separately.
6
6 Kasiski test—determine keyword length m Observation: two identical plaintext segments will be encrypted to the same ciphertext whenever they appear positions apart in plaintext, where 0 mod m. Vice Versa. So search ciphertext for pairs of identical segments, record the distance between their starting positions, such as 1, 2,…, then m should divide all of i ’s. i.e., m divides gcd of all i ’s.
7
7 Index of coincidence Can be used to determine m as well as to confirm m, determined by Kasiski test Definition: suppose x=x 1 x 2,…,x n is a string of length n. The index of coincidence of x, denoted by I c (x), is defined to be the probability that two random elements of x are identical. –Denoted the frequencies of A,B,…,Z in x by f 0,f 1,…,f 25 --I c (x)= i=0 25 ( ) fi2fi2 ( ) n2n2 = i=0 25 f i (f i -1) n(n-1) ( Formula IC )
8
8 Suppose x is a string of English text, denote the expected probability of occurrences of A,B,…,Z by p 0,p 1,…,p 25 with values from table 1.1, then I c (x) p i 2 =0.082 2 +0.015 2 +…+0.001 2 =0.065 (since the probability that two random elements both are A is p 0 2, both are B is p 1 2,…) Index of coincidence (cont.) Question: if y is a ciphertext obtained by shift cipher, what is the I c (y)? Answer: should be 0.065, because the individual probabilities will be permuted, but the p i 2 will be unchanged. Therefore, suppose y=y 1 y 2 …y n is the ciphertext from Vigenere cipher. For any given m, divide y into m substrings: y 1 =y 1 y m+1 y 2m+1 … if m is indeed the keyword length, then y 2 =y 2 y m+2 y 2m+2 … each y i is a shift cipher, I c (y i ) is about 0.065. … y m =y m y 2m y 3m … otherwise, I c (y i ) 26(1/26) 2 = 0.038.
9
9 Index of coincidence (cont.) For purpose of verify keyword length m, divide the ciphertext into m substrings, compute the index of coincidence by formula IC for each substring. If all IC values of the substrings are around 0.065, then m is the correct keyword length. Otherwise m is not the correct keyword length. If want to use I c to determine correct keyword length m, what to do? Beginning from m=2,3, … until an m, for which all substrings have IC value around 0.065. Now, how to determine keyword K=(k 1,k 2,…,k m )? Assume m is given.
10
10 Determine keyword K=(k 1,k 2,…,k m ) 1.Determine each k i (from y i ) independently. 2.Observation: 2.1 let f 0,f 1,…,f 25 denote the frequencies of A,B,…,Z in y i and n′=n/m 2.2 then probability distribution of 26 letters in y i is: f 0 f 25 n′,, n′ 2.3 if the shift key is k i, then f 0+k i (i.e., A+k i ) is the frequency of a in the corresponding plaintext x i, …, f 25+k i (note the subscript 25+k i should be computed by modulo 26) is the frequency of z in x i. Since x i is normal English text, probability distribution of f 0+k i f 25+k i n′,, n′ should be “close to” ideal probability distribution p 0,p 1,…,p 25. p 0, …, p 25 So: f 0+k i n' p0p0 +…+ f 25+k i n' p 25 p 0 2 +…+p 25 2 =0.065
11
11 Determine keyword K=(k 1,k 2,…,k m ) (cont.) Therefore, define: f i+g M g = i=0 25 p i n′n′ When g=k i, M g will generally be around 0.065 (i.e., i=0 25 p i 2 ). Otherwise M g will be quite smaller than 0.065. So let g from 0, until 25, compute M g, and for some g, if M g is around 0.065, then k i =g. Note: the subscript i+g should be seen as modulo 26. f 0+g n' p0p0 +…+ f 25+g n' p 25 On the other hand, for any g !=k i, will not be close to 0.065.
12
12 Cryptanalysis of Vigenere cipher--example Example 1.12, page 33. –Using Kasiski test to determine the keyword length CHR appears five times at 1,166,236,276,286 the distance is 165, 235,275,285, the gcd is 5, so m=5. –Using index of coincidence to verify m=5. Divide ciphertext into y 1, y 2, y 3, y 4, y 5 Compute f 0,f 1,…,f 25 for each y i and then I c (y i ), get 0.063, 0.068,0.069,0.061,0.072, so m=5 is correct. –Determine k i for i=1,…,5. Compute M g for g=0,1,…,25 and if M g 0.065, then let k i =g. where M g = i=0 25 p i n′n′ f i+g As a result, k 1 =9,k 2 =0,k 3 =13,k 4 =4,k 5 =19, i.e., JANET
13
13 Cryptanalysis of Hill cipher Difficult to break based on ciphertext only Easily to break based on both ciphertext and plaintext. Suppose given at least m distinct plaintext- ciphertext pairs: x j =(x 1,j,x 2,j,…,x m,j ) y j =(y 1,j,y 2,j,…,y m,j ) then define two matrices X=(x i,j ) and Y=(y i,j ) Let Y=XK, if X is invertible, then K=X -1 Y.
14
14 Suppose plaintext is: friday and ciphertext is: PQCFKU and the m=2. Then e K (f,r)=(P,Q), e K (i,d)=(C,F). That is: Cryptanalysis of Hill cipher--example ( ) = ( )K 15 16 2 5 5 17 8 3 K=( ) -1 ( )=( )( )=( ) 5 17 8 3 15 16 2 5 9 1 2 15 15 16 2 5 7 19 8 3 Then using the third pair, i.e., (a,y) and (K,U) to verify K. In case m is unknown, try m=2,3, …
15
15 Cryptanalysis of LFSR stream cipher Vulnerable to known-plaintext attack. Suppose m, plaintext binary string x 1,x 2,…,x n and ciphertext binary string y 1,y 2,…,y n are known, as long as n>2m, the key can be broken: –Keystream is: z i =(x i +y i ) mod 2. (i=1,2,…,n) –Then the initialization vector of K is z 1,…, z m. –Next is to determine coefficients (c 0,c 1,…,c m-1 ) of K (recall that z i+m = m-1 j=0 c j z i+j mod 2 for all i 1) i.e, –(z m+1,z m+2,…,z 2m )=(c 0,c 1,…,c m-1 ) z 1 z 2 … z m z 2 z 3 … z m+1 …………… z m z m+1 … z 2m-1
16
16 Cryptanalysis of LFSR stream cipher (cont.) –(c 0,c 1,…,c m-1 )=(z m+1,z m+2,…,z 2m ) z 1 z 2 … z m z 2 z 3 … z m+1 …………… z m z m+1 … z 2m-1 Therefore:
17
17 Example 1.14, page 37. Suppose LFSR 5 with the following: Cryptanalysis of LFSR stream cipher --example Ciphertext string: 101101011110010 Plaintext string: 011001111111000 Then keystream: 110100100001010 Therefore initialization vector is: 11010. For next five key elements: 01000, set up equation for coefficients (c 0,c 1,c 2,c 3,c 4 ) and solve it. The result is: (c 0,c 1,c 2,c 3,c 4 ) =(1,0,0,1,0) i.e., z i+5 =(z i +z i+3 ) mod 2.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.