Xiutao Feng Institute of Software Chinese Academy of Sciences A Byte-Based Guess and Determine Attack on SOSEMANUK
2 Outline 1 Introduction 2 Description of SOSEMANUK 3 Basic properties of SOSEMANUK 4 Our attack 5 Further discussion on our attack 6 Conclusion
3 1 Introduction 1.1 On SOSEMANUK SOSEMANUK is a software-oriented stream cipher proposed by C. Berbain et al for the eSTREAM project and has been selected into the final portfolio with other six algorithms together. Its design adopted the ideas of both the stream cipher SNOW 2.0 and the block cipher SERPENT, and aimed at improving SNOW 2.0 from two aspects of both security and efficiency.
4 1.2 Known cryptanalytic results on SOSEMANUK The designers of SOSEMANUK presented a guess and determine attack, whose time complexity is operations; In 2006 H. Ahmadi et al revised the above attack and reduced the time complexity to operations; In 2006 Y. Tsunoo et al improved Ahmadi et al's result and further reduced it to operations; In 2008 Jung-Keun Lee et al proposed a correlation attack, which needs about time, key bits, and bit memories;
5 In 2009 Lin and Jie gave a new guess and determine attack, and claimed that their attack only needs operations.
6 1.3 Our work
7 2 Description of SOSEMANUK Figure 1 The structure of SOSEMANUK LFSR FSM Serpent1
8 2.1 The LFSR
9 2.2 The FSM
The Serpent1 Figure 2 The round function Serpent1 in the bit-slice mode
Generation of Keystream
3 Basic properties on SOSEMANUK 12
13 Let x be a 32-bit word. Denote by x (i) the i-th byte of x, where i=0,1,2,3. For example, s 1 (3), s 4 (0), s 4 (1) and s 10 (0) are known, then we can calculate s 11 (0). Figure 3 The feedback of the LFSR in the byte form
14 4 Our attack 4.1 Basic idea of the guess and determine attack The guess and determine attack is a common cryptographic attack method. Its basic idea is that Guess: first guess the values of a portion of the internal state of the target algorithm; Deduce: then deduce the values of all the rest of the internal state of the algorithm by making use of the values of the guessed portion of the internal state and a few known keystream; Test: finally generate a phase of keystream by using the above recovered values, and test their correctness by comparing the generated keystream with the known keystream. If NOT, then return Step 1.
The execution of our attack Our attack is based on the following assumption: The guessing and deducing procedure of the attack can be subdivided into five phases: 1. Guess the values of s 1, s 2, s 3, R2 1 (0), R2 1 (1), R2 1 (2) and the rest 31-bit values of R1 1, and deduce the value of s 10 (0), R1 2 (0), R2 2, s 11 (0), s 4 (1), s 10 (1), R1 2 (1), s 11 (1), s 4 (2), s 10 (2), R1 2 (2), S 11 (2) and s 4 (3).
16 The deduced byte The guessed byte Figure 4 The illustration of the deduction in Phase 1
17 2.By the assumption lsb(R1 1 )=1, which implies R1 2 =R2 1 ⊞ (s 3 ⊕ s 10 ), we get the equation on the variable s 10 (3) : where a, b, c, and d are known. Since s 10 (3) occurs three times in the above equation, it is easy to check equation (12) has exactly one solution on s 10 (3). So we can solve it and get s 10 (3). Further we deduce s 11 (3), R2 1 (3) and R2 2 (3). Up to now we have obtained s 1, s 2, s 3, s 4, s 10, s 11, R1 1, R2 1, R1 2 and R Further deduce R1 3, R2 3, R1 4, R2 4, R1 5, R2 5, R2 6, s 5, s 6, s 12 and s 13.
18 The deduced byte in phase 2 The known byte Figure 5 The illustration of the deduction in Phase 2 and 3 The deduced byte in phase 3
19 4.Further guess s 7 (0) and s 8 (0), and deduce the rest bytes of s 7 and s 8. 5.Final deduce s 9.
20 The deduced byte The known byte Figure 6 The illustration of the deduction in Phase 4 and 5 The guessed byte
Time and data complexity Time complexity: operations In Phase 1 and Phase 4, we guess a total of 175 bits of the internal state, including s 1, s 2, s 3, R2 1 (0), R2 1 (1), R2 1 (2), s 7 (0), s 8 (0) and the rest 31-bit values of R1 1. Consider the assumption which holds true with probability Data complexity: about 20 words used In the guessing phase: 8 words used; In the testing phase: about 8 words used (When 16 words are given, which has totally 512 bits and is larger than the 384 bits of the internal state, the internal state is determined by them. So we can use them to test the correctness of the recovered internal state.); Consider the assumption: another 4 words used (By shifting the keystream by 4 words we can test two cases).
22 5 Further discussion on our attack Here it should be pointed out that the assumption lsb(R1 1 )=1 is NOT necessary for our attack to work. In fact when lsb(R1 1 )=0, which implies that R1 2 =R2 1 ⊞ s 3, similarly we get the equation on s 10 (3) : The above equation has no solution or 2 k solutions for some integer k. However when a’, b’, c’ and d’ go through all possible values, the sum of the number of all solutions is just equal to We directly guess total 160-bit values of the internal state in phase 1, and after phase 2 we get total possible values. For each of them, we go on phases 3, 4 and 5. So the time complexity is still operations, but the data complexity reduces to about 16 key words.
23 6 Conclusion In this work we presented a byte-based guess and determine attack on SOSEMANUK, which only needs a few words of known keystream to recover the whole internal state of SOSEMANUK with time complexity operations. Since SOSEMANUK has a key with the length varying from 128 and 256 bits, it shows that when the length of a chosen encryption key is larger than 176 bits, our attack is more efficient than an exhaustive key search.
24 Thank you !