Stream Cipher Introduction Pseudorandomness LFSR Design Refer to “Handbook of Applied Cryptography” [Ch 5 & 6]
Stream Cipher Introduction Properties Originate from one-time pad bit-by-bit Exor with pt and key stream (ci = mi zi) Encryption = Decryption --> Symmetric Use LFSR (Linear Feedback Shift Register) (external) Synchronous or self-synchronous Properties Faster and Low Complexity in H/W Security measure : Period of key stream, LC(Linear Complexity), Statistical properties Vast amounts of theoretical knowledge Proprietary and Confidential for Military
Sequence Def) s=s0,s1,… : infinite seq., sn=s0,s1,…,sn-1: n term of s if si = si+n for all i >=0, s is periodic seq. having period n. run : subsequence of consecutive ‘0’(gap) or consecutive ‘1’(block)
Golomb’s postulates(I) sN : periodic seq. of period N For a cycle of sN, 0~1 balanceness, i.e, | #{si=1} - #{sj=0} | =<1 (2) For a cycle of sN, half the runs have length 1, 1/4 have the length 2, …, etc. (3) Autocorrelation* function is two-valued * Measuring similarity between original and t-shifted sequences ** A sequence satisfying them is called Pseudo-Noise(PN) sequence.
Golomb’s postulates(II) (Ex) s15 = 0,1,1,0,0,1,0,0,0,1,1,1,1,0,1 (1) #{0} = 7, #{1}=8 (why ?) (2) 8 runs, 4 runs with length 1 (2 gaps, 2 blocks), 2 runs with length 2 (1 gap, 1 block), 1 run with length 3 (1 gap), 1 run with length 4 (1 block) (3) Autocorrelation function, C(0)=1, C(t)= - 1/15 Thus, PN-seq.
Statistical Randomness Five Basic Tests Frequency Test (monobit) Serial Test (twobit; Overlapping is allowed) Poker Test (Frequency of m-bit subsequences) Runs Test Autocorrelation Test Others Spectral Test Linear Complexity Profile Quadratic Complexity Universal Test
Statistical Test by FIPS 140-1 For a given 20,000bit sample seq. (I) monobit test : The number of ‘1’=n1, 9,654 < n1 < 10,346 (2) poker test : m=4, 1.03 < X3 < 57.4 (3) runs test : for length 1 i 6 (4) long run test : no run greater than 34
Notation of LFSR Notation: < L, C[D]> where connection polynomial C[D] = 1 + c1D + c2D2 + …+cLDL Z2[D] If cL=1, {i.e., deg{C[D]}=L}, C[D] is called a nonsingular polynomial If initial vector 0 is [sL-1, … , s1,s0], si ={0,1}, output sequence s= s0,s1, … is uniquely determined by the recursion sj = (c1s j-1 + c 2 s j-2 + … + c Ls j-L) mod 2 , j L (Ex) <4, 1 + D + D4> , 0 = [0,1,1,0] c1 =1, c4 =1, s4=s3+s0 t D3 D2 D1 D0 t D3 D2 D1 D0 0 0 1 1 0 (6) 8 1 1 1 0 (14) 1 0 0 1 1 (3) 9 1 1 1 1 (15) 2 1 0 0 1 (9) 10 0 1 1 1 (7) 3 0 1 0 0 (4) 11 1 0 1 1 (11) 4 0 0 1 0 (2) 12 0 1 0 1 (5) 5 0 0 0 1 (1) 13 1 0 1 0 (10) 6 1 0 0 0 (8) 14 1 1 0 1 (13) 7 1 1 0 0 (12) 15 0 1 1 0 (6) Output seq. = 0,1,1,0,0,1,0,0,0,1,1,1,1,0,1 Output Stage 3 Stage 2 Stage 1 Stage D3 D2 D1 D0 Clock 15 10
Properties of m-LFSR(I) The period of the sequence from LFSR divides 2L-1 A irreducible polynomial f(x) in Zp[x] of degree m is called a primitive polynomial if and only if f(x) divides xk-1 for k=2m-1 and for no smaller positive integer k # of monic primitive poly. of degree m over Zp =(pm-1)/m where is Euler-phi ft. If the connection polynomial is primitive, the period is 2L-1 Such sequence is called Maximum-length Shift Register Seq., M –seq. and LFSR is called m-LFSR.
Primitive Polynomials m k(k1,k2,k3) m k(k1,k2,k3) m k(k1,k2,k3) m k(k1,k2,k3) 2 3 4 5 6 7 8 9 10 11 1 6,5,1 12 13 14 15 16 17 18 19 20 21 7,4,3 4,3,1 12,11,1 5,3,2 22 23 24 25 26 27 28 29 30 31 8,7,1 16,15,1 32 33 34 35 36 37 38 39 40 41 28,27,1 15,14,1 12,10,2 4 21,19,2 Primitive polynomial over Z2: xm+xk+1(trinomial) for smallest k xm + xk1+xk2+xk3+1(pentanomial)
Properties of LFSR Well suited for H/W implementation Produce seq. of large period Good statistical properties Readily analyzed by algebraic structure Breakable by consecutive 2 * L subsequence Using Berlekamp-Massey algorithm, from any (short) subsequences having length at least 2L, we can find the LFSR with length L
Linear Complexity(I) (Def) Given an infinite sequence s, the shortest length of LFSR’s that generate s is called Linear Complexity Using Berlekamp-Massey algorithm, LC is computed (Properties of LC) s,t : binary seq. For any n 1, 0 L(sn) n LC(sn) =0 iff sn is ‘0’ seq. of length n. LC(sn) =n iff sn=0,0,…,0,1. If s is periodic with period N, LC(sn) N. LC(st) LC(s) + LC(t)
Linear Complexity(II) sn : random seq. from all seq. of length n Expectation value of LC where B(n)=0 if even n, otherwise 0 For large n E(L(sn)) n/2 + 2/9 and Var(L(sn)) 86/81 (Def) LCP (Linear Complexity Profile) Denote LN is LC of sN=s0,s1,…sN-1, L1, L2, … LN is LCP
Nonlinear FSR f ( s j-1, s j-2, …, s j-L) sj-L+1 f() : nonlinear ft Stage L-1 1 Sj-1 sj-L+1 Sj-L+2 S j-L Sj Output f ( s j-1, s j-2, …, s j-L) f() : nonlinear ft
Synchronous Stream Cipher(I) f : next state ft, i+1 = f(i , k), 0 : initial value g : keystream generating ft, zi = g (i , k), k : key h : output ft, ci = h (zi, mi) , mi : pt, zi : key stream, ci:ct i i i+1 i+1 f f k g g k zi zi mi h ci ci h-1 mi Decryption Encryption
Synchronous Stream Cipher(II) Keystream is independent of pt and ct Properties Synchronization requirement No error propagation Active attack Insertion, deletion or replay will lose synchronization Change selected ciphertext digits Need to have integrity check mechanisms
Self-Sync. Stream Cipher(I) i = (ci-t , ci-t+1, …, ci-1), 0 = (c-t, c-t+1, …, c-1) : initial value g : keystream generating ft, zi = g (i , k), k : key h : output ft, ci = h (zi, mi) , mi : pt, zi : keystream, ci : ct k g g k zi zi mi h ci ci mi h-1 Encryption Decryption
Self-Sync. Stream Cipher(II) Keystream is independent of pt and ct Properties Self-Synchronization Limited error propagation Active attack Difficult to detect insertion, deletion, or replay Easy to find passive modification More diffusion more resistant against attacks based on plaintext redundancy
Nonlinear Combiner(I) LFSR 1 LFSR 2 LFSR n f Keystream, z Algebraic Normal Form (ANF) : mod. 2 sum of distinct m-th order product of its variable, 0 <= m <= n Ex) f(x1,x2,x3,x4,x5)=1 + x2+ x3 + x4 + x4x5 + x1x2x3x4, deg(f) =4
Nonlinear Combiner(II) Geffe generator LFSR 1 LFSR 2 LFSR 3 Keystream, z x1 x2 x3 f(x1,x2,x3) = x1x2 (1+x2)x3 = x1x2 x2x3 x3 p(z) : (2L1-1) (2L2-1)(2L3-1) where L1,L2 and L3 are relatively prime L(z) = L1L2 + L1L3 + L3 Prob(z(t)=x1(t)) =3/4 Correlation attack is possible !
Nonlinear Combiner(III) Summation generator LFSR 1 LFSR 2 LFSR n Carry x1 x2 xn If Li and Lj are pairwise relatively prime, then p(z) = i=1 n (2Li -1) LC p(z) But vulnerable to the correlation attack of carry and 2-adic span z, keystream
Clock-controlled generator(I) Alternating step generator LFSR R1 LFSR R2 LFSR R3 Clock z, keystream R1 : de Brujin seq. of period 2L1 R2,R3 : m-seq s.t., gcd(L2, L3)=1 p(z) = 2L1 (2L2-1)(2L3-1) L(z) : (L2 + L3) 2L1-1 < L(z) <= (L2+L3) 2L1 Best known attack is a divide-and-conquer attack on the control register R1 in 2L L should be about 128 (de Brujin = maximal period)
Clock-controlled generator(II) Shrinking generator LFSR R1 LFSR R2 Clock ai bi ai=1 ai=0 output bi discard bi If gcd(L1, L2) =1, p(z) = (2L2-1) 2L1-1 L2 2 L1-2 < L(z) < L2 2 L1-1 Best known attack takes O(2L1L23). Li is about 64
Other generators Cascade Generator CSPRBG(Cryptographically Secure Pseudo Random Bit Generator) RSA LSB Generator BBS Generator (p.336) Pseudo-noise Generator Noise Diode or Noise Transistor Feedback with Carry Shift Register (FCSR) 2-adic span A5/1, A5/2, HC-256, RC4, PKZIP, Py, Rabbit, FISH, SEAL, Salsa20, SOBER, etc.
Correlation Attack
Correlation Attack (I) Siegenthaler, 1984 The complexity of a Combining Generator depends on the correlation of the combining function F. Divide-and-Conquer Attack - If the output of F has a correlation with the output of KSG1, we can find the initial vector of the KSG1 KSG 1 x1 KSG 2 F x2 z xn KSG n
Correlation Attack (II) Assume Prob(z=0|xi=0)=1/2-e, e>0 Identify the initial vector of the KSGi by Divide and Conquer Known ciphertext attack Assume an initial vector of KSGi Generate xi’ from KSGi Compute e’=1/2- Prob(z=0|xi’=0) If the initial vector is correct, we must have e’=e. If not, we have e0 since x’ has no correlation with z This attack is very effective. So e must be zero. KSG 1 KSG 2 KSG n F z x1 xn x2
Resilient Functions A balanced function {0,1}n {0,1}m - every possible output m-tuple is equally likely to occur A k-resilient function f : {0,1}n {0,1}m when the values of k arbitrary inputs are fixed and the remaining n-k input bits are chosen independently at random. A 0-resilient function is just a balanced function. A k-resilient function is (k-1)-resilient. E.g.) f(x1,x2)=x1+x2 is 1-resilient.
Multi-output Stream Ciphers To design a multi-output stream cipher based on a combining generator, we need a resilient function which is nonlinear has algebraic degree as large as possible (for large LC) has nonlinearity as large as possible has resiliency as large as possible KSG 1 KSG 2 F KSG n
Summary of a Stream Cipher Period : Depends on req’d level of security Linear Complexity shortest LFSR that generates a given seq. Measure against Correlation Attack Correlation Immune function Nonlinear function * A5 (for GSM) crack survey: a5.htm