Session 2: Secret key cryptography – stream ciphers – part 1
The Vernam cipher Message Cryptogram Running key Running key Cryptogram Message Key distribution centre transmitterreceiver
The Vernam cipher Advantage: Unconditionally secure. Disadvantage: Requires one key bit for every plaintext bit. Because of that, if the level of security is not the highest one (the red phone line, etc.), instead of the Vernam cipher, a stream cipher can be used.
xixi Key zizi zizi yiyi Deterministic algorithm xixi The stream cipher procedure x i z i = y i y i z i = x i TRANSMITTERRECEIVER Key
Stream ciphers The key is short – much shorter than the length of the plaintext (on average). The key determines the initial state of a deterministic algorithm. Based on the initial state, the algorithm generates the running key sequence. The running key sequence is summed modulo 2 with the bits of the plaintext.
Stream ciphers Vernam cipher (running key) Stream cipher (running key) Length text Length seq. YES Used once YES Randomness Pseudorandomness Running key Algorithm + key c3c3 c2c2 c1c1
Stream ciphers Do not satisfy the perfect secrecy conditions (the running key is not random but pseudorandom). However, stream ciphers possess practical secrecy. The level of security depends on the design. Advantage: the secret key is short – it is the only piece of information that the transmitter and the receiver must share.
The running key: 1. What are general characteristics of these sequences? 2. What generators produce them?
Stream ciphers Enciphering bit after bit Generation of pseudorandom sequences: Long period Pseudorandomness properties Unpredictability Key space large enough Etc.
Running keys The running key sequences generated by pseudorandom sequence generators are ultimately periodic (i.e. they may have an aperiodic prefix). The period must be at least as long as the length of the plaintext. In practice, this period is much longer.
Running keys Example: T = ≈ 1.26 bits V c = 1.2 10 8 bits/sec 3.33 years times the age of the universe (1.5 years) to generate the whole period.
Running keys Distribution of zeros and ones …… …… a run of length k – k consecutive equal digits between two different digits. runs of zeros (gaps) runs of ones (blocks)
Running keys: Autocorrelation Autocorrelation in phase: Autocorrelation out of phase: A – Number of coincidences D – Number of no coincidences T – Period k - Shift Original seq Shifted seq
Golomb’s pseudorandomness postulates G1: In each period of the considered sequence, the difference between the number of 1s and the number of 0s must not overcome unity. G2: In each period of the considered sequence, half of the runs, of the total number of observed runs, has the length 1, one fourth has the length 2, one eight has the length 3 … etc. For each length, there will be the same number of blocks and gaps. G3: The autocorrelation AC(k) out of phase must be constant for each k.
Explanation of the Golomb´s postulates G1: The 1s and 0s must appear along the sequence with the same probability. G2: different n-grams (samples of n consecutive digits) must occur with the correct probability. G3: Computation of the coincidences between a sequence and its shifted version must not give any information about the period of the sequence.
Golomb´s postulates A finite sequence that satisfies the 3 Golomb´s postulates is denominated PN sequence (Pseudo-Noise). Its properties are equal to the properties of a random sequence with uniform distribution.
Unpredictability Given a part of a sequence of any length, a cryptanalyst cannot predict the next digit with a probability of success greater than 0.5. A measure of unpredictability: Linear complexity.
Basic structures Generators based on linear congruencies Feedback shift registers Non linear feedback shift registers (NLFSR) Linear feedback shift registers (LFSR)
Linear congruencies The recurrence of the type The parameters a, b and m can be used as the secret key. X 0 is the seed that initializes the process. If the parameters a, b and m are chosen in an appropriate way, the numbers X i are not repeated until they cover entirely the segment [0,m-1].
Linear congruencies Example:
Linear congruencies Security of the generator: bad Given a sufficiently long portion of the sequence, it is possible to deduce the parameters m, a and b, i.e. the key.
Feedback shift registers A feedback shift register (FSR): n flip-flops (stages) A feedback function – to express each new element of the output sequence as a function of the n previous elements. The contents of the flip-flops is shifted one position at every clock pulse.
Feedback shift registers
Shift registers The state of the register – the contents of the stages between two clock pulses. The initial state – the contents of the stages at the moment of the beginning of the process. The state diagram of a FSR is cyclic if the feedback function is not singular, i.e. it has the form:
Shift registers The period of the produced sequence depends on the number of stages n of the FSR and the characteristics of the function g. The maximum possible period is 2 n. The key – the initial contents of the FSR. The feedback function can also be kept secret.
Shift registers Example 1: n=3 x1x1 x2x2 x3x3 g
Shift registers Example 1 (cont.) Algebraic normal form of the function g:
Feedback shift registers Example 1 (cont.) The DeBruijn graph - singular
Feedback shift registers Example 2: n=3 x1x1 x2x2 x3x3 g
Feedback shift registers Example 2 (cont.) Algebraic normal form of the function g:
Feedback shift registers Example 2 (cont.) The DeBruijn graph – non singular
Problems with NLFSR A systematic method of their analysis and manipulation does not exist – the mathematical theory is not well developed. It is possible to obtain the sequences whose period is 2 n – De Bruijn sequences. However, the De Bruijn sequences do not satisfy the Golomb’s G3 postulate.
LFSR The most important devices for generation of pseudorandom sequences. Their feedback function is a linear recurrence – linear recurrent sequences of order n.
LFSR To avoid the null sequence, the initial state must be different from the all-zero state. The largest number of different states is 2 n -1. It is possible to associate the characteristic polynomial to every linear recurrence.
LFSR Example: A LFSR of length 4. Generated sequence: …… Initial state Feedback polynomial Linear recurrence
LFSR The characteristics of the output sequence of the LFSR depend on the characteristics of the feedback polynomial. The feedback polynomial can be: reducible irreducible primitive
LFSR The fundamental theorem of arithmetic: Every positive integer can be represented in a unique way as a product of prime factors. Analogue in a GF: Every polynomial in a GF can be represented in a unique way as a product of irreducible factors.
LFSR An irreducible polynomial has no irreducible factors except 1 and itself. Theorem: The polynomial in a field GF(p m ) has as factors all the irreducible polynomials whose degree divides k.
LFSR Thus, if a polynomial f(x) of degree n in GF(p m ) does not have common factors with then it is irreducible.
LFSR Example: GF(2)
LFSR Euclidean algorithm For determining G.C.D. between two integers. The same algorithm can be used to determine G.C.D. between two polynomials. The divisor from the previous step of the algorithm is iteratively divided by the remainder from the previous step until the remainder is 0. The G.C.D. is the remainder obtained in the penultimate step of the algorithm.
LFSR Example – integers Find (18,12) 18=1 =2 6+0 (18,12)=6
LFSR Example – polynomials in GF(2) Find (x 5 +x 4 +x 2 +x, x 4 +x 3 +x 2 +x) (x 5 +x 4 +x 2 +x)=x(x 4 +x 3 +x 2 +x)+(x 3 +x) (x 4 +x 3 +x 2 +x)=(x+1)(x 3 +x)+0 (x 5 +x 4 +x 2 +x, x 4 +x 3 +x 2 +x)=(x 3 +x)
LFSR Example - Determine if the polynomial is irreducible. Then, the given polynomial is not irreducible.
LFSR Example – Determine if the polynomial is irreducible. Then, the given polynomial is irreducible.
LFSR A primitive polynomial of degree n in GF(p m ) is irreducible does not divide Example: The polynomial of degree 4 in GF(2) is irreducible and does not divide any of the polynomials. Because of that, it is primitive.
LFSR The reciprocal polynomial of the polynomial f(x) of degree n If f(x) is primitive, f*(x) is also primitive.
LFSR Example:primitive. primitive.
Period of the LFSR (reducible)
Generators with reducible feedback polynomials The length of the output sequence depends on the initial state. The period T satisfies with the possibility of secondary periods whose length divides the period T. Not adequate for use in cryptography.
Period of the LFSR (irreducible)
Generators with irreducible feedback polynomial The length of the output sequence does not depend on the initial state. The period T is a factor of Not adequate for use in cryptography.
Period of the LFSR (primitive) PN-sequence (m-sequence) The maximum possible period for this type of generator …..
Generators with primitive feedback polynomial The length of the sequence does not depend on the initial state The period is Adequate for use in cryptography, because the output sequence satisfies all the Golomb’s postulates.
How many primitive polynomials of degree L are there? But not all of them are good. It is not recommended to use the polynomials with very concentrated coefficients. There are attacks against LFSRs with that property. The period of the sequence must have the smallest possible number of prime factors. These prime factors must be as large as possible.
Mersenne primes Those are prime numbers whose form is 2 L -1. Example: = is a Mersenne prime. Example: = =7 2 73 127 337 is not a Mersenne prime. It is not recommended for LFSRs. Thus, the best strategy is to use the LFSRs with a primitive polynomial of degree L such that 2 L -1 is a Mersenne prime. The numbers , , , , etc. are Mersenne primes.
PN-sequences and Golomb’s postulates G1: G2: Long.GapsBlocks 1 2 : :: r : :: L-2 11 L-1 10 L 01 Total
PN-sequences and Golomb’s postulates G3: PN-sequences satisfy the Golomb’s postulates
Linear complexity (unpredictability) The concept of sequence complexity: quantity of sequence symbols necessary to determine the rest of it. General idea: Associate a LFSR to every sequence. Linear complexity = The length of the smallest LFSR capable of generating the given sequence. Berlekamp-Massey algorithm (1969) Input: The considered binary sequence Output: and the initial contents
Linear complexity Sequence 1: Seq. generated by a LFSR (primitive pol.) VERY PREDICTABLE Sequence 2:random VERY UNPREDICTABLE
Linear complexity Example: The output sequence: 1110… The initial state: a 0, a 1, a 2, a 3. The output bits: y 0 =1, y 1 =1, y 2 =1, y 3 =0 The equations: Linear system – easy to solve! a 3210 y y y y3 0111
Linear complexity A random sequence of length 2L has expected linear complexity L. When a random sequence of length L is repeated periodically, the value of its linear complexity approaches the length of its period.
The Berlekamp-Massey algorithm Input to one step: n digits of a sequence. Determines the characteristics of the minimum LFSR capable of generating them. If the digit n+1 of the sequence can be generated by the current LFSR, the length of the current LFSR is preserved. Otherwise, a longer LFSR is needed, capable of generating the n+1 digits. Etc.
The Berlekamp-Massey algorithm Theorem 1 If generates the prefix s n of the intercepted sequence, but does not generate s n+1, then Example Generates , but does not generate LC( ) Discrepancy
The Berlekamp-Massey algorithm Theorem 2 If generates s n, but does not generate s n+1 (discrepancy n 0) and generates s m, but does not generate s m+1 (discrepancy m 0), where 0 m n, then generates s n+1.
The Berlekamp-Massey algorithm Theorem 3 If with L=LC(s n ) generates s n, but does not generate s n+1, then
= n *= m X=n-m
The Berlekamp-Massey algorithm Example N=7, GF(2), s 0,…,s 6 =1,1,0,1,0,0,1 Solution C(D)=1+D+D 3, L=