Session 2: Secret key cryptography – stream ciphers – part 2.

Slides:



Advertisements
Similar presentations
Cryptography, Attacks and Countermeasures Lecture 3 - Stream Ciphers
Advertisements

2013/12/10.  The Kendall’s tau correlation is another non- parametric correlation coefficient  Let x 1, …, x n be a sample for random variable x and.
Random Number Generation. Random Number Generators Without random numbers, we cannot do Stochastic Simulation Most computer languages have a subroutine,
Digital Kommunikationselektroink TNE027 Lecture 6 (Cryptography) 1 Cryptography Algorithms Symmetric and Asymmetric Cryptography Algorithms Data Stream.
ORTHOGONAL ARRAYS APPLICATION TO PSEUDORANDOM NUMBERS GENERATION AND OPTIMIZATION PROBLEMS A.G.Chefranov †‡, T.A.Mazurova ‡, I.D.Sidorov ‡, T.S.Letia 
Stream ciphers 2 Session 2. Contents PN generators with LFSRs Statistical testing of PN generator sequences Cryptanalysis of stream ciphers 2/75.
Cryptography and Network Security Chapter 3
CS457 – Introduction to Information Systems Security Cryptography 1b Elias Athanasopoulos
Ch11 Curve Fitting Dr. Deshi Ye
Random number generation Algorithms and Transforms to Univariate Distributions.
Block ciphers 1 Session 3. Contents Design of block ciphers Non-linear transformations 2/25.
Random Number Generators. Why do we need random variables? random components in simulation → need for a method which generates numbers that are random.
Session 2 Symmetric ciphers 1. Stream cipher definition Recall the Vernam cipher: Plaintext Ciphertext (Running) key
Session 6: Introduction to cryptanalysis part 2. Symmetric systems The sources of vulnerabilities regarding linearity in block ciphers are S-boxes. Example.
Linearization of Stream Ciphers in Terms of Cellular Automata Amparo Fúster-Sabater Institute of Applied Physics (CSIC) Madrid (Spain)
Stream cipher diagram + + Recall: One-time pad in Chap. 2.
Random Number Generation
Statistical Background
Chapter 11 Multiple Regression.
Stream Ciphers 1 Stream Ciphers. Stream Ciphers 2 Stream Ciphers  Generalization of one-time pad  Trade provable security for practicality  Stream.
Linear and generalised linear models
Session 6: Introduction to cryptanalysis part 1. Contents Problem definition Symmetric systems cryptanalysis Particularities of block ciphers cryptanalysis.
Linear and generalised linear models
Session 2: Secret key cryptography – stream ciphers – part 1.
Computer Security CS 426 Lecture 3
CMSC 414 Computer and Network Security Lecture 3 Jonathan Katz.
Fall 2011 CSC 446/546 Part 6: Random Number Generation.
Cryptanalysis. The Speaker  Chuck Easttom  
EE5552 Network Security and Encryption block 4 Dr. T.J. Owens CEng MIET Dr T. Itagaki MIET, MIEEE, MAES.
ETM 607 – Random Number and Random Variates
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Block ciphers 2 Session 4. Contents Linear cryptanalysis Differential cryptanalysis 2/48.
Códigos y Criptografía Francisco Rodríguez Henríquez A Short Introduction to Stream Ciphers.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Cryptography, Attacks and Countermeasures Lecture 4 –Boolean Functions John A Clark and Susan Stepney Dept. of Computer Science University of York, UK.
One-Time Pad Or Vernam Cipher Sayed Mahdi Mohammad Hasanzadeh Spring 2004.
CS555Spring 2012/Topic 51 Cryptography CS 555 Topic 5: Pseudorandomness and Stream Ciphers.
Random-Number Generation Andy Wang CIS Computer Systems Performance Analysis.
CPSC 531: RN Generation1 CPSC 531:Random-Number Generation Instructor: Anirban Mahanti Office: ICT Class Location:
Chapter 7 Random-Number Generation
Tests for Random Numbers Dr. Akram Ibrahim Aly Lecture (9)
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Basic Concepts in Number Theory Background for Random Number Generation 1.For any pair of integers n and m, m  0, there exists a unique pair of integers.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Session 1 Stream ciphers 1.
CRYPTANALYSIS OF STREAM CIPHER Bimal K Roy Cryptology Research Group Indian Statistical Institute Kolkata.
DIFFERENTIAL CRYPTANALYSIS Chapter 3.4. Ciphertext only attack. The cryptanalyst knows the cryptograms. This happens, if he can eavesdrop the communication.
Linear Feedback Shift Register. 2 Linear Feedback Shift Registers (LFSRs) These are n-bit counters exhibiting pseudo-random behavior. Built from simple.
Confidence intervals and hypothesis testing Petter Mostad
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Stream Cipher Introduction Pseudorandomness LFSR Design
0 Simulation Modeling and Analysis: Input Analysis 7 Random Numbers Ref: Law & Kelton, Chapter 7.
MONTE CARLO METHOD DISCRETE SIMULATION RANDOM NUMBER GENERATION Chapter 3 : Random Number Generation.
Information and Network Security Lecture 2 Dr. Hadi AL Saadi.
1.  How does the computer generate observations from various distributions specified after input analysis?  There are two main components to the generation.
Page : 1 bfolieq.drw Technical University of Braunschweig IDA: Institute of Computer and Network Engineering  W. Adi 2011 Lecture-7 Secret-Key Ciphers.
Chapter 9 Hypothesis Testing.
Random-Number Generation
Chapter 7 Random Number Generation
Chapter 7 Random-Number Generation
Properties of Random Numbers
Randomness and Statistical Tests
Random Number Generation
Cryptography Lecture 15.
Presentation transcript:

Session 2: Secret key cryptography – stream ciphers – part 2

The Berlekamp-Massey algorithm Computational complexity of the Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence. Thus, if the linear complexity is very high, then the task of predicting the next bits of the sequence is too complex.

The Berlekamp-Massey algorithm Then, in order to prevent the cryptanalysis of a pseudorandom sequence generator, we must design it in such a way that its linear complexity is too high for the application of the Berlekamp-Massey algorithm.

Pseudorandom sequence generators Based on LFSRs The goals: Preserve good characteristics of the PN-sequences Increase the linear complexity The key is the initial state Different families of generators

Combinational generators  Non linear filter  Non linear combiner LFSR

Non linear filters In general, it is difficult to calculate the value of the linear complexity of the resulting sequence. However, under some special conditions, it is possible to estimate the linear complexity of the resulting sequence.

Algebraic normal form It is the form of a Boolean function that uses only the operations  and  In the ANF, the product that includes the largest number of variables is denominated non linear order of the function. Example: The non linear order of the function f(x 1,x 2,x 3 )=x 1  x 2 x 3  x 1 x 3 is 2.

Algebraic normal form The ANF of a function can be determined from its truth table. The Möbius transform

Algebraic normal form Example: n=3, u= x

Algebraic normal form Example: n=3, u= x

Algebraic normal form Example: n=3 x0x0 x1x1 x2x2 f

Algebraic normal form u=000u=001u= a 000 =f(0,0,0)=0a 001 =f(0,0,0)+ +f(0,0,1)=0+1=1 a 010 =f(0,0,0)+ +f(0,1,0)=0+0=0

Algebraic normal form u=011u=100u= a 011 =f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)= =0 a 100 =f(0,0,0)+ +f(1,0,0)=0+0=0 a 101 =f(0,0,0)+ f(0,0,1)+f(1,0,0)+f(1,0,1)= =0

Algebraic normal form u=110u= a 110 =f(0,0,0)+ f(0,1,0)+f(1,0,0)+f(1,1,0)= =1 a 111 =f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)+ f(1,0,0)+f(1,0,1)+f(1,1,0)+ f(1,1,1)=0

Algebraic normal form f(x 0,x 1,x 2 )=a 001 x 2 +a 110 x 0 x 1 =x 2 +x 0 x 1

Non linear filters Theorem (Rueppel, 1984): With the LFSR of length n and with the filter function with the property that its unique term in the ANF of maximum order k is a product of equidistant phases, the lower limit of the linear complexity of the resultant sequence is

Non linear filters Design principles: The feedback polynomial: primitive The filter function must have various terms of each order. k  n/2 Include a linear term in order to obtain good statistical properties of the resulting sequence (balanced filter function).

Non linear combiners In these generators, the keystream sequence is obtained by combining the output sequences of various LFSRs in a non linear manner. Example – it is possible to use a Boolean function (without memory).

Non linear combiners Two cryptographic principles by Shannon: Confusion – we must use complicated transformations – as many bits of the key as possible should be involved in obtaining a single bit of the keystream sequence (and the ciphertext). Diffusion – Every bit of the key must affect many bits of the keystream sequence (and the ciphertext).

Non linear combiners Possible flaws of non linear combiners (to be considered during the design): Bad statistical properties – e.g. too many zeros/ones in the output sequence. Correlation – The output sequence coincides too much with one or more internal sequences – this enables correlation attacks.

Non linear combiners Correlation attacks: It is possible to divide the task of the cryptanalyst into several less difficult tasks – “Divide and conquer”. In order to prevent the correlation attacks, the non linear function of the combiner must have, at the same time: as high non linear order as possible as high correlation immunity as possible. These two requirements are opposite – we must find a trade off between these two values.

Non linear combiners Correlation immunity: A Boolean function is correlation immune of order m if its output sequence is not correlated with any set of m and less input sequences. But, the higher the correlation immunity, the lower the non linear order k. The trade off (N is the number of variables) m+k  N; 1  k  N, 0  m  N-1

Non linear combiners A Boolean function is balanced if it has an equal number of 0s and 1s in its truth table. The balanced correlation immune functions of order m are denominated m-resilient functions.

Non linear combiners Example: The sum modulo 2 of N variables has the maximum possible value of correlation immunity, N-1, but its non linear order is 1. If the combination function contains memory, then the trade off between the correlation immunity and non linearity is not needed – it is possible to maximize both values – a single bit of memory is enough (Rueppel, 1984).

Non linear combiners If F is a Boolean function of N periodic input sequences a 1 (t), a 2 (t),..., a N (t), then the output sequence b(t) = F(a 1 (t), a 2 (t),..., a N (t)) is a linear combination of various products of sequences. These products are determined by determining the ANF of the function F.

Non linear combiners If in the ANF of the function F instead of the sum and product modulo 2 we use the sum and product of integers, the resulting function is denominated F* and for the linear complexity and the period of the output sequence of F the following holds:

Non linear combiners Example: If the characteristic polynomials of the input sequences are: All these polynomials are primitive!

Non linear combiners Example (cont.): Then

Non linear combiners The sum of N sequences in GF(q): The equality holds if the characteristic polynomials of the input sequences are mutually prime.

Non linear combiners The sum of N sequences in GF(q): Obviously, if the periods of the input sequences are mutually prime then

Non linear combiners Example: Primitive! The periods are Mersenne primes

Non linear combiners Product of N sequences in GF(q): Theorem (Golić, 1989) If Per(a i ) are mutually prime, then Theorem (Lidl, Niedereiter) Per(a i ) are mutually prime

Non linear combiners Example: Primitive! The periods are Mersenne primes

Non linear combiners The general case: Let be the Boolean function obtained by removing all the products from the function F except those of the maximum order. Let be the corresponding integer function.

Non linear combiners Theorem (Golić, 1989) F depends on all the N input variables. Per(a i ) are mutually prime. Then

Non linear combiners Example: If the characteristic polynomials of the input sequences are: Primitive, periods Mersenne primes

Non linear combiners Example (cont.)

Geffe’s generator F balanced – good statistical properties

Geffe’s generator The equivalent scheme

Geffe’s generator Example: polynomials – primitive, with periods that are Mersenne primes.

Geffe’s generator Problem: Correlation!

Correlation immune functions Is there a way to find a Boolean memoryless combiner that guarantees a high level of correlation immunity? This is a difficult problem and there is no final answer. However, some Boolean combiners are known to have a high level of correlation immunity.

Correlation immune functions One of the classes of such “good” functions – Latin squares. A Latin square is an n×n scheme of integers in which each element appears exactly once in each row and in each column.

Correlation immune functions Basic property of Latin squares: If we exchange two rows/columns of a Latin square, the obtained scheme is also a Latin square. This gives rise to a construction (one of the possible algorithms): We start from the table of addition of the additive group with n elements. We exchange some rows and columns of the table several times.

Correlation immune functions Example – a Latin square of order 4:

Correlation immune functions A Latin square of dimension n as a family of log 2 n Boolean functions (a vectorial Boolean function with log 2 n outputs): There are 2 address branches, log 2 n bits each The output has log 2 n bits. Example (see previous slide): The address is 0110 (the two most significant bits address the row). The output is 10.

Correlation immune functions Basic correlation-related property of Latin squares: Each bit of output is correlated with a linear combination of inputs that are located in both address branches. Consequence: there is no way of analyzing the address branches individually – no divide and conquer.

Correlation immune functions

Decimation of sequences The principal characteristic: The output sequence of a subgenerator controls the clock sequence of one or more subgenerators.

Decimation of sequences Example 1: X=1,1,0,1,0,1,0,1 Y=0,1,0,0,1 Z=1,0,1,0,0 Example 2: X and Y are generated by LFSRs and the BRM is applied

Decimation of sequences Theorem (Chambers, Jennings, 1984) R 1, R 2 – primitive polynomials, degrees m and n, respectively Periods M=2 m -1 and N=2 n -1 All the prime factors of M divide N Then:

Decimation of sequences The requirements of the Theorem are satisfied if the lengths of both LFSRs are equal and the feedback polynomials are primitive.

Decimation of sequences Example: n=m=107, primitive polynomials LC=nM=107( ) Per = NM =( )( )

The shrinking generator (1993) A very simple binary sequence generator (Crypto’93) It consists of two LFSRs: LFSR1 and LFSR2 Based on P, LFSR1 (the control register) decimates the sequence generated by LFSR2 LFSR 1 LFSR 2 P clock

The shrinking generator If a i =0, b i is discarded, otherwise b i is sent to the output. Thus the number of discarded bits from the sequence b depends on the lengths of runs of 0s in the sequence a.

The shrinking generator (an example) LFSRs:  LFSR1: L 1 =3, f 1 (x)=1+x 2 +x 3, IS 1 =(1,0,0)  LFSR2: L 2 =4, f 2 (x)=1+x+x 4, IS 2 =(1,0,0,0) Decimation rule P: {a i }= … {b i }= … {c j }= … The underlined bits (1 and 0) are eliminated.

Characteristics of the output sequence Period: Linear complexity: Number of 1’s: balanced sequence

Example – BRM vs. Shrinking BRM: X= … Y= … Z= … Shrinking: X= … Y= … Z=

Statistical testing of PN generators The output sequence of a generator of pseudorandom sequences looks random, but it is not. Pseudorandom generators expand a truly random sequence (the key) to a much longer sequence, such that an adversary cannot distinguish between the pseudorandom sequence and a truly random sequence.

Statistical testing of PN generators In order to obtain a guarantee of the security of this type of generators various statistical tests are applied, especially designed for this purpose. The fact that a generator passes a set of statistical tests should be considered a necessary condition, although not a sufficient one, for the security of the generator.

Statistical testing of PN generators If the result X of an experiment can take any real value, then X is a continuous random variable. The probability density function f(x) of a continuous random variable X can be integrated and the following holds: f(x)  0, for all x  R For all a, b  R the following holds

Statistical testing of PN generators A continuous random variable has a normal distribution with the mean  and the variance  2 if its probability density function is: We say that X is If X is, then we say that X has a standard normal distribution.

Statistical testing of PN generators If the random variable X is, then the variable is. The Euler’s gamma function:

Statistical testing of PN generators A continuous random variable X has a  2 distribution with degrees of freedom if its probability density function is

Statistical testing of PN generators A statistical hypothesis H is an affirmation about the distribution of one or more random variables. A hypothesis test is a procedure based on the observed values of the random variable that leads to the acceptance or rejection of the hypothesis H.

Statistical testing of PN generators The test only provides a measure of the strength of evidence given by the data against the hypothesis. The conclusion is probabilistic. The level of significance  of the test of the hypothesis H is the probability of rejecting the hypothesis H when it is true.

Statistical testing of PN generators The hypothesis to be tested is denominated the null hypothesis, H 0. The alternative hypothesis is denoted by H 1 or H a. In cryptography: H 0 – the given generator is a random sequence generator.

Statistical testing of PN generators If  is too small, the test could accept non random sequences. If  is too high, the test could reject random sequences. In cryptography:  is between 0,001 and 0,05.

Statistical testing of PN generators A test: Determines a statistic for the sample of the output sequence. This statistic is compared with the expected value of a random sequence.

Statistical testing of PN generators How is the comparison carried out? The computed statistic – X 0 – follows a  2 distribution with degrees of freedom. It is assumed that this statistic takes large values for non random sequences. In order to achieve , a threshold X  is chosen (by means of the corresponding table), such that P(X 0 >X  )= .

Statistical testing of PN generators How is the comparison carried out? (cont.) If the value of the statistic for the sample of the output sequence, X s, satisfies X s >X , then the sequence fails on the test. Basic tests for cryptographic use: Frequency test, serial test, poker test, runs test, autocorrelation test, etc.

Statistical testing of PN generators Frequency test Purpose: determine if the number of zeros and ones in a sequence s is approximately the same. n 0 – number of zeros, n 1 – number of ones. The statistic:

Statistical testing of PN generators Frequency test (cont.) The statistic follows a  2 distribution with 1 degree of freedom. The approximation is good enough if n  10.

Statistical testing of PN generators Serial test Tries to determine if the number of occurrences of 00, 01, 10 and 11, as subsequences of s is approximately the same. The statistic: The statistic follows a  2 distribution with 2 degrees of freedom. The approximation is good enough if n  21.

Statistical testing of PN generators Poker test A positive integer m is considered such that The sequence s is divided into k parts of size m. n i is the number of occurrences of the type i of the sequence of length m, 1  i  2 m (that is, i is the value of the integer whose binary representation is the sequence of length m. The test determines if every sequence of length m appears approximately the same number of times.

Statistical testing of PN generators Poker test (cont.) The statistic: The statistic follows approximately a  2 distribution with 2 m -1 degrees of freedom.

Statistical testing of PN generators Runs test A run of length i – a subsequence of s formed by i consecutive zeros or i consecutive ones that are neither preceded nor followed by the same symbol. A run of zeros – gap A run of ones – block

Statistical testing of PN generators Runs test (cont.) Purpose: determine if the number of runs of different lengths in the sequence s is that expected in a random sequence. The number of gaps (or blocks) of length i in a random sequence of length n is It is considered that k is equal to the largest integer i for which e i  5. We denote by B i and H i the number of blocks and gaps of length i in s, for each i, 1  i  k.

Statistical testing of PN generators Runs test (cont.) The statistic The statistic follows approximately a  2 distribution with 2k-2 degrees of freedom.

Statistical testing of PN generators Autocorrelation test Checks the correlation between s and shifted versions of s. An integer d, 1  d  n/2  is considered. The number of bits in s that are not equal to the d-shifts is

Statistical testing of PN generators Autocorrelation test (cont.) The statistic The statistic follows approximately a N (0,1) distribution. The approximation is good enough if n-d  10.