1 / 81

# Session 2: Secret key cryptography – stream ciphers – part 2 - PowerPoint PPT Presentation

Session 2: Secret key cryptography – stream ciphers – part 2. The Berlekamp-Massey algorithm. Computational complexity of the Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Session 2: Secret key cryptography – stream ciphers – part 2

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Session 2: Secret key cryptography – stream ciphers – part 2

### The Berlekamp-Massey algorithm

• Computational complexity of the Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence.

• Thus, if the linear complexity is very high, then the task of predicting the next bits of the sequence is too complex.

### The Berlekamp-Massey algorithm

• Then, in order to prevent the cryptanalysis of a pseudorandom sequence generator, we must design it in such a way that its linear complexity is too high for the application of the Berlekamp-Massey algorithm.

### Pseudorandom sequence generators

• Based on LFSRs

• The goals:

• Preserve good characteristics of the PN-sequences

• Increase the linear complexity

• The key is the initial state

• Different families of generators

### Combinational generators

LFSR

LFSR

LFSR

• Non linear filter

• Non linear combiner

### Non linear filters

• In general, it is difficult to calculate the value of the linear complexity of the resulting sequence.

• However, under some special conditions, it is possible to estimate the linear complexity of the resulting sequence.

### Algebraic normal form

• It is the form of a Boolean function that uses only the operations  and 

• In the ANF, the product that includes the largest number of variables is denominated non linear order of the function.

• Example: The non linear order of the function

f(x1,x2,x3)=x1x2x3x1x3 is 2.

### Algebraic normal form

• The ANF of a function can be determined from its truth table.

The Möbius transform

### Algebraic normal form

• Example: n=3, u=001

x

000

001

010

011

100

101

110

111

### Algebraic normal form

• Example: n=3, u=010

x

000

001

010

011

100

101

110

111

• Example: n=3

### Algebraic normal form

• u=000u=001u=010

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

a000=f(0,0,0)=0

a001=f(0,0,0)+

+f(0,0,1)=0+1=1

a010=f(0,0,0)+

+f(0,1,0)=0+0=0

### Algebraic normal form

• u=011u=100u=101

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

a101=f(0,0,0)+ f(0,0,1)+f(1,0,0)+f(1,0,1)=0+1+0+1=0

a011=f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)=0+1+0+1=0

a100=f(0,0,0)+

+f(1,0,0)=0+0=0

### Algebraic normal form

• u=110u=111

000

001

010

011

100

101

110

111

a111=f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)+ f(1,0,0)+f(1,0,1)+f(1,1,0)+ f(1,1,1)=0

a110=f(0,0,0)+ f(0,1,0)+f(1,0,0)+f(1,1,0)=0+0+0+1=1

### Algebraic normal form

• f(x0,x1,x2)=a001x2+a110x0x1=x2+x0x1

### Non linear filters

• Theorem (Rueppel, 1984):

• With the LFSR of length n and with the filter function with the property that its unique term in the ANF of maximum order k is a product of equidistant phases, the lower limit of the linear complexity of the resultant sequence is

### Non linear filters

• Design principles:

• The feedback polynomial: primitive

• The filter function must have various terms of each order.

• kn/2

• Include a linear term in order to obtain good statistical properties of the resulting sequence (balanced filter function).

### Non linear combiners

• In these generators, the keystream sequence is obtained by combining the output sequences of various LFSRs in a non linear manner.

• Example – it is possible to use a Boolean function (without memory).

### Non linear combiners

• Two cryptographic principles by Shannon:

• Confusion – we must use complicated transformations – as many bits of the key as possible should be involved in obtaining a single bit of the keystream sequence (and the ciphertext).

• Diffusion – Every bit of the key must affect many bits of the keystream sequence (and the ciphertext).

### Non linear combiners

• Possible flaws of non linear combiners (to be considered during the design):

• Bad statistical properties – e.g. too many zeros/ones in the output sequence.

• Correlation – The output sequence coincides too much with one or more internal sequences – this enables correlation attacks.

### Non linear combiners

• Correlation attacks:

• It is possible to divide the task of the cryptanalyst into several less difficult tasks – “Divide and conquer”.

• In order to prevent the correlation attacks, the non linear function of the combiner must have, at the same time:

• as high non linear order as possible

• as high correlation immunity as possible.

• These two requirements are opposite – we must find a trade off between these two values.

### Non linear combiners

• Correlation immunity:

• A Boolean function is correlation immune of order m if its output sequence is not correlated with any set of m and less input sequences.

• But, the higher the correlation immunity, the lower the non linear order k.

• The trade off (N is the number of variables)

m+kN; 1kN, 0mN-1

### Non linear combiners

• A Boolean function is balanced if it has an equal number of 0s and 1s in its truth table.

• The balanced correlation immune functions of order m are denominated m-resilient functions.

### Non linear combiners

• Example:

• The sum modulo 2 of N variables has the maximum possible value of correlation immunity, N-1, but its non linear order is 1.

• If the combination function contains memory, then the trade off between the correlation immunity and non linearity is not needed – it is possible to maximize both values – a single bit of memory is enough (Rueppel, 1984).

### Non linear combiners

• If F is a Boolean function of N periodic input sequences a1(t), a2(t), ..., aN(t), then the output sequence b(t) = F(a1(t), a2(t), ..., aN(t)) is a linear combination of various products of sequences.

• These products are determined by determining the ANF of the function F.

### Non linear combiners

• If in the ANF of the function F instead of the sum and product modulo 2 we use the sum and product of integers, the resulting function is denominated F* and for the linear complexity and the period of the output sequence of F the following holds:

### Non linear combiners

• Example:

• If the characteristic polynomials of the input sequences are:

All these polynomials are primitive!

### Non linear combiners

• Example (cont.):

• Then

### Non linear combiners

• The sum of N sequences in GF(q):

• The equality holds if the characteristic polynomials of the input sequences are mutually prime.

### Non linear combiners

• The sum of N sequences in GF(q):

• Obviously, if the periods of the input sequences are mutually prime then

### Non linear combiners

• Example:

Primitive!

The periods are Mersenne primes

### Non linear combiners

• Product of N sequences in GF(q):

• Theorem (Golić, 1989)

If Per(ai) are mutually prime, then

• Theorem (Lidl, Niedereiter)

Per(ai) are mutually prime

### Non linear combiners

• Example:

Primitive!

The periods are Mersenne primes

### Non linear combiners

• The general case:

• Let be the Boolean function obtained by removing all the products from the function F except those of the maximum order. Let be the corresponding integer function.

### Non linear combiners

• Theorem (Golić, 1989)

• F depends on all the N input variables.

• Per(ai) are mutually prime.

• Then

### Non linear combiners

• Example:

• If the characteristic polynomials of the input sequences are:

Primitive, periods Mersenne primes

### Non linear combiners

• Example (cont.)

Geffe’s generator

F balanced – good statistical properties

### Geffe’s generator

• The equivalent scheme

### Geffe’s generator

• Example: polynomials – primitive, with periods that are Mersenne primes.

### Geffe’s generator

• Problem: Correlation!

### Correlation immune functions

• Is there a way to find a Boolean memoryless combiner that guarantees a high level of correlation immunity?

• This is a difficult problem and there is no final answer.

• However, some Boolean combiners are known to have a high level of correlation immunity.

### Correlation immune functions

• One of the classes of such “good” functions – Latin squares.

• A Latin square is an n×n scheme of integers in which each element appears exactly once in each row and in each column.

### Correlation immune functions

• Basic property of Latin squares:

• If we exchange two rows/columns of a Latin square, the obtained scheme is also a Latin square.

• This gives rise to a construction (one of the possible algorithms):

• We start from the table of addition of the additive group with n elements.

• We exchange some rows and columns of the table several times.

### Correlation immune functions

• Example – a Latin square of order 4:

### Correlation immune functions

• A Latin square of dimension n as a family of log2n Boolean functions (a vectorial Boolean function with log2n outputs):

• There are 2 address branches, log2n bits each

• The output has log2n bits.

• Example (see previous slide):

• The address is 0110 (the two most significant bits address the row).

• The output is 10.

### Correlation immune functions

• Basic correlation-related property of Latin squares:

• Each bit of output is correlated with a linear combination of inputs that are located in both address branches.

• Consequence: there is no way of analyzing the address branches individually – no divide and conquer.

### Decimation of sequences

• The principal characteristic:

• The output sequence of a subgenerator controls the clock sequence of one or more subgenerators.

### Decimation of sequences

• Example 1:

• X=1,1,0,1,0,1,0,1

• Y=0,1,0,0,1

• Z=1,0,1,0,0

• Example 2:

• X and Y are generated by LFSRs and the BRM is applied

### Decimation of sequences

• Theorem (Chambers, Jennings, 1984)

• R1, R2 – primitive polynomials, degrees m and n, respectively

• Periods M=2m-1 and N=2n-1

• All the prime factors of M divide N

• Then:

### Decimation of sequences

• The requirements of the Theorem are satisfied if the lengths of both LFSRs are equal and the feedback polynomials are primitive.

### Decimation of sequences

• Example: n=m=107, primitive polynomials

LC=nM=107(2107-1)

Per = NM =(2107-1)(2107-1)

LFSR 1

clock

P

LFSR 2

### The shrinking generator (1993)

• A very simple binary sequence generator (Crypto’93)

• It consists of two LFSRs: LFSR1 and LFSR2

• Based on P,LFSR1(the control register) decimates the sequence generated by LFSR2

### The shrinking generator

• If ai=0, bi is discarded, otherwise bi is sent to the output.

• Thus the number of discarded bits from the sequence b depends on the lengths of runs of 0s in the sequence a.

### The shrinking generator (an example)

LFSRs:

• LFSR1: L1=3, f1(x)=1+x2+x3, IS1=(1,0,0)

• LFSR2: L2=4, f2(x)=1+x+x4, IS2=(1,0,0,0)

Decimation rule P:

• {ai}= 0 1 1 1 0 0 1 0 1 1 1 0 0 1 …

• {bi}= 1 1 1 0 10 1 1 0 0 1 00 0 …

• {cj}= 1 1 0 1 0 0 1 0 …

The underlined bits (1 and 0) are eliminated.

Characteristics of the output sequence

• Period:

• Linear complexity:

• Number of 1’s:

balanced sequence

### Example – BRM vs. Shrinking

• BRM:

• X=000100110101111…

• Y=001110100111010…

• Z=0010100111…

• Shrinking:

• X=000100110101111…

• Y=001110100111010…

• Z=01011011

### Statistical testing of PN generators

• The output sequence of a generator of pseudorandom sequences looks random, but it is not.

• Pseudorandom generators expand a truly random sequence (the key) to a much longer sequence, such that an adversary cannot distinguish between the pseudorandom sequence and a truly random sequence.

### Statistical testing of PN generators

• In order to obtain a guarantee of the security of this type of generators various statistical tests are applied, especially designed for this purpose.

• The fact that a generator passes a set of statistical tests should be considered a necessary condition, although not a sufficient one, for the security of the generator.

### Statistical testing of PN generators

• If the result X of an experiment can take any real value, then X is a continuous random variable.

• The probability density functionf(x) of a continuous random variable X can be integrated and the following holds:

• f(x)0, for all xR

• For all a, b R the following holds

### Statistical testing of PN generators

• A continuous random variable has a normal distribution with the mean  and the variance 2 if its probability density function is:

• We say that X is

• If X is , then we say that X has a standard normal distribution.

### Statistical testing of PN generators

• If the random variable X is , then the variable is .

• The Euler’s gamma function:

### Statistical testing of PN generators

• A continuous random variable X has a 2 distribution with  degrees of freedom if its probability density function is

### Statistical testing of PN generators

• A statistical hypothesisH is an affirmation about the distribution of one or more random variables.

• A hypothesis test is a procedure based on the observed values of the random variable that leads to the acceptance or rejection of the hypothesis H.

### Statistical testing of PN generators

• The test only provides a measure of the strength of evidence given by the data against the hypothesis.

• The conclusion is probabilistic.

• The level of significance of the test of the hypothesis H is the probability of rejecting the hypothesis H when it is true.

### Statistical testing of PN generators

• The hypothesis to be tested is denominated the null hypothesis, H0.

• The alternative hypothesis is denoted by H1 or Ha.

• In cryptography:

• H0 – the given generator is a random sequence generator.

### Statistical testing of PN generators

• If  is too small, the test could accept non random sequences.

• If  is too high, the test could reject random sequences.

• In cryptography:

•  is between 0,001 and 0,05.

### Statistical testing of PN generators

• A test:

• Determines a statistic for the sample of the output sequence.

• This statistic is compared with the expected value of a random sequence.

### Statistical testing of PN generators

• How is the comparison carried out?

• The computed statistic – X0 – follows a 2 distribution with  degrees of freedom.

• It is assumed that this statistic takes large values for non random sequences.

• In order to achieve , a threshold X is chosen (by means of the corresponding table), such that P(X0>X)=.

### Statistical testing of PN generators

• How is the comparison carried out? (cont.)

• If the value of the statistic for the sample of the output sequence, Xs, satisfies Xs>X, then the sequence fails on the test.

• Basic tests for cryptographic use:

• Frequency test, serial test, poker test, runs test, autocorrelation test, etc.

### Statistical testing of PN generators

• Frequency test

• Purpose: determine if the number of zeros and ones in a sequence s is approximately the same.

• n0 – number of zeros, n1 – number of ones.

• The statistic:

### Statistical testing of PN generators

• Frequency test (cont.)

• The statistic follows a 2distribution with 1 degree of freedom.

• The approximation is good enough if n10.

### Statistical testing of PN generators

• Serial test

• Tries to determine if the number of occurrences of 00, 01, 10 and 11, as subsequences of s is approximately the same.

• The statistic:

• The statistic follows a 2distribution with 2 degrees of freedom.

• The approximation is good enough if n21.

### Statistical testing of PN generators

• Poker test

• A positive integer m is considered such that

• The sequence s is divided into k parts of size m.

• ni is the number of occurrences of the type i of the sequence of length m, 1i2m (that is, i is the value of the integer whose binary representation is the sequence of length m.

• The test determines if every sequence of length m appears approximately the same number of times.

### Statistical testing of PN generators

• Poker test (cont.)

• The statistic:

• The statistic follows approximately a 2 distribution with 2m-1 degrees of freedom.

### Statistical testing of PN generators

• Runs test

• A run of length i – a subsequence of s formed by i consecutive zeros or i consecutive ones that are neither preceded nor followed by the same symbol.

• A run of zeros – gap

• A run of ones – block

### Statistical testing of PN generators

• Runs test (cont.)

• Purpose: determine if the number of runs of different lengths in the sequence s is that expected in a random sequence.

• The number of gaps (or blocks) of length i in a random sequence of length n is

• It is considered that k is equal to the largest integer i for which ei5.

• We denote by Bi and Hi the number of blocks and gaps of length i in s, for each i, 1ik.

### Statistical testing of PN generators

• Runs test (cont.)

• The statistic

• The statistic follows approximately a 2distribution with 2k-2 degrees of freedom.

### Statistical testing of PN generators

• Autocorrelation test

• Checks the correlation between s and shifted versions of s.

• An integer d, 1dn/2 is considered.

• The number of bits in s that are not equal to the d-shifts is

### Statistical testing of PN generators

• Autocorrelation test (cont.)

• The statistic

• The statistic follows approximately a N(0,1) distribution.

• The approximation is good enough if n-d10.