- 49 Views
- Uploaded on
- Presentation posted in: General

Session 2: Secret key cryptography – stream ciphers – part 2

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Session 2: Secret key cryptography – stream ciphers – part 2

- Computational complexity of the Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence.
- Thus, if the linear complexity is very high, then the task of predicting the next bits of the sequence is too complex.

- Then, in order to prevent the cryptanalysis of a pseudorandom sequence generator, we must design it in such a way that its linear complexity is too high for the application of the Berlekamp-Massey algorithm.

- Based on LFSRs
- The goals:
- Preserve good characteristics of the PN-sequences
- Increase the linear complexity

- The key is the initial state
- Different families of generators

LFSR

LFSR

LFSR

- Non linear filter

- Non linear combiner

- In general, it is difficult to calculate the value of the linear complexity of the resulting sequence.
- However, under some special conditions, it is possible to estimate the linear complexity of the resulting sequence.

- It is the form of a Boolean function that uses only the operations and
- In the ANF, the product that includes the largest number of variables is denominated non linear order of the function.
- Example: The non linear order of the function
f(x1,x2,x3)=x1x2x3x1x3 is 2.

- The ANF of a function can be determined from its truth table.

The Möbius transform

- Example: n=3, u=001

x

000

001

010

011

100

101

110

111

- Example: n=3, u=010

x

000

001

010

011

100

101

110

111

- Example: n=3

- u=000u=001u=010

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

a000=f(0,0,0)=0

a001=f(0,0,0)+

+f(0,0,1)=0+1=1

a010=f(0,0,0)+

+f(0,1,0)=0+0=0

- u=011u=100u=101

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

a101=f(0,0,0)+ f(0,0,1)+f(1,0,0)+f(1,0,1)=0+1+0+1=0

a011=f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)=0+1+0+1=0

a100=f(0,0,0)+

+f(1,0,0)=0+0=0

- u=110u=111

000

001

010

011

100

101

110

111

a111=f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)+ f(1,0,0)+f(1,0,1)+f(1,1,0)+ f(1,1,1)=0

a110=f(0,0,0)+ f(0,1,0)+f(1,0,0)+f(1,1,0)=0+0+0+1=1

- f(x0,x1,x2)=a001x2+a110x0x1=x2+x0x1

- Theorem (Rueppel, 1984):
- With the LFSR of length n and with the filter function with the property that its unique term in the ANF of maximum order k is a product of equidistant phases, the lower limit of the linear complexity of the resultant sequence is

- Design principles:
- The feedback polynomial: primitive
- The filter function must have various terms of each order.
- kn/2
- Include a linear term in order to obtain good statistical properties of the resulting sequence (balanced filter function).

- In these generators, the keystream sequence is obtained by combining the output sequences of various LFSRs in a non linear manner.
- Example – it is possible to use a Boolean function (without memory).

- Two cryptographic principles by Shannon:
- Confusion – we must use complicated transformations – as many bits of the key as possible should be involved in obtaining a single bit of the keystream sequence (and the ciphertext).
- Diffusion – Every bit of the key must affect many bits of the keystream sequence (and the ciphertext).

- Possible flaws of non linear combiners (to be considered during the design):
- Bad statistical properties – e.g. too many zeros/ones in the output sequence.
- Correlation – The output sequence coincides too much with one or more internal sequences – this enables correlation attacks.

- Correlation attacks:
- It is possible to divide the task of the cryptanalyst into several less difficult tasks – “Divide and conquer”.
- In order to prevent the correlation attacks, the non linear function of the combiner must have, at the same time:
- as high non linear order as possible
- as high correlation immunity as possible.

- These two requirements are opposite – we must find a trade off between these two values.

- Correlation immunity:
- A Boolean function is correlation immune of order m if its output sequence is not correlated with any set of m and less input sequences.
- But, the higher the correlation immunity, the lower the non linear order k.
- The trade off (N is the number of variables)
m+kN; 1kN, 0mN-1

- A Boolean function is balanced if it has an equal number of 0s and 1s in its truth table.
- The balanced correlation immune functions of order m are denominated m-resilient functions.

- Example:
- The sum modulo 2 of N variables has the maximum possible value of correlation immunity, N-1, but its non linear order is 1.

- If the combination function contains memory, then the trade off between the correlation immunity and non linearity is not needed – it is possible to maximize both values – a single bit of memory is enough (Rueppel, 1984).

- If F is a Boolean function of N periodic input sequences a1(t), a2(t), ..., aN(t), then the output sequence b(t) = F(a1(t), a2(t), ..., aN(t)) is a linear combination of various products of sequences.
- These products are determined by determining the ANF of the function F.

- If in the ANF of the function F instead of the sum and product modulo 2 we use the sum and product of integers, the resulting function is denominated F* and for the linear complexity and the period of the output sequence of F the following holds:

- Example:
- If the characteristic polynomials of the input sequences are:

All these polynomials are primitive!

- Example (cont.):
- Then

- The sum of N sequences in GF(q):
- The equality holds if the characteristic polynomials of the input sequences are mutually prime.

- The sum of N sequences in GF(q):
- Obviously, if the periods of the input sequences are mutually prime then

- Example:

Primitive!

The periods are Mersenne primes

- Product of N sequences in GF(q):
- Theorem (Golić, 1989)
If Per(ai) are mutually prime, then

- Theorem (Lidl, Niedereiter)
Per(ai) are mutually prime

- Theorem (Golić, 1989)

- Example:

Primitive!

The periods are Mersenne primes

- The general case:
- Let be the Boolean function obtained by removing all the products from the function F except those of the maximum order. Let be the corresponding integer function.

- Theorem (Golić, 1989)
- F depends on all the N input variables.
- Per(ai) are mutually prime.
- Then

- Example:
- If the characteristic polynomials of the input sequences are:

Primitive, periods Mersenne primes

- Example (cont.)

Geffe’s generator

F balanced – good statistical properties

- The equivalent scheme

- Example: polynomials – primitive, with periods that are Mersenne primes.

- Problem: Correlation!

- Is there a way to find a Boolean memoryless combiner that guarantees a high level of correlation immunity?
- This is a difficult problem and there is no final answer.
- However, some Boolean combiners are known to have a high level of correlation immunity.

- One of the classes of such “good” functions – Latin squares.
- A Latin square is an n×n scheme of integers in which each element appears exactly once in each row and in each column.

- Basic property of Latin squares:
- If we exchange two rows/columns of a Latin square, the obtained scheme is also a Latin square.

- This gives rise to a construction (one of the possible algorithms):
- We start from the table of addition of the additive group with n elements.
- We exchange some rows and columns of the table several times.

- Example – a Latin square of order 4:

- A Latin square of dimension n as a family of log2n Boolean functions (a vectorial Boolean function with log2n outputs):
- There are 2 address branches, log2n bits each
- The output has log2n bits.

- Example (see previous slide):
- The address is 0110 (the two most significant bits address the row).
- The output is 10.

- Basic correlation-related property of Latin squares:
- Each bit of output is correlated with a linear combination of inputs that are located in both address branches.
- Consequence: there is no way of analyzing the address branches individually – no divide and conquer.

- The principal characteristic:
- The output sequence of a subgenerator controls the clock sequence of one or more subgenerators.

- Example 1:
- X=1,1,0,1,0,1,0,1
- Y=0,1,0,0,1
- Z=1,0,1,0,0

- Example 2:
- X and Y are generated by LFSRs and the BRM is applied

- Theorem (Chambers, Jennings, 1984)
- R1, R2 – primitive polynomials, degrees m and n, respectively
- Periods M=2m-1 and N=2n-1
- All the prime factors of M divide N
- Then:

- The requirements of the Theorem are satisfied if the lengths of both LFSRs are equal and the feedback polynomials are primitive.

- Example: n=m=107, primitive polynomials
LC=nM=107(2107-1)

Per = NM =(2107-1)(2107-1)

LFSR 1

clock

P

LFSR 2

- A very simple binary sequence generator (Crypto’93)
- It consists of two LFSRs: LFSR1 and LFSR2
- Based on P,LFSR1(the control register) decimates the sequence generated by LFSR2

- If ai=0, bi is discarded, otherwise bi is sent to the output.
- Thus the number of discarded bits from the sequence b depends on the lengths of runs of 0s in the sequence a.

LFSRs:

- LFSR1: L1=3, f1(x)=1+x2+x3, IS1=(1,0,0)
- LFSR2: L2=4, f2(x)=1+x+x4, IS2=(1,0,0,0)
Decimation rule P:

- {ai}= 0 1 1 1 0 0 1 0 1 1 1 0 0 1 …
- {bi}= 1 1 1 0 10 1 1 0 0 1 00 0 …
- {cj}= 1 1 0 1 0 0 1 0 …
The underlined bits (1 and 0) are eliminated.

Characteristics of the output sequence

- Period:
- Linear complexity:
- Number of 1’s:
balanced sequence

- BRM:
- X=000100110101111…
- Y=001110100111010…
- Z=0010100111…

- Shrinking:
- X=000100110101111…
- Y=001110100111010…
- Z=01011011

- The output sequence of a generator of pseudorandom sequences looks random, but it is not.
- Pseudorandom generators expand a truly random sequence (the key) to a much longer sequence, such that an adversary cannot distinguish between the pseudorandom sequence and a truly random sequence.

- In order to obtain a guarantee of the security of this type of generators various statistical tests are applied, especially designed for this purpose.
- The fact that a generator passes a set of statistical tests should be considered a necessary condition, although not a sufficient one, for the security of the generator.

- If the result X of an experiment can take any real value, then X is a continuous random variable.
- The probability density functionf(x) of a continuous random variable X can be integrated and the following holds:
- f(x)0, for all xR
- For all a, b R the following holds

- A continuous random variable has a normal distribution with the mean and the variance 2 if its probability density function is:
- We say that X is
- If X is , then we say that X has a standard normal distribution.

- If the random variable X is , then the variable is .
- The Euler’s gamma function:

- A continuous random variable X has a 2 distribution with degrees of freedom if its probability density function is

- A statistical hypothesisH is an affirmation about the distribution of one or more random variables.
- A hypothesis test is a procedure based on the observed values of the random variable that leads to the acceptance or rejection of the hypothesis H.

- The test only provides a measure of the strength of evidence given by the data against the hypothesis.
- The conclusion is probabilistic.
- The level of significance of the test of the hypothesis H is the probability of rejecting the hypothesis H when it is true.

- The hypothesis to be tested is denominated the null hypothesis, H0.
- The alternative hypothesis is denoted by H1 or Ha.
- In cryptography:
- H0 – the given generator is a random sequence generator.

- If is too small, the test could accept non random sequences.
- If is too high, the test could reject random sequences.
- In cryptography:
- is between 0,001 and 0,05.

- A test:
- Determines a statistic for the sample of the output sequence.
- This statistic is compared with the expected value of a random sequence.

- How is the comparison carried out?
- The computed statistic – X0 – follows a 2 distribution with degrees of freedom.
- It is assumed that this statistic takes large values for non random sequences.
- In order to achieve , a threshold X is chosen (by means of the corresponding table), such that P(X0>X)=.

- How is the comparison carried out? (cont.)
- If the value of the statistic for the sample of the output sequence, Xs, satisfies Xs>X, then the sequence fails on the test.

- Basic tests for cryptographic use:
- Frequency test, serial test, poker test, runs test, autocorrelation test, etc.

- Frequency test
- Purpose: determine if the number of zeros and ones in a sequence s is approximately the same.
- n0 – number of zeros, n1 – number of ones.
- The statistic:

- Frequency test (cont.)
- The statistic follows a 2distribution with 1 degree of freedom.
- The approximation is good enough if n10.

- Serial test
- Tries to determine if the number of occurrences of 00, 01, 10 and 11, as subsequences of s is approximately the same.
- The statistic:
- The statistic follows a 2distribution with 2 degrees of freedom.
- The approximation is good enough if n21.

- Poker test
- A positive integer m is considered such that
- The sequence s is divided into k parts of size m.
- ni is the number of occurrences of the type i of the sequence of length m, 1i2m (that is, i is the value of the integer whose binary representation is the sequence of length m.
- The test determines if every sequence of length m appears approximately the same number of times.

- Poker test (cont.)
- The statistic:
- The statistic follows approximately a 2 distribution with 2m-1 degrees of freedom.

- Runs test
- A run of length i – a subsequence of s formed by i consecutive zeros or i consecutive ones that are neither preceded nor followed by the same symbol.
- A run of zeros – gap
- A run of ones – block

- Runs test (cont.)
- Purpose: determine if the number of runs of different lengths in the sequence s is that expected in a random sequence.
- The number of gaps (or blocks) of length i in a random sequence of length n is
- It is considered that k is equal to the largest integer i for which ei5.
- We denote by Bi and Hi the number of blocks and gaps of length i in s, for each i, 1ik.

- Runs test (cont.)
- The statistic
- The statistic follows approximately a 2distribution with 2k-2 degrees of freedom.

- Autocorrelation test
- Checks the correlation between s and shifted versions of s.
- An integer d, 1dn/2 is considered.
- The number of bits in s that are not equal to the d-shifts is

- Autocorrelation test (cont.)
- The statistic
- The statistic follows approximately a N(0,1) distribution.
- The approximation is good enough if n-d10.