Loading in 5 sec....

CPE 619 Testing Random-Number GeneratorsPowerPoint Presentation

CPE 619 Testing Random-Number Generators

- 197 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'CPE 619 Testing Random-Number Generators' - chanton

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### CPE 619Testing Random-Number Generators

Aleksandar Milenković

The LaCASA Laboratory

Electrical and Computer Engineering Department

The University of Alabama in Huntsville

http://www.ece.uah.edu/~milenka

http://www.ece.uah.edu/~lacasa

Overview

- Chi-square test
- Kolmogorov-Smirnov Test
- Serial-correlation Test
- Two-level tests
- K-dimensional uniformity or k-distributivity
- Serial Test
- Spectral Test

Testing Random-Number Generators

Goal: To ensure that the random number generator produces a

random stream

- Plot histograms
- Plot quantile-quantile plot
- Use other tests
- Passing a test is necessary but not sufficient
- Pass ¹ GoodFail Bad
- New tests Old generators fail the test
- Tests can be adapted for other distributions

Chi-Square Test

- Most commonly used test
- Can be used for any distribution
- Prepare a histogram of the observed data
- Compare observed frequencies with theoretical
- k = Number of cells
- oi = Observed frequency for ith cell
- ei = Expected frequency

- D=0 Þ Exact fit
- D has a chi-square distribution with k-1 degrees of freedom.
Þ Compare D with c2[1-a; k-1] Pass with confidence a if D is less

Example 27.1

- 1000 random numbers with x0 = 1
- Observed difference = 10.380
- Observed is Less Accept IID U(0, 1)

Chi-Square for Other Distributions

- Errors in cells with a small ei affect the chi-square statistic more
- Best when ei's are equal
- Þ Use an equi-probable histogram with variable cell sizes

- Combine adjoining cells so that the new cell probabilities are approximately equal
- The number of degrees of freedom should be reduced to k-r-1 (in place of k-1), where r is the number of parameters estimated from the sample
- Designed for discrete distributions and for large sample sizes only Lower significance for finite sample sizes and continuous distributions
- If less than 5 observations, combine neighboring cells

Kolmogorov-Smirnov Test

- Developed by A. N. Kolmogorov and N. V. Smirnov
- Designed for continuous distributions
- Difference between the observed CDF (cumulative distribution function) Fo(x) and the expected cdf Fe(x) should be small

Kolmogorov-Smirnov Test

- K+ = maximum observed deviation below the expected cdf
- K- = minimum observed deviation below the expected cdf
- K+ < K[1-a;n] and K- < K[1-a;n]Þ Pass at a level of significance
- Don't use max/min of Fe(xi)-Fo(xi)
- Use Fe(xi+1)-Fo(xi) for K-
- For U(0, 1): Fe(x)=x
- Fo(x) = j/n, where x > x1, x2, ..., xj-1

Example 27.2

- 30 Random numbers using a seed of x0=15:
- The numbers are:14, 11, 2, 6, 18, 23, 7, 21, 1, 3, 9, 27, 19, 26, 16, 17, 20, 29, 25, 13, 8, 24, 10, 30, 28, 22, 4, 12, 5, 15.

Example 27.2 (cont’d)

- The normalized numbers obtained by dividing the sequence by 31 are:0.45161, 0.35484, 0.06452, 0.19355, 0.58065, 0.74194, 0.22581, 0.67742, 0.03226, 0.09677, 0.29032, 0.87097, 0.61290, 0.83871, 0.51613, 0.54839, 0.64516, 0.93548, 0.80645, 0.41935, 0.25806, 0.77419, 0.32258, 0.96774, 0.90323, 0.70968, 0.12903, 0.38710, 0.16129, 0.48387.

Example 27.2 (cont’d)

- K[0.9;n] value for n = 30 and a = 0.1 is 1.0424
- Observed<Table Pass

Serial-Correlation Test

- Nonzero covariance Þ Dependence. The inverse is not true
- Rk = Autocovariance at lag k = Cov[xn, xn+k]
- For large n, Rk is normally distributed with a mean of zero and a variance of 1/[144(n-k)]
- 100(1-a)% confidence interval for the autocovariance is:
For k1 Check if CI includes zero

- For k = 0, R0= variance of the sequence Expected to be 1/12 for IID U(0,1)

Example 27.3: Serial Correlation Test

10,000 random numbers with x0=1:

Example 27.3 (cont’d)

- All confidence intervals include zero All covariances are statistically insignificant at 90% confidence.

Two-Level Tests

- If the sample size is too small, the test results may apply locally, but not globally to the complete cycle.
- Similarly, global test may not apply locally
- Use two-level tests
Þ Use Chi-square test on n samples of size k each and then use a Chi-square test on the set of n Chi-square statistics so obtained

Þ Chi-square on Chi-square test.

- Similarly, K-S on K-S
- Can also use this to find a ``nonrandom'' segment of an otherwise random sequence.

k-Distributivity

- k-Dimensional Uniformity
- Chi-square Þ uniformity in one dimensionÞ Given two real numbers a1 and b1 between 0 and 1 such that b1> a1
- This is known as 1-distributivity property of un.
- The 2-distributivity is a generalization of this property in two dimensions:
For all choices of a1, b1, a2, b2 in [0, 1], b1>a1 and b2>a2

k-Distributivity (cont’d)

- k-distributed if:
- For all choices of ai, bi in [0, 1], with bi>ai, i=1, 2, ..., k.
- k-distributed sequence is always (k-1)-distributed. The inverse is not true.
- Two tests:
- Serial test
- Spectral test
- Visual test for 2-dimensions: Plot successive overlapping pairs of numbers

Example 27.4

- Tausworthe sequence generated by:
- The sequence is k-distributed for k up to d /le, that is, k=1.
- In two dimensions: Successive overlapping pairs (xn, xn+1)

Example 27.5

- Consider the polynomial:
- Better 2-distributivity than Example 27.4

xn +1

xn

Serial Test- Goal: To test for uniformity in two dimensions or higher.
- In two dimensions, divide the space between 0 and 1 into K2 cells of equal area

Serial Test (cont’d)

- Given {x1, x2,…, xn}, use n/2 non-overlapping pairs (x1, x2), (x3, x4), … and count the points in each of the K2 cells
- Expected= n/(2K2) points in each cell
- Use chi-square test to find the deviation of the actual counts from the expected counts
- The degrees of freedom in this case are K2-1
- For k-dimensions: use k-tuples of non-overlapping values
- k-tuples must be non-overlapping
- Overlapping number of points in the cells are not independent chi-square test cannot be used
- In visual check one can use overlapping or non-overlapping
- In the spectral test overlapping tuples are used
- Given n numbers, there are n-1 overlapping pairs, n/2 non-overlapping pairs

Spectral Test

- Goal: To determine how densely the k-tuples {x1, x2, …, xk} can fill up the k-dimensional hyperspace
- The k-tuples from an LCG fall on a finite number of parallel hyper-planes
- Successive pairs would lie on a finite number of lines
- In three dimensions, successive triplets lie on a finite number of planes

Example 27.6 (cont’d)

- In three dimensions, the points (xn, xn-1, xn-2) for the above generator would lie on five planes given by:
Obtained by adding the following to equation

Note that k+k1 will be an integer between 0 and 4.

Spectral Test (More)

- Marsaglia (1968): Successive k-tuples obtained from an LCG fall on, at most, (k!m)1/k parallel hyper-planes, where m is the modulus used in the LCG.
- Example: m = 232, fewer than 2,953 hyper-planes will contain all 3-tuples, fewer than 566 hyper-planes will contain all 4-tuples, and fewer than 41 hyper-planes will contain all 10-tuples. Thus, this is a weakness of LCGs.
- Spectral Test: Determine the max distance between adjacent hyper-planes.
- Larger distanceÞ worse generator
- In some cases, it can be done by complete enumeration

Example 27.7

- Compare the following two generators:
- Using a seed of x0=15, first generator:
- Using the same seed in the second generator:

Example 27.7 (cont’d)

- Every number between 1 and 30 occurs once and only once
Þ Both sequences will pass the chi-square test for uniformity

Example 27.7 (cont’d)

- First Generator:

Example 27.7 (cont’d)

- Three straight lines of positive slope or ten lines of negative slope
- Since the distance between the lines of positive slope is more, consider only the lines with positive slope
- Distance between two parallel lines y=ax+c1 and y=ax+c2 is given by
- The distance between the above lines is or 9.80

Example 27.7 (cont’d)

- Second Generator:

Example 27.7 (cont’d)

- All points fall on seven straight lines of positive slope or six straight lines of negative slope.
- Considering lines with negative slopes:
- The distance between lines is: or 5.76.
- The second generator has a smaller maximum distance and, hence, the second generator has a better 2-distributivity
- The set with a larger distance may not always be the set with fewer lines

Example 27.7 (cont’d)

- Either overlapping or non-overlapping k-tuples can be used
- With overlapping k-tuples, we have k times as many points, which makes the graph visually more complete.The number of hyper-planes and the distance between them are the same with either choice.

- With serial test, only non-overlapping k-tuples should be used.
- For generators with a large m and for higher dimensions, finding the maximum distance becomes quite complex.
See Knuth (1981)

Summary

- Chi-square test is a one-dimensional testDesigned for discrete distributions and large sample sizes
- K-S test is designed for continuous variables
- Serial correlation test for independence
- Two level tests find local non-uniformity
- k-dimensional uniformity = k-distributivity tested by spectral test or serial test

Overview

- Inverse transformation
- Rejection
- Composition
- Convolution
- Characterization

Random-Variate Generation

- General Techniques
- Only a few techniques may apply to a particular distribution
- Look up the distribution in Chapter 29

u

CDF

F(x)

0.5

0

x

Inverse Transformation- Used when F-1 can be determined either analytically or empirically

Example 28.1

- For exponential variates:
- If u is U(0,1), 1-u is also U(0,1)
- Thus, exponential variables can be generated by:

Example 28.2

- The packet sizes (trimodal) probabilities:
- The CDF for this distribution is:

Example 28.2 (cont’d)

- The inverse function is:
- Note: CDF is continuous from the right the value on the right of the discontinuity is used The inverse function is continuous from the left u=0.7 x=64

Rejection

- Can be used if a pdf g(x) exists such that c g(x)majorizes the pdf f(x) c g(x)>f(x)8x
- Steps:
- 1. Generate x with pdf g(x)
- 2. Generate y uniform on [0, cg(x)]
- 3. If y<f(x), then output x and returnOtherwise, repeat from step 1 Continue rejecting the random variates x and y until y>f(x)

- Efficiency = how closely c g(x) envelopes f(x)Large area between c g(x) and f(x) Large percentage of (x, y) generated in steps 1 and 2 are rejected
- If generation of g(x) is complex, this method may not be efficient

Example 28.2

- Beta(2,4) density function:
- Bounded inside a rectangle of height 2.11 Steps:
- Generate x uniform on [0, 1]
- Generate y uniform on [0, 2.11]
- If y< 20 x(1-x)3, then output x and returnOtherwise repeat from step 1

Composition

- Can be used if CDF F(x) = Weighted sum of n other CDFs.
- Here, , and Fi's are distribution functions.
- n CDFs are composed together to form the desired CDFHence, the name of the technique.
- The desired CDF is decomposed into several other CDFs Also called decomposition
- Can also be used if the pdf f(x) is a weighted sum of n other pdfs:

- Generate a random integer I such that:
- This can easily be done using the inverse-transformation method.
- Generate x with the ith pdf fi(x) and return.

Example 28.4

- pdf:
- Composition of two exponential pdf's
- Generate
- If u1<0.5, return; otherwise return x=a ln u2.
- Inverse transformation better for Laplace

Convolution

- Sum of n variables:
- Generate n random variate yi's and sum
- For sums of two variables, pdf of x = convolution of pdfs of y1 and y2. Hence the name
- Although no convolution in generation
- If pdf or CDF = Sum Composition
- Variable x = Sum Convolution

Convolution: Examples

- Erlang-k = åi=1k Exponentiali
- Binomial(n, p) = åi=1n Bernoulli(p) Generated n U(0,1), return the number of RNs less than p
- c2(n) = åi=1n N(0,1)2
- G(a, b1)+G(a,b2)=G(a,b1+b2) Non-integer value of b = integer + fraction
- åi=1n Any = Normal å U(0,1) = Normal
- åi=1m Geometric = Pascal
- åi=12 Uniform = Triangular

Characterization

- Use special characteristics of distributions characterization
- Exponential inter-arrival times Poisson number of arrivals Continuously generate exponential variates until their sum exceeds T and return the number of variates generated as the Poisson variate.
- The ath smallest number in a sequence of a+b+1 U(0,1) uniform variates has a b(a, b) distribution.
- The ratio of two unit normal variates is a Cauchy(0, 1) variate.
- A chi-square variate with even degrees of freedom c2(n) is the same as a gamma variate g(2,n/2).
- If x1 and x2 are two gamma variates g(a,b) and g(a,c), respectively, the ratio x1/(x1+x2) is a beta variate b(b,c).
- If x is a unit normal variate, em+s x is a lognormal(m, s) variate.

invertible?

Yes

Use inversion

SummaryYes

Is CDF a sum

of other CDFs?

Use composition

Is pdf a sum

of other pdfs?

Yes

Use Composition

Summary (cont’d)

Is the

variate a sum of other

variates

Yes

Use convolution

Is the

variate related to other

variates?

Yes

Use characterization

Does

a majorizing function

exist?

Yes

Use rejection

No

Use empirical inversion

Homework #6

- Submit answers to exercise 27.1
- Submit answers to exercise 27.4
- Due: Monday, April 7, 2008, 12:45 PM
- Submit a hard copy to instructor

Download Presentation

Connecting to Server..