Signal and Data Processing

Signal and Data Processing CSC 508 Lecture 2

Homework Review One common method of “fixing” a bad pixel is to replace it with the average value of its neighbors. The difficulty is in figuring out which pixels are bad. In our case we have the advantage of an imperfect optical system that blurs the image. Only bad pixels will have grey-levels or colors that change abruptly. Compare thetwo enlargements from the Hourglass Nebula above. The point-source image of a star on the right is blurred by the HST optical system while the bad pixel shows a sudden change in pixel color and brightness. Versions of this algorithm are used to eliminate the “clicks and pops” in phonograph recordings as well as the gamma radiation spikes in data communications from space-based platforms.

2.0 Measurements and Noise This lesson introduces some basic definitions of signal analysis. We will investigate methods of quantifying the amount of uncertainty or randomness in a set of measurements. We will build a computer simulation of a noisy signal source and we will flip some coins. As computer scientists we are used to dealing with absolutes and deterministic processes. We compute numbers and generate strings of symbols that are exactly determined. When we solve for the root of we find that x=-1 exactly. We can determine that the values for x,y and z for which the logical expression G(x,y,z)=y’z+x’y’+y is true are x=true, y=z=false. This will be the result every time we compute it. In the world of analog signals we are dealing with values that are not precise and not exactly reproducible. Essentially every aspect of connecting computers to the “real world” involves the manipulation of noisy measurements. These data are called random or stochastic.

y x Probability & Statistics of Measurements Sometimes a stochastic variable can have a finite set of possible values such as the number rolled on a pair of dice or the face of a flipped coin. These are called discrete values. Alternatively a radom variable can be a measurement from a range of possible values, such as the current temperature or the speed of a car. These are called continuous values. 2.1 Statistical Measures If we make measurements xi of a fixed (not varying with time) stochastic quantity X, we expect our observations to approximate the quantity but we do not expect each of them to be exactly equal to X. If there is no built-in error in our measuring instrument (bias) we will find that our measureents will be distributed around the true value of X in a manner such that values of xi closer to X are more likely that those farther from X. Probability Distribution Function Relative Likelihood of xi = x x X

Sample Average Assuming our errors are purely random, we can obtain an estimate of the value of X by computing the average of all our measurements. average sampled value It is important to realize that when the variation in measurements is due to differences in the items being measured, the average value of the samples may not be a better estimate of a particular item than a particular measurement of that item. On the other hand, if the quantity being measured is fixed, the larger the sample size the better the average will be as an estimate of the true value. Later we will see that repeated measurements of varying quantities can also be used to improve the estimate if we have knowledge of how the quantity is changing with time. When there is only random measurement error, we can get an idea of the quality of the average of all the samples as an estimate of the true value by noting how much variation there is between samples. These sample-to-sample variations are called deviations.

Average Deviation If we add together the absolute difference of each sample to the sample average and then divide the result by the number of samples, we obtain the average deviation. This is also called the dispersion of the set of samples of X. A large dispersion indicates a high level of uncertainty while a small dispersion suggests that any particular sample would be a good estimate of the true value of X.

Standard Deviation Usually we are more concerned with large deviations from the true value that small ones. In this regard we prefer to give a larger weight to those values farthest from the mean in our calculation of deviation. This is called the standard deviation and its square is called the variance of the set of measurements. The average and the standard deviation as defined here are based on a sample size of n. This implies that we have collected a set of n measurements of some fixed quantity x. If we were to collect another n samples of the same quantity and use these additional data to recompute the average and the deviations, we would probably not get exactly the same values.

Sample Standard Deviation The value of the average used in the standard deviation is assumed to be the population mean. In general, we may not be able to measure the entire population in order to compute the population mean. Instead we can compute an estimate of the mean using a sample of the population. The reason we use n-1 in the denominator involves the number of degrees of freedom associated with our measurements. Since we use the same samples to estimate the population mean and to compute the standard deviation, we lose one degree of freedom.

A Coin Flip Random Number Generator Consider an experiment in which you flip a true coin (i.e. a coin that is equally likely to turn up heads as tails) four times. For each of the four occurances, you record a 1 for heads and a 0 for tails. You interpret these values as a four-digit binary number between 0 (0000) and 15 (1111). Since you are as likely to obtain a 1 as a 0, you can assum that each of the 16 possible four-digit binary number is equally likely. If you repeat this experiment 100 times you would obtain a collection of values similar to the ones below...

We can test our assumption by counting the number of occurances of each value. The frequency of occurances of each of the values can be displayed in a histogram as shown below. In 100 trials we should expect each of the 16 values to occur 1/16 of the time or 100/16=6.25 occurances. Our results seem quite different. Does this mean that our original assumption was wrong? Do the results shown deviate in any significant way from our expectations? First of all, we cannot obtain a fractional number of occurances of any value. For those values with a frequency of occurance exceeding 6.25 there must be other with a frequency of occurance less than 6.25. Secondly, we are fairly confident that we have a good understanding of the experiment and the expected results. Finally, we note that the number of 1's (heads)=223 and the number of 0's (tails) = 177 which seems reasonable for a fair coin and our sample size.

Actually these results are well within the range of what can be expected for the number of trials and the number of possible occurances. This illustrates the importance of having a sufficient number of samples to support meaningful analysis. There is a general rule that states, "When constructing a histogram, the number of samples should meet or exceed the square of the number of bins into which the samples are being partitioned." If we apply this rule to our experiment we should take at least 256 samples. The figure below shows an example histogram using a sample size of 256. The value 1/16 is the probability that a particular one of the 16 numbers 0 through 15 will be obtained in a given trial. A graph of the probability of occurance for every possible outcome is called a probability distribution. Since each value is equally likely in this case we have a uniform distribution with an expected mean of 7.5.

Now we will analyze a different coin flipping experiment. We will flip 10 coins at a time and count the number of heads as our sample in 100 trials. We can assume that the average number of heads for the 10 coins will be 5. We know that the range of possible values for our sample is [0,10]. That is, there can be a few a 0 heads or as many as 10 heads in a trial. Is every outcome (i.e. 0 heads, 1 head, 2 heads,. . ., 10 heads) equally likely? We can derive the relative probabilities for the numbers of heads from a closer look at the behavior of each coin as an independent event in an ensemble of events. Since each coin may turn up in one of two ways, the set of 10 coins may fall in any one of 210 or 1024 different ways. Even though we are only counting the number of heads and not worrying about which coin is a head, each of the 1024 outcomes is equally likely. Therefore each outcome will occur with a probability of 1/1024. Since we are dividing the outcomes into 11 categories, some of these categories must be more likely than others. For example there are more ways for 10 coins to turn up 5 heads and 5 tails than there are ways for 10 coins to turn up 10 heads and 0 tails. The histogram above shows the result of 100 trials. These results are not uniform, rather they correspond to a binomial distribution.

Binomial Distribution This name of this distribution comes from the fact that is applies to an experiment consisting of repeated trials in which there are two possible outcomes in each trial. The binomial probability is given by, where p is the probability of a single success (e.g. the probability that a coin will turn up heads), pm(1-p)n-m is the probability of exactly m out of n successes and, is the number of way of choosing m out of n items when order doesn't matter. In our coin flipping experiment, p=1/2, n=10, m is the number of heads ranging from 0 to 10, and b is the probability of obtaining m heads in a trial.

Normal (Gaussian) Distribution The coin experiments deal with discrete values of either heads or tails. Many times we will make measurements of continuous quantities such as temperature, weight or speed. (It can be argued that, at a quantum scale, these also take on discretized values, but let`s not.) The chracteristics exhibited by the measruements of a fixed continuous quantity in a noisy environment are described by the normal (or Gaussian) distribution function. When repeated measurements of a random continuous variable with a constant mean m are made, the distribution of the values is approximated by, where s is the standard deviation of the random variable x. While we simply add values to determine the total probability of a discrete random sample, we integrate the function g(x,m,s) over the desired range of x to obtain the corresponding probability for a continuous random variable. The proper way to use this function is to select some region around a value x, such as x1<x<x2 and integrate the function over this range. The result is the probability that a particular sample from this distribution will fall between the limits x1 and x2.

The normal probability density function (PDF) has a shape similar to the curve for the binomial distribution function. The source code on the right is an Ada function used to compute the integral of a normal PDF with zero mean and unit standard deviation. Homework: Implement this function in some software package. You may use any programming language math package or a spreadsheet. Calling INTGAUSS(XL,XH) with values substituted for XL and XH returns the probability of of sample values falling between these limits in a zero mean, unit sigma normal distribution. This function can be used to compute probabilities for a normal distribution with any mean and standard deviation by the proper manipulation of its input values.

Example use of INTGAUSS Let's say you want to know the probability of obtaining a value between Xlo=3.0 and Xhi=4.2 from a normal distribution with a mean m=2.5 and a standard deviation s=1.9. Before calling INTGAUSS, we will transform these limits to the equivalent limits for a zero mean, unit sigma normal distribution. so that, XL=0.263 and XH=0.895 Plugging these values into INTGAUSS we obtain, P(Xlo<X<Xhi) = 0.211

Poisson Distribution Finally we consider a probability distribution that applies to events occuring in time. Certain random variables are expressed in terms of the likelihood of an event occuring in a specified time interval iven that such events occur at a known average rate. For example, we record the number of vehicles passing a certain point on a highway during a known time interval and estimate that there are, on average, 8 vehicles per hour passing that point. We wish to determine the probability that at least 1 vehicle will pass during some 10 minute interval. To compute this value we will use the Poisson distribution given by, where m is the average number of events per unit time and n is the number of such event for which we desire to know the probability. The equation above computes the probability that exactly n events will occur in the implied time interval. For our example m = 8events/hr x 1hr/60min x 10min = 4/3 events/10min. Since we want to know if at least 1 vehicle passes in 10 minutes we can compute the probability that no (zero) vehicles pass and subtract this probability from 1.0.

Relationships between the Distributions As the number of elements in the binomial and the Poisson distributions is increased, their forms asymtotically approach the form of the normal distribution. The figure below shows tha change in the form of the Poisson distribution as the mean grows from 10 to 50. For large values of n, the binomial, Poisson and normal are practically equivalent (oppologies to any mathematicians in the house). The standard deviation of a random variable that follows the Poisson distribution is given by, Even though m and s are independent in the general form of the normal distribution, many of the normally distributed random variables we encounter exhibit this same relationship between m and s.

Homework - A friend with a serious excess of free time describes the following coin flipping experiment and claims to have made a rather startling breakthrough in probability theory: "I can make a better than random prediction of the outcome of a completely random event! This is how I do it. I flip four coins and look at three of them drawn at random. If all three are heads I predict that the fourth coin is a tail because there is only one way out of 16 (24) that all four coins can be heads and four ways that the four coins can fall as 3 heads and 1 tail (namely THHH, HTHH, HHTH, and HHHT). When I see three tails, I predict that the 4th coin is a head by the same reasoning. When the three randomly selected coins are two heads and a tail I predict that the fourth coin is a tail because there are only 4 ways to get 3 heads and 1 tail while there are 6 ways to get 2 heads and 2 tails. The same argument holds when I draw two tails and a head." Help save your friend from any embarrassment by fixing the above analysis and demonstrate that the probability of predicting the fourth coin is exactly 50%.

Signal and Data Processing