Chapter 18. Sampling Distribution Models. Demonstration. Observe in class SPSS demonstration related to sampling distribution models. Demonstration Summary. First, we examined the distribution of state appropriations for education given the entire population of U.S. states.
Sampling Distribution Models
First, we examined the distribution of state appropriations for education given the entire population of U.S. states.
Our findings indicated that the distribution of state spending on education was skewed to the right, with a mean (m) of 1,272,969,120.00 and standard deviation (s) of 1,567,930,688.096
Next we randomly selected 30 states to be included in our sample. Analysis of this sample indicated that again the distribution of spending on education was skewed to the right; however the mean of the sample ( )was $1,410,710,766.67 with a standard deviation (s) of $1,941,673,134.577.
We then repeated the random sampling process to get a new sample of thirty states. We noticed that this new sample also had a distribution that was skewed to the right; however, the mean and standard deviation of this sample differed. The results were $1,126,093,266.67 and $1,781,298,838.439 respectively.
Did we do something wrong?
We then examined 100 different random samples of size thirty and determined that each sample had a slightly different mean and standard deviation due to sampling variability (i.e. different combinations of states were included in each of our samples).
When we went to create a histogram for our collection of sample means, we discovered something pretty amazing – that distribution looked very much like a normal model even though the distribution of state appropriations from our original population was skewed to the right.
A listing of all the values that a sample mean can take on and how often those values can occur is called the sampling distribution of a sample mean.
This histogram of sample means depicts the sampling distribution of the sample mean.
Like any other distribution, a sampling distribution of the sample mean has a shape, center, and measure of variability (i.e. spread)
This distribution can be interpreted as the probability distribution of sample means.
Under certain conditions this sampling distribution will approximate the normal model regardless of the shape of the distribution for the original variable from the population.
We can use simulation to get a sense as to what the sampling distribution of the sample mean might look like…
Let’s start with a simulation of 10,000 tosses of a die. A histogram of the results is:
Looking at the average of two dice after a simulation of 10,000 tosses:
The average of 5 dice after a simulation of 10,000 tosses looks like:
As the sample size (number of dice) gets larger, each sample average is more likely to be closer to the population mean.
And, it probably does not shock you that the sampling distribution of this mean becomes Normal.
The mean of a random sample has a sampling distribution whose shape can be approximated by a Normal model. The larger the sample, the better the approximation will be.
The CLT is surprising and a bit weird:
All we need is for the observations to be independent and collected with randomization.
Recall that normal models are described by their means and standard deviations.
The mean of all sample means is the population mean m. That is to say, the sampling distribution of the mean has a mean m .
The standard deviation of all sample means is . That is to say, the sampling distribution of the mean has a standard deviation .
When a random sample is drawn from any population with mean m and standard deviation s , its sample mean has a sampling distribution with the same mean m but whose standard deviation is
(we write ).
No matter what population (whether it has a distribution that is symmetric, uniform, or skewed to the right or left) the random sample comes from, the shape of the sampling distribution is approximately Normal as long as the sample size is large enough. The larger the sample used, the more closely the Normal approximates the sampling distribution for the mean.
Provided that the sampled values are independent and the sample size is large enough, the sampling distribution of (sample proportion) is modeled by a Normal model with
Check the following corresponding conditions
The standard deviations of our Normal models are as follows:
When we don’t know p or σ, we’re stuck, right?
Nope. We will use sample statistics to estimate these population parameters.
When we estimate the standard deviation of a sampling distribution using statistics found from the data, the estimate is called a standard error.
In the 2001 ACT, students had a mean score of 21.3 with a standard deviation of 6.0. Assume that the scores are normally distributed.
If 60 students are randomly selected, find the probability that they have a mean score greater than 23.5.
A national study found that 44% of college students engage in binge drinking (5 drinks at a sitting for men, 4 for women). Use the 68-95-99.7 Rule to describe the sampling distribution model for the proportion of students in a randomly selected group of 200 college students who engage in binge drinking. Do you think the appropriate conditions are met?
Carbon monoxide emissions for a certain kind of car vary with mean 2.9 g/m and standard deviation 0.4 g/m. A company has 80 of these cars in its fleet.
Just before a referendum on a school budget, a local newspaper polls 400 voters in an attempt to predict whether the budget will pass. Suppose that the budget actually has the support of 52% of the voters. What’s the probability the newspaper’s sample will lead them to predict defeat? Be sure to verify that the assumptions and conditions necessary for your analysis are met.