Presentation 7. Sampling Distributions. Statistics VS parameters. Statistic – is a numerical value computed from a sample. Parameter – is a numerical value associated with a population.
For Variables with a Normal (Bell-Shaped Distribution)
~68% of the values fall within +/- 1 standard deviation of the mean.
~95% of the values fall within +/-2 standard deviations of the mean.
Situation 1: A survey is undertaken to determine the proportion of PSU students who engage in under-age drinking. The survey asks 200 random under-age students (assume no problems with bias). Suppose the true population proportion of those who drink is 60% or p=.6
is the proportion in the sample who drink.
Imagine repeating this survey many times, and each time we record the sample proportion of those who have engaged in under-age drinking. What would the sampling distribution of look like?
is a random variable
assigning a value to each sample!
Let X be the number of respondents who say they engage in under age drinking. What is the PDF of X?
X is binomial with n=200 and p=.6 so we can calculate the probability of X for each possible outcome (0-200). The PDF is plotted below:
if the following conditions are satisfied:
Situation 2: The mean height of women age 20 to 30 is normally distributed (bell-shaped) with a mean of 65 inches and a standard deviation of 3 inches. A random sample of 200 women was taken and the sample mean recorded.
Now IMAGINE taking MANY samples of size 200 from the population of women. For each sample we record the . What is the sampling distribution of ?
Original Population of Women: X= height of random woman
Distribution of Sample Means: X-bar = mean of random sample of size 200.
What about for skewed or non-normal data?
Situation 3: Clearly CDs is a right skewed data set. Suppose our population looked something like this, let us take repeated samples from this population and see what the sample mean looks like.
n = 4
n = 8
n = 16
n = 32
µ = 87.6, σ = 87.8
The sample means from the previous slide had the following summary statistics:
Note: that the mean remains constant, and the std. deviation decreases as the sample size increases!
The above is true if the sample size is large enough, usually n greater than 30 is sufficient.
ASSUME the drug has NOT lost potency, answer the following questions…
Then the sampling distribution of is approximately normal with mean p=.85 and standard deviation = .036.
there is 95% probability that the proportion cured should be between 78% and 92%
First calculate a z-score…
Z-score = [value-mean]/StdDev
Z-score = [.9-.85]/.036 =1.4
P( >.9) = P(Z>1.4 ) = 1- P(Z<1.4 )
= 1-.9192 = .0808
Z-score = [.75-.85]/.036 = -2.80
P( .75) = P(Z< -2.80) = .0026
We will see some examples about how to use the sampling distribution of the sample mean in class activities…but it is similar idea.