1 / 49

Chapter

Chapter. 8. Sampling Distributions. Section 8.1. Distribution of the Sample Mean. Statistics such as the “mean” ( ) are random variables. Their value varies from sample to sample, so they have probability distributions associated with them.

xena-gross
Download Presentation

Chapter

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 8 Sampling Distributions

  2. Chap 2

  3. Section 8.1 Distribution of the Sample Mean

  4. Statistics such as the “mean” ( ) are random variables. Their value varies from sample to sample, so they have probability distributions associated with them. In this chapter we focus on the shape (normal?), center (peak) and spread(deviation) of distributions of .

  5. The sampling distribution of the sample mean is the probability distribution of all possible values of the random variable “ ” Take many samples of size “n” from a population whose mean is μ and standard deviation is σ. Then plot just the means ( ) of each sample.

  6. Illustrating Sampling Distributions • Step 1: Obtain a simple random sample of size “n” • Step 2: Compute the sample mean. • Step 3: Repeat Steps 1 and 2 until all possible simple random samples of size n have been obtained (in theory). In practice, take as many n-sized samples as practicable (time & money…).

  7. The weights of pennies minted after 1982 are approx normally distributed with mean = 2.46g (grams) and std dev = 0.02g. (28g = 1 oz) Approximate the sampling distribution of the sample mean by taking 200 simple random samples of size n = 5 pennies from this population (in other words, find the mean weight of each 5-penny sample and then plot just the 200 ‘s, not the weight of all 1000 pennies.) Sampling Distribution of the Sample Mean: Normal Population

  8. The data on the following slide represent the sample means for the 200 simple random samples, each of size n = 5. For example, the first sample of n = 5 pennies had the following weight data (in g): 2.493 2.466 2.473 2.492 2.471 Note: =2.479g for this sample

  9. The data on the following slide represent the sample means for the 200 simple random samples, each of size n = 5. For example, the first sample of n = 5 pennies had the following weight data (in g): 2.493 2.466 2.473 2.492 2.471 Note: =2.479g for this sample

  10. The data on the following slide represent the sample means for the 200 simple random samples, each of size n = 5. For example, the first sample of n = 5 pennies had the following weight data (in g): 2.493 2.466 2.473 2.492 2.471 Note: =2.479g for this sample

  11. Sample Means for Samples of Size n =5

  12. The mean of the 200 sample means is 2.46g, the same as the mean of the population. The standard deviation of the sample means (“standard error”) is 0.0086g, which is smaller than the standard deviation of the population. The next slide shows the histogram of all 200 sample means.

  13. What role does “n”, the size of each sample, play in the value of the standard deviation of the sample means? As the size “n” of each sample increases, the standard deviation of the distribution of the sample mean decreases.

  14. Approximate the distribution of the sample means by obtaining 200 simple random samples of size n = 20 pennies (not 5)…. from the same population of pennies minted after 1982 (μ =2.46 grams and σ=0.02 grams) The Impact of Sample Size “n” on Sampling Variation (Variance, Std Dev)

  15. The mean of the 200 sample means for n =20 pennies is still 2.46g, but the standard deviation is now 0.0045g (0.0086g forn = 5). There is less variation in the distribution of the sample means when n =20p than with n =5p and the curve is more “normal”.

  16. The Mean & Standard Deviation of the Sampling Distribution of If a random sample of size “n” is drawn from a large population with mean “μ” and std dev “σ”. Then the distribution of the ‘s will have: mean std dev is the “standard error of the mean” or just “standard error”

  17. Theorem If the parent population of the random variable X is normally distributed, the distribution of the sample means from that population will also be normally distributed.

  18. The weights of pennies minted after 1982 (pop) are normally distributed with: mean = 2.46g and std dev = 0.02g. What is the probability that, in a random sample of 10 of these pennies, the mean weight of the sample is at least 2.465 grams? Distribution of the Sample Mean

  19. . . P(Z > 0.79) = 1 – 0.7852 = 0.2148 . normally distributed = 2.46g normcdf (-1E99, 0.79) = 0.7852 Or: normcdf (0.79,1E99) = 0.2148 Or: normcdf(2.465,1E99,2.46,0.0063) = 0.2137

  20. Now… Lets describe the Distribution of the Sample Means if the parent population is non-normal, Or (same thing) we don’t know whether the parent is normal or not.

  21. Sampling from a Population that is Not Normal (or Unk) The following table and histogram give the probability distribution for rolling a fair die: μ = 3.5, σ = 1.708 Note that the population distribution is Uniform, NOT normal

  22. Roll a dice 4 times (n=4) and find the mean value Repeat this 200 times. Calculating the sample mean for each of the 200 samples. Repeat for n = 10 and 30. Estimate the sampling distribution of by obtaining 200 simple random samples of size and Histograms of the sampling distribution of the sample mean for each sample size are given on the next slide. Probability Experiment

  23. Roll a dice 4 times (n=4) and find the mean value Repeat this 200 times. Calculating the sample mean for each of the 200 samples. Repeat for n = 10 and 30. Estimate the sampling distribution of by obtaining 200 simple random samples of size and Histograms of the sampling distribution of the sample mean for each sample size are given on the next slide. Probability Experiment

  24. 1. Roll 4 dice (n=4). Find the mean value of result 2. Repeat this 200 times. Calculate the sample mean for each of the 200 rolls. Repeat for n = 10 and n = 30. 3. Estimate the sampling distribution of for each “n” Histograms of the distribution of the sample means for each sample size are given on the next slide. Probability Experiment

  25. The mean of the sampling distribution is equal to the mean of the parent population. The std error of the distribution of the sample means is regardless of “n” • The Central Limit Theorem: the shape of the distribution of the sample means becomes approx normal as the sample size n ≥ 30 increases, regardless of the shape of the parent population (means can use z-scores). Key Points from Dice Experiment

  26. Using the Central Limit Theorem • Suppose that the mean time for an auto oil change at Karlos’ Speedy Oil Change is 11.4 minutes with a standard deviation of 3.2 min. • If a random sample of n = 35 oil changes is selected, describe the sampling distribution of the sample mean. • (b) If a sample of n = 35 oil changes at Karlos’ is selected, what is the probability that the mean of these 35 oil change times is less than 11 minutes?

  27. Using the Central Limit Theorem S”what is the probability the mean of 35 oil change times is less than 11 minutes? that the mean time for an oil change at a “10-minute oil change joint” is 11.4 minutes with a standard deviation of 3.2 minutes. If a random sample of n = 35 oil changes is Or…. 2:Distr:2: normcdf(-1E99,11.0,11.4,0.5409) = 0.2298 IOW, if you did this experiment (n=35) 100 times, about 23 of those times (1/4) the mean oil change would take less than 11 mins. Solution: since n > 30, is approximately normally distributed with mean = 11.4 min and std. dev. = min • P(Z<–0.74) = 0.2296

  28. Section 8.2 Distribution of the Sample Proportion (not Mean)

  29. If a random sample of size “n” is obtained from a population in which each outcome is binomial (Yes/No; True/False; Success/Fail) The sample proportion, (“p-hat”) is where x is the number of “successes”. The sample proportion is a statistic that estimates the population proportion, ρ (Gr: rho). Point Estimate of a Population Proportion

  30. In a 2003 national Harris Poll, 1745 registered voters were asked whether they approved of the way President Bush was handling the economy (wthtm). 349 responded “yes”, and the other 1396 said various other unprintable things. Obtain a point estimate (%) for the proportion of registered voters who approved of the way President Bush was handling the economy. Computing a Sample Proportion

  31. According to a Time poll in 2008, 42% of registered voters believed that gay/lesbian couples should be allowed to (civil) marry. Describe the sampling distribution of the sample proportion for samples of size: n = 10, 50, 100. Note: We are using simulations to create the histograms on the following slides. Using Simulation to Describe the Distribution of the Sample Proportion

  32. Shape: As the size of the sample “n” increases, the shape of the sampling proportion distribution becomes approximately normal. • Center: The mean of the sampling proportion distribution equals the population proportion, ρ. • Spread: The std dev of the sampling proportion distribution decreases as the sample size, n increases. Key Points from Time Poll

  33. For a simple random sample of size n with point estimate and population proportion ρ: 1. The shape of the sampling distribution of is approximately normal provided npq ≥ 10. 2. The mean of the sampling distribution of is 3. The standard deviation of the sampling distribution of is Note: p + q=1 or q = 1 - p Sampling Distribution of

  34. Normality for Qualitative Variables (Proportions)… In order for us to use z-scores for probability of proportions, the distribution of must be approx normal. Your text uses npq ≥ 10 for this requirement, but most books use both np ≥ 5 and also nq ≥ 5 to assume normality

  35. Normality Requirement In this civil marriage problem: p = 0 .42 (42% approve) , so q = 1-p = 0.58 (58% disapprove) If n = 50, then: npq = 12.18 > 10 np = 21 >5 and nq = 29 >5 But, if n = 20, npq = 4.872 < 10 np = 8.4 > 5 and nq = 11.6 > 5

  36. According to a Time poll conducted in 2008, 42% of registered voters believed that gay/lesbian couples should be allowed to civil marry. Suppose that we obtain a random sample of 50 voters and determine which of those voters believe that gay/lesbian couples should be allowed to marry (“Success” = x). Describe the sampling distribution of the sample proportion for registered voters who believe that gay and lesbian couples should be allowed to marry. Sampling Distribution of the Sample Proportion

  37. np(1 – p) = npq = 50(0.42)(0.58) = 12.18 ≥ 10. Thus the sampling distribution of the sample proportion is therefore approximately normal with: mean = 0.42 (42% of the 50 = Success, 58% of the 50 = Fail) and standard deviation = Proportion Marriage Survey

  38. According to the Centers for Disease Control and Prevention (CDC) in 2004, 18.8% of school-aged children (aged 6-11 yrs) were overweight. • In a random sample of 90 school-aged children, what is the probability that at least 19% are overweight? • Suppose in a random sample of 90 school-aged children you find 25 overweight children. What might you conclude? Note: we are dealing with % of children, not quantity of children, because this variable is proportion. Compute Probabilities of a Sample Proportion

  39. Porky Pigs: a) In sample of 90 children, • what is prob at least 19% are overweight? 1. npq = 90(0.188)(0.812) ≈ 13.739 ≥ 10 So, is approximately normal with mean = 0.188 and standard deviation =

  40. 1. As before, is approximately normal with mean = 0.188 and standard deviation = 0.0412 • Porky Pigs: b) In a sample of 90, you find 25 overweight • children. What might you conclude? • normcdf (2.179, 1E99) = 0.0147 = prob of finding 25/90 O/W • If the true population proportion is 0.188 (18.8% O/W), then • the prob of finding 25/90 or 27.8% o/w children in a sample is • “unusual” (any data point outside 2σ). • If we repeated this experiment 100 times, we would only • expect to get this many O/W children (25/90) approx 1 or 2 times.

  41. Chap 2

More Related