Chapter 10: Estimating With Confidence

Chapter 10: Estimating With Confidence

10.1 – Confidence Intervals: The basics

Statistical Inference: Using sample data to draw conclusions about a population Note: Each sample may vary, but the population parameter doesn’t!

Sampling Distribution: • If population is approximately normal, so is • the sample distribution • If population is skewed, the sample distribution • is approximately normal if n30 by the central • limit theorem • If given sample data, look at the distribution to • assess normality if needed. (Normal Prob. Plot)

Confidence Interval: • Uses the sample distribution to predict population parameter • It is an interval of numbers above and below the sample statistic

Confidence Level: The probability the interval will capture the true parameter value in repeated samples Critical Value: The probability p lying to its right under the standard Normal curve. ( Z* )

Margin of Error: • How accurate our estimate is based on the variability of the sample distribution. We add and subtract this from our estimate. estimate  margin of error Caution! Margin of error is only from random sampling errors. This does not include errors in collecting the data!

Most Common Critical Values Confidence Level (C) Upper tail prob. Z* Value 90%  1.645 0.05 0.05 0.05 0.90 Z=? Z=?

Calculator Tip: Critical Values 2nd Dist – invNorm( (1 + C)/2 ) OR: Look at the T-Tables for the most common ones! (You will learn more about them later)

Confidence Interval for a Population mean ( known) (Z-Interval) estimate  margin of error estimate  critical value  standard error 

Properties of Confidence Intervals • The interval is always centered around the statistic • The higher the confidence level, the wider the interval becomes • If you increase n, then the margin of error decreases

Calculator Tip: Z-Interval Stat – Tests – ZInterval Data: If given actual values Stats: If given summary of values

Interpreting a Confidence Interval: What you will say: I am C% confident that the true parameter is captured in the interval What it means: If we took many, many, SRS from a population and calculated a confidence interval for each sample, C% of the confidence intervals will contain the true mean

CAUTION! Never Say: The interval will capture the true mean C% of the time. It either does or does not!

Conditions for a Z-Interval: (should say) • SRS (CLT or population approx normal) 2. Normality (Population 10x sample size) 3. Independence

Steps to Construct ANY Confidence Interval: PANIC P: Parameter of Interest (what are you looking for?) A: Assumptions (what are the conditions?) N: Name the type of interval (what type of data do we have?) I: Interval (Finally! You can calculate!) C: Conclusion in context (I am ___% confident the true parameter lies between ________ and _________)

Example #1 Serum Cholesterol-Dr. Paul Oswick wants to estimate the true mean serum HDL cholesterol for all of his 20-29 year old female patients. He randomly selects 30 patients and computes the sample mean to be 50.67. Assume from past records, the population standard deviation for the serum HDL cholesterol for 20-29 year old female patients is =13.4. • Construct a 95% confidence interval for the mean serum HDL cholesterol for all of Dr. Oswick’s 20-29 year old female patients. P: The true mean serum HDL cholesterol for all of Dr. Oswick’s 20-29 year old female patients.

A: SRS: Says randomly selected Normality: Approximately normal by the CLT (n 30) I am assuming that Dr. Oswick has 300 patients or more. Independence: N: One sample Z-Interval

C: I am 95% confident the true mean serum HDL cholesterol for all of Dr. Oswick’s 20-29 year old female patients is between 45.875 and 55.465

Example #1 Serum Cholesterol-Dr. Paul Oswick wants to estimate the true mean serum HDL cholesterol for all of his 20-29 year old female patients. He randomly selects 30 patients and computes the sample mean to be 50.67. Assume from past records, the population standard deviation for the serum HDL cholesterol for 20-29 year old female patients is =13.4. b. If the US National Center for Health Statistics reports the mean serum HDL cholesterol for females between 20-29 years old to be  = 53, do Dr. Oswick’s patients appear to have a different serum level compared to the general population? Explain. No, 53 is contained in the interval.

Example #1 Serum Cholesterol-Dr. Paul Oswick wants to estimate the true mean serum HDL cholesterol for all of his 20-29 year old female patients. He randomly selects 30 patients and computes the sample mean to be 50.67. Assume from past records, the population standard deviation for the serum HDL cholesterol for 20-29 year old female patients is =13.4. c. What two things could you do to decrease your margin of error? Increase n Lower confidence level

Example #2 Suppose your class is investigating the weights of Snickers 1-ounce Fun-Size candy bars to see if customers are getting full value for their money. Assume that the weights are Normally distributed with standard deviation = 0.005 ounces. Several candy bars are randomly selected and weighed with sensitive balances borrowed from the physics lab. The weights are 0.95 1.02 0.98 0.97 1.05 1.01 0.98 1.00 ounces. Determine a 90% confidence interval for the true mean, µ. Can you say that the bars weigh 1oz on average? P: The true mean weight of Snickers 1-oz Fun-size candy bars

A: Says randomly selected SRS: Normality: Approximately normal because the population is approximately normal I am assuming that Snickers has 80 bars or more in the 1-oz size Independence: N: One sample Z-Interval

C: I am 90% confident the true mean weight of Snickers 1-oz Fun-size candy barsis between .9921 and .9979 ounces. I am not confident that the candy bars weigh as advertised at the 90% level.

Choosing a Sample Size for a specific margin of error Note: Always round up! You can’t have part of a person! Ex: 163.2 rounds up to 164.

Example #3 A statistician calculates a 95% confidence interval for the mean income of the depositors at Bank of America, located in a poverty stricken area. The confidence interval is $18,201 to $21,799. • What is the sample mean income?

Example #3 A statistician calculates a 95% confidence interval for the mean income of the depositors at Bank of America, located in a poverty stricken area. The confidence interval is $18,201 to $21,799. b. What is the margin of error? m m = 21,799 – 20,000 m = 1,799

Example #4 A researcher wishes to estimate the mean number of miles on four-year-old Saturn SCI’s. How many cars should be in a sample in order to estimate the mean number of miles within a margin of error of  1000 miles with 99% confidence assuming =19,700.

10.2 – Estimating a Population Mean

In the 10.1 we made an unrealistic assumption that the population standard deviation was known and could be used to calculate confidence intervals.

Standard Error: When the standard deviation of a statistic is estimated from the data

When we know  we can use the Z-table to make a confidence interval. But, when we don’t know it, then we have to use something else!

Properties of the t-distribution: • σ is unknown • Degrees of Freedom = n – 1 • More variable than the normal distribution (it has fatter tails than the normal curve) • Approaches the normal distribution when the degrees of freedom are large (sample size is large). • Area is found to the right of the t-value

Properties of the t-distribution: • If n < 15, if population is approx normal, then so is the sample distribution. If the data are clearly non-Normal or if outliers are present, don’t use! • If n > 15, sample distribution is normal, except if population has outliers or strong skewness • If n  30, sample distribution is normal, even if population has outliers or strong skewness

Example #1 Determine the degrees of freedom and use the t-table to find probabilities for each of the following: 10 1.093

Example #1 Determine the degrees of freedom and use the t-table to find probabilities for each of the following: 0.15 10 1.093

Example #1 Determine the degrees of freedom and use the t-table to find probabilities for each of the following: 23 0.685

Chapter 10: Estimating With Confidence