1 / 41

# Chapter 8 - PowerPoint PPT Presentation

Chapters 1. Introduction 2. Graphs 3. Descriptive statistics 4. Basic probability 5. Discrete distributions 6. Continuous distributions 7. Central limit theorem 8. Estimation 9. Hypothesis testing 10. Two-sample tests 13. Linear regression

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Chapter 8' - tim

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

1. Introduction

2. Graphs

3. Descriptive statistics

4. Basic probability

5. Discrete distributions

6. Continuous distributions

7. Central limit theorem

8. Estimation

9. Hypothesis testing

10. Two-sample tests

13. Linear regression

14. Multivariate regression

### Chapter 8

Confidence Interval Estimation and Statistical Inference

• Statistical inference is the process by which we acquire information and draw conclusions about populations from samples.

• In order to do inference, we require the skills and knowledge of descriptive statistics, probability distributions, and sampling distributions.

Towson University - J. Jung

• There are two types of inference: estimation and hypothesis testing; estimation is introduced first.

• The objective of estimation is to determine the approximate value of a population parameter on the basis of a sample statistic.

• E.g., the sample mean ( ) is employed to estimate the population mean ( ).

• There are two types of estimators:

• Point Estimator

• Interval Estimator

Towson University - J. Jung

• A point estimator draws inferences about a population by estimating the value of an unknown parameter using a single value or point.

• We saw earlier that point probabilities in continuous distributions were virtually zero. Likewise, we’d expect that the point estimator gets closer to the parameter value with an increased sample size, but point estimators don’t reflect the effects of larger sample sizes. Hence,

• An interval estimator draws inferences about a population by estimating the value of an unknown parameter using an interval.

• That is we say “with some ___% confidence that the population parameter of interest is between some lower and upper bounds”.

Towson University - J. Jung

Interval Estimator

• The interval is called confidence interval (C.I.).

• The chosen probability is called level of confidence.

• An interval estimate centered over a pointestimate isreported at the endpoints of the range.

• Example: Suppose we want to estimate the mean summer income of a class of business students. For n=25 students,

• is calculated to be 400 \$/week.

• point estimate C.I. level of confidence

• An alternative statement is:

• The mean income is between 380 and 420 \$/week with 95% level.

Towson University - J. Jung

We can calculate an interval estimator from a sampling distribution, by:

• Drawing a sample of size n from the population

• Calculating its mean,

• When X is normally (or approximately normally) distributed then it can be normalized:

• And random variable Z will have a standard normal (or approximately normal) distribution!!

Towson University - J. Jung

• What is the probability of:

• Now the other way around, what are the z-scores when:

= 0.95

• Again for:

= 0.90

• Hint: Use =norm.s.distor =norm.s.invappropriately.

Towson University - J. Jung

• Now we know that:

• We know from the CLT that:

• We can now normalize this random variable:

Towson University - J. Jung

• Replace Z with the normalized expression:

• Now do a bunch of algebra to get:

Towson University - J. Jung

• With 95% probability the estimated interval was:

• With a 90% probability the interval is smaller:

• In general, the formula is:

• is called the level of confidence!!

Towson University - J. Jung

Confidence interval

True, but unknown parameter

=

Towson University - J. Jung

Estimating when is known

• Thus, the probability that the interval:

contains the population mean is 1– .

• This is a confidence interval estimator for

• The confidence interval is abbreviated as: C.I.

Towson University - J. Jung

• …the actual location of the population mean …

…may be here…

…or here…

…or possibly even here…

The population mean is a fixed but unknown quantity. It’s incorrect to interpret the confidence interval estimate as a probability statement about . . The interval acts as the lower and upper limits of the interval estimate of the population mean.

Towson University - J. Jung

- the probability in tails, the likelihood of a certain type of error or mistake.

level of confidence = 1 –

is called critical value, the z score associated with half of alpha.

is called margin of error, denoted by e.

Therefore, C.I. is the interval [point estimate – e, point estimate + e].

Towson University - J. Jung

4 Commonly used Confidence Levels…

• Confidence Level

cut & keep handy!

Table 10.1

Towson University - J. Jung

• A computer company samples demand during a sales period over 25 sales periods:

• Its is known that the standard deviation of demand during a sales period is 75 computers.

• We want to estimate the mean demand of a sales period with 95% confidence in order to set inventory levels correctly.

Towson University - J. Jung

• In order to use our confidence interval estimator, we need the following pieces of data:

• therefore:

• So the 95% C.I. is (340.76, 399.56).

• Interpretation: The intervals got in this way contain in 95% of the time.

Calculated from the data…

, from Stats Tables or Excel.

Given

Towson University - J. Jung

• A confidence interval either does or does not contain m.

• The confidence level quantifies the risk.

• Out of 100 confidence intervals, approximately 95% would contain m, while approximately 5% would not contain m.

Towson University - J. Jung

Towson University - J. Jung

• A wide interval provides little information.

• For example, suppose we estimate with 95% confidence that an accountant’s average starting salary is between \$15,000 and \$100,000.

• Contrast this with:

• a 95% confidence interval estimate of starting salaries between \$42,000 and \$45,000.

• The second estimate is much narrower, providing accounting students more precise information about starting salaries.

Towson University - J. Jung

• A larger confidence level produces a w i d e r confidence interval

• Larger values of produce w i d e rconfidence intervals

• Increasing the sample size decreases the width of the confidence interval while the confidence level can remain unchanged.

• More data provides better estimates

Towson University - J. Jung

Selecting the Sample Size!

• We can control the width of the interval by determining the sample size necessary to produce narrow intervals.

• Suppose we want to estimate the mean demand “to within 5 units”; i.e. we want the interval estimate to be:

• Since:

• It follows that

• that is, to produce a 95% confidence interval estimate of the mean (±5 units), we need to sample 865 lead time periods (vs. the 25 data points we have currently).

Solve for n to get required sample size!

Towson University - J. Jung

• The general formula for the sample size needed to estimate a population mean with an interval estimate of:

• Requires a sample size of at least this large:

Towson University - J. Jung

• A lumber company must estimate the mean diameter of trees to determine whether or not there is sufficient lumber to harvest an area of forest.

• They need to estimate this to within 1 inch at a confidence level of 99%.

• The tree diameters are normally distributed with a standard deviation of 6 inches.

• How many trees need to be sampled?

Towson University - J. Jung

1

Example

Things we know:

• Confidence level = 99%,

therefore =.01

• We want ,

hence W=1.

• We are given that = 6.

• We compute…

• That is, we will need to sample at least 239 trees to have a 99% confidence interval of

Towson University - J. Jung

Inference with unknown variance!

• Previously, we estimate the population mean when the population standard deviation was known or given.

• When is unknown, we use its point estimator s

• and the z-statistic is replaced by the t-statistic, where the number of “degrees of freedom” v = n–1.

• NOTE: To use “z” or “t”, we require X-bar has NORMAL distribution.

Towson University - J. Jung

Estimating when is unknown!

• When the population standard deviation is unknown and the population is normal, the statistic is:

• which is Student t distributed with v= n–1 degrees of freedom. The confidence interval estimator of is given by:

Towson University - J. Jung

Estimating when is unknown

• Thus, the probability that the interval:

contains the population mean is 1– .

• This is a confidence interval estimator for

• Use =t.invto get the critical t scores.

Towson University - J. Jung

• A random sample of n = 83 companies resulted in average sales of \$15.02 with a variance of 68.98.

• Please construct an interval estimator for average sales with a 95%.

Towson University - J. Jung

• From the data, we calculate:

• For this term

• and so:

• We are confident that 95% of similarly constructed confidence intervals contain the true population mean.

=T.INV(0.025,82)

Towson University - J. Jung

To get the negative z value that has the specified probability to the left:

t1=t.inv(,n-1)

P(T<t1)= 

t1=t.inv(,n-1)

Towson University - J. Jung

Towson University - J. Jung

• When data are nominal, we count the number of occurrences of each value and calculate proportions.

• Thus, the parameter of interest in describing a population of nominal data is the population proportion π.

• This parameter is based on the binomial experiment.

• Recall the use of this statistic:

• where p is the sample proportion: x successes in a sample size of n items.

Towson University - J. Jung

• When nπ and n(1–π) are both at least 5, the sampling distribution of p is approximately normal with:

• Thus,

• The confidence interval estimator for π is given by:

Towson University - J. Jung

• The confidence interval estimator for a population proportion is:

• Thus the (half) width of the interval (W) is:

• Solving for n, we have:

Towson University - J. Jung

• For example, we want to know how many customers to survey in order to estimate the proportion of customers who prefer our brand to within 0.03 (with 95% confidence).

• i.e. our confidence interval after surveying will be p ± 0.03, that means W=0.03

• Substituting into the equation…

Uh Oh. Since we haven’t taken a sample yet, we don’t have this sample proportion…

Towson University - J. Jung

• Two methods – in each case we choose a value for pthen solve the equation for n.

• Method 1 : no knowledge of even a rough value of p. This is a ‘worst case scenario’ so we substitute: p= 0.50

• Method 2 : we have some idea about the value of p. This is a better scenario and we substitute in our estimated pvalue.

• e.g. We draw a sample and get a p, then we can use this p to solve for n for the next sample that would give us the interval estimate with the required probability.

Towson University - J. Jung

• Method 1 : no knowledge of value of p, use 50%:

• Method 2 : p from last sample is, say, 20%:

• Thus, we can sample fewer people if we already have a reasonable estimate of the population proportion before starting.

Towson University - J. Jung

• A Gallup Poll released stated with 95% confidence that the proportion of Marylanders supporting President Bush's proposal for revising Social Security was 56% with a margin of error of 3%. The number of persons polled was 1052.

• Verify this result.

Towson University - J. Jung

• Step One: Identify the Random Variable: p

• Center: p=0.56

• Step Two: Determine Its Distribution

• Standard Error: SQRT(0.56*0.44/1052)=0.0153

• Shape: 0.56*1052 = 589>5, and

• 0.44*1052 = 463>5 ==>Normal

• Margin of Error:

• 0.56+-NORM.S.INV(0.025)*0.0153=0.56+-0.03

Towson University - J. Jung

• Estimate the two values between which 99.7% of similar sample proportions might lie.

• 0.56+-NORM.S.INV(0.0015)*0.0153=0.56+-4.54

• So the interval increased in size, because the probability that this interval covers the true population proportion is larger.

Towson University - J. Jung