Inference on a Single Mean

1 / 83

# Inference on a Single Mean - PowerPoint PPT Presentation

Inference on a Single Mean. Use Calculation from Sample to Estimate Population Parameter. (select). Population. Sample. (calculate). (describes). (estimate). Parameter. Statistic. Use Calculation from Sample to Estimate Population Parameter. (select). Population. Sample. (calculate).

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Inference on a Single Mean' - dasha

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Inference on a Single Mean

L. Wang, Department of Statistics

University of South Carolina

Use Calculation from Sample to Estimate Population Parameter

(select)

Population

Sample

(calculate)

(describes)

(estimate)

Parameter

Statistic

L. Wang, Department of Statistics

University of South Carolina; Slide 2

Use Calculation from Sample to Estimate Population Parameter

(select)

Population

Sample

(calculate)

(describes)

(estimate)

Parameter

Statistic

L. Wang, Department of Statistics

University of South Carolina; Slide 3

Describes a sample.

Always known

Changes upon repeated sampling.

Examples:

Describes a population.

Usually unknown

Is fixed

Examples:

Statistic Parameter

L. Wang, Department of Statistics

University of South Carolina; Slide 4

A Statistic is a Random Variable
• Upon repeated sampling of the same population, the value of a statistic changes.
• While we don’t know what the next value will be, we do know the overall pattern over many, many samplings.
• The distribution of possible values of a statistic for repeated samples of the same size from a population is called the sampling distribution of the statistic.

L. Wang, Department of Statistics

University of South Carolina; Slide 5

Sampling Distribution of
• If a random sample of size n is taken from a normal population having mean μyand variance σy2, then is a random variable which is also normally distributed with mean μyand variance σy2/n .

L. Wang, Department of Statistics

University of South Carolina; Slide 6

Sampling Distribution of

N(100,5)

N(100,1.58)

N(100,3.54)

N(100,1)

L. Wang, Department of Statistics

University of South Carolina; Slide 7

Light Bulbs
• The life of a light bulb is normally distributed with a mean of 2000 hours and standard deviation of 300 hours.
• What is the probability that a randomly chosen light bulb will have a life of less than 1700 hours?
• What is the probability that the mean life of three randomly chosen light bulbs will be less than 1700 hours?

L. Wang, Department of Statistics

University of South Carolina; Slide 8

• Suppose we are manufacturing light bulbs. The life of these bulbs has historically followed a normal distribution with a mean of 2000 hours and standard deviation of 300 hours.
• We change the filament material and unbeknown to us the average life of the bulbs decreases to 1500 hours. (We will assume that the distribution remains normal with a standard deviation of 300 hours.)
• If we randomly sample 1 bulb, will we realize that the average life has decrease? What if we sample 3 bulbs? 9 bulbs?

L. Wang, Department of Statistics

University of South Carolina; Slide 9

σ = 300

μ = 1500

μ = 2000

Y < 1400 would signal shift

L. Wang, Department of Statistics

University of South Carolina; Slide 10

σ = 173

μ = 1500

μ = 2000

Averages of n = 3

Y < 1650 would signal shift

L. Wang, Department of Statistics

University of South Carolina; Slide 11

µ = 1500

µ = 2000

σ = 100

µ = 1500

μ = 1500

µ = 2000

μ = 2000

Averages of n = 9

Y < 1800 would signal shift

L. Wang, Department of Statistics

University of South Carolina; Slide 12

What if the original distribution is not normal? Consider the roll of a fair die:

L. Wang, Department of Statistics

University of South Carolina; Slide 13

Suppose the single measurements are not normally Distributed.
• Let Y = life of a light bulb in hours
• Y is exponentially distributed

with λ = 0.0005 = 1/2000

0.0005

L. Wang, Department of Statistics

University of South Carolina; Slide 14

Single measurements

Averages of 2 measurements

Averages of 4 measurements

Source: Lawrence L. Lapin, Statistics in Modern Business Decisions, 6th ed., 1993, Dryden Press, Ft. Worth, Texas.

Averages of 25 measurements

L. Wang, Department of Statistics

University of South Carolina; Slide 15

As n increases, what happens to the variance?

n=1

n=2

• Variance increases.
• Variance decreases.
• Variance remains the same.

n=4

n=25

L. Wang, Department of Statistics

University of South Carolina; Slide 16

n = 1

n = 2

n = 4

n = 25

L. Wang, Department of Statistics

University of South Carolina; Slide 17

Central Limit Theorem
• If n is sufficiently large, the sample means of random samples from a population with mean μ and standard deviation σ are approximately normally distributed with mean μ and standard deviation .

L. Wang, Department of Statistics

University of South Carolina; Slide 18

Random Behavior of Means Summary
• If Y is distributed n(μ, σ), then

is distributed N(μ, ).

• If Y is distributed non-N(μ, σ), then

is distributed approximately

N(μ, ).

L. Wang, Department of Statistics

University of South Carolina; Slide 19

If We Can Consider to be Normal …
• Recall: If Y is distributed normally with mean μ and standard deviation σ, then
• So if is distributed normally with mean μ and standard deviation ,

then

L. Wang, Department of Statistics

University of South Carolina; Slide 20

If the time between adjacent accidents in an industrial plant follows an exponential distribution with an average of 700 days, what is the probability that the average time between 49 pairs of adjacent accidents will be greater than 900 days?

L. Wang, Department of Statistics

University of South Carolina; Slide 21

XYZ Bottling Company claims that the distribution of fill on it’s 16 oz bottles averages 16.2 ounces with a standard deviation of 0.1 oz. We randomly sample 36 bottles and get y = 16.15. If we assume a standard deviation of 0.1 oz, do we believe XYZ’s claim of averaging 16.2 ounces?

L. Wang, Department of Statistics

University of South Carolina; Slide 22

Up Until Now We have been Assuming that We Knew the True Standard Deviation (σ), But Let’s Face Facts …
• When we use s to estimate σ, then the calculated value

follows a t-distribution with n-1 degrees of freedom.

Note: we must be able to assume that we are sampling from a normal population.

L. Wang, Department of Statistics

University of South Carolina; Slide 23

Let’s take another look at XYZ Bottling Company. If we assume that fill on the individual bottles follows a normal distribution, does the following data support the claim of an average fill of 16.2 oz?

16.1 16.0 16.3 16.2 16.1

L. Wang, Department of Statistics

University of South Carolina; Slide 24

In Summary
• When we know σ:
• When we estimate σ with s:

We assume we are sampling from a normal population.

L. Wang, Department of Statistics

University of South Carolina; Slide 25

Relationship Between Z and t Distributions

Z

tdf=3

tdf=1

L. Wang, Department of Statistics

University of South Carolina; Slide 26

Internal Combustion Engine
• The nominal power produced by a student-designed internal combustion engine is 100 hp. The student team that designed the engine conducted 10 tests to determine the actual power. The data follow:

98, 101, 102, 97, 101, 98, 100, 92, 98, 100

Assume data came from a normal distribution.

L. Wang, Department of Statistics

University of South Carolina; Slide 27

Internal Combustion Engine

Summary Data:

What is the probability of getting a sample mean of 98.7 hp or less if the true mean is 100 hp?

L. Wang, Department of Statistics

University of South Carolina; Slide 28

Internal Combustion Engine

0.0949

What did we assume when doing this analysis?

Are you comfortable with the assumption?

L. Wang, Department of Statistics

University of South Carolina; Slide 29

Can We Assume Sampling from a Normal Population?
• If data are from a normal population, there is a linear relationship between the data and their corresponding Z values.

If we plot y on the vertical axis and z on the horizontal axis, the y intercept estimates μ and the slope estimates σ.

L. Wang, Department of Statistics

University of South Carolina; Slide 30

How to Calculate Corresponding Z-Values
• Order data
• Estimate percent of population below each data point.
• Look up Z-Value that has Pi proportion of distribution below it.

where i is a data point’s position in the ordered set and n is the number of data points in the set.

L. Wang, Department of Statistics

University of South Carolina; Slide 31

Normal Probability (QQ) Plot

ZPi yi i

-1.15 .125 2 1

-0.32 .375 4 2

+0.32 .625 7 3

+1.15 .875 10 4

2 4 7 10

Data set:

L. Wang, Department of Statistics

University of South Carolina; Slide 32

Normal Probability (QQ) Plot

This data is a random sample from a N(10,2) population.

L. Wang, Department of Statistics

University of South Carolina; Slide 33

Normal Probability (QQ) Plot

L. Wang, Department of Statistics

University of South Carolina; Slide 34

### Estimation of the Mean

L. Wang, Department of Statistics

University of South Carolina

Point Estimators
• A point estimator is a single number calculated from sample data that is used to estimate the value of a parameter.
• Recall that statistics change value upon repeated sampling of the same population while parameters are fixed, but unknown.
• Examples:

L. Wang, Department of Statistics

University of South Carolina; Slide 36

In General:

What makes a “Good” estimator?

(1)

Accuracy: An unbiased estimator of a parameter is one whose expected value is equal to the parameter of interest.

(2)

Precision: An estimator is more precise if its sampling distribution has a smaller standard error*.

*Standard error is the standard deviation for the sampling distribution.

L. Wang, Department of Statistics

University of South Carolina; Slide 37

Unbiased Estimators

For normal populations, both the sample mean and sample median are unbiased estimators of μ.

mean

median

µ

L. Wang, Department of Statistics

University of South Carolina; Slide 38

Most Efficient Estimators
• If you have multiple unbiased estimators, then you choose the estimator whose sampling distribution has the least variation. This is called the most efficient estimator.

mean

median

For normal populations, the sample mean is the most efficient estimator of μ.

L. Wang, Department of Statistics

University of South Carolina; Slide 39

Interval Estimate of the Mean

(with a little algebra)

So we say that we are 95% sure

that μ is in the interval

L. Wang, Department of Statistics

University of South Carolina; Slide 40

Interval Estimate of the Mean

0.95

.025

.025

Z

1.96

-1.96

L. Wang, Department of Statistics

University of South Carolina; Slide 41

Interval Estimate of the Mean
• Let’s go from 95% confidence to the general case.
• The symbol zα is the z-value that has an area of α to the right of it.

L. Wang, Department of Statistics

University of South Carolina; Slide 42

Interval Estimate of the Mean

1 - α

α/2

α/2

-Zα/2

+Zα/2

(1 – α) 100% Confidence Interval

L. Wang, Department of Statistics

University of South Carolina; Slide 43

What Does (1 – α) 100% Confidence Mean?

Sampling Distribution of the y

(1-α)100% Confidence Intervals

μ

L. Wang, Department of Statistics

University of South Carolina; Slide 44

• 99%
• 95%
• 90%
• 85%

L. Wang, Department of Statistics

University of South Carolina; Slide 45

Which z-value would you use to calculate a 99% confidence interval on a mean?
• Z0.10 = 1.282
• Z0.01 = 2.326
• Z0.005 = 2.576
• Z0.0005 = 3.291

L. Wang, Department of Statistics

University of South Carolina; Slide 46

Plastic Injection Molding Process
• A plastic injection molding process for a part that has a critical width dimension historically follows a normal distribution with a standard deviation of 8.
• Periodically, clogs from one of the feeder lines causes the mean width to change. As a result, the operator periodically takes random samples of size 4.

L. Wang, Department of Statistics

University of South Carolina; Slide 47

Plastic Injection Molding
• A recent sample of four yielded a sample mean of 101.4.
• Construct a 95% confidence interval for the true mean width.
• Construct a 99% confidence for the true mean width.

L. Wang, Department of Statistics

University of South Carolina; Slide 48

When going from a 95% confidence interval to a 99% confidence interval, the width of the interval will
• Increase.
• Decrease.
• Remain the same.

L. Wang, Department of Statistics

University of South Carolina; Slide 49

Interval Width, Level of Confidence and Sample Size
• At a given sample size, as level of confidence increases, interval width __________.
• At a given level of confidence as sample size increases, interval width __________.

L. Wang, Department of Statistics

University of South Carolina; Slide 50

Calculate Sample Size Before Sampling!
• The width of the interval is determined by:

Suppose we wish to estimate the mean to a maximum error of d:

L. Wang, Department of Statistics

University of South Carolina; Slide 51

Plastic Injection Molding
• A plastic injection molding process for a part that has a critical width dimension historically follows a normal distribution with a standard deviation of 8.
• What sample size is required to estimate the true mean width to within + 2 units at 95% confidence?
• What sample size is required to estimate the true mean width to within + 2 units at 99% confidence?

L. Wang, Department of Statistics

University of South Carolina; Slide 52

If we don’t have prior knowledge of the standard deviation, but can assume we are sampling from a normal population…
• Instead of using a z-value to calculate the confidence interval…

L. Wang, Department of Statistics

University of South Carolina; Slide 53

Interval Estimate of the Mean

t df=n-1

1 - α

α/2

α/2

-tα/2

+tα/2

(1 – α) 100% Confidence Interval

L. Wang, Department of Statistics

University of South Carolina; Slide 54

Plastic Injection Molding – Reworded
• A plastic injection molding process for a part that has a critical width dimension historically follows a normal distribution.
• A recent sample of four yielded a sample mean of 101.4 and sample standard deviation of 8.
• Estimate the true mean width with a 95% confidence interval.

L. Wang, Department of Statistics

University of South Carolina; Slide 55

### Hypothesis Testing

L. Wang, Department of Statistics

University of South Carolina

Statistical Hypothesis
• A statistical hypothesis is an assertion or conjecture concerning one or more population parameters.
• Examples:
• More than 7% of the landings for a certain airline exceed the runway.
• The defective rate on a manufacturing line is less than 10%.
• The mean lifetime of the bulbs is above 2200 hours.

L. Wang, Department of Statistics

University of South Carolina; Slide 57

The Null and Alternative Hypotheses
• Null Hypothesis, Ho, represents what we assume to be true. It is always stated so as to specify an exact value of the parameter.
• Alternative (Research) Hypothesis, H1 or Ha, represents the alternative to the null hypothesis and allows for the possibility of several values. It carries the burden of proof.
• In most situations, the researcher hopes to disprove or reject the null hypothesis in favor of the alternative hypothesis.

L. Wang, Department of Statistics

University of South Carolina; Slide 58

Steps to a Hypothesis Test
• Determine the null and alternative hypotheses.
• Collect data and calculate test statistic, assuming null hypothesis it true.
• Assuming the null hypothesis is true, calculate the p-value or use rejection region method.
• Draw conclusion and state it in English.

L. Wang, Department of Statistics

University of South Carolina; Slide 59

Two types of mistakes

(1) Type I error

Reject null hypothesis when it is true.

(2) Type II error

Fail to reject the null hypothesis when the alternative hypothesis is true.

Let α= P(type I error), β=P(type II error)

Power of the test is 1-β.

L. Wang, Department of Statistics

University of South Carolina; Slide 60

Combustion Engine

The nominal power produced by a student designed combustion engine is assumed to be at least 100 hp. We wish to test the alternative that the power is less than 100 hp.

Let µ = nominal power of engine.

QQ plots shows it is reasonable to assume data came from a normal distribution.

Sample Data:

L. Wang, Department of Statistics

University of South Carolina; Slide 61

Combustion Engine

(1) State hypotheses, set alpha.

(2) Choose test statistic

(3,4) Designate critical value for test ( if using the rejection region method)and draw conclusion

or

Calculate p-value and draw conclusion.

L. Wang, Department of Statistics

University of South Carolina; Slide 62

(3) Designate Rejection Region

Assumes H0: µ = 100 is true

0.05

100

Y=avghp

-4 -3 -2 -1 0 +1 +2 +3 +4 tdf=9

-1.833

L. Wang, Department of Statistics

University of South Carolina; Slide 63

Draw conclusion:

tdf=9

-1.4327

L. Wang, Department of Statistics

University of South Carolina; Slide 64

-1.833

p-value
• The p-value is the probability of getting the sample result we got or something more extreme.

0.0928

tdf=9

L. Wang, Department of Statistics

University of South Carolina; Slide 65

-1.4327

p-value
• P(tdf=9< -1.4327) = 0.0928
• Note:

If p-value <α, reject H0.

If p-value > α. Fail to reject H0.

0.0928

0.05

tdf=9

-1.4327

L. Wang, Department of Statistics

University of South Carolina; Slide 66

-1.833

Average Life of a Light Bulb

Historically, a particular light bulb has had a mean life of no more than 2000 hours. We have changed the production process and believe that the life of the bulb has increased.

Let μ = mean life.

(1) Set Up Hypotheses α = 0.05

H0:

Ha:

L. Wang, Department of Statistics

University of South Carolina; Slide 67

Average Life of a Light Bulb

(2) Collect Data and calculate test statistic:

0.05

0.0121

tdf=14

1.761

2.5282

p-value = P(tdf=14> 2.5282) = 0.0121

L. Wang, Department of Statistics

University of South Carolina; Slide 68

Average Life of a Light Bulb

State Conclusion:

• At 0.05 level of significance there is insufficient evidence to conclude that µ > 2000 hours.
• At 0.05 level of significance there is sufficient evidence to conclude that µ > 2000 hours.

L. Wang, Department of Statistics

University of South Carolina; Slide 69

Mean Width of a Manufactured Part
• Test the theory that the mean width of a manufactured part differs from 100 cm.

Let µ = mean width.

(1) Set up Hypotheses α = 0.05

L. Wang, Department of Statistics

University of South Carolina; Slide 70

Mean Width of a Manufactured Part

(2,3) Collect data and calculate test statistic.

(4) State conclusion.

L. Wang, Department of Statistics

University of South Carolina; Slide 71

Given population parameter µ and value µ0:

For Ho: µ = µ0

Ha: µ = µ0

Ha: µ > µ0

Ha:µ < µ0

α/2

α/2

Ha

H0

Ha

α

H0

Ha

α

L. Wang, Department of Statistics

University of South Carolina; Slide 72

Ha

H0

Focus on the two types of errors in hypothesis test
• Reject H0 when H0 is true. This is called a type I error.

P(Rej H0|H0 is true) = α

• Fail to Reject H0 when Ha is true at some value. This is called a type II error.

P(Fail to Rej H0|Ha is true at some value) = β

L. Wang, Department of Statistics

University of South Carolina; Slide 73

Avg Life of Light Bulb - Type I Error

H0: µ < 2000

Assumes H0 is true.

Ha: µ > 2000

α = Probability that we will reject Ho when Ho is true.

Z

Fail to reject H0.

L. Wang, Department of Statistics

University of South Carolina; Slide 74

Type I and Type II Errors

H0: µ = 2000

What if µ = 2200

α = Probability that we will reject Ho when Ho is true.

β = Probability we will fail to reject Ho when Ha is true at µ = 2200

L. Wang, Department of Statistics

University of South Carolina; Slide 75

How can we control the size of β?
• The value of α.
• Location of our point of interest.
• Sample size.

L. Wang, Department of Statistics

University of South Carolina; Slide 76

Calculating β
• If µ = 2200, what is the probability of a type II error?
• Given: α = 0.05 and we are assuming

µ = 2000. We will also assume we know σ = 216.

L. Wang, Department of Statistics

University of South Carolina; Slide 77

Calculating β

H0: µ = 2000

What if µ = 2200

Fail to Reject Ho

2091

Reject Ho

L. Wang, Department of Statistics

University of South Carolina; Slide 78

Calculating β

L. Wang, Department of Statistics

University of South Carolina; Slide 79

α, β and Power
• α = P(Reject H0|µ = 2000) = 0.05
• β = P(Fail to Rej H0| µ = 2200) = 0.0254
• We say that the power of this test at

µ = 2200 is 1 – 0.0254 = 0.9746

• Power = 1 –β
• Power = P(Rej H0|µ is at some Ha level)

L. Wang, Department of Statistics

University of South Carolina; Slide 80

Plastic Injection Molding
• A plastic injection molding process for a part that has a critical width dimension historically follows a normal distribution.
• A recent sample of n = 4 yielded a sample mean of 101.4 and sample standard deviation of 8.
• Does this data support the statement: “The true average width is greater than 95.”?

L. Wang, Department of Statistics

University of South Carolina; Slide 81

Plastic Injection MoldingConfidence Interval Approach
• 95% confidence interval on µ:

L. Wang, Department of Statistics

University of South Carolina; Slide 82

Plastic Injection MoldingHypothesis Test Approach

H0:

Ha:

α = 0.05

Test statistics is

p-value =

Conclusion:

L. Wang, Department of Statistics

University of South Carolina; Slide 83