STAT131/171 W9L1 Confidence Intervals

1 / 64

# STAT131/171 W9L1 Confidence Intervals - PowerPoint PPT Presentation

STAT131/171 W9L1 Confidence Intervals. by Anne Porter alp@uow.edu.au. Lecture Outline. Review Central Limit theorem Intuitive ideas associated with confidence intervals CI for means Using Z Using t CI for proportions Hypothesis tests. Variability of Sample Means.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## STAT131/171 W9L1 Confidence Intervals

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### STAT131/171W9L1 Confidence Intervals

by

Anne Porter

alp@uow.edu.au

Lecture Outline
• Review Central Limit theorem
• Intuitive ideas associated with confidence intervals
• CI for means
• Using Z
• Using t
• CI for proportions
• Hypothesis tests
Variability of Sample Means
• Sample means are random quantities
• Sample means change from sample to sample
• They also have a mean and standard deviation
Even with a bimodal distribution

If we repeatedly take samples which are large enough from this

distribution and plot the means of these samples we know that...

Central Limit Theorem(Large Sample Normality)
• Given a random sample X1, X2, ..Xn from any distribution with mean m and finite variance s2, then irrespective of the distribution of the parent population, the distribution of approaches the shape of a normal distribution when the sample size is large and irrespective of sample size has a

mean

and standard deviation

Activity 1:Intuitive notions of Confidence intervals
• When we take a sample and use it to estimate the population mean then we know that the point estimate will have associated with a certain amount of error.
• So in preference to providing a point estimate of the population parameter, we generally provide an interval estimate.

Distribution of measurements

How confident would you be of each of the following statements based on the sample of data plotted?

The mean

• is exactly 1.146
• is between1.414 and 1.151
• is between1.10 and 1.18
• is between1.05 and 1.24
• is between 0.0 and 2.2
• (Source, Griffiths et al, 1998, p. 290)
Confidence
• Our confidence increases as the width of the interval increases.
• We have a much greater confidence that m is between 0.0 and 2.2 than any of the other intervals given.
• All intervals have some risk of not containing the true population mean m but there is greater confidence attached to the wider interval
The width of the confidence interval relates to

the variability in the sample and sample size

• Does greater variability in the sample, imply needing a wider or narrower confidence interval ?

Wider

• As the sample size n increases does the sample mean becomes
• more or less stable ?

More stable

As n increases estimate of m is more accurate.

Confidence Intervals for the means known
• Decisions through Data Video Unit 20 Confidence Intervals

This film clip explores how confidence intervals are

constructed for the mean of a normal population based on a

Sample drawn from that population. It assumes that the s of

The population is known and hence uses a standard normal

distribution. In this instance the two-sided confidence interval is of the form

Confidence Intervals s unknown
• Generally s is not known and it needs to be estimated from the sample. In this instance provided the population is normally distributed the t distribution can be used to construct a two-sided confidence interval

Other forms are sometimes used – think rather than apply rote

Confidence Intervals
• The two-tailed 100(1-a)% confidence is constructed using the values of Z and t which are associated with having a/2 beyond the |Z| and beyond the |t| according to which distribution of the means is in use. For example

a/2

a/2

Example 1

What is the Z score corresponding to 90% confidence interval about the mean (a=0.1)?

0.1/2

0.1/2

Example 1

For a two-sided CI

this means 5% in either tail

What is the Z score corresponding to 90% confidence interval about the mean (a=0.1)?

Z=-1.645

Example 2
• What is the Z score corresponding to 95% confidence interval about the mean? a/2=
Example 2
• What is the Z score corresponding to 95% confidence interval about the mean? a/2=.025

Z=-1.96

Example 3
• What is the Z score corresponding to 99% confidence interval about the mean? a/2=

Z

Example 3
• What is the Z score corresponding to 99% confidence interval about the mean? a/2=0.005

Z

Z= - 2.575

Example 4
• What is the t score corresponding to 90% confidence interval about the mean given a sample from a normal population of size 20?

A 90% confidence interval means that there will be what

percentage in each tail of the distribution

How many degrees of freedom will there be?

Example 4
• What is the t score corresponding to 90% confidence interval about the mean given a sample from a normal population of size 20?

A 90% confidence interval means that there will be what

percentage in each tail of the distribution

a/2=0.05 ie 5%

How many degrees of freedom will there be?

df = v = n-1 = 20-1 = 19

p

tp

0

When the variance of the population is unknown
• With n=n-1 =20-1=19 degrees of freedom and
• a/2=0.05 ie 5% in the top tail is given by
• ie the probability below tp is 0.95

a 0.1 0.05 0.025 0.01 0.005

p

tp

0

When the variance of the population is unknown

Then t 19, 0.1/2 =1.729

• With n=n-1 =20-1=19 degrees of freedom and
• a/2=0.05 ie 5% in the top tail is given by
• ie the probability below tp is 0.95

OR t 0.95=1.729

a 0.1 0.05 0.025 0.01 0.005

Example 5
• What is the t score corresponding to 95% confidence interval about the mean given a sample from a normal population of size 20?

A 95% confidence interval means that there will be what

percentage in each tail of the distribution

How many degrees of freedom will there be?

Example 5
• What is the t score corresponding to 95% confidence interval about the mean given a sample from a normal population of size 20?

A 95% confidence interval means that there will be what

percentage in each tail of the distribution

a/2=0.025 or 2.5%

How many degrees of freedom will there be?

df = v = n-1 = 20-1 = 19

(think! it no longer matters if it is a/2 just read the

amount in the tail that is required)

a

p

tp

0

When the variance of the population is unknown
• With n=n-1 =20-1=19 degrees of freedom and
• a/2=0.025

a 0.1 0.05 0.025 0.01 0.005

t 19, 0.05/2=2.093

Example 6
• What is the t score corresponding to 99% confidence interval about the mean given a sample from a normal population of size 20?

p

tp

0

When the variance of the population is unknown
• With n=n-1 =20-1=19 degrees of freedom and
• a/2=0.005

p

tp

0

t19,0.01/2 =2.861

When the variance of the population is unknown
• With n=n-1 =20-1=19 degrees of freedom and
• a/2=0.005
Example 7
• If a sample mean IQ is 105 based on a sample of 49 students. Given that the IQ test is standardised and is known to have a standard deviation of 15. What is the 95% confidence interval for the population mean. Will we use Z or t?
Example 7
• If a sample mean IQ is 105 based on a sample of 49 students. Given that the IQ test is standardised and is known to have a standard deviation of 15. What is the 95% confidence interval for the population mean. Will we use Z or t?

In this case we have both a large population and a

known standard deviation. So

Example 7
• If a sample mean IQ is 105 based on a sample of 49 students. Given that the IQ test is standardised and is known to have a standard deviation of 15. What is the 95% confidence interval for the population mean.

=102.86, 107.14

Example 8
• Find the mean and standard deviation of the height of 5 students in STAT131.

Sample 1: 178,170,165,154,165 cm

Sample 2: 181,172,190,168,168 cm

Sample 3: 184,185,175,180,163 cm

Example 8
• Find the mean and standard deviation of the height of 5 students in STAT131.

Sample 1: 178,170,165,154,165

Sample 2: 181,172,190,168,168

Sample 3: 184, 185,175,180,163

S=8.7

S=9.55

S=8.96

Example 8
• To find the a confidence interval for the mean of the height of 5 students in STAT131 for each sample.

Sample 1: 178,170,165,154,165 cm

Sample 2: 181,172,190,168,168 cm

Sample 3: 184,185,175,180,163 cm

Will we use a t or Z distribution? Why?

Example 8
• To find the a confidence interval for the mean of the height of 5 students in STAT131 for each sample.

Sample 1: 178,170,165,154,165 cm

Sample 2: 181,172,190,168,168 cm

Sample 3: 184,185,175,180,163 cm

Will we use a t or Z distribution? Why?

Although the sample size is smallheight can be assumed to be normally distributed, as s is unknown use a t.

Example 9
• Find the 95% confidence for the population mean of heights

Sample 1: 178,170,165,154,165

S=8.7

What is n?

n=5

t=2.776

What is t n-1,a/2 ?

Example 9
• Find the 95% confidence for the population mean of heights

Sample 1: 178,170,165,154,165

S=8.7

What is n?

n=5

t=2.776

What is t n-1,a/2 ?

=(155.6, 177.2)

Example 10: Confidence interval for m
• Find the 95% confidence of the population of heights

Sample 2: 181,172,190,168,168

S=9.55

What is t n-1,a/2 ?

What is n?

n=5

t=2.776

What is the confidence interval for m?

= (163.94, 187.66)

Example 10: Confidence interval for m
• Find the 95% confidence of the population of heights

Sample 2: 181,172,190,168,168

S=9.55

What is t n-1,a/2 ?

What is n?

What is the confidence interval for m?

Example 10: Confidence interval for m
• Find the 95% confidence of the population of heights

Sample 2: 181,172,190,168,168

S=9.55

What is t n-1,a/2 ?

What is n?

n=5

t=2.776

What is the confidence interval for m?

= (163.94, 187.66)

Example 11: CI for m
• Find the 95% confidence of the population of heights

Sample 3: 184, 185,175,180,163

S=8.96

n=5

t=2.776

Example 11: CI for m
• Find the 95% confidence of the population of heights

Sample 3: 184, 185,175,180,163

S=8.96

n=5

t=2.776

=(166.28, 188.52)

Example 12: Interpretation of CI
• Interpret these confidence intervals
• Sample 1: CI (155.60, 177.2)
• Sample 2: CI (163.94, 187.66)
• Sample 3: CI (166.28, 188.52)
Example 12: Interpretation of CI
• Interpret these confidence intervals
• Sample 1: CI (155.60, 177.2)
• Sample 2: CI (163.94, 187.66)
• Sample 3: CI (166.28, 188.52)
• These all represent 95% confidence intervals. That is if the process of sampling were repeated, 95% of the confidence intervals should cover the true population parameter m. We do not actually know if one of these three intervals contains the true parameter.
Central Limit Theorem (Large Sample Normality)
• Given a random sample X1, X2, ..Xn from any distribution with mean m and finite variance s2, then irrespective of the distribution of the parent population, the distribution of approaches the shape of a normal distribution when the sample size is large, with a mean and standard deviation

and

• We will draw on this theorem when finding confidence intervals for proportions
Symbols
• Let p=population proportion or binomial probability
• Let =corresponding sample proportion
• Using the central limit theorem, if numerous (n>30) samples are taken the distribution of all possible values of is approximately a normal curve with mean and standard deviation

and

Sampling distribution of proportions
• Mean
• Variance
• Standard Deviation
100(1-a)% CI of the population proportion p
• When n is large , so we can assume a normal distribution of sample proportions, the confidence interval for a proportion is given by
An example: Class Roll
• Using sample proportions to estimate a population
• Take a sample of 30 students (needs to be at least 30 for proportions), see if each is present or absent
• Use the sample to estimate the proportion of students in class
• The proportions, together with the sample size

completely summarise the data.

List of students

• List of students
Questions based on our sample of data
• Class population size N is actually 200
• What proportion of the randomly selected sample of thirty students did we have?
• Use this proportion to estimate how many students are in the class
• Check the number present in class
• What do you notice about the estimate?

It is likely to be good but wrong!

So we might want to create a confidence interval

for our estimate.

Sampling distribution of proportionsback to class!
• Mean
• Variance
• Standard Deviation

So based on our class roll what is the 95% confidence interval of

the proportion of students attending class?

Example: Class Roll
• The 95% CI for the proportion is given by
• What is Z a/2
• What is the sample size n for sample class roll?
• What is the estimate, , of the proportion of students attending lectures?
• Note this is just one sample of the many that could be taken.
Example: CI proportion in Class
• Hence the 95% confidence interval for the proportion of students attending class on Thursdays is from to
• This is simply one of a number of intervals that could be calculated. 95% of the time we would expect the many intervals to encompass the true proportion of the population attending on Tuesdays.
An example: From Griffiths p.276
• The Librarian wishes to select texts over 20 years of age from a library collection of 480,000 references and place these in an archive. The library wishes to estimate the proportion of the collection which will be placed into the archives. How will we do this?
An example: From Griffiths p.276
• How will we do this?
• Let us take a sample of 100 references
• Use this sample to estimate the proportion of the population that will require archival.
• Represent the categories as 0 (not archived) or 1(archived)
• Because this is a sample we know that the actual proportion will vary from the 25% obtained from this sample.
• The mean of the means of is equal to the population mean
• the standard deviation of the means of the samples is for the 0/1 population.
• So for the 0/1 population , where the proportion of 1's is denoted by p . The mean of the population is
• m=p=
• Using the binomialthe variance is

0.25

(where n=1 trial in each experiment)

s2= p(1-p)

An example: From Griffiths p.276

How do we obtain the proportion?

Count of number of 1 out of sample size 100.

Let us say 25/100

• How do we find the mean for this sample?
Sampling distribution of proportions
• Mean
• Variance
• Standard Deviation
The 95% confidence interval for books is

Given this confidence interval the librarian could be expected to have to house between

approximately and .

The estimate is based on one of many possible confidence intervals but we know that the process provides an interval which encompasses the true proportion 95% of the time

Hypothesis testing: Process manual
• To test the hypothesis that the proportion of students attending class is 75% (samples will vary). A sample of 30 randomly selected students yields a of 0.53. Test the hypothesis at the 95% level of confidence (ie a=0.05)

Step 1: Specify the hypotheses

Step 2: Set the significance level a

Step 3: Perform the experiment, decide on the statistic

say Z or T and calculate

Step 4: Determine the rejection region

Step 5: Retain or Reject Ho and draw conclusions