Goodness Of Fit

1 / 21

# Goodness Of Fit - PowerPoint PPT Presentation

Goodness Of Fit. Goodness Of Fit. The purpose of a chi-square goodness-of-fit test is to compare an observed distribution to an expected distribution . .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Goodness Of Fit' - helki

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Goodness Of Fit

The purpose of a chi-square goodness-of-fit test is to compare an observed distribution to an expected distribution.

For example, suppose there are four entrances to a building. You want to know if the four entrances are equally used. You observe 400 people entering the building on a random basis:

H0: pM = pB = pS1 = pS2

H1: The proportions are not all equal.

If the entrances are equally utilized, we would expect each entrance to be used approximately 25% of the time. Is the difference shown above statistically significant?

Chi Square Test

If the observed frequencies are obtained from a random sample and each expected frequency is at least 5, the sampling distribution for the goodness-of-fit test is a chi-square distribution with k-1 degrees of freedom. (where k = the number of categories)

Test Statistic

O = observed frequency in each category

E = expected frequency in each category

Goodness-of-Fit Test: Equal Expected Frequencies

Let f0 and fe be the observed and expected frequencies, respectively.

H0: There is no difference between the observed and expected frequencies.

H1: There is a difference between the observed and the expected frequencies.

H0: p1 = p2 = p3 = p4

H1: The proportions are not all equal.

EXAMPLE

The following information shows the number of employees absent by day of the week at a large manufacturing plant. At the .05 level of significance, is there a difference in the absence rate by day of the week?

DayFrequency

Monday 120

Tuesday 45

Wednesday 60

Thursday 90

Friday 130

Total 445

EXAMPLE continued

The expected frequency is:

(120+45+60+90+130)/5=89.

The degrees of freedom is (5-1)=4.

The critical value is 9.488. (Appendix B, P.495)

Example continued

DayFreq.Expec.(fo – fe)2/fe

Monday 120 89 10.80

Tuesday 45 89 21.75

Wednesday 60 89 9.45

Thursday 90 89 0.01

Friday 130 8918.89

Total445 445 60.90

Because the computed value of chi-square is greater than the critical value, H0 is rejected.

We conclude that there is a difference in the number of workers absent by day of the week.

ExampleGoodness of Fit

A seller of baseball cards wants to know if the demand for the following 6 cards is the same.

MegaStat

Goodness Of Fit

(unequal frequencies)

Example - Goodness Of Fit

(unequal frequencies)

The Bank of America (BoA) credit card department knows from national US government records that 5% of all US VISA card holders have no high school diploma, 15% have a high school diploma, 25% have some college, and 55% have a college degree. Given the information below, at the 1% level of significance can we conclude that (BoA) card holders are significantly different from the rest of the nation?

= (500)(.05)

= (500)(.15)

= (500)(.25)

= (500)(.55)

Reject H0

df = (4 - 1) = 3

Limitations of Chi-Square

1.) If there are only 2 cells, the expected frequency in each cell should be at least 5.

2.) For more than 2 cells, chi-square should not be used if more than 20% of fe cells have expected frequencies less than 5.

Two-thirds of the computed chi-square value is accounted for by just two categories (outcomes). Although the expected frequency is not less than 5, too much weight may be given to these categories. More experimental trials should be conducted to increase the number of observations.

MegaStat

Independence &

Contingency Tables

Contingency Table Analysis

Acontingency tableis used to investigate whether two traits or characteristics are related. Each observation is classified according to two criteria.

The degrees of freedomis equal to:

df = (# rows - 1)(# columns - 1).

The expected frequency is computed as:

Expected Frequency = (row total)(column total)/Grand Total

EXAMPLE

Is there a relationship between the location of an accident and the gender of the person involved in the accident? A sample of 150 accidents reported to the police were classified by type and gender. At the .05 level of significance, can we conclude that gender and the location of the accident are related?

EXAMPLE continued

The expected frequency for the work-male intersection is computed as (90)(80)/150=48. Similarly, you can compute the expected frequencies for the other cells.

H0: Gender and location are not related.

H1: Gender and location are related.

EXAMPLE continued

H0 is rejected if the computed value of χ2 is greater than 5.991. There are (3- 1)(2-1) = 2 degrees of freedom.

Find the value of χ2.

H0 is rejected. We conclude that gender and location are related.

MegaStat ExampleContingency Tables

A crime agency wants to know if a male released from prison and returned to his hometown has an easier (or more difficult) time adjusting to civilian life .

MegaStat