13.1 The Chi-square Goodness-of-Fit test

13.1 The Chi-square Goodness-of-Fit test Agenda for 3/31 and 4/1 Extended Test Time – 20 minutes Introduction of X 2 Activity with X 2 Discussion of Homework

X 2 - Chi (Ki) • The chi-square test is a statistics used to compare and decide whether two or more populations, variables or characteristics are the same. • It does not matter what the distributions of the populations are so long as the relative frequencies are known for each population or the population and some standard population frequencies.

Three types: 1st - X 2 Goodness of Fit • Goodness of fit – used to test to see if the null hypothesis population distribution is the same as a referenced distribution. • (ex: is the companies claim actually true?)

2nd - X 2 Test of Homogeneity • Homogeneity – is an overall test that tells us whether the data give a good indication that the categorical variable is the same in multiple populations. • (ex: are the colors in a roll of Smarties evenly distributed?)

3rd – X 2 Test of Association/Independence • Test for Independence – used to test the association/independence between categorical variables • (ex: is there a relationship between stress and heart attacks?)

Problem: The number of defects for a new thermometer is classified by the following defect types with their expected defects percent obtained from historical statistics from an older model thermometer: • State your Ho and Ha. Ho? • The distribution of defeats for the new thermometer is the same as the distribution of the old thermometer. • Ha? • The distribution of defeats for the new thermometer is different form the distribution of the old thermometer.

You may use this test when all individual EXPECTED counts are at least 1 and no more than 20% of the expected are less than 5%. • So how do we check? E=np (sample size*% of each category) • .89(1336) = 1162.32 • .09(1336) = 120.26 • .03(1336) = 40.08 • .01(1336) = 13.36 • All counts are greater than 5, therefore we may use a chi-squared procedure.

Step 3. Determine or compute the expected frequencies:

= 9.72 from computational table .02 < p < .025 Conclusion or inference: • The old thermometer is different than the new (pertaining to the characteristics of defect distribution) • So Ho (null hypothesis) is rejected; there seems to be a difference between old and new.

Color distribution of M&M’s • Blue 24% • Brown 13% • Green 16% • Orange 20% • Red 13% • Yellow 14% • HW: RTN 13.2 pp744-766 Do #’s 14,16,20,22

Homework 13.1 p. 736 • a) X 2 =1.41, df = 1, p-value is between .20 and .25 and can be written .20 < p < .25 • b) X 2 =19.62, df = 9, .05 < p-value < .10 • c) X 2 =7.04, df = 6, p-value is off the chart to the left, therefore the p-value >.25

Homework #13.2 Are you married?

Step 1 – State the Ho and Ha • Ho: The marital-status distribution of 25-29 year old males is the same as that of the population as a whole. • Ha: The marital-status distribution of 25-29 year old males is different as that of the population as a whole.

Step 2 - Choose the appropriate test and Check Conditions • We can use a goodness of fit test to measure the strength of evidence against the hypothesized distribution (marital status) provided all expected counts are greater than 5. • Expected (np) are 140.5, 281.5, 32, 46 • Therefore we can proceed with the test.

Step 3 – Carry out the Inference procedure

Step 4-Interpret results in the context of the problem • Since the X 2 = 161.77 with a df = 3, our p-value is off the chart to the right and essentially 0. • With an alpha level of 5% or even 1%, this is strong evidence to reject the Ho and claim that the distribution of marital status is different among 25-29year old males than that of the population as a whole.

Is your random number generator working?

Step 1 – State the Ho and Ha • Ho: p1=p2…….=p9 which is = .1 • Ha: At least one of the p’s is not = .1 • You are looking for uniform, therefore all =

Step 2 - Choose the appropriate test and Check Conditions • We can use a goodness of fit test to measure the strength of evidence against the hypothesized distribution (the claim is that the proportion’s are = .1) provided all expected counts are greater than 5. • Expected (np) are .1x200 = 20 all > 5 • Therefore we can proceed with the test.

Step 3 – Carry out the Inference procedure RUN SIMULATION (seed to 123 to get my values)123 → rand, then randInt(0,9,200) → in list

Step 4-Interpret results in the context of the problem • Since the ΣX 2 = 13.2 • with a df = 9 , our p-value is .15<p<.20 • With an alpha level of 5%, this is not significant evidence to reject the null hypothesis, therefore we can say there is no evidence to say the sample data was generated from a distribution different from the uniform distribution.

Carnival Games

Run the test

What were we thinking? • Ho: The carnival wheel is balance and all 4 parts are evenly distributed. • Ha: The carnival wheel is not balanced and all 4 parts are NOT evenly distributed. • Since all exptected values are > than 5 (E=125) we can use the X 2 test. • Running the test gave us a X 2 of 24 w/ 3df and a p<.0005 • We have sufficient evidence to reject the null and make a claim that the wheel is not balanced. • Where is the most significant X 2 ?

13.1 The Chi-square Goodness-of-Fit test