Chris Morgan, MATH G160 csmorgan@purdue April 13, 2012 Lecture 30

1 / 21

Chris Morgan, MATH G160 csmorgan@purdue April 13, 2012 Lecture 30 - PowerPoint PPT Presentation

Chris Morgan, MATH G160 csmorgan@purdue.edu April 13, 2012 Lecture 30. Chapter 2.4: Chi-Squared ( χ 2 ) Test and Independence between two Categorical Variables. Two-Way Tables.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Chris Morgan, MATH G160 csmorgan@purdue April 13, 2012 Lecture 30' - oya

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Chris Morgan, MATH G160

csmorgan@purdue.edu

April 13, 2012

Lecture 30

Chapter 2.4:

Chi-Squared (χ2) Test and Independence between two Categorical Variables

Two-Way Tables
• Any table which allows you to observe multiple pieces of information to help find conditional, joint, and marginal probabilities
• Expected Counts: the expected count in any cell of a two-way table when the null hypothesis is true
• The null hypothesis is what you think to be true given previous research, outside readings, or personal opinion based on an educated guess
Example 1a:
• Above is a sample of students in the College of Business. They were asked their chosen major and their sex.
• What is the probability that a student is a Finance Major?
• What is the probability that a student is Female?
Example 1b:
• Above is a sample of students in the College of Business. They were asked their chosen major and their sex.
• 3. What is the probability that a student is female given that the person is in Administration?
Example 1c:
• Above is a sample of students in the College of Business. They were asked their chosen major and their sex.
• 4. What is the probability that a student is an Administration major given that the student is female?
Hypothesis Testing
• Our null hypothesis is what we expect to see given no interaction between variables
• Our alternative hypothesis is some improvement or change on the null hypothesis
• Never accept the Ha
• Always “reject the Ho” or “fail to reject the Ho”
• Why?
• For the chi-square test:
• Ho: there is no association between two categorical variables, and we conclude they’re independent
• Ha: there is an association between two categorical variables, and we conclude there is a relationship
Calculating a Chi-Squared Statistic
• Denoted χ2
• The observed count is whatever value we see in the table
• The expected count for each cell in the table can be found by taking:

Note: We can safely use the χ² test under two important conditions:

1. when no more than 20% of the expected counts are less than five

2. when all individual expected counts are one or greater

Interpreting a Chi-Squared Test
• I can compare the calculated chi-square test-statistic to a critical value to see if my variables do in fact have a relationship
• We will denote the test statistic as χ²* and the critical value as χ²α, (r-1)(c-1) where r is the number of rows, c is the number of columns, and the degrees of freedom is found by: df = (r-1)*(c-1). I can then look up the critical value in the table (see next slide) using the alpha level and df
• If: | χ²*| > χ²α, (r-1)(c-1)

…then we will reject the null hypothesis and conclude the alternative hypothesis, that the observed values were sufficiently far away from the expected value, meaning it is a significant result and there exists a relationship between the two variables

• If: | χ²*| ≤ χ²α, (r-1)(c-1)

…then we fail to reject the null hypothesis and the two variables are independent (meaning no relationship exists)

Chi-Square (χ²) Distribution Critical Values

The first row is the alpha level

The first column is the number of df

Example 2a:
• Returning to example one, is there a relationship between gender and major?
• Find expected counts
• Compare expected counts to observed counts
• Calculate χ²
• Compare chi-squaretest statistic (χ²*) to chi-square critical value (χ²α, (r-1)(c-1) )
Example 2b: Fill in expected counts

Recall the equation for expected counts:

Example 2c: Calculate χ²

Recall the equation for chi-square:

Example 2d: Calculate χ²

Recall the equation for chi-square:

Now we just have to add them all together:

and compare the chi-square value to the critical value…

Example 2e: Is χ² significant?

To compare the chi-square value to the critical value I look up in the table the value for the chi-squared critical value when alpha = 0.05 and df = 3:

Therefore, since the absolute value of the test statistic is less than or equal to the critical value we (circle one):

reject the Ho fail to reject the Ho accept the Ho accept the Ha

And conclude….what?:

Example 3a:
• Is there a relationship between favorite soda and favorite ice cream?
• Find expected counts
• Compare expected counts to observed counts
• Calculate χ²
• Compare chi-squaretest statistic (χ²*) to chi-square critical value (χ²α, (r-1)(c-1) )
Example 3b: Fill in expected counts

Recall the equation for expected counts:

Example 3c: Calculate χ²

Recall the equation for chi-square:

Example 3d: Calculate χ²

Recall the equation for chi-square:

Now we just have to add them all together:

and compare the chi-square value to the critical value…

Example 3e: Is χ² significant?

To compare the chi-square value to the critical value I look up in the table the value for the chi-squared critical value when alpha = 0.05 and df = ____:

Therefore, since the absolute value of the test statistic is less than or equal to the critical value we (circle one):

reject the Ho fail to reject the Ho accept the Ho accept the Ha

And conclude….what?:

To review:

When calculating a chi-squared value:

1. Find expected counts

2. Compare expected counts to observed counts

3. Calculate a χ² test statistic

4. Compare test statistic to critical value using table

5. Make a conclusion

If | χ²*| > χ²α, (r-1)(c-1) REJECT THE NULL: relationship exists

If | χ²*| ≤ χ²α, (r-1)(c-1) FAIL TO REJECT THE NULL: independent, no relationships exists

NEVER SAY ACCEPT THE NULL!!!!