1 / 37

Chi-Square

Chi-Square. Heibatollah Baghi, and Mastee Badii. Different Scales, Different Measures of Association. Chi-Square ( χ 2 ) and Frequency Data.

johana
Download Presentation

Chi-Square

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chi-Square Heibatollah Baghi, and Mastee Badii

  2. Different Scales, Different Measures of Association

  3. Chi-Square (χ2) and Frequency Data • Up to this point, the inference to the population has been concerned with “scores” on one or more variables, such as CAT scores, mathematics achievement, and hours spent on the computer. • We used these scores to make the inferences about population means. To be sure not all research questions involve score data. • Today the data that we analyze consists of frequencies; that is, the number of individuals falling into categories. In other words, the variables are measured on a nominal scale. • The test statistic for frequency data is Pearson Chi-Square. The magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.

  4. Steps in Test of Hypothesis • Determine the appropriate test • Establish the level of significance:α • Formulate the statistical hypothesis • Calculate the test statistic • Determine the degree of freedom • Compare computed test statistic against a tabled/critical value

  5. 1. Determine Appropriate Test • Chi Square is used when both variables are measured on a nominal scale. • It can be applied to interval or ratio data that have been categorized into a small number of groups. • It assumes that the observations are randomly sampled from the population. • All observations are independent (an individual can appear only once in a table and there are no overlapping categories). • It does not make any assumptions about the shape of the distribution nor about the homogeneity of variances.

  6. 2. Establish Level of Significance • α is a predetermined value • The convention • α = .05 • α = .01 • α = .001

  7. 3. Determine The Hypothesis:Whether There is an Association or Not • Ho : The two variables are independent • Ha : The two variables are associated

  8. 4. Calculating Test Statistics • Contrasts observed frequencies in each cell of a contingency table with expected frequencies. • The expected frequencies represent the number of cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated). • Expected frequency of two unrelated events is product of the row and column frequency divided by number of cases. Fe= Fr Fc / N

  9. Continued 4. Calculating Test Statistics

  10. Continued 4. Calculating Test Statistics Observed frequencies Expected frequency Expected frequency

  11. 5. Determine Degrees of Freedom df = (R-1)(C-1) Number of levels in column variable Number of levels in row variable

  12. 6. Compare computed test statistic against a tabled/critical value • The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable • The critical tabled values are based on sampling distributions of the Pearson chi-square statistic • If calculated 2 is greater than 2 table value, reject Ho

  13. Example • Suppose a researcher is interested in voting preferences on gun control issues. • A questionnaire was developed and sent to a random sample of 90 voters. • The researcher also collects information about the political party membership of the sample of 90 respondents.

  14. Bivariate Frequency Table or Contingency Table

  15. Bivariate Frequency Table or Contingency Table Observed frequencies

  16. Row frequency Bivariate Frequency Table or Contingency Table

  17. Bivariate Frequency Table or Contingency Table Column frequency

  18. 1. Determine Appropriate Test • Party Membership ( 2 levels) and Nominal • Voting Preference ( 3 levels) and Nominal

  19. 2. Establish Level of Significance Alpha of .05

  20. 3. Determine The Hypothesis • Ho : There is no difference between D & R in their opinion on gun control issue. • Ha : There is an association between responses to the gun control survey and the party membership in the population.

  21. 4. Calculating Test Statistics

  22. Continued 4. Calculating Test Statistics = 50*25/90

  23. Continued 4. Calculating Test Statistics = 40* 25/90

  24. Continued 4. Calculating Test Statistics = 11.03

  25. 5. Determine Degrees of Freedom df = (R-1)(C-1) =(2-1)(3-1) = 2

  26. 6. Compare computed test statistic against a tabled/critical value • α = 0.05 • df = 2 • Critical tabled value = 5.991 • Test statistic, 11.03, exceeds critical value • Null hypothesis is rejected • Democrats & Republicans differ significantly in their opinions on gun control issues

  27. SPSS Output for Gun Control Example

  28. Additional Information in SPSS Output • Exceptions that might distort χ2Assumptions • Associations in some but not all categories • Low expected frequency per cell • Extent of association is not same as statistical significance Demonstrated through an example

  29. Another Example Heparin Lock Placement Time: 1 = 72 hrs 2 = 96 hrs from Polit Text: Table 8-1

  30. Continued Hypotheses in Heparin Lock Placement • Ho: There is no association between complication incidence and length of heparin lock placement. (The variables are independent). • Ha: There is an association between complication incidence and length of heparin lock placement. (The variables are related).

  31. Continued More of SPSS Output

  32. Pearson Chi-Square • Pearson Chi-Square = .250, p = .617 Since the p > .05, we fail to reject the null hypothesis that the complication rate is unrelated to heparin lock placement time. • Continuity correction is used in situations in which the expected frequency for any cell in a 2 by 2 table is less than 10.

  33. Continued More SPSS Output

  34. Phi Coefficient • Pearson Chi-Square provides information about the existence of relationship between 2 nominal variables, but not about the magnitude of the relationship • Phi coefficient is the measure of the strength of the association

  35. Cramer’s V • When the table is larger than 2 by 2, a different index must be used to measure the strength of the relationship between the variables. One such index is Cramer’s V. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with particular categories of the second variable.

  36. Cramer’s V • When the table is larger than 2 by 2, a different index must be used to measure the strength of the relationship between the variables. One such index is Cramer’s V. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with particular categories of the second variable. Smallest of number of rows or columns Number of cases

  37. Take Home Lesson How to Test Association between Frequency of Two Nominal Variables

More Related