1 / 32

Rating Scale Analysis

Rating Scale Analysis. Michael Glencross Community Agency for Social Enquiry (CASE) UK Stata Users Group Meeting 10 September 2009. Rationale. Attitudes, beliefs, opinions are often measured by means of a set of Likert items

shelbyt
Download Presentation

Rating Scale Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rating Scale Analysis Michael Glencross Community Agency for Social Enquiry (CASE) UK Stata Users Group Meeting 10 September 2009

  2. Rationale • Attitudes, beliefs, opinions are often measured by means of a set of Likert items • A Likert item is a statement which the respondent is asked to evaluate according to some subjective or objective criteria • Usually the level of agreement or disagreement is measured

  3. Rationale • The format of a typical 5-point Likert item is: • Strongly disagree • Disagree • Neither agree nor disagree • Agree • Strongly agree

  4. Likert Item Rate your level of agreement with the following statement:

  5. Rationale • It is desirable to have a measure of the amount of agreement or disagreement in the sample • This is preferable to making an arbitrary decision

  6. Example 1Respondents: Disagree/Undecided/Agree?(1=SD; 2=D; 3=U; 4=A; 5=SA)

  7. Example 2Respondents: Disagree/Undecided/Agree? (1=SD; 2=D; 3=U; 4=A; 5=SA)

  8. Example 3Respondents: Disagree/Undecided/Agree? (1=SD; 2=D; 3=U; 4=A; 5=SA)

  9. Cooper (1978) • N respondents, r response categories, S total score • Sampling distribution of z is approx standard normal (N large)

  10. Whitney (1978) • N respondents, r response categories, S total score • Sampling distribution of t is approx tN-1 (N small)

  11. Hsu (1979) • Calculates the variance ( ) of the N ratings in the sample • This is compared with the variance ( ) of the null distribution of ratings • The ratio has a distribution that is approximately • For approx normal dist of population ratings,

  12. Hsu • significantly large → heterogeneity of ratings, i.e., disagreement

  13. Hsu • significantly small → homogeneity of ratings, i.e., agreement

  14. Likert.do • If N > 200, calculates Cooper z and displays appropriate message: • Result is significant, p<0.01, i.e., there is strong evidence that the respondents agree with the statement • Result is significant, p<0.05, i.e., there is evidence that the respondents disagree with the statement • Result is not significant, i.e., there is evidence that respondents are undecided about the statement

  15. Likert.do • If N <= 200, calculates Whitney t and displays appropriate message • Result is significant, p<0.01, i.e., there is strong evidence that the respondents disagree with the statement • Result is significant, p<0.05, i.e., there is evidence that the respondents agree with the statement • Result is not significant, i.e., there is evidence that respondents are undecided about the statement

  16. Likert.do • If z or t are not significant, calculates Hsu and displays appropriate message: • The lack of significance is associated with significant (p<0.01) heterogeneity (disagreement) of population ratings • The lack of significance is associated with significant (p<0.05) homogeneity (agreement) of population ratings • The lack of significance is not associated with any significant heterogeneity (disagreement) or homogeneity (agreement) of population ratings

  17. Example 1: Analysis • N=627 • N > 200 so use Cooper z • Mean_c = 2.8070175 • Cooper z = -3.416934 • Result is significant, p<0.01, i.e., there is strong evidence that respondents disagree with the statement

  18. Example 2: Analysis • N=468 • N > 200 so use Cooper z • Mean_c = 3.1346154 • Cooper z = 2.0592194 • Result is significant, p<0.05, i.e., there is evidence that the respondents agree with the statement

  19. Example 3: Analysis • N=542 • N > 200 so use Cooper z • Mean_c = 3.0369004 • Cooper z = .60745674 • Result is not significant, i.e., there is evidence that respondents are undecided about the statement • The lack of significance in Cooper z is not associated with any significant heterogeneity (disagreement) or homogeneity (agreement) of population ratings

  20. Stata code (1) capture program drop likert *! likert v1.1 MJ Glencross 13 August 2009 program define likert, rclass version 9.2 syntax varlist (max=1 numeric) quietly summarize `varlist' gen N=r(N) gen S=r(sum)

  21. Stata code (2) if N>200 { display "N > 200 so use Cooper z" display " Mean_c = " r(mean) gen z=(r(sum)-3*N)/sqrt(2*r(N)) display "Cooper z = " z if z>2.58 { display "Result is significant, p<0.01" display "i.e., there is strong evidence that the respondents agree with the statement" } else if z>1.96 & z<2.58 { . . .

  22. Stata code (3) . . . else{ gen chisq01=invchi2tail((r(N)-1),0.01) gen critvar01=(0.764*chisq01)/(r(N)-1) gen chisq05=invchi2tail((r(N)-1),0.05) gen critvar05=(0.764*chisq05)/(r(N)-1) . . .

  23. Stata code (4) . . . if abs(z)<1.96 & critvar01<0.764 { display "The lack of significance in Cooper z is associated with significant (p<0.01) heterogeneity (polarisation/disagreement) of population ratings" } else if abs(z)<1.96 & critvar01>0.764 & critvar05<0.764 {

  24. Stata code (5) else { display "N <= 200 so use Whitney t" display " Mean_t = " r(mean) gen isq= `varlist'*`varlist' quietly summarize isq gen t=(S-3*N)/sqrt((N*r(sum)-S^2)/(N-1)) display "Whitney t = " t

  25. Stata code (6) gen T=ttail((r(N)-1),t) if t>0 & T<0.01{ display "Result is significant,p<0.01" display "i.e., there is strong evidence that the respondents agree with the statement" } else if t>0 & T<0.05 & T>0.01 {. . .

  26. Stata code (7) if T>0.05 & critvar01<0.764 { display "Lack of significance in Whitney t is associated with significant (p<0.01) heterogeneity (polarisation/disagreement) of population ratings" } . . . . . . } } end

  27. Other issues • Assumptions about a Likert item • Interval level data? Use parametric analysis • Ordinal (ordered categorical) data? Use non-parametric analysis • Likert scale is a summation of Likert items • Unidimensional scale is implied. How do you know? Principal component analysis? Correspondence analysis? • Assumptions about Cooper z, Whitney t and Hsu chi sq

  28. Problems of Likert Scales • Response set • tendency to give identical responses, regardless of item content • Response style • tendency to favour a particular subset of responses (SA or D) • Agreement bias • tendency to agree with statements regardless of content

  29. Problems of Likert Scales • Social desirability bias • tendency to provide responses to please interviewer • Assumed ordinality • assumption that SA > A > U > D > SD • Meaning of middle category • “Undecided” might be a genuine neutral or just a ‘safe’ option

  30. Further Research • Develop tests (z and t) for difference between two Likert items • Develop test for differences between three or more items (ANOVA, Kruskal-Wallis) • Rating scales and Item Response Theory models (1-, 2- and 3-parameter models)

  31. Further Research • Use Likert scale data as a basis for obtaining interval level estimates on a continuum by applying the polytomous Rasch model • Model allows testing of hypothesis that statements represent increasing levels of attitude • Not all Likert scaled items can be used

  32. References • Cooper, M. (1978) An exact probability test for use with Likert-type scales. Educational and Psychological Measurement,36, 647-655. • Hsu, L. (1979) Agreement or disagreement of a set of Likert-type ratings. Educational and Psychological Measurement, 39, 291-295. • Whitney, D. R. (1978) An alternative test for use with Likert-type scales. Educational and Psychological Measurement, 38, 15-18.

More Related