640 likes | 830 Views
2. Readings and Homework. Homework as stated in syllabus is for the following weekReadings are relevant to the current week. 3. Overview of class. Types of measurement scalesRationale for multi-item measuresScale construction methods Error concepts. 4. Types of Measurement Scales. Categorical (nominal)ClassificationNumbers are labels for categoriesContinuous (along a continuum)Ordinal IntervalRatio.
E N D
1. 1 Class 1c Classical Methods of Scale Construction
2. 2 Readings and Homework Homework as stated in syllabus is for the following week
Readings are relevant to the current week
3. 3 Overview of class Types of measurement scales
Rationale for multi-item measures
Scale construction methods
Error concepts
4. 4 Types of Measurement Scales Categorical (nominal)
Classification
Numbers are labels for categories
Continuous (along a continuum)
Ordinal
Interval
Ratio
5. 5 Classification vs. Continuous Scores CES-D continuous score
20 items summed using Likert scaling methods
Range of sum is 0-60, used as continuous score in correlational studies
CES-D classification score:
Those scoring 16 or higher are “classified” as having likely depression
Referred for further screening
6. 6 Categorical (Nominal) Scales/Measures Primary language 1 Spanish 2 English 3 Other
Can you walk without help?
1 Yes 2 No
Numbers have no inherent meaning
7. 7 Ordinal Scales: Numbers Reflect Increasing Level Change in health:
1 Better
2 No change
3 Worse
Income:
1 < $10,000
2 $10,000 - <$20,000
3 $20,000 - <$30,000
4 >$30,000
8. 8 Another Example of Ordinal Scale How much pain did you have this past week?
1 None
2 Very mild
3 Mild
4 Moderate
5 Severe
6 Very severe
9. 9 Feature of Ordinal Scales Distances between numbers are unknown and probably vary
some closer together in meaning than others
When ordinal responses are determining extent of agreement (agree, disagree)
referred to as a Likert scale
Likert scale has since come to have other meanings in health measurement
10. 10 Interval Scales Numbers have equal intervals
A unit change is constant across the scale
Example - temperature
can add and subtract scores
a 2 unit change is the same at lower temperatures as higher temperatures
11. 11 Ratio Scale Has a meaningful zero point
Change scores have specific meaning
and can multiply
e.g., one score can be 2 or 3 times another
Examples
Weight in pounds
Income in dollars
Number of visits
12. 12 Types of Measurement Scales and Their Properties
13. 13 Overview of class Types of measurement scales
Rationale for multi-item measures
Scale construction methods
Error concepts
14. 14 Single- and Multi-Item Measures Advantages of single items
Response choices are interpretable
Disadvantages
Numbers are not easily interpretable
Limited variability
Easy to get skewed distributions
Reliability is usually low
Difficult to assess a complex concept with one item
15. 15 Interpretability of “Numbers” in Single Item Ordinal Scale
16. 16 Interpretability of “Numbers” in Single Item Ordinal Scale
17. 17 Estimated Distance Between Levels in Ordinal Scale (N=2,928) (0-100 scale)
18. 18 Distance Between Levels in an Ordinal Scale (N=2,928)
19. 19 Distance Between Levels: “In general, how would you rate your health?”
20. 20 Distance Between Levels: “In general, how would you rate your health?”
21. 21 Multi-Item Measures or Scales Multi-item measures are created by combining two or more items into an overall measure or scale score
22. 22 Advantages of Multi-item measures More scale values (enhances sensitivity)
Improves score distribution (more normal)
Reduces number of variables needed to measure one concept
Improves reliability (reduces random error)
Can estimate a score if some items are missing
Enriches the concept being measured (more valid)
23. 23 Overview of class Types of measurement scales
Rationale for multi-item measures
Scale construction methods
Error concepts
24. 24 Types of Scale Construction Summated ratings scales
Likert scaling
Utility weighting or preference-based measures (econometric scales)
Guttman scaling
Thurstone scales
Many others
25. 25 Example of a 2-item Summated Ratings Scale How much of the time .... tired?
1 - All of the time
2 - Most of the time
3 - Some of the time
4 - A little of the time
5 - None of the time How much of the time
…. full of energy?
1 - All of the time
2 - Most of the time
3 - Some of the time
4 - A little of the time
5 - None of the time For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)
26. 26 Step 1: Reverse One Item So They Are All in the Same Direction How much of the time .... tired?
1 - All of the time
2 - Most of the time
3 - Some of the time
4 - A little of the time
5 - None of the time How much of the time
…. full of energy?
1=5 All of the time
2=4 Most of the time
3=3 Some of the time
4=2 A little of the time
5=1 None of the time For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)
27. 27 Step 2: Sum the Two Items How much of the time .... tired?
1 - All of the time
2 - Most of the time
3 - Some of the time
4 - A little of the time
5 - None of the time How much of the time
…. full of energy?
5 All of the time
4 Most of the time
3 Some of the time
2 A little of the time
1 None of the time For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)
28. 28 Step 2: Average the Two Items How much of the time .... tired?
1 - All of the time
2 - Most of the time
3 - Some of the time
4 - A little of the time
5 - None of the time How much of the time
…. full of energy?
5 All of the time
4 Most of the time
3 Some of the time
2 A little of the time
1 None of the time For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)For a fatigue scale where higher scores are more fatigue:
1. Reverse the tired item (so a 5 is tired all of the time) (higher score is more fatigue)
2. Add them together
3. Score ranges from 2 (1 on both) to 10 (5 on both)
29. 29 Summed or Averaged: Increase Number of Levels from 5 to 9
30. 30 Summated Scales: Scaling Analyses To create a summated scale, one needs to first test whether a set of items that appear to measure the same concept can be combined
Need to test hypothesis that the items do indeed belong together to form a single concept
Five criteria need to be met to combine items into a summated scale
31. 31 Five Criteria to Meet to Qualify as a Summated Scale Item convergence
Item discrimination
No unhypothesized dimensions
Items contribute similar proportion of information to score
Items have equal variances
32. 32 First Criterion: Item Convergence Each item correlates substantially with the total score of all items
with the item taken out or “corrected for overlap”
Typical criterion is >= .30
for well-developed scales, often set at>= .40
33. 33 Example: Analyzing Convergent Validity for Adaptive Coping Scale
34. 34 Example: Analyzing Convergent Validity for Adaptive Coping Scale
35. 35 Example: Analyzing Convergent Validity for Adaptive Coping Scale
36. 36 SAS/SPSS Make Item Convergence Analysis Easy Reliability programs provide this
Item-scale correlations corrected for overlap
Internal consistency reliability (coefficient alpha)
Reliability with each item removed
To see effect of removing a bad item
37. 37 Second Criterion: Item Discrimination Each item correlates significantly higher with the construct it is hypothesized to measure than with other constructs
Item discrimination
Statistical significance is determined by standard error of the correlation
Determined by sample size
38. 38 Multitrait Scaling - An Approach to Constructing Multi-item Scales Confirms whether hypothesized item groupings can be summed into a scale score
Examines extent to which all five criteria are met
Examines resulting scales
39. 39 Example: Two Subscales Being Developed Depression and Anxiety subscales of MOS Psychological Distress measure
40. 40 Example of Multitrait Scaling Matrix: Hypothesized Scales
41. 41 Example of Multitrait Scaling Matrix: Item Convergence
42. 42 Example of Multitrait Scaling Matrix: Item Convergence
43. 43 Example of Multitrait Scaling Matrix: Item Discrimination
44. 44 Preference Based or Utility Measures Utilities are numeric measurements that reflect the desirability people associate with a health state or condition
Value of that health state
Preference for that health state (rather than another)
45. 45 Methods for Assigning Values? Four steps:
Identify the population of judges who will assign “preferences”
Sample and describe health states to be assigned utilities
Select a preference measurement method
Collect preference judgments, analyze the data, and assign weights to the health states
46. 46 Preference Based or Utility Measures (cont.) Advantages
Combine complex health states into a single number
Score reflects the value or preference for the overall health state
Need two absolute reference points
0 represents death
1 represents perfect health
Methods for obtaining value weights
Time tradeoff, standard gamble, rating scales
47. 47 Readings on Utility Measurement A huge literature
Some readings available on request
48. 48 Overview Types of measurement scales
Rationale for multi-item measures
Scale construction methods
Error concepts
49. 49 Concepts of Error How to depict error
Distinction between random error and systematic error
50. 50 Components of an Individual’s Observed Item Score (NOTE: Simplistic view)
Observed true item score score
51. 51 Components of Variability in Item Scores of a Group of Individuals
Observed true score score variance variance
Total variance
(Variation is the sum of all observed item scores)
52. 52 Combining Items into Multi-Item Scales When items are combined into a scale score, error cancels out to some extent
Error variance is reduced as more items are combined
As you reduce random error, amount of “true score” increases
Multi-item scale is thus more reliable than any single item
53. 53 Sources of Error Subjects
Observers or interviewers
Measure or instrument
54. 54 Measuring Weight in Pounds of Children: Weight without shoes Observed scores is a linear combination of many sources of variation for an individual
55. 55 Measuring Weight in Pounds of Children: Weight without shoes
56. 56 Measuring Weight in Pounds of Children: Weight without shoes
57. 57 Sources of Error Weight of clothes
Subject source of error
Person weighing child is not precise
Observer source of error
Scale is miscalibrated
Instrument source of error
58. 58 Measuring Depressive Symptoms in Asian and Latino Men
59. 59 Measuring Depressive Symptoms in Asian and Latino Men
60. 60 Return to Components of an Individual’s Observed Item Score
Observed true item score score
61. 61 Components of an Individual’s Observed Item Score
Observed true item score score
62. 62 Sources of Error in Measuring Weight Weight of clothes
Subject source of random error
Scale is miscalibrated
Instrument source of systematic error
Person weighing child is not precise
Observer source of random error
63. 63 Sources of Error in Measuring Depression Hard to choose one number on 1-6 response scale
Subject source of random error
Unwillingness to tell interviewer
Subject source of systematic error (underreporting true depression)
Instrument is not culturally sensitive (missing some components)
Instrument source of systematic error
64. 64 Next Week – Week 4 Variability
Reliability
Interpretability
65. 65 Homework for Week 2 Complete rows 1-12 on the matrix for each measure you want to review
Handout
On the web site for this class
Download matrix and fill it in