MEASUREMENT Goal

MEASUREMENT Goal. To develop reliable and valid measures using state-of-the-art measurement models Members: Chang, Berdes, Gehlert, Gibbons, Schrauf, Weiss. Why Item Response Theory?. Item Response Theory (IRT).

• To develop reliable and valid measures using state-of-the-art measurement models

• Members: Chang, Berdes, Gehlert, Gibbons, Schrauf, Weiss

• A family of mathematical descriptions of what happens when a person meets a test or survey question

• Relates characteristics of items (item parameters) and characteristics of persons (person latent traits) to the probability of a correct or rating/categorical response

• Models the test-taking behavior at the item level

Chang & Gehlert (2002).

Difficulty (b)

2-PL

Difficulty (b)

Discriminating (a)

3-PL

Difficulty (b)

Discriminating (a)

Guessing (c)

Dichotomous Unidimensional IRT Models

• Polytomous

• 1-PL (threshold)

• Partial Credit

• Rating Scale

• 2-PL (threshold & discriminating)

• Nominal

• Generalized Partial Credit

Item Bank (Catalogued; Hierarchically Structured)

CAT

Brief Forms

• IRT pre-calibrated item bank

• Initial item selection

• Test scoring method

• Item selection during test administration

• Stopping rules

Item Bank Assessment

• Set of carefully IRT-calibrated questions

• Items covers entire latent trait continuum

• Items represent differing amounts of trait

• Items represent differing amounts of information

• Items can be selected to maximize precision and retain clinical relevance

Item Banking is Inter-disciplinary Assessment

• Psychometricians

• Information scientists

• Clinicians/healthcare providers

• Outcomes researchers

• Content experts

Approaches to Develop AssessmentItem Banks

• Top-Down Approach

• Bottom-Up Approach

• How to best calibrate existing items?

• Model selection

• Whose item parameters to use?

• Standardization?

• Generic vs. disease-specific

• Item parameter drift

• Anchor or Re-calibrate?

• How to write and best test new items?

• An adaptive test is a tailored, individualized measure which involves selecting a set of test items for each individual that best measures the psychological characteristics of that person (Weiss, 1985)

• Weiss DJ. Adaptive testing by computer. J Consult Clin Psychol. Dec 1985;53(6):774-789.

• Adaptive testing selects questions based on previous responses

• Tailored item and test difficulties

• Eliminates floor and ceiling effects

• Require fewer questions to arrive at an accurate estimate

• Automate question administration, data recording, scoring, and prompt reporting

• Allows for immediate feedback

CAT Algorithm Assessment

Administer Item of Median Difficulty (or Screening Item)

Score Item

Estimate Latent Trait (Theta)

Termination Criterion Satisfied

Choose and Administer Next Item with Maximum Information

• Context effects

• Unbalanced content

• Time frame

• Response categories

• Multidimensionality

What kind of short form? Measurement

• Question 1

• 0 I do not feel sad.

• 2 I am sad all the time and I can’t snap out of it.

• 3 I am so sad or unhappy that I can’t stand it.

Are you basically satisfied with your life? True/False

Item production Measurement

Item statistics

Item exposure

Maintaining a valid bank of items for test construction

Fairness

Delivery options

Cost-benefit considerations

MORE Research Still Needed for Effective CAT Implementation

Infrastructure of a MeasurementNational Geriatric Pain Item Bank

