What is Differential Item Functioning (DIF)?. Answers the question, “Do item response curves differ significantly for subgroups of a population?” (e.g., gender, language, race, etc.) Often used to provide evidence of item bias in testing.

• In most testing programs items that demonstrate DIF are routinely excluded from the test or not scored.
• DIF is necessary but NOT necessarily a sufficient condition to detect item bias (i.e, an item can show DIF and NOT be bias).
• For example, a poorly translated item can show DIF, which does not necessarily mean the item is bias.
• With regard to polytomously scored items (items with greater than one possible point), DIF also serves the purpose of identifying non-scoring items.
What is item bias?
• Each group has its own set of cultural characteristics consisting of values, norms, attitudes, expectancies, etc…
• …That sometimes reveal themselves statistically in test items.
• Tests developed and tested in one population may show evidence of bias when used in another population.
• Bias can occur through cross-cultural differences in the interpretations of the meaning of concepts and of items used to measure constructs.
Item Characteristic Curves: Examples of DIF
• The red and blue groups score differently on this item; a test candidate in the red group has a higher probability of answering this item correct.
Identifying Non-scoring Items
• 2004Q2: DIF analysis identified 8 items in 6 different languages that had never scored