Empirical Assessment of Mantel-Haenszel Statistic for Item Bias Detection

An Empirical Assessment of the Mantel-Haenszel Statistic for Studying Differential Item Performance N.S.Raju, R.K.Bode, and V.S.Larsen Applied Measurement in Education (1989) Introduced by Taehoon Kang (05/09/03) - DIP is defined as existing when examinees from different demographic groups perform differentially on an item after conditioning on the ability intended to be measured by the test - To detect biased items (or DIP), Cochran-Mantel-Haenszel test was used (In this article, they call it MH test) - To see the degree of DIP of an item, Common Odds Ratio was used (Instead of θMH, they call it in this article)

X: demographic group membership (0=reference group, 1=focal group) Y: item score (0=incorrect, 1=correct) Z: test score group (by dividing the test scores into categories which number is K) - The null hypothesis that X and Y are conditionally independent given Z, means that the conditional odds ratio θXY(k) between X and Y equals 1 in each partial table. - If the null hypothesis for an item is rejected by the CMH test, we can conclude the item is biased. And the Common Odds Ratio is obtained to see the extent of the bias. Table (X, Y, and Z variables)

The purpose of this article- it is to provide an empirical evaluation of the MH technique to detect biased items in an achievement test. - This empirical study was carried out (1) to determine the effects of the number of score groups, and (2) to see the effect of the inclusion or exclusion of the studied item in forming score groups Data- The 40 items vocabulary test from the Survey of Basic Skills (SBS), The item response data for this study came from a sample of 3,795 fourth, fifth, and sixth graders who took the vocabulary test as part of the SBS standardization (2,400 White, 1,161 Black, and 234 Hispanic examinees)

Analysis - For each racial comparison (W vs. B, and W vs. H), 10 different DIP analyses were performed with the MH technique to assess of the number of score groups and the inclusion or exclusion of the studied item * Total raw test score used to get K groups (K=2, 4, 6, 8, or 10)

Results- About the effect of the number of score groups, they concluded that we can have stable MH test results when the group number is equal to or bigger than 4 - Also, as the number score groups increase, both the inclusion and exclusion of a studied item resulted in approximately the same number of biased items. - Finally, when there are 40 multiple choice items scored 0 or 1, we need to have more than 4 score groups to find DIP items in a stable wayCritique- The problem of the MH technique used in this article is that there is no test to check homogeneous association. In other words, when we reject the hypothesis of conditional independence using CMH test for an item, we don’t know such association is homogeneous or not over all the partial tables. In this article, however, they just got the θMH without checking the homogeneous association. Then, I think we need to apply Breslow-Day statistic to DIP study

Empirical Assessment of Mantel-Haenszel Statistic for Item Bias Detection

Empirical Assessment of Mantel-Haenszel Statistic for Item Bias Detection

Presentation Transcript

Learning In Bayesian Networks

Flashback 10-3-12

Lectures prepared by: Elchanan Mossel Yelena Shvets

Instrumental Variables

Topic 8: Optimisation of functions of several variables

CHANGE OF VARIABLES

Dr Roger Bennett R.A.Bennett@Reading.ac.uk Rm. 23 Xtn. 8559

Syntax With Binders

Arc-consistency for alldiff(x,y,...,z)

StudentClass Table

Bayesian Networks Lecture 8

Lecture 18

Dynamic Systems

طراحی وب مقدماتی

3.5 – Solving Systems of Equations in Three Variables

Chapter 12 Complex Numbers and Functions

Solving Systems of Equations

Lectures prepared by: Elchanan Mossel Yelena Shvets

One Function of Two Random Variables

StudentClass Table

Solving Systems of Equations