The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning

The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning W. Holmes Finch

Outline • Introduction • DIF detection • Missing data • Types • Methods for dealing with missing data • Listwise deletion • Omitted as incorrect • Multiple imputation • Stochastic regression imputation • Objective of the present study • Method • Results • Discussion

Introduction • Researchers have focused on the impact of missing data on uniform DIF analyses in the presence of missing data. • Results showed that type I error rates were inflated so that items were mistakenly identified as displaying DIF and power for DIF detection in presence of missing data was low. • This paper focused on examining the impact of missing data on nonuniform DIF.

DIF Detection • Uniform DIF: • The reference group have a consistent advantage in the likelihood of responding correctly to an item for all levels, as compared with the focal group. • NonuniformDIF: • The reference group have an advantage in correctly responding to an item for some levels, whereas for other levels, the focal group has an advantage in correctly responding to the item.

Methods of Nonuniform DIF Detection • IRT likelihood ratio test (IRTLR) • Logistic regression (LR) • Crossing SIBTEST (CSIB)

Types of Missing Data • Missing completely at random (MCAR) • Some respondents leave an item unanswered in a completely random fashion, with no systematic mechanism associated with the missingness. • Missing at random (MAR) • The probability of an observation containing missing data is associated directly with a measurable variable. • Missing not at random (MNAR) • The likelihood of being missing is associated with the value of the variable itself.

Listwise Deletion (LD) • If an individual fails to respond to any item on the instrument, his or her data would be excluded from DIF analyses. • Easy to employ and is the default for many statistical software packages. • It reduces the effective sample size, which can in turn lead to a notable reduction in statistical power for hypothesis testing of DIF. • It has been associated with biased estimates in some situations except data of MCAR.

Omitted as incorrect • Zero imputation (ZI) • Missing responses are assigned an incorrect value, or a zero in the case of dichotomously scored items. • This approach would lead to biased parameter estimation and hypothesis test results.

Multiple imputation (MI) • MI can incorporate information from all variables in a data set to derive imputed values for those that are missing. • The MI algorithm assumes a multivariate normal probability distribution among the variables and that the data are MAR or MCAR. • Accurate parameter estimation and statistical power rates comparable with those obtained with complete data.

Stochastic regression imputation (SRI) • SRI involves a two-step process in which the distribution of relative frequencies for each response category for each member of the sample is first obtained from the observed data. • For each member of the sample, missing values are then replaced by random draws from the multinomial distribution with parameters equal to the distribution of relative frequencies of the categories. • The second step of SRI, LR is conducted for the target variable for each of the M complete data sets with the independent variables being the other variables in the data set.

Prior research • Sedivy et al. (2006) • GRM model • LR and Poly-SIBTEST (uniform DIF) • Lowest value imputation • Type I error rates were rarely inflated and power was diminished for higher levels of missing. • Banks and Walker (2006) • 3PL dichotomous model • LD and ZI • Type I error rates were inflated for ZI but not LD and power for detecting DIF was higher for ZI than LD.

Prior research • Robitzsch and Rupp (2009) • MH and LR • LD, ZI, MI, and tow-way imputation • ZI resulted in inflated type I error rates • DIF method, sample size, and number of items had relatively little impact on the type I error and power rates. • Finch (2011) • MI, LD, and ZI • ZI was associated with type I error inflation and in some cases low power. • Methods of DIF detection used (SIBTEST, MH, or LR) were not affected differentially by the presence of missing data.

Method • 3PL model • 20 and 40 items • 1 DIF item • Sample size: 250/250, 500/500, 1000/1000 • Impact: (0,0), (0,-0.5), (0,0.5) • Percentage of missing data: 0, 10%, 20%, 30% • Magnitude of DIF: 0, 0.4, 0.8, and 1

Type of missing data • MCAR: responses from across both groups on the target item were randomly selected to be missing. • MAR1: only members of the focal group were randomly selected to have missing data on the target item (missing data mechanism was associated with group membership). • MAR2: examinees with total scores at or below the 30th percentile were selected to have missing data (individuals with relatively lower trait levels tend to leave target item blank). • MNAR: missing data were taken only from those who had an incorrect response to the target item (examinees who did not know the correct answer to an item left it blank).

Results

Power was higher for greater levels of DIF Impact = 0/0 Power for the LD method was slightly lower than that of the complete data condition, except when the data were MNAR. For ZI, power rates were relatively low in the MAR1 and MCAR conditions. Impact = 0/-5 Power for all conditions was somewhat lower than for the other two impact conditions. Power for LD was slightly lower than for the complete data except MAR2. Higher power for SRI might resulted from inflated type I error. Impact = 0/+5 When impact = 0/+5, power under most of the conditions simulated here was higher than when impact = 0/-5. Power for MI was typically comparable with or higher than for LD, with the exception of MAR1 data and the lowest DIF condition. LR

Results

Discussion • Prior research on uniform DIF and missing data • No single approach could be identified as optimal for all conditions. • ZI can always be viewed as the least optimal missing data approach for uniform DIF detection. • The current study on nonuniform DIF and missing data • ZI did not always result in type I error inflation for nonuniform DIF detection when data were MCAR and MNAR. • LD produced results very similar to those obtained with the complete data. • Overall MI appears to be much preferable to SRI. The inflation for SRI was much more severe than that of MI.

The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning

The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning

Presentation Transcript

What is Differential Item Functioning (DIF)?

Using Differential Item Functioning Analyses to Enhance the Curriculum

The mysteries of missing data

The Impact of Killing on Mental Health Symptoms and Functioning Veterans of War

The Impact of Data and Technology on Science

Differential Item Functioning

The DIF-Free-Then-DIF Strategy for the Assessment of Differential Item Functioning

Detection of Item Degradation

The effect of differential item functioning in anchor items on population invariance of equating

Impact of the Particle Environment on LYRA Data

The Impact of Duality on Data Synopsis Problems

Impact of Systems of Care on School Functioning

An extreme occurrence of the missing data

Detection of Differential Item/Test Functioning (DIF/DTF) Using IRT

Differential Item Functioning in Mplus

The Item Bias Detection of the Reading Tests and the Development of the Item Bank Software

The Impact of the Financial Crisis On the Functioning of the Internal Market

The Impact of the Financial Crisis On the Functioning of the Internal Market

An extreme occurrence of the missing data