Detecting Item Parameter Drift
Download
1 / 21

Detecting Item Parameter Drift in a CAT program using the Rasch Measurement Model - PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on
  • Presentation posted in: General

Detecting Item Parameter Drift in a CAT program using the Rasch Measurement Model. Mayuko Simon, David Chayer, Pam Hermann, and Yi Du Data Recognition Corporation April, 2012. How should banked item parameters be checked? .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Detecting Item Parameter Drift in a CAT program using the Rasch Measurement Model

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Detecting Item Parameter Drift

in a CAT program

using the Rasch Measurement Model

  • Mayuko Simon, David Chayer, Pam Hermann, and Yi Du

  • Data Recognition Corporation

  • April, 2012


How should banked item parameters be checked?

  • The idea for this study came about when the authors were faced with a large existing bank of CAT items with estimated item parameters that needed augmentation.


Re-calibration of banked item parameters and item parameter drift

  • Recalibration is recommended at periodic interval

  • CAT item data is sparse matrix and range of students’ ability for each item are limited


What would be a reasonable way to recalibrate items?

  • The methods can be applied to

    • Maintenance of CAT item bank

    • Detecting item parameter drift

    • Calibration of field test items


How did other researchers calibrate/re-calibrate CAT data?

  • Missing imputation to avoid sparseness (Harmes, Parshall, and Kromrey, 2003)

  • Calibrate FT items by anchoring operational items (Wang and Wiley, 2004)

  • Calibrate FT item anchoring ability (Kingsbury, 2009)

  • Use ability to calibrate item parameter to detect drift (Stocking, 1988)


Simulation study

  • 300 items in item bank

  • 20,000 students’ simulated responses, N(0,1)

  • Known item parameter drift (10% of item bank)

  • Various drift sizes


Design


Four calibration methods in this study

  • Anchor person ability (AP)

  • Anchor person ability and anchor 200 items difficulty out of 300 items (API)

  • Use of Displacement value from Winsteps output

  • Item by Item calibration (IBI)


IBI: Item by Item calibration

  • A vector of responses for an item

  • A vector of ability who took the item

  • Same concept as logistic regression, but use Winsteps to calibrate

  • No sparseness involved

  • Less data is needed (especially when not all items in a bank needed to be checked)


Evaluation

  • One sample t-test with alpha 0.01 for AP, API, and IBI

  • Cutoff value 0.4 for Displacement method

  • Type I error rate

  • Type II error rate

  • Sensitivity (Type II + Sensitivity = 1)

  • RMSE (average difference from banked value for flagged items)

  • BIAS (average bias from banked value for flagged items)


Type I error rate

* Average over 40 replications

  • Type I error for Control is also inflated

  • Condition 1 had higher Type I error rate


Type II error rate

* Average over 40 replications

  • Type II error for Displacement method is too high.

  • Condition 1 had higher Type II error rate


Sensitivity

* Average over 40 replications

  • Sensitivity for Displacement method is too low.

  • Condition 1 had lower sensitivityrate


Items with small sample sizes and small drift are difficult to flag correctly.


Type II error were with items with small sample size and/or small drift

Items with

large drift

Items with

small N

Item with

small drift


Same item

Same items

Same items


Which method has re-calibrated item difficulty closer to the banked value?

  • Median of the RMSE are similar across three methods

  • IBI has less variance of RMSE than AP


Which method has less bias with the re-calibrated item difficulty?

  • All three methods has very small bias

  • IBI has less variance of BIAS than AP


Conclusion

  • Use caution with Displacement value to identify item parameter drift.

  • AP, API, and IBI worked reasonably well.

  • Items with small drift or small sample sizes are difficult to detect the item parameter drift

  • Compared to AP, IBI had less variance of RMSE and BIAS

  • Item parameter in one direction (condition 1) would cause more bias in the final ability estimate, leading to higher Type I and Type II errors.


Limitation and Future Study

  • Proportion of items with item parameter drift was 10% of the bank.

    • How the results would change with various proportion? How about the size of drift?

  • Used only Rasch model

    • How about other models and software?

  • Minimum sample size was 10

    • How about different minimum sample sizes (e.g., 30,50, etc)?

  • No iterative procedure (no update of the item difficulty with drift)

    • Does results get better if we do iteratively, updating the difficulty after detecting?


ad
  • Login