230 likes | 355 Views
Explore diverse methods of outlier detection in educational data forensics and analyze their effectiveness through statistical evaluations and comparisons. Investigate erasure analysis, scale score changes, performance level changes, and more to identify irregularities in test-taking behaviors and academic outcomes.
E N D
Data Forensics: A Compare and Contrast Analysis of Multiple Methods Christie Plackner
Outlier Score • Applied to most of the methods • Statistical probabilities were transformed into a score of 0 to 50 • 10 = statistically unusual
Erasure Analysis • Wrong-to-right (WR) erasure rate higher than expected from random events • The baseline for the erasure analysis is the state average • One sample t-test
Scale Score Changes • Scale score changes statistically higher or lower than the previous year • Cohort and Non-cohort • One sample t-test
Performance Level Changes • Large changes in proportion in performance levels across years • Cohort and Non-cohort • Log odds ratio • adjusted to accommodate small sample size • z test
Measurement Model Misfit • Performed better or worse than expected • Rasch residuals summed across operational items • Adjusted for unequal school sizes
Subject Regression • Large deviations from expected scores • Within year – reading and mathematics • Across year – cohort within a subject • One sample t-test
Modified Jacob and Levitt • Only method not resulting in a school receiving a score • Combination of two indicators: • unexpected test score fluctuations across years using a cohort of students, and • unexpected patterns in student answers • Modified application of Jacob and Levitt (2003) • 2 years of data • Sample size
Principal Component Analysis • Does each method contribute to the overall explained variance? • Can the methods be reduced for a more efficient approach?
Multiple Methods • Erasure Analysis (mER) • Scale score changes using non-cohort groups (mSS) • Scale score changes using cohort groups (mSC) • Performance level changes using non-cohort groups (mPL) • Performance level changes using cohort groups (mPLC) • Model misfit using Rasch Residuals (mRR) • Across subject regression using reading scores to predict mathematic scores (mRG) • Within subject regression using a cohort’s previous year score to predict current score (mCR) • Index 1 of the Modified Jacob and Levitt evaluating score changes (mMJL1) • Index 2 of the Modified Jacob and Levitt evaluating answer sheet patterns (mMJL2).
Principal Component Analysis • Grade 4 mathematics exam • 10 methods
Simplified Loading Matrix • +/- greater than 1/2 the maximum value in the component • (+)/(-) is between ¼ to ½ the maximum
Reducing Variable Set • Determine how many components to retain • Cumulative percentage of total variation • Eigenvalues • The scree plot
Reducing Variable Set • Select one method to represent a component • Selecting methods within components • Positive selection • Retain highest loading method with components • Discarded principal components • Remove highest loading method with
Reducing Variable Set • Cohort regression* • Modified J&L, Index 1* • Non-cohort scale score change • Model misfit
Conclusion • All methods seem to account for variation in detecting test taking irregularities • Accounting for the most • Cohort regression • Cohort scale score change • Cohort performance level change • Method reduction results the same
Discussion • Different component selection methodologies • Closer examination of variables • Remove cohort regression or cohort scale score change • Combine the J&L indexes • Remove erasures