Estimation of Ability Using Globally Optimal Scoring Weights

Estimation of Ability Using Globally Optimal Scoring Weights Shin-ichi Mayekawa Graduate School of Decision Science and Technology Tokyo Institute of Technology

Outline • Review of existing methods • Globally Optimal Weight: a set of weights that maximizes the Expected Test Information • Intrinsic Category Weights • Examples • Conclusions

Background • Estimation of IRT ability q on the basis of simple and weighted summed score X. • Conditional distribution of X given qas the distribution of the weighted sum of the Scored Multinomial Distribution. • Posterior Distribution of q given X. h(q|x) @ f(x|q) h(q ) • Posterior Mean(EAP) of q given X. • Posterior Standard Deiation(PSD)

Item Score We must choose w to calculate X. IRF

Item Score We must choose w and v to calculate X. ICRF

Conditional distribution of X given q • Binary items • Conditional distribution of summed score X. • Simple sum: Walsh(1955), Lord(1969) • Weighted sum: Mayekawa(2003) • Polytomous items • Conditional distribution of summed score X. • Simple sum: Hanson(1994), Thissen et.al.(1995) • With Item weight and Category weight: Mayekawa & Arai(2007)

Example • Eight Graded Response Model items 3 categories for each item.

Example (choosing weight) • Example: Mayekawa and Arai (2008) • small posterior variance  good weight. • Large Test Information (TI) good weight

Test Information Function • Test Information Function is proportional to the slope of the conditional expectation of X given q, (TCC), and inversely proportional the squared width of the confidence interval (CI) of q given X. • Width of CI • Inversely proportional to the conditionalstandard deviation of X given q.

Confidence interval (CI) of q given X

Test Information Functionfor Polytomous Items ICRF

Maximization of the Test Informationwhen the category weights are known. • Category weighted Item Scoreand the Item Response Function

Maximization of the Test Informationwhen the category weights are known.

Maximization of the Test Informationwhen the category weights are known. • Test Information

Maximization of the Test Informationwhen the category weights are known. • First Derivative

Maximization of the Test Informationwhen the category weights are known.

Globally Optimal Weight • A set of weights that maximizethe Expected Test Informationwith some reference distribution of q . It does NOT depend on q .

Example NABCT A B1 B2 GO GOINT A AINT Q1 1.0 -2.0 -1.0 7.144 7 8.333 8 Q2 1.0 -1.0 0.0 7.102 7 8.333 8 Q3 1.0 0.0 1.0 7.166 7 8.333 8 Q4 1.0 1.0 2.0 7.316 7 8.333 8 Q5 2.0 -2.0 -1.0 17.720 18 16.667 17 Q6 2.0 -1.0 0.0 17.619 18 16.667 17 Q7 2.0 0.0 1.0 17.773 18 16.667 17 Q8 2.0 1.0 2.0 18.160 18 16.667 17 LOx LO GO GOINT A AINT CONST 7.4743 7.2993 7.2928 7.2905 7.2210 7.2564 5.9795

Maximization of the Test Informationwith respect tothe category weights. • Absorb the item weight in category weights.

Maximization of the Test Informationwith respect tothe category weights. • Test Information • Linear transformation of the categoryweights does NOT affect the information.

Maximization of the Test Informationwith respect tothe category weights. • First Derivative

Maximization of the Test Informationwith respect tothe category weights. • Locally Optimal Weight

Globally Optimal Weight • Weights that maximizethe Expected Test Informationwith some reference distribution of q .

Intrinsic category weight • A set of weights which maximizes: • Since the category weights can belinearly transformed, we set v0=0, ….. vmax=maximum item score.

Example of Intrinsic Weights

Example of Intrinsic Weights • h(q)=N(-0.5, 1): v0=0, v1=*, v2=2

Example of Intrinsic Weights • h(q)=N(0.5, 1): v0=0, v1=*, v2=2

Example of Intrinsic Weights • h(q)=N(1, 1 ): v0=0, v1=*, v2=2

Summary of Intrinsic Weight • It does NOT depend on q, butdepends on the reference distributionof q: h(q) as follows. • For the 3 category GRM, we found that • For those items with high discriminationparameter, the intrinsic weights tendto become equally spaced: v0=0, v1=1, v2=2 • The Globally Optimal Weight isnot identical to the Intrinsic Weights.

Summary of Intrinsic Weight • For the 3 category GRM, we found that • The mid-category weight v1 increases according to the location of the peak ofICRF. That is: The more easy the category is, the higher the weight . • v1 is affected by the relative location ofother two category ICRFs.

Summary of Intrinsic Weight • For the 3 category GRM, we found that • The mid-category weight v1 decreases according to the location of the reference distribution of q: h(q). • If the location of h(q) is high, the mostdifficult category gets relatively high weight,and vice versa. • When the peak of the 2nd categorymatches the mean of h(q), we haveeqaully spaced category weights: v0=0, v1=1, v2=2

Globally Optimal w given v

Test Information LOx LO GO GOINT CONST 30.5320 30.1109 30.0948 29.5385 24.8868

Test Information

Bayesian Estimation of q from X

Bayesian Estimation of q from X (1/0.18)^2 = 30.864

Conclusions • Polytomous item has the Intrinsic Weight. • By maximizing the Expected Test Information with respect to either Item or Category weights, we can calculate the Globally Optimal Weights which do not depend on q. • Use of the Globally Optimal Weights when evaluating the EAP of q given X reduces the posterior variance.

References

ご静聴ありがとうございました。Thank you.

Estimation of Ability Using Globally Optimal Scoring Weights