1 / 23

Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI

Enhancing the Technical Quality of the North Carolina Testing Program: An Overview of Current Research Studies. Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI. Overview. Comparability Consequential validity Other projects on the horizon. Comparability.

kiefer
Download Presentation

Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis , NCDPI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Enhancing the Technical Quality of the North Carolina Testing Program: An Overview of Current Research Studies Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis, NCDPI

  2. Overview • Comparability • Consequential validity • Other projects on the horizon

  3. Comparability • Previous Accountability Conference presentations provided early results • Research funded by an Enhanced Assessment Grant from the US Department of Education • Focused on the following topics: • Translations • Simplified language • Computer-based • Alternative formats

  4. What is Comparability? Not just “same score” • Same content coverage • Same decision consistency • Same reliability & validity • Same other technical properties (i.e., factor structure) • Same interpretations of test results, with the same level of confidence

  5. Goal • Develop and evaluate methods for determining the comparability of scores from test variations to scores from the general assessments • The same inferences should be able to be made, with the same level of confidence, from variations of the same test.

  6. Research Questions • What methods can be used to evaluate score comparability? • What types of information are needed to evaluate score comparability? • How do different methods compare in the types of information about comparability they provide?

  7. Products • Comparability Handbook • Current Practice • State Test Variations • Procedures for Developing Test Variations and Evaluating Comparability • Literature Reviews • Research Reports • Recommendations • Designing Test Variations • Evaluating Comparability of Scores

  8. Results - Translations • Replication methodology helpful when faced with small samples and widely different proficiency distributions • Gauge variability due to sampling (random) error • Gauge variability due to distribution differences • Multiple methods for evaluating structure are helpful • Effect size criteria helpful for DIF • Congruence b/w structural & DIF results

  9. Results – Simplified Language • Carefully documented and followed development procedures focused on maintaining the item construct can support comparability arguments. • Linking/equating approaches can be used to examine and/or establish comparability. • Comparing item statistics using the non-target group can provide information about comparability.

  10. Results – Computer-based • Propensity score matching produced similar results to studies using within-subjects samples. • Propensity score method provides a viable alternative to the difficult-to-implement repeated measures study. • Propensity score method is sensitive to group differences. For instance, the method performed better when 8th and 9th grade groups were matched separately.

  11. Results – Alternative Formats • The burden of proof is much heavier for this type of test variation. • A study based on students eligible for the general test can provide some, but not solid, evidence of comparability. • Judgment-based studies combined with empirical studies are needed to evaluate comparability. • More research is needed in methods for evaluating what constructs each test type is measuring.

  12. Lessons Learned • It takes a village… • Cooperative effort of SBE, IT, districts and schools to implement special studies • Researchers to conduct studies, evaluate results • Cooperative effort of researchers and TILSA members to review study design and results • Assessment community to provide insight and explore new ideas

  13. Consequential Validity • What is consequential validity? • Amalgamation of evidence regarding the degree to which use of test results have social consequences • Can be both positive and negative; intended and unintended

  14. Who’s Responsibility? • Role of the Test Developer versus the Test User? • Responsibility and roles are not clearly defined in the literature • State may be designated as both a test developer and a user

  15. Test Developer Responsibility • Generally responsible for… • Intended effects • Likely side effects • Persistent unanticipated effects • Promoted use of scores • Effects of testing

  16. Test Users’ Responsibility • Generally responsible for… • Use of scores • the further from the intended uses, the greater the responsibility

  17. Role of Peer Review • Element 4.1 • For each assessment, including the alternate assessment, has the state documented the issue of validity…. with respect to the following categories: • g) has the state ascertained whether the assessment produces intended and unintended consequences?

  18. Study Methodology • Focus Groups • Conducted in five regions across the state • Led by NC State’s Urban Affairs • Completed in Dec 09 and Jan 10 • Input of teachers and administration staff • Included large, small, rural, urban, suburban schools

  19. Study Methodology • Survey Creation • Drafts currently modeled after surveys conducted in other states • However, most of those were conducted 10+ years ago • Surveys will be finalized after focus group results are reviewed

  20. Study Methodology • Survey administration • Testing Coordinators to receive survey notification • Survey to be available in late March to April

  21. Study Results • Stay tuned! • Hope to make the report publicly available on DPI testing website

  22. Other Research Projects • Trying out different item types • Item location effects • Auditing

  23. Contact Information • Nadine McBride Psychometrician nmcbride@dpi.state.nc.us • Melinda Taylor Psychometrician mtaylor@dpi.state.nc.us • Carrie Perkis Data Analyst cperkis@dpi.state.nc.us

More Related