1 / 21

Diagnostic Measurement and Reporting on Concept Inventories

Diagnostic Measurement and Reporting on Concept Inventories. Lou DiBello and Jim Pellegrino DRK-12 PI Meeting Washington, DC December 3, 2010. Acknowledge NSF Support. For substantial portions of the work presented here we acknowledge NSF support under projects:

abba
Download Presentation

Diagnostic Measurement and Reporting on Concept Inventories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Diagnostic Measurement and Reporting on Concept Inventories Lou DiBello and Jim Pellegrino DRK-12 PI Meeting Washington, DC December 3, 2010

  2. Acknowledge NSF Support • For substantial portions of the work presented here we acknowledge NSF support under projects: • REESE-TTCI Project (NSF #0918552; Collaborative Research: Integrating Cognition and Measurement with Conceptual Knowledge: Establishing the Validity and Diagnostic Capacity) • DRK-12 Project (NSF #DRL-0732090; Evaluation of the Cognitive, Psychometric, and Instructional Affordances of Curriculum-Embedded Assessments: A Comprehensive Validity-Based Approach) • CCLI Project (NSF # 0920242; Collaborative Research: ciHUB, a Virtual Community to Support Research, Development, and Dissemination of Concept Inventories) • REESE Synthesis (NSF #0815065; Practical and Theoretical Foundations for Informative Classroom Assessment: A Synthesis of Cognitive Science, Curriculum, Instruction, and Measurement)

  3. General Features of CIs • CIs typically assess a relatively narrow domain—“the concept of force” in physics (FCI, Hestenes); the area of “statics,” (CATS, Steif & Dantzler); or “heat transfer, thermodynamics, and fluid mechanics” (TTCI, Streveler, Olds, Miller, Nelson) • CIs attempt to measure deeper conceptual understanding, not just rote facts or procedures • CIs typically are used in courses in high school, college, community college, & technical schools

  4. Unresolved Issues Related to CI’s & Their Applications • Rigorous empirical support for the diagnostic and formative instructional usefulness of CIs has yet to be shown • General need to validate CIs’ conceptual underpinnings and to find ways to reliably extract useful diagnostic information for instructional application

  5. Diagnostic Modelingfor CIs • CI development framework claims that each item taps particular conceptual knowledge • We attempt to identify a set of concepts & skills for diagnostic reporting that simultaneously represents the CIs conceptual framework tapped by the full set of items—finding the “sweet spot” • Develop hypothesized matrix of items x diagnostic skills—we assume multivariate skill-item mapping • Apply multivariate methods to test and refine the theory, & extract item and person level diagnostic information, validate the skills framework & inventory

  6. Diagnostic Goals • Derive person and population information: • For each student a “skills profile” telling mastery or non-mastery for each skill: (0,0,1,*,1,1,0,1,1,0) {* means not sure about skill 4} • Derive item and test information: • Estimate item parameters that represent measurement features of items and skills • Critique and evaluate the model-based analysis • Reasonability, reliability, model-data fit, • Examine the classroom usefulness of the model-based information • Are student skills profiles useful for students and instructors? • Can information about skills and items improve CI use and impact?

  7. Example of ApplyingDiagnostic Analysis to a CI • CATS is multiple choice test with 27 questions and multiple choice distractors developed by first asking open ended questions and taking account of common student errors • Santiago-Román built a “skills” framework consisting of 10 skills for diagnostic reporting • Used Fusion Model/Arpeggio to analyze CATS data (DiBello & Stout, 2007)

  8. A General Diagnostic Modeling Procedure for CIs • Begin with the conceptual framework of CATS (or any specific CI) • Develop from that framework a set of skills or conceptual understandings for diagnostic measurement and reporting • Map skills to items—Q matrix • Construct diagnostic model using skills & Q matrix • Perform the model-based statistical analysis and evaluate and critique the results • Modify skills, items or aspects of the model • Iterate the analysis process

  9. Q matrix—Strong Cognitive & Conceptual Assumptions 1 = indicated skill is required for that item

  10. Clusters of Concepts for CATS(Steif & Dantzler, 2005, p. 363)

  11. Sample Item #1 from CATS (Steif)

  12. Four-Phase Procedure to investigate diagnostics in CATS (Santiago-Román, PhD thesis, Purdue, 2009) • Phase 1 – Identify “skills” for diagnostic reports • Build upon the conceptual foundation for CATS • Which cognitive attributes are required for each question in CATS? • Phase 2 – Estimate Fusion Model parameters • Estimate model parameters • Infer skills profiles for students • Evaluate reasonability, model estimation and model-data fit • Phase 3 – Evaluate model fit and reliability • Model-data fit • Reliability of skill profile reports • Phase 4 – Consider model implications • Compute expected student skill patterns • Which cognitive attributes estimated as more/less difficult • Are any modifications indicated to skills or items?

  13. Second representation (after conversations with Dr. Steif) (Santiago-Román, 2009)

  14. Initial cognitive attributes for each item(Santiago-Román, 2009)

  15. Final “Skills” and their relation toCATS Framework

  16. Student Scores &Skill Profiles 1 = master; 0 = NON-master; 9 = uncertain

  17. Population Proportion of Masters for each skill pk = estimated population proportion of masters of skill k

  18. Model-data fit outcomes: ItemMasters vs ItemNon-masters + = proportion correct by item masters - = proportion correct by item NON=masters pdiff = 0.6517 Mp+ = 0.8899; NMp+ = 0.2332

  19. Diagnostic Results Results(Santiago-Román, 2009): • Skills were identified consonant with foundations • Student profiles were generated • Estimated parameter values were reasonable—for example they identified easier and harder skills • Successfully fit diagnostic model to data. Fit indicators nearly twice as good as similar indices for retro-fit assessments: • Average across all items: Pdiff = 0.6517

  20. Next Steps for CATS • Consider implications of current results • Item quality and diagnostic utility • Diagnosticity of overall instrument • Conceptual quality and coherence of the model • Engage in external validation studies • Student protocols • Validation with other student samples • Add information from the distractors to the modeling effort & diagnostic output

  21. Next Steps for other CI’s and other STEM Assessments • Applicability of these procedures to other CIs and other STEM assessments • Develop a conceptual model • Collect adequate data to perform model analyses • Iteratively refine & interpret • Interpretive use of student & class information for instructional improvement • Future development of “item pools” and “testlets” for web based administration

More Related