210 likes | 375 Views
Diagnostic Measurement and Reporting on Concept Inventories. Lou DiBello and Jim Pellegrino DRK-12 PI Meeting Washington, DC December 3, 2010. Acknowledge NSF Support. For substantial portions of the work presented here we acknowledge NSF support under projects:
E N D
Diagnostic Measurement and Reporting on Concept Inventories Lou DiBello and Jim Pellegrino DRK-12 PI Meeting Washington, DC December 3, 2010
Acknowledge NSF Support • For substantial portions of the work presented here we acknowledge NSF support under projects: • REESE-TTCI Project (NSF #0918552; Collaborative Research: Integrating Cognition and Measurement with Conceptual Knowledge: Establishing the Validity and Diagnostic Capacity) • DRK-12 Project (NSF #DRL-0732090; Evaluation of the Cognitive, Psychometric, and Instructional Affordances of Curriculum-Embedded Assessments: A Comprehensive Validity-Based Approach) • CCLI Project (NSF # 0920242; Collaborative Research: ciHUB, a Virtual Community to Support Research, Development, and Dissemination of Concept Inventories) • REESE Synthesis (NSF #0815065; Practical and Theoretical Foundations for Informative Classroom Assessment: A Synthesis of Cognitive Science, Curriculum, Instruction, and Measurement)
General Features of CIs • CIs typically assess a relatively narrow domain—“the concept of force” in physics (FCI, Hestenes); the area of “statics,” (CATS, Steif & Dantzler); or “heat transfer, thermodynamics, and fluid mechanics” (TTCI, Streveler, Olds, Miller, Nelson) • CIs attempt to measure deeper conceptual understanding, not just rote facts or procedures • CIs typically are used in courses in high school, college, community college, & technical schools
Unresolved Issues Related to CI’s & Their Applications • Rigorous empirical support for the diagnostic and formative instructional usefulness of CIs has yet to be shown • General need to validate CIs’ conceptual underpinnings and to find ways to reliably extract useful diagnostic information for instructional application
Diagnostic Modelingfor CIs • CI development framework claims that each item taps particular conceptual knowledge • We attempt to identify a set of concepts & skills for diagnostic reporting that simultaneously represents the CIs conceptual framework tapped by the full set of items—finding the “sweet spot” • Develop hypothesized matrix of items x diagnostic skills—we assume multivariate skill-item mapping • Apply multivariate methods to test and refine the theory, & extract item and person level diagnostic information, validate the skills framework & inventory
Diagnostic Goals • Derive person and population information: • For each student a “skills profile” telling mastery or non-mastery for each skill: (0,0,1,*,1,1,0,1,1,0) {* means not sure about skill 4} • Derive item and test information: • Estimate item parameters that represent measurement features of items and skills • Critique and evaluate the model-based analysis • Reasonability, reliability, model-data fit, • Examine the classroom usefulness of the model-based information • Are student skills profiles useful for students and instructors? • Can information about skills and items improve CI use and impact?
Example of ApplyingDiagnostic Analysis to a CI • CATS is multiple choice test with 27 questions and multiple choice distractors developed by first asking open ended questions and taking account of common student errors • Santiago-Román built a “skills” framework consisting of 10 skills for diagnostic reporting • Used Fusion Model/Arpeggio to analyze CATS data (DiBello & Stout, 2007)
A General Diagnostic Modeling Procedure for CIs • Begin with the conceptual framework of CATS (or any specific CI) • Develop from that framework a set of skills or conceptual understandings for diagnostic measurement and reporting • Map skills to items—Q matrix • Construct diagnostic model using skills & Q matrix • Perform the model-based statistical analysis and evaluate and critique the results • Modify skills, items or aspects of the model • Iterate the analysis process
Q matrix—Strong Cognitive & Conceptual Assumptions 1 = indicated skill is required for that item
Clusters of Concepts for CATS(Steif & Dantzler, 2005, p. 363)
Four-Phase Procedure to investigate diagnostics in CATS (Santiago-Román, PhD thesis, Purdue, 2009) • Phase 1 – Identify “skills” for diagnostic reports • Build upon the conceptual foundation for CATS • Which cognitive attributes are required for each question in CATS? • Phase 2 – Estimate Fusion Model parameters • Estimate model parameters • Infer skills profiles for students • Evaluate reasonability, model estimation and model-data fit • Phase 3 – Evaluate model fit and reliability • Model-data fit • Reliability of skill profile reports • Phase 4 – Consider model implications • Compute expected student skill patterns • Which cognitive attributes estimated as more/less difficult • Are any modifications indicated to skills or items?
Second representation (after conversations with Dr. Steif) (Santiago-Román, 2009)
Initial cognitive attributes for each item(Santiago-Román, 2009)
Student Scores &Skill Profiles 1 = master; 0 = NON-master; 9 = uncertain
Population Proportion of Masters for each skill pk = estimated population proportion of masters of skill k
Model-data fit outcomes: ItemMasters vs ItemNon-masters + = proportion correct by item masters - = proportion correct by item NON=masters pdiff = 0.6517 Mp+ = 0.8899; NMp+ = 0.2332
Diagnostic Results Results(Santiago-Román, 2009): • Skills were identified consonant with foundations • Student profiles were generated • Estimated parameter values were reasonable—for example they identified easier and harder skills • Successfully fit diagnostic model to data. Fit indicators nearly twice as good as similar indices for retro-fit assessments: • Average across all items: Pdiff = 0.6517
Next Steps for CATS • Consider implications of current results • Item quality and diagnostic utility • Diagnosticity of overall instrument • Conceptual quality and coherence of the model • Engage in external validation studies • Student protocols • Validation with other student samples • Add information from the distractors to the modeling effort & diagnostic output
Next Steps for other CI’s and other STEM Assessments • Applicability of these procedures to other CIs and other STEM assessments • Develop a conceptual model • Collect adequate data to perform model analyses • Iteratively refine & interpret • Interpretive use of student & class information for instructional improvement • Future development of “item pools” and “testlets” for web based administration