1 / 47

Integrating Measurement and Sociocognitive Perspectives in Educational Assessment

Integrating Measurement and Sociocognitive Perspectives in Educational Assessment. Robert J. Mislevy University of Maryland. Robert L. Linn Distinguished Address

benita
Download Presentation

Integrating Measurement and Sociocognitive Perspectives in Educational Assessment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrating Measurement and Sociocognitive Perspectives in Educational Assessment Robert J. Mislevy University of Maryland Robert L. Linn Distinguished Address Sponsored by AERA Division D. Presented at the Annual Meeting of the American Educational Research Association, Denver, CO, May 1, 2010. This work was supported by a grant from the Spencer Foundation.

  2. Messick, 1994 • [W]hat complex of knowledge, skills, or other attribute should be assessed... • Next, what behaviors or performances should reveal those constructs, and • what tasks or situations should elicit those behaviors?

  3. Snow & Lohman, 1989 • Summary test scores, and factors based on them, have often been though of as “signs” indicating the presence of underlying, latent traits. … • An alternative interpretation of test scores as samples of cognitive processes and contents … is equally justifiable and could be theoretically more useful.

  4. Roadmap • Rationale • Model-based reasoning • A sociocognitive perspective • Assessment arguments • Measurement models & concepts • Why are these issues important? • Conclusion

  5. Rationale Measurement frame Sociocognitive frame

  6. Rationale An articulated way to think about assessment: • Understand task & use situations in “emic” sociocognitive terms. • Identify the shift in to “etic” terms in task-level assessment arguments. • Examine the synthesis of evidence across tasks in terms of model-based reasoning. • Reconceive measurement concepts. • Draw implications for assessment practice.

  7. Model-Based Reasoning

  8. Representational Form A Representational Form B y=ax+b (y-b)/a=x Mappings among representational systems Entities and relationships Real-World Situation Reconceived Real-World Situation Measurement models Measurement concepts

  9. Measurement models Measurement concepts Entities and relationships in lower-level model Real-World Situation Reconceived Real-World Situation Representational Form A Representational Form B y=ax+b (y-b)/a=x Mappings among representational systems ReconceivedEntities and relationships in higher-level model Sociocognitive concepts

  10. A Sociocognitive Perspective

  11. Some Foundations Themes from, e.g., cog psych, linguistics, neuroscience, anthropology: Connectionist metaphor, associative memory, complex systems (variation, stability, attractors) Situated cognition & information processing E.g., Kintsch’s Construction-Integration (CI) theory of comprehension; diSessa’s “knowledge in pieces” Interpersonal & Extrapersonal patterns

  12. Some Foundations Extrapersonal patterns: Linguistic: Grammar, conventions, constructions Cultural models: What ‘being sick’ means, restaurant script, apology situations Substantive: F=MA, genres, plumbing, etc. Intrapersonal resources: Connectionist metaphor for learning Patterns from experience at many levels

  13. not observable not observable Inside A A Inside B B observable

  14. and internal and external aspects of context … Context A la Kintsch: Propositional content of text / speech… Inside A A Inside B B

  15. Context The C in CI theory is Construction: Activation of bothrelevantandirrelevant bits from LTM, past experience. All L/C/S levels involved. Example: Chemistry problems in German. Inside A A Inside B B • If a pattern hasn’t been developed in past experience, it can’t be activated (although it may get constructed in the interaction). • A relevant pattern from LTM may be activated in some contexts but not others (e.g., physics models).

  16. Context Inside A A Inside B B • The I in CI theory, Integration: • Situation model: synthesis of coherent / reinforced activated L/C/S patterns

  17. Context Inside A A Inside B B Situation model is also the basis of planning and action.

  18. Context Context Context Context Inside A A Inside B B

  19. Context Inside A A Inside B B Context Context Ideally, activation of relevant and compatible intrapersonal patterns… Context

  20. Context to lead to (sufficiently) shared understanding; i.e., co-constructed meaning. Inside A A Inside B B Context Context • Persons’ capabilities, situations, and performances are intertwined – • Meaning co-determined, through L/C/S patterns Context

  21. What can we say about individuals? Use of resources in appropriate contexts in appropriate ways; i.e., Attunement to targeted L/C/S patterns: • Recognize markers of externally-viewed patterns? • Construct internal meanings in their light? • Act in ways appropriate to targeted L/C/S patterns? • What is the range and circumstances of activation? (variation of performance across contexts)

  22. Assessment Arguments

  23. Messick, 1994 • [W]hat complex of knowledge, skills, or other attribute should be assessed... • Next, what behaviors or performances should reveal those constructs, and • what tasks or situations should elicit those behaviors?

  24. Toulmin’s Argument Structure Claim unless Alternative explanation since Warrant so Backing Data

  25. Concerns features of (possibly evolving) context as seen from the view of the assessor – in particular, those seen as relevant to targets of inference. Backing concerning assessment situation unless Alternative explanations on account of Evaluation of performance seeks evidence of attunement to features of targeted L/C/S patterns. Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Note the move from the emic to the etic! Choice in light of assessment purpose and conception of capabilities. Claim about student so Data concerning student performance Depends on contextual features implicitly, since evaluated in light of targeted L/C/S patterns. Student acting in assessment situation

  26. “Hidden” aspects of context—not in test theory model but essential to argument: What attunements to linguistic cultural / substantive patterns can be presumed or arranged for among examinees, to condition inference re targeted l/c/s patterns? Backing concerning assessment situation unless Alternative explanations on account of Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Claim about student Fundamental to situated meaning of student variables in measurement models; Both critical and implicit. so Data concerning student performance Student acting in assessment situation

  27. Backing concerning assessment situation Time unless Alternative explanations on account of Warrant concerning assessment since Data concerning task situation Unfolding situated performance Macro features of performance Micro features of performance Micro features of situation as it evolves Macro features of situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Features of context arise over time as student acts / interacts. Claim about student Features of performance evaluated in light of emerging context. so Data concerning student performance Especially important in simulation, game, and extended performance contexts (e.g., Shute) Student acting in assessment situation

  28. Backing concerning assessment situation Claim about student unless Alternative explanations on account of Warrant concerning assessment since so Data concerning student performance Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Student acting in assessment situation Design Argument

  29. unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation (Bachman) unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation

  30. unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation (Bachman) unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student Claim about student is output of the assessment argument, input to the use argument. on account of How it is cast depends on psychological perspective and intended use. so When measurement models are used, the claim is an etic synthesis of evidence, expressed as values of student-model variable(s). Data concerning student performance Design Argument Student acting in assessment situation

  31. unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation

  32. unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation

  33. unless Alternative explanations Warrant concerning assessment since Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation Backing concerning assessment situation Claim about student on account of so Data concerning student performance Design Argument Student acting in assessment situation

  34. Backing concerning assessment situation Claim about student unless Alternative explanations on account of Warrant for inference: Increased likelihood of activation in use situation if was activated in task situations. Warrant concerning assessment since so Data concerning student performance Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Student acting in assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation • What features do tasks and use situations share? • Implicit in trait arguments • Explicit in sociocognitive arguments Empirical question: Degrees of stability, ranges and conditions of variability (Chalhoub-Deville) Design Argument

  35. Use situation features call for other L/C/S patterns that weren’t in task and may or may not be in examinee’s resources. • Target patterns activated in task but not use context. • Target patterns activated in use but not task context. • Issues of validity & generalizability • e.g., “method factors” • Knowing about relation of target examinees and use situations strengthen inferences • “bias for the best” (Swain, 1985) Backing concerning assessment situation Claim about student unless Alternative explanations on account of Warrant concerning assessment since so Data concerning student performance Data concerning task situation Warrant concerning task design Warrant concerning evaluation since since Other information concerning student vis a vis assessment situation Student acting in assessment situation Use Argument Claim about student in use situation unless Alternative explanations Warrant concerning use situation since on account of Other information concerning student vis a vis use situation Backing concerning use situation Data concerning use situation What features do tasks and use situations not have in common? Design Argument

  36. Dp1 Dp1 Dp1 Ds2 Ds1 Dsn Dp1 Dpn Dp2 OI2 OI1 OIn An A2 A1 Claim about student … Multiple Tasks Synthesize evidence from multiple tasks, in terms of proficiency variables in a measurement model • Snow & Lohman’s sampling • What accumulates? L/C/S patterns, but variation • What is similar from analyst’s perspective need not be from examinee’s.

  37. Measurement Models & Concepts AS IF • Tendencies for certain kinds of performance in certain kinds of situations expressed as student model variables q. • Probability models for individual performances (X) modeled as probabilistic functions of q – variability. • Probability models permit sophisticated reasoning about evidentiary relationships in complex and subtle situations, • BUT they are models, with all the limitations implied!

  38. Measurement Models & Concepts • Xs result from particular persons calling upon resources in particular contexts (or not, or how) • Mechanicallyqs simply accumulate info across situations • Our choosing situations and what to observe drives their situated meaning. • Situated meaning of qs are tendencies toward these actions in these situations that call for certain interactional resources, via L/C/S patterns.

  39. Claim about student … Dp1 Dp1 Dp1 Dsn Ds2 Ds1 Dp1 Dpn Dp2 OI1 OIn OI2 A2 A1 An t X Classical Test Theory • Probability model: “true score” = stability along implied dimension, “error” = variation • Situated meaning from task features & evaluation • Can organize around traits, task features, or both, depending on task sets and performance features. • Profile differences unaddressed

  40. Claim about student … Dp1 Dp1 Dp1 Ds2 Dsn Ds1 Dpn Dp1 Dp2 OI1 OI2 OIn A2 A1 An q … X1 X2 Xn Item Response Theory • q = propensity to act in targeted way, bj=typical evocation, IRT function = typical variation • Situated meaning from task features & evaluation • Task features still implicit • Profile differences / misfit highlights where the narrative doesn’t fit – for sociocognitive reasons Complex systems concepts: Attractors & stability  regularities in response patterns, quantified in parameters; Typical variation  prob model Will work best when most nontargeted L/C/S patterns are familiar… Item-parameter invariance vs Population dependence (Tatsuoka, Linn, Tatsuoka, & Yamamoto, 1988)

  41. Multivariate Item Response Theory (MIRT) • q s = propensities to act in targeted ways in situations with different mixes of L/C/S demands. • Good for controlled mixes of situations

  42. Claim about student … Dp1 Dp1 Dp1 Dsn Ds1 Ds2 Dp1 Dpn Dp2 OI1 OI2 OIn A2 A1 An q … X1 q1 X2 Xn qn q2 vi1 vin vi2 Structured Item Response Theory • Explicitly model task situations in terms of L/C/S demands. Links TD with sociocognitive view. • Work explicitly with features in controlled and evolved situations (design / agents) • Can use with MIRT; Cognitive diagnosis models

  43. Claim about student Claim about student … … OR Dp1 Dp1 Dp1 Dp1 Dp1 Dp1 Dsn Ds1 Dsn Ds2 Ds2 Ds1 Dpn Dp2 Dp1 Dp2 Dpn Dp1 OI1 OI2 OIn OI2 OI1 OIn A1 An A1 An A2 A2 q q … … X1 X1 X2 X2 Xn Xn Mixtures of IRT Models • Different IRT models for different unobserved groups of people • Modeling different attractor states • Can be theory driven or discovered in data

  44. Measurement Concepts • Validity • Soundness of model for local inferences • Breadth of scope is an empirical question • Construct representation in L/C/S terms • Construct irrelevant sources of variation in L/C/S terms • Reliability • Through model, strength of evidence for inferences about tendencies, given variabilities … or about characterizations of variability.

  45. Measurement Concepts • Method Effects • What accumulates in terms of L/C/S patterns in assessment situations but not use situations • Generalizability Theory (Cronbach) • Watershed in emphasizing evidentiary reasoning rather than simply measurement • Focus on external features of context; can be recast in L/C/S terms, & attend to correlates of variability

  46. Why are these issues important? • Connect assessment/measurement with current psychological research • Connect assessment with learning • Appropriate constraints on interpreting large scale assessments • Inference in complex assessments • Games, simulations, performances • Assessment modifications & accommodations • Individualized yet comparable assessments

  47. Conclusion Measurement frame Sociocognitive frame Communication at the interface We have work we need to do, together.

More Related