1 / 28

Validity - C onsequantialism

Validity - C onsequantialism. Assoc. Prof. Dr. Sehnaz Sahinkarakas. “ Effect-driven testing ” ( Fulcher & Davidson, 2007) “the effect that the test is intended to have and to structure the test development to achieve that effect” (p.144) What does this mean?. Definition of VALIDITY.

Download Presentation

Validity - C onsequantialism

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Validity - Consequantialism Assoc. Prof. Dr. SehnazSahinkarakas

  2. “Effect-driven testing” (Fulcher & Davidson, 2007) • “the effect that the test is intended to have and to structure the test development to achieve that effect” (p.144) • What does this mean?

  3. Definition of VALIDITY • “Overall judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions on the basis of test scores or other modes of assessment” (Messick, 1995, p. 741). • What is score? • In general it is “any coding or summarization of observed consistencies or performance regularities on a test, questionnaire, observation procedure, or other assessment devices such as work samples, portfolios, and realistic problem simulations” (p. 741).

  4. Then validity is making inferences about scores; scores are the reflections of a test taker’s knowledge and/or skills based on test tasks. • Different from early definitions of validity: the degree of correlation between the test and the criterion (validity coefficient) • In early definition: • there is an upper limit for the possible correlation • it is directly related to the reliability of the test (without high reliability a test cannot be valid) • New definition (especially after Messicks), validity changed as the meaning of the test scores, not a property

  5. Final remarksforvalidity(and reliability, fairness…): • not based on just measurement principles; • they are social values • correlation coefficients and/or content validity analysis are not enough to assume validity (Messick). • So, “score validation is an empirical evaluation of the meaning and consequences of measurement” (Messick)

  6. Construct Validity • What is construct? • To define a concept in such a way that • it becomes measureable (operational definition) • it can have relationship with other different constructs (e.g. the more anxious, the less self-confidence) • Construct validity • is the degree to which inferences can be made from the operational definitions to theoretical constructs those definitions are based • What does this mean?

  7. Two things to consider in construct validation: • Theory (what goes on in our mind: ideas, theories, beliefs…) • Observation (what we see happening around us; our actual program/treatment) • i.e., we develop something (observation) to reflect what is in our mind (theory) • Construct validity is assessing how well we have transformed our ideas/theories to our actual programs/measures • What does this mean in testing? How do we do it in testing?

  8. Sources of Invalidity • Two major threats: • Construct underrepresentation: assessment is too narrow: does not include important dimensions of the construct • Construct-irrelevant variance: assessment is too broad: contains variance associated with other distinct constructs

  9. Construct-Irrelevant Variable • Two kinds • Construct-irrelevant difficulty (e.g., undue reading text based on subject-matter knowledge): leads to invalid low scores • Construct-irrelevant easiness (e.g., highly familiar texts to some): leads to invalid high scores • What do you think about KPDS/YDS in terms of threats to validity

  10. Sources of Evidence in ConstructValidity (Messick, 1995) • ConstructValidity= theevidentialbasisforscoreinterpretation • How do weinterpretscores? • Anyscoreinterpretation is needed, not just ‘theoreticalconstructs’ • How do we do this?

  11. Evidence-RelatedValidity • Twotypes: • Convergent validity consists of providing evidence that two tests that are believed to measure closely related skills or types of knowledge correlate strongly. (i.e. The test MEASURES what it clasimstomeasure) • Discriminant validity consists of providing evidence that two tests that do not measure closely related skills or types of knowledge do not correlate strongly. (i.e. The test does NOT MEASURE irrelevantattributes)

  12. Aspects of ConstructValidity • Validity is a unifiedconcept but it can be differentiatedintodistinctaspects: • Content • Substantive • Structural • Generalizability • External • Consequential

  13. Content Aspect • Content relevance; Representativeness; Technical quality(towhatextentdoes it representthe domain?) • Itrequiresidentifyingtheconstruct DOMAIN to be assessed • Towhatextentdoesthe domain/taskcovertheconstruct • Allimportantparts of theconstruct domain should be covered

  14. SubstantiveAspect • Theprocess of theconstructandthedegreetheseprocessesarereflected • Itincludescontentaspect in it but empiricalevidence is alsoneeded. • This can be done using a variety of sources; e.g. think-aloudprotocols

  15. Theconceptbridgingcontentandsubstantive is representativeness. • Representativeness has twodistinctmeanings: • Mentalrepresentation (cognitivepsyhchology) • Brunswinkian sense of ecologicalsampling: correlationbetween a cueand a property. (e.g. Color of banana is a cueand it indicatestheripeness of thefruit)

  16. StructuralAspect • Relatedtoscoring • Thescoringcriteriaandrubricsshould be rationallydeveloped (based on theconstructs)

  17. Generalizability • Interpretationsshould not be limitedtothetaskassessed • Should be generalizabletotheconstruct domain • (degree of correlationbetweenthetaskandtheothers)

  18. ExternalVariables • Scores’ relationshipwithothermeasuresandnonassessmentbehaviours • Convergent (correspondencebetweenmeasures of thesameconstruct) andDiscriminantevidence (distinctnessfrommeasures of otherconstructs) areimportant

  19. Consequences • Evaluatingintendedandunintendedconsequences of scoreinterpretationbothpositiveandnegativeimpact • But, negativeimpactshould NOT be because of theconstructunderrepresentationorconstructirrelevantvariance. • Twofacets: (a) justification of thetestingbased on scoremeaningorconsequencescontributingtoscorevaluation; (b) functionoroutcome of thetesting—as interpretaionorapplieduse

  20. Facets of Validity as a ProgressiveMatrix (Messicks, 1995, p. 748) Twofacets: (a) justification of thetestingbased on scoremeaningorconsequencescontributingtoscorevaluation; (b) functionoroutcome of thetesting—as interpretaionorapplieduse. Whentheyarecrossedwitheachother a four-foldclassification is obtained

  21. Construct validity appears in every cell in the figure. • This means: • Validity issues are unified into a unitary concept • But also distinct features of construct validity should be emphasized • What is the implication here? • Both meaning and values are interwined in the validation process. • Thus, • ‘Validity and values are one imperative, not two, and test validation implicates both the science and the ethics of assessment, which is why validity has force as a social value’ (Messick, 1995, p. 749).

  22. ConsequentialValidity & Washback • Messicianview (Unifiedversion) of ConstructValidity= Consideringtheconsequences of test use(i.e., washback) • Whatdoesthismean in validitystudies?

  23. Washbackis a particularinstance of consequentialaspect of constructvalidity • Investigatingwashbackandotherconsequences is a crucial step in theprocess of test validation • i.e., Washbackis one (not theonly) indicator of consequentialaspect of validity • Itis importanttoinvestigatewashbacktoestablishthevalidity of a test

  24. Put it differently: • Modern paradigm of validitycomeswithitsconsequentialnature • Test impact is part of a validationargument • Thus, effect-driventestingshould be considered: testersshouldbuildtestswiththeintendedeffects in mind

  25. To put it alltogether Value implication + Socialconsequences = CONSEQUENTIAL VALIDITY (twofairness-relatedelements of Messick’sconsequentialvalidity)

  26. Implication

  27. But whobringsaboutwashback (positiveornegative)? • People in classrooms (T / Ss)? • Test Developers? • ForFulcherandDavidson, it is thepeople in classrooms • Thusmoreattentionshould be giventoteachers’ beliefsaboutteachingandlearningandthedegree of their PROFESSIONALISM

  28. Task A9.2 • Course book (p.143) • Select onelarge-scale test youarefamiliarwith. • What is itsinfluenceuponwhom? • Does it seemreasonableto define thesetests as theirinfluence as well?

More Related