Chapter 7 Evaluating What a Test Really Measures. Validity. APA – Standards for Educational and Psychological Testing (1985) – Recognized three ways of deciding whether a test is sufficiently valid to be useful.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Validity: Does the test measure what it claims to measure?The appropriateness with which inferences can be made on the basis of test results.
***NOTE – Content validity does not involve statistical analysis.
Attributes that can be described in terms of specific behaviors.
e.g., ability to play piano, do math problems
More difficult to describe in terms of behaviors because people might disagree on what the behaviors present
e.g., intelligence, creativity, personalityAttributes
Concurrent validity – correlating test scores with an independent measure of the same trait that the test is designed to measure – currently available.
Or being able to distinguish between groups known to be different; i.e., significantly different mean scores on the test.
E.g.1, Teachers’ ratings of reading ability validated by correlating with reading test scores.
E.g.2, validate an index of self-reported delinquency by comparing responses to office police records on the respondents.
CRITERION MEASUREMENTS MUST THEMSELVES BE VALID! comparing scores with a criterion (the standard by which your measure is being judged or evaluated).
BOTH PREDICTOR AND CRITERION MEASURES MUST BE RELIABLE comparing scores with a criterion (the standard by which your measure is being judged or evaluated).FIRST!
-Requires that you take into account the size of the group (N) from whom we obtained our data.
-When researchers or test developers report a validity coefficient, they should also report its level of significance.
r2 tells us how much covariation exists between predictor and criterion; e.g., if r = .7, then 49% of the variance is common to both.
i.e., If correlation (r) is .30, then the coefficient of determination (r2) is .09. (This means that the test and criterion have 9% of their variation in common.)
Hits: a) True positives - predicted to succeed and did.
b) True negatives - predicted to fail and did.
Misses: a) False positives - predicted to succeed and didn’t.
b) False negatives - predicted to fail and would have succeeded.
WE WANT TO MAXIMIZE TRUE HITS AND MINIMIZE MISSES!
a) mathematics test
b) intelligence test
c) vocational interest inventory
d) music aptitude test
a) personnel manager
b) teacher or principal
c) college admissions officer
d) prison warden
f) guidance counselor
g) veterinary dermatologist
h) professor in medical school