Test Construction and Measurement

Test Construction and Measurement

An Experiment • Researcher gave students the Diagnostic Inventory Blank • Hobbies, reading interests, secret hopes and ambitions • Then gave students typed descriptions of their personalities • Asked students to rate how well personality sketch described them

Sample Personality Description • You have a need for other people to like and admire you • You have a tendency to be critical of yourself • While you have some personality weaknesses, you are generally able to compensate for them

Disciplined and self-controlled outside, you tend to be worrisome and insecure inside • You pride yourself as an independent thinker and do not accept others’ statements without proof • At times you are extraverted and sociable, while at other times you are introverted and reserved

Result • Almost all students were very impressed with how well DIB described them • Rated DIB as very accurate personality test

Problem • Every student was given exactly the same personality description

The Lesson • Beware of the Barnum effect • Tendency of people to see vague, universal statements as descriptive of themselves

Major Point • Real psychological measurement is a complicated and difficult process

A Preview • Correlation • Steps in constructing a psychological test • Reliability and validity • Factor analysis

Correlational Research • Focuses on relationships among variables • Changes in one variable are associated with changes in another variable

Correlation Coefficient • Number which expresses the direction and strength of the relationship between 2 variables • Ranges from -1 to 1 • Index of the degree to which scores on one measure can be used to predict scores on a 2nd measure

Direction • Indicated by + or - sign (slope) • Positive correlation • as one variable goes up, so does the other • Negative correlation • As one variable goes up, the other goes down

Strength • Indicated by absolute value • perfect positive relationship = 1 • perfect negative relationship = -1 • no relationship = 0

Percent of Variance • Percent of variance in “measure A” that can be accounted for “measure B” • square correlation coefficient and multiply by 100 • Correlation of .50 means we can account for 25% of variance

Causality • Correlation just tells you that 2 variables are related • Can’t make causal interpretations

Fact: Time spent on the internet is positively correlated with depression • Possible interpretations • Spending lots of time on internet causes depression • Being depressed causes you to spend lots of time on internet • Some third variable, such as living by one’s self, causes both

Steps in Psychological Test Construction

Major Point • It is difficult, but not impossible, to construct a meaningful psychological test

Steps in Test Construction 1. Decide what to measure • Identify construct • Idea that helps us makes sense of world around us • Not directly observable • Examples: intelligence, extraversion, racism, pessimism, creativity

Steps (continued) 2. Develop a set of items/questions • Search literature • Get experts or lay people to tell us what construct means to them

Steps (continued) 3. Get sample of people to answer items • From population you want to use test for

Steps (continued) 4. Evaluate each item • Correlate each item with mean of whole set • Correlate each item with item directly assessing self-reported racism • drop bad items

Steps (continued) 5. Select a set of items for further study • Want normal distribution • Drop high YES and high NO items

Steps (continued) 6. Assess reliability of entire test • Consistency of measurement • 3 major types

Reliability 1. Inter-rater: • Extent to which different people scoring same test get same result • Correlate set of tests scored by one rater with same set of tests scored by different rater

Reliability 2. Test-retest: • Extent to which people get same results if take test again • Subjects take test twice. Correlate set of time 1 scores with time 2 scores

Reliability 3. Internal consistency: • Split-half: correlation between one half of test and other half • Coefficient alpha: average of all possible split-half reliabilities

How high should reliability correlations be? • Expect r = .80 or better

Factors that influence reliability • Clarity of items • Motivation of test taker • Number of items

Steps (continued) 7. Assess validity of entire test • Extent to which test measures what it is supposed to measure • Face validity not sufficient • Do series of validity studies

Ways to Measure Validity 1. Criterion validity • Correlation between test and concrete, directly observable criterion • Example: correlate self-report of weight with actual weight on scale

Ways to Measure Validity (continued) 2. Content validity • Adequate coverage of target domain • Example: test of chapters 1-4 which only covers chapter 2 and 3 lacks content validity

Ways to Measure Validity (continued) 3. Convergent validity • Agreement among alternative measures of same construct • Example: correlation between ACT and SAT 4. Discriminant validity • Lack of correlation between tests that are intended to measure different constructs • Example: expect low correlation between ACT and test of aggression

Threats to Validity Response tendency • Assign numbers to items for reasons that have little to do with the construct the item is intended to measure

Response Tendencies • Extremity tendency • Use end of scales • Acquiescence tendency • Agree with questions • Social desirability • Answer in a way that makes you look good

Factor Analysis • Statistical technique that examines pattern of correlations among multiple tests or items • Tests or items that correlate strongly with one another are considered to represent a common, underlying factor

Interpreting Factor Analysis • Each item has a factor loading: correlation between item and factor • Marker variable • item that has high factor loading (correlation) with given factor • closely related to meaning of factor • Blend • item that loads moderately high on more than one factor • not a pure measure of factor, related to two or more factors

Test Construction and Measurement