Reliability, Validity, and Utility in Selection - PowerPoint PPT Presentation

reliability validity and utility in selection l.
Skip this Video
Loading SlideShow in 5 Seconds..
Reliability, Validity, and Utility in Selection PowerPoint Presentation
Download Presentation
Reliability, Validity, and Utility in Selection

play fullscreen
1 / 24
Download Presentation
Reliability, Validity, and Utility in Selection
Download Presentation

Reliability, Validity, and Utility in Selection

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Reliability, Validity, and Utility in Selection

  2. Requirements for Selection Systems • Reliable • Valid • Fair • Effective

  3. Reliability • the degree to which a measure is free from random error. • Stability, Consistency, Accuracy, Dependability • Represented by a correlation coefficient: rxx • A perfect positive relationship equals +1.0 • A perfect negative relationship equals - 1.0 • Should be .80 or higher for selection

  4. Factors that Affect Reliability • Test length - longer = better • Homogeneity of test items – higher r if all items measure same construct • Adherence to standardized procedures results in higher reliability

  5. Factors that Negatively Affect Reliability • Poorly constructed devices • User error • Unstable attributes • Item difficulty – too hard or too easy inflates reliability

  6. Standardized Administration • All test takers receive: • Test items presented in same order • Same time limit • Same test content • Same administration method • Same scoring method of responses

  7. Types of Reliability • Test-retest • Alternate Forms • Internal Consistency • Interrater

  8. Test-Retest Reliability • Temporal stability • Obtained by correlating pairs of scores from the same person on two different administrations of the same test • Drawbacks – maturation; learning; practice; memory

  9. Alternate Forms • Form stability; aka parallel forms, equivalent forms • Two different versions of a test that have equal means, standard deviations, item content, and item difficulties • Obtained by correlating pairs of scores from the same person on two different versions of the same test • Drawbacks: need to create 2x items (cost); practice; learning; maturation

  10. Internal Consistency - Split-half Reliability • obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once • r must be adjusted statistically to correct for test length • Spearman-Brown Prophecy formula • Advantages: efficient; eliminates some of the drawbacks seen in other methods

  11. Internal Consistency – Coefficient Alpha • Represents the degree of correlation among all the items on a scale calculated from a single administration of a single form of a test • Obtained by averaging all possible split-half reliability estimates • Drawback: test must be uni-dimensional; can be artificially inflated if test is lengthened • Advantages: same as split-half • Most commonly used method of r

  12. Interrater Reliability • Degree of agreement that exists between two or more raters or scorers • Used to determine if scores represent rater characteristics rather than what is being rated • Obtained by correlating ratings made by one rater with those of other raters for each person being rated

  13. Validity • Extent to which inferences based on test scores are justified given the evidence • Is the test measuring what it is supposed to measure? • Extent to which performance on the measure is associated with performance on the job. • Builds upon reliability, i.e. reliability is necessary but not sufficient for validity • No single best strategy

  14. Types of Validity • Content Validity • Criterion Validity • Construct Validity • Face Validity

  15. Content Validity • Degree to which test taps into domain or “content” of what it is supposed to measure • performed by demonstrating that the items, questions, or problems posed by the test are a representative sample of the kinds of situations or problems that occur on the job. • Determined through Job Analysis • Identification of essential tasks • Identification of KSAOs required to complete tasks • Relies on judgment of SMEs • Can also be done informally

  16. Criterion Validity • Degree to which a test is related (statistically) to a measure of job performance • Statistically represented by rxy • Usu. ranges from .30 to .55 for effective selection • Can be established two ways: • Concurrent Validity • Predictive Validity

  17. Concurrent Validity • Test scores and criterion measure scores are obtained at the same time & correlated with each other • Drawbacks: • Must involve current employees, which results in range restriction & non-representative sample • Current employees will not be as motivated to do well on the test as job seekers

  18. Predictive Validity • Test scores are obtained prior to hiring, and criterion measure scores are obtained after being on the job; scores are then correlated with each other • Drawbacks: • Will have range restriction unless all applicants are hired • Must wait several months for job performance (criterion) data

  19. Construct Validity • Degree to which a test measures the theoretical construct it purports to measure • Construct – unobservable, underlying, theoretical trait

  20. Construct Validity (cont.) • Often determined through judgment, but can be supported with statistical evidence: • Test homogeneity (high alpha; factor analysis) • Convergent validity evidence - test score correlates with other measures of same or similar construct • Discriminant or divergent validity evidence – test score does not correlate with measures of other theoretically dissimilar constructs

  21. Additional Representations of Validity • Face Validity – degree to which a test appears to measure what it purports to measure; i.e., do the test items appear to represent the domain being evaluated? • Physical Fidelity – do physical characteristics of test represent reality • Psychological Fidelity – do psychological demands of test reflect real-life situation

  22. Where to Obtain Reliability & Validity Information • Derive it yourself • Publications that contain information on tests • e.g., Buros’ Mental Measurements Yearbook • Test publishers – should have data available, often in the form of a technical report

  23. Selection System Utility • Taylor-Russell Tables – estimate percentage of employees selected by a test who will be successful on the job • Expectancy Charts – similar to T-R, but not as accurate • Lawshe Tables – estimate probability of job success for a single applicant

  24. Methods for Selection Decisions • Top-down – those with the highest scores are selected first • Passing or cutoff score – everyone above a certain score is hired • Banding – all scores within a statistically determined interval or band are considered equal • Multiple hurdles – several devices are used; applicants are eliminated at each step