reliability validity and utility in selection
Download
Skip this Video
Download Presentation
Reliability, Validity, and Utility in Selection

Loading in 2 Seconds...

play fullscreen
1 / 24

Chapter 6 notes - PowerPoint PPT Presentation


  • 522 Views
  • Uploaded on

Reliability, Validity, and Utility in Selection Requirements for Selection Systems Reliable Valid Fair Effective Reliability the degree to which a measure is free from random error. Stability, Consistency, Accuracy, Dependability Represented by a correlation coefficient: r xx

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chapter 6 notes' - jana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
requirements for selection systems
Requirements for Selection Systems
  • Reliable
  • Valid
  • Fair
  • Effective
reliability
Reliability
  • the degree to which a measure is free from random error.
  • Stability, Consistency, Accuracy, Dependability
  • Represented by a correlation coefficient: rxx
  • A perfect positive relationship equals +1.0
    • A perfect negative relationship equals - 1.0
  • Should be .80 or higher for selection
factors that affect reliability
Factors that Affect Reliability
  • Test length - longer = better
  • Homogeneity of test items – higher r if all items measure same construct
  • Adherence to standardized procedures results in higher reliability
factors that negatively affect reliability
Factors that Negatively Affect Reliability
  • Poorly constructed devices
  • User error
  • Unstable attributes
  • Item difficulty – too hard or too easy inflates reliability
standardized administration
Standardized Administration
  • All test takers receive:
    • Test items presented in same order
    • Same time limit
    • Same test content
    • Same administration method
    • Same scoring method of responses
types of reliability
Types of Reliability
  • Test-retest
  • Alternate Forms
  • Internal Consistency
  • Interrater
test retest reliability
Test-Retest Reliability
  • Temporal stability
  • Obtained by correlating pairs of scores from the same person on two different administrations of the same test
  • Drawbacks – maturation; learning; practice; memory
alternate forms
Alternate Forms
  • Form stability; aka parallel forms, equivalent forms
  • Two different versions of a test that have equal means, standard deviations, item content, and item difficulties
  • Obtained by correlating pairs of scores from the same person on two different versions of the same test
  • Drawbacks: need to create 2x items (cost); practice; learning; maturation
internal consistency split half reliability
Internal Consistency - Split-half Reliability
  • obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once
  • r must be adjusted statistically to correct for test length
    • Spearman-Brown Prophecy formula
  • Advantages: efficient; eliminates some of the drawbacks seen in other methods
internal consistency coefficient alpha
Internal Consistency – Coefficient Alpha
  • Represents the degree of correlation among all the items on a scale calculated from a single administration of a single form of a test
  • Obtained by averaging all possible split-half reliability estimates
  • Drawback: test must be uni-dimensional; can be artificially inflated if test is lengthened
  • Advantages: same as split-half
  • Most commonly used method of r
interrater reliability
Interrater Reliability
  • Degree of agreement that exists between two or more raters or scorers
  • Used to determine if scores represent rater characteristics rather than what is being rated
  • Obtained by correlating ratings made by one rater with those of other raters for each person being rated
validity
Validity
  • Extent to which inferences based on test scores are justified given the evidence
  • Is the test measuring what it is supposed to measure?
  • Extent to which performance on the measure is associated with performance on the job.
  • Builds upon reliability, i.e. reliability is necessary but not sufficient for validity
  • No single best strategy
types of validity
Types of Validity
  • Content Validity
  • Criterion Validity
  • Construct Validity
  • Face Validity
content validity
Content Validity
  • Degree to which test taps into domain or “content” of what it is supposed to measure
  • performed by demonstrating that the items, questions, or problems posed by the test are a representative sample of the kinds of situations or problems that occur on the job.
  • Determined through Job Analysis
    • Identification of essential tasks
    • Identification of KSAOs required to complete tasks
    • Relies on judgment of SMEs
  • Can also be done informally
criterion validity
Criterion Validity
  • Degree to which a test is related (statistically) to a measure of job performance
  • Statistically represented by rxy
    • Usu. ranges from .30 to .55 for effective selection
  • Can be established two ways:
    • Concurrent Validity
    • Predictive Validity
concurrent validity
Concurrent Validity
  • Test scores and criterion measure scores are obtained at the same time & correlated with each other
  • Drawbacks:
    • Must involve current employees, which results in range restriction & non-representative sample
    • Current employees will not be as motivated to do well on the test as job seekers
predictive validity
Predictive Validity
  • Test scores are obtained prior to hiring, and criterion measure scores are obtained after being on the job; scores are then correlated with each other
  • Drawbacks:
    • Will have range restriction unless all applicants are hired
    • Must wait several months for job performance (criterion) data
construct validity
Construct Validity
  • Degree to which a test measures the theoretical construct it purports to measure
  • Construct – unobservable, underlying, theoretical trait
construct validity cont
Construct Validity (cont.)
  • Often determined through judgment, but can be supported with statistical evidence:
    • Test homogeneity (high alpha; factor analysis)
    • Convergent validity evidence - test score correlates with other measures of same or similar construct
    • Discriminant or divergent validity evidence – test score does not correlate with measures of other theoretically dissimilar constructs
additional representations of validity
Additional Representations of Validity
  • Face Validity – degree to which a test appears to measure what it purports to measure; i.e., do the test items appear to represent the domain being evaluated?
  • Physical Fidelity – do physical characteristics of test represent reality
  • Psychological Fidelity – do psychological demands of test reflect real-life situation
where to obtain reliability validity information
Where to Obtain Reliability & Validity Information
  • Derive it yourself
  • Publications that contain information on tests
    • e.g., Buros’ Mental Measurements Yearbook
  • Test publishers – should have data available, often in the form of a technical report
selection system utility
Selection System Utility
  • Taylor-Russell Tables – estimate percentage of employees selected by a test who will be successful on the job
  • Expectancy Charts – similar to T-R, but not as accurate
  • Lawshe Tables – estimate probability of job success for a single applicant
methods for selection decisions
Methods for Selection Decisions
  • Top-down – those with the highest scores are selected first
  • Passing or cutoff score – everyone above a certain score is hired
  • Banding – all scores within a statistically determined interval or band are considered equal
  • Multiple hurdles – several devices are used; applicants are eliminated at each step
ad