Loading in 2 Seconds...
Loading in 2 Seconds...
Issues in Measuring Behaviour: Why do we want to quantify everything? Types of psychological test. Factors affecting test reliability. Factors affecting test validity. Why quantify?
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Why do we want to quantify everything?
Types of psychological test.
Factors affecting test reliability.
Factors affecting test validity.
1. Science involves measurement - because measurements can be objectively obtained, are publicly available, and potentially checkable by sceptical others.
2. Science often (but not invariably) involves experimentation - because the experimental method is good for identifying cause and effect.
Purpose of tests:
2. Practical applications - clinical, educational, occupational.
A 20th century phenomenon, dating back to Binet (1900s).
1. Untrained use - easy to administer, hard to interpret.
2. Spurious “precision”, because quantitative.
3. Misapplication of findings, in a deterministic way.
4. Essentially descriptions of groups; less reliable as descriptions of individuals.
Sex differences in reaction time?
1. Performance (e.g. IQ tests):
2. Disposition (e.g., anxiety, extroversion):
Social desirability (criterion-keyed tests).
Need appropriate norms.
A test is reliable if it gives consistent/reproducible results.
A score = “true” score + error:
Error is due to
(a) natural performance variation;
(b) lack of precision in defining and measuring psychological constructs (e.g. what do we mean by terms like "aggression" or "intelligence?")
(a) Test - retest (time to time).
(b) Alternate forms (version to version).
(c) Split-half (item to item).
(d) Inter-scorer (person to person).
1. The phenomenon itself (traits vs. states).
2. Precision of measurement.
3. Length of test (long > short).
4. Time between tests (short > long).
5. Variability in performance (high > low).
Multiple choice of 5 answers per question: 20 % correct by chance. True/false: 50% correct by chance.
Multiple choice therefore more reliable than true/false.
7. Inter-individual variability in scores (high > low).
The greater the variability between individuals in test scores, the better the reliability:
A test is valid if it measures what it is supposed to be measuring.
Important - a test can be reliable without being valid (but not vice versa).
(a) Face validity (intuitively looks plausible).
(b) Content validity (test covers material which is considered relevant - eg. statistics exams shouldn’t contain history questions!).
(c) Criterion validity - predictive or concurrent.
Problem - finding appropriate/decent criteria.
(d) Construct validity (does performance correlate well with known measures of the phenomenon?).
(e) Ecological / external validity.
Norms and standardisation.
(a) How well was the test standardised?
Stratified random sampling is ideal.
Do sub-group norms exist?
(b) Are sufficient details given to ensure correct administration?
(c) How appropriate is the standardised group as a baseline against which to compare your sample?
To what extent are our results generalisable to the real world?
Depends - e.g. driving simulators are good for simulating vehicle control, useless for simulating how riskily people are prepared to drive.
L.E.D. brakelights - light faster, but do the milliseconds make any practical difference to a following driver's braking times?