1 / 21

Topics: Quality of Measurements

Topics: Quality of Measurements. Reliability Validity. The Quality of Measuring Instruments: Definitions. Reliability: Consistency - the extent to which the data are consistent Validity: Accuracy- the extent to which the instrument measures what it purports to measure.

korene
Download Presentation

Topics: Quality of Measurements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics: Quality of Measurements • Reliability • Validity

  2. The Quality of Measuring Instruments: Definitions • Reliability: Consistency - the extent to which the data are consistent • Validity: Accuracy- the extent to which the instrument measures what it purports to measure

  3. Hitting the Bull’s Eye

  4. The Questions of Reliability • To what degree does a subject’s measured performance remain consistent across repeated testings? How consistently will results be reproduced if we measure the same individuals again? • What is the equivalence of results of two measurement occasions using “parallel” tests? • To what extent do the individual items that go together to make up a test or inventory consistently measure the same underlying characteristic? • How much consistency exists among the ratings provided by a group of raters? • When we have obtained a score, how precise is it?

  5. True and Error Score Parallel Tests

  6. Sources of Error: Conditions of Test Administration and Construction • Changes in time limits • Changes in directions • Different scoring procedures • Interrupted testing session • Qualities of test administrator • Time test is taken • Sampling of items • Ambiguity in wording of items/questions • Ambiguous directions • Climate of test situation (heating, light, ventilation, etc) • Differences in observers

  7. Sources of Error: Conditions of the Person Taking the Test • Reaction to specific items • Health • Motivation • Mood • Fatigue • Luck • Memory and/or attention fluctuations • Attitudes • Test-taking skills (test-wiseness) • Ability to understand instructions • Anxiety

  8. Reliability • Reliability: ratio of true variance to observed variance • Reliability coefficient: a numerical index which assumes a value between 0 and +1.00

  9. Relation between Reliability and Error True-Score Variability Error True-Score Variability Error Reliable Measure (A) Unreliable Measure (B)

  10. Methods of Estimating Reliablity • Test-Retest: Repeated measures with the same test (coefficient of stability) • Parallel Forms: Repeated measures with equivalent forms of a test (coefficient of equivalence) • Internal Consistency: Repeated measures using items on a single test • Inter-Rater: Judgments by more than one rater.

  11. Reliability Is The Consistency Of A Measurement Repeated Measurements/Observations Person X1 X2 X3 . . . Xk-->infinity Charlie 20 19 21 . . . 20 Harry 15 17 16 . . . 16 Reliable Repeated Measurements/Observations Person X1 X2 X3 . . . Xk-->infinity Charlie 20 10 8 . . . 23 Harry 2 11 4 . . . 15 Unreliable

  12. Test-Retest Reliability • Situation: Same people taking two administrations of the same test • Procedure: Correlate scores on the two tests which yields the coefficient of stability • Meaning: the extent to which scores on a test can be generalized over different occasions (temporal stability). • Appropriate use: Information about the stability of the trait over time.

  13. Parallel (Alternate)Forms Reliability • Situation: Testing of same people on different but comparable forms of the test • Procedure: correlate the scores from the two tests which yields a coefficient of equivalence • Meaning: the consistency of response to different item samples (where testing is immediate) and across occasions (where testing is delayed). • Appropriate use: to provide information about the equivalence of forms

  14. Internal Consistency Reliability • Situation: a single administration of one test form • Procedure: Divide test into comparable halves and correlate scores from both halves. • Split Half with Spearman Brown adjustment • Kuder Richardson #20 and #21 • Cronbach’s Alpha • Meaning: consistency across the parts of a measuring instrument (“parts” = individual items or subgroups of items). • Appropriate Use: Where focus is on the degree to which same characteristic is being measured. A measure of test homogeneity.

  15. Inter-rater Reliability • Situation: Having a sample of test papers (essays) scored independently by two examiners • Procedure: correlate the two sets of scores • Kendall’s coefficient of concordance • Cohen’s kappa • Intraclass correlation • Pearson product moment • Meaning: measure of scorer (rater) reliability (consistency, agreement) which yields the coefficient of concordance. • Appropriate Use: For ensuring consistency between raters

  16. When is a reliability satisfactory? • Depends on the type of instrument • Depends on the purpose of the study • Depends on who is affected by results

  17. Factors Affecting Reliability Estimates • Test length • Range of scores • Item similarity

  18. Standard Error of Measurement • All tests scores contain some error • For any test, the higher the reliability estimate, the lower the error • The standard error or measurement is the average standard deviation of the error variance over the number of people in the sample • Can be used to estimate a range within which a true score would likely fall

  19. Use of Standard Error of Measurement • We never know the true score • By knowing the s.e.m. and by understanding the normal curve, we can assess the likelihood of the true score being within certain limits. • The higher the reliability the lower the standard error of measurement, hence more confidence we can place in the accuracy of a person’s test score.

  20. Normal Curve Areas Under the Curve .3413 .3413 .1359 .1359 68% .0214 .0214 95% .0013 .0013 99% -3se -2se -1se +1se +2se +3se X=test score

  21. Warnings about Reliability • No such thing as “the” reliability; Different methods are assessing consistency from different perspectives • Reliability coefficients apply to the data, NOT to the instrument • Any reliability is only an estimate of consistency

More Related