1 / 14

Reliability - The extent to which a test or instrument gives consistent measurement

Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores. Test-retest reliability (coefficient of stability) - Correlate two administrations of the same test.

Download Presentation

Reliability - The extent to which a test or instrument gives consistent measurement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores. Test-retest reliability (coefficient of stability) - Correlate two administrations of the same test. Parallel form reliability (coefficient of equivalence) - Correlate two forms of the same test Split half reliability (Spearman-Brown prophecy formula) - Correlate two halves of the test Internal consistency reliability (Cronbach α) - Correlate every item with every other item.

  2. T X1 X2 Reliability is the extent to which your observed score represents your true score E = X – T The test yielding the score of X1 is more reliable than that giving X2

  3. T X Reliability is the extent to which individual differences or rank ordering of individuals based on the observed scores represent that based on the true scores. One operations of this definition is the correlation between observed scores and true scores, rXT, which is called reliability index. Another operation is the squared correlation between observed score and true score or the proportion of observed score variance that is true score variance, or proportion of the consistent rank ordering, rXT2

  4. Test-retest Parallel form Split half Internal consistency X’ X In reality, it is the extent to which two tests yield similar results or similar rank ordering of the individuals, r2XX’

  5. When ρxx' = 1, • the measurement has been made without error (e=0 for all examinees). • X = T for all examinees. • all observed score variance reflects true-score variance. • all difference between observed scores are true score differences. • the correlation between observed scores and true scores is 1. • the correlation between observed scores and errors is zero.

  6. When ρxx’ = 0, • only random error is included in the measurement. • X = E for all examinees. • all observed score variance reflects error variance. • all difference between observed scores are errors of measurement. • the correlation between observed scores and true scores is 0. • the correlation between the observed scores and errors is 1.

  7. When ρxx’ is between zero and 1, • the measurement include some error and some truth. • X = T + E. • observed score variance include true-score and error variance. • difference between scores reflect true-score differences and error. • the correlation between observed scores and true scores is reliability. • the correlation between observed scores and error is the square root of 1 – reliability.

  8. Validity - The extent to which a test or instrument truly measures what it is expected to measure. - The use of a bathroom scale to measure weight is valid whereas the use of a bathroom scale to measure height is invalid. • Content validity refers to the extent to which the items on a test are representative of a specified domain content. Achievement and aptitude (but not personality and attitude) tests are concerned with content validity. • Construct validity refers to the extent to which items on a test are representative of the underlying construct, e.g., personality or attribute. Personality and attitude tests are concerned with construct validity. The process to establish construct validity is referred to as construct validation. • Criterion related validity, including predictive validity and concurrent validity, refers to the extent to which a test correlates with future behaviors which the test is intended to predict.

  9. Construct Validity: Internal Structure 0.66 0.82 0.74 0.67 0.82 0.54 0.65 0.40 0.61 0.58 0.46 0.39 0.52 0.64 Warmth Physical Punishment 0.88 Authoritative Parenting 0.84 0.79 0.72 0.38 0.80 0.56 0.37 0.82 0.57 0.63 0.74 0.67 0.68 0.93 Non Reasoning Inductive Reasoning - 0.38 0.94 0.78 0.64 0.69 0.65 0.48 0.59 0.68 0.62 0.43 -0.04 0.65 0.64 Easygoing Responsiveness Authoritarian Directiveness Authoritarian Parenting 0.85 1.01 0.65 0.05 0.18 0.42 0.48 0.59 0.68 0.62 Democratic Participation 1.04 Verbal Hostility

  10. Communication Avoidance Perceived Social Competence Time 1 .59 .50 .54 .55 .58 .73 -.38 -.16 -.24 .54 Social Withdrawal .96 .94 .94 -.13 Perceived Social Competence Time 2 .65 .62 .66 .23 Assertive Leadership -.13 -.35 .90 .70 .27 Peer Acceptance Time 1 .17 Single Indicator .87 .89 .82 Behavioral Aggression -.27 .24 -.15 -.17 .60 .65 .67 .69 Verbal Aggression Peer Acceptance Time 2 Single Indicator -.20 Construct Validity: Network Relations

  11. Criterion-Related Validity SAT A-Level Concurrent A-Level University GPA Predictive

  12. Distribution of criterion scores for selected group Selected group Criterion Distribution of scores on the criterion if no examinees were excluded Qualifying score Rejected Selected Test Scores Restriction of Range Effect

More Related