1 / 35

RELIABILITY

TEST-RETEST RELIABILITY . 2 measures yield identical (or similar) results at 2 different times.. INTER-ITEM RELIABILITY. Multiple items measuring a concept (e.g., an index) are strongly correlated.. ALTERNATE FORMS RELIABILITY . Slightly different versions of same index are strongly correlat

donald
Download Presentation

RELIABILITY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. RELIABILITY A measure is reliable if provides CONSISTENT SCORES when measuring a phenomenon. Reliable measures have little random error. Unreliability: 1. You get different measurement results when the thing measured has not changed. 2. Answers to questions forming an index are poorly related. 3. Similar versions of a measure garner very different answers. 4. Ratings by two or more trained observers are poorly correlated. Valid measures will ALWAYS be reliable…but reliable measures are not necessarily valid!!

    2. TEST-RETEST RELIABILITY 2 measures yield identical (or similar) results at 2 different times.

    3. INTER-ITEM RELIABILITY Multiple items measuring a concept (e.g., an index) are strongly correlated.

    4. ALTERNATE FORMS RELIABILITY Slightly different versions of same index are strongly correlated? Split-halves reliability…variant of alternate forms reliability.

    5. INTEROBSERVER RELIABILITY Measures or observations from two + trained observers are correlated?

    6. CRONBACH’S ALPHA: A STATISTICAL MEASURE OF RELIABILITY Cronbach’s alpha varies from 0 to 1. Higher alpha value means higher reliability. Simple interpretation…the average correlation between the questions in an index. Correlation between your index and all possible indices measuring concept with the same number of questions.

    7. MORE ON CRONBACH’S ALPHA Value of alpha determined by: 1. Number of items in index; 2. Correlations among items. Index with a large number of items can have a good alpha even if correlations are modest. Small index can have a good alpha only if correlations between items are high. SPSS RELIABILITIES.

    8. INDEXES AND SCALES In principle, all ”real” social phenomena can be empirically measured….it is only a question of how good we measure them. Complex phenomena must be measured using multiple questions…these items are combined to form indexes.

    9. MORE ON INDEXES AND SCALES... IT IS POSSIBLE TO MODIFY EXISTING INDEXES. MEASUREMENT IS CONSTANTLY BEING IMPROVED. PROVIDE ORDINAL-INTERVAL LEVEL MEASUREMENT; MORE INFORMATION ABOUT VARIABLES; RELIABILITY & VALIDITY TESTING. MORE RELIABILITY/VALIDITY THAN SINGLE INDICATORS; MORE STATISTICAL ANALYSES.

    10. SCALES AND INDEXES: ARE THEY DIFFERENT? Scales in Psychology; standardized; have established reliability & validity. Indexes in Sociology; linked to specific research goals; variable reliability & validity.

    11. Combine related items into an index score. Often, score is the sum of the answers. Measure variables at ordinal/interval/ratio level.

    12. Mutually Exclusive means a respondent fits into one and only one option. Exhaustive means all respondents can fit into one option. Q. Your religious affiliation is: 1. PROTESTANT 2. CATHOLIC 3. NONE 4. OTHER, specify___________

    13. Unidimensional...the index measures a single concept. Factor analysis is useful for testing dimensionality….more on FA later.

    14. INDEX CONSTRUCTION: The questionnaire items are combined into an overall measure of the concept..the index score (e.g., total correct answers to the multiple choice questions on Exam #1 provide a kind of index measuring your mastery of the assigned readings from the lectures and labs. Indexes can be created to measure all kinds of complex social concepts (e.g., Quality of Life, Fear of Crime, Political views, Attitudes, etc.,).

    15. TELEPHONE SURVEY INTRODUCTIONS All questionnaire items in an index must have “face” validity. Each aspect of concept should be measured with at least one question. “Unweighted” index gives each item equal weight. A “weighted” index assigns different weights to each item. Weighted indexes are more precise.

    16. MISSING DATA: IS GENERATED WHEN A RESPONDENT FAILS OR REFUSES TO ANSWER A QUESTION. CAN BE A PROBLEM WHEN FORMING AN INDEX….UNDERMINES RELIABILITY & VALIDITY. TWO WAYS TO DEAL WITH MISSING DATA: 1. ELIMINATE CASES WITH MISSING INFORMATION. 2. ASSIGN “AVERAGE” INDEX SCORE TO CASES WITH MISSING DATA.

    17. SCALES Developed to measure how an individual feels or thinks about something. Help to operationalize a single concept (e.g., liberalism/conservatism). Like indexes, scales provide a quanititative measure that can be used to test hypotheses.

    18. THE LOGIC OF SCALING: Based on idea of reliably and validly measuring the strength of a variable. Consider the example of a “graphic” scale: Q. On a scale from 1 to 7, with 1 indicating “very cold” and 7 indicating “very warm”, what is your perception of Prime Minister Stephen Harper? 1 ----- 2 ----- 3 ----- 4 ----- 5 ----- 6 ----- 7 Very Very Cold Warm

    19. LIKERT SCALES: Widely used in surveys to measure attitudes & opinions. Check handout for examples. Minimum of 3; maximum of 7 answer categories….4 to 5 answer categories is optimal for most Likert scale questions. Make sure answer categories are balanced. If possible, avoid a “neutral” answer category. Related Likert scale answers often summed to form an excellent index.

    20. BOGARDUS SOCIAL DISTANCE SCALE Designed to measure perceived “social distance” of social groups from each other (e.g., ethnic, age, religious groups, etc.,). Measure how much social distance one group feels toward a “target” or “outgroup”. Logic of scale assumes a respondent who is uncomfortable with “socially distant” questions or scenarios, will be more uncomfortable with “socially close” questions or scenarios. Check the handout….

    21. MORE…. People from Group X are: entering your country/ province/ city/; work at your place of employment; live in your neighbourhood; marry your siblingr, etc. Respondents asked how comfortable or acceptable the statements or scenarios are. Usually 5 to 9 statements or scenarios.

    22. TWO LIMITATIONS OF SOCIAL DISTANCE SCALES. Must tailor the scale to a specific target/ outgroup and social setting. Awkward when analyzing 3+ groups.

    23. SEMANTIC DIFFERENTIAL SCALES. Developed to measure how respondent feels about: concept, object, person. Measures feelings or evaluations toward something with adjectives having polar opposite meanings (e.g., light/dark; hard/soft; smart/stupid; good/bad; cold/hot; fast/slow; etc.,)

    24. MORE ON SEMANTIC DIFFERENTIAL SCALES. Researcher asks question or statement and provides a list of paired opposite adjectives separated by a continuum of 7 to 11 points. Respondents mark spot on continuum that expresses their feelings or evalaution. Diverse & well-mixed adjectives Three major classes of meaning: evaluation (good-bad), potency (strong-weak), and activity (active-passive).

    25. GUTTMAN SCALES Begins with a set of indicators (survey questions). Usually have dichotomous response options (e.g., YES / NO). Scales include from 3 to 20 indicators. Researcher selects items on the assumption that a logical and hierarchical relationship exists between them. In a Guttman scale, most respondents will state YES to “lower-order” indicators, while a progressively smaller number will state YES to higher-order “indicators”. Example: Have you heard about Jean Cretien? __ YES __ NO Have you heard about Pierre Trudeau? __ YES __ NO Have you heard about John A McDonald? __ YES __ NO

    26. FACTOR ANALYSIS SCALING Statistical technique for testing dimensionality of an index; and creating weighted indexes. Logic rests on the idea we can statistically assess correlations between indicators in a way that uncovers concepts the indicators measure. Alternatively, the underlying concept(s) discovered by FA “explain” correlations between several indicators. Begins by factor analyzing items researcher thinks are related. FA confirms if the items are measuring a concept(s) and produce factor scores and a weighted factor index.

    27. Overview of Focus Groups Groups are common (e.g., brainstorming, planning, learning, partying). Can be enjoyable & useful; or boring, unproductive & time-wasting. Why groups can fail: (1) Leaders have an unclear purpose . (2) Leaders have inadequate process skills. FGs are a special kind of group…and one a popular social research method.

    29. Focus Groups: A Short History In the 1930’s researchers developed nondirective interviewing techniques. Shift from interviewer to respondent. Open-ended. Allow respondents opportunity to comment, explain, share experiences, ideas, attitudes. People share information when in a safe environment with people similar to themselves. Market researchers embraced focus groups in the 1950s; Academics rediscovered focus groups in the 1980s.

    30. Why focus groups work? The ability (and willingness) for self-disclosure varies widely in the population. Children have a natural tendency for self-disclosure. Socialization teaches us the value of dissemblance (concealing thoughts). We learn to display an expurgated (acceptable to others), “public self”….the self we want others to see & believe! People are more likely to self-disclose (reveal their “private self” when they feel comfortable, & when the environment is permissive & nonjudgemental!

    31. People more likely to self-disclose in groups comprised of people who are similar. Vital to select participants who have something in common & tell them the thing in common! Moderators encourage a wide range of comments, avoid making judgments , and body language that communicates approval or disapproval. Locations comfortable for participants. Often called “small group discussions” to appear less intimidating & mysterious.

    32. Components of Focus Groups: Usually consist of 5-10 people, can range from 4-12. Small enough for participation; large enough for diversity in views. People who possess certain characteristics: Participants are similar to each other in a way important to research goals. Similarity is basis for recruitment; participants informed of commonality at beginning of the discussion. Who can give you the type of information you need?

    33. Components of Focus Groups: The traditional ideal in focus groups was to use participants who did not know each other. Be cautious with focus groups where people know each other…it could inhibit disclosure. Moderators should appear neutral to respondents (no identification with public issues, agendas, or social organizations).

    34. Components of Focus Groups: Focus groups mainly provide qualitative data: Goal is to find a range of opinions & ideas. Compare & contrast data from 3+ focus groups. Inductive goals (understanding based on discussion) more than deductive goals (testing hypotheses).

    35. Focus groups have a focussed discussion: Questions are carefully planned, phrased, and sequenced so they are easy to understand and logical (Question Route). Most questions open-ended. Question Route moves from general to specific. Early questions help people to get talking & thinking. Later questions provide most useful information. Emphasis is NOT on group consensus, but understanding feelings, comments, and thoughts.

    36. Applications A versatile method, focus groups are used in: academic, market & social research, evaluations, and community-based research. Focus groups have a distinctive cluster of characteristics: (1) homogenous people in social interaction, (2) to collect data from focussed discussion, (3) in inductive, naturalistic way.

More Related