Validity and reliability

Validity and reliability In Research

Agenda 1 2 3 AT the end of this lesson, you should be able to: Discuss validity Discuss reliability Discuss validity in qualitative research Discuss validity in experimental design Discuss how to achieve validity and reliability 4 5

Reliability • The consistency of scores or answers from one administration of an instrument to another, or from one set of items to another. • A reliable instrument yields similar results if given to a similar population at different times.

Validity • Appropriateness, meaningfulness, correctness, and usefulness of inferences a researcher makes. • Validity of ?? • Instrument? • Data?

Validity Internal validity is the extent to which research findings are free from bias and effects External validity is the extent to which the findings can be generalised

Validity - Content-related evidence • Content-related evidence of validity focuses on the content and format of an instrument. • Is it appropriate? • Comprehensive? • Is it logical? • How do the items or questions represent the content? Is the format appropriate?

Validity - Criterion-related evidence This refers to the relationship between the scores obtained using the instrument and the scores obtained using one or more other instruments or measures. For example, are students’ scores on teacher made tests consistent with their scores on standardized tests in the same subject areas?

Validity - Construct-related evidence Construct validity is defined as “establishing correct operational measures for the concepts being studied” (Yin, 1984). For example, if one is looking at problem solving in leaders, how well does a particular instrument explain the relationship between being able to problem solve and effectiveness as a leader.

Attaining validity and reliability

Elements of content-related evidence • Adequacy : the size and scope of the questions must be large enough to cover the topic. • Format of the instrument: Clarity of printing, type size, adequacy of work area, appropriateness of language, clarity of directions, etc.

How to achieve content validity • Consult other experts who rate the items. • Rate items, eliminating or changing those that do not meet the specified content. • Repeat until all raters agree on the questions and answers.

Criterion-related validity To obtain criterion-related validity, researchers identify a characteristic, assess it using one instrument (e.g., IQ test) and compare the score with performance on an external measure, such as GPA or an achievement test.

Validity coefficient • A validity coefficient is obtained by correlating a set of scores on one test (a predictor) with a set of scores on another (the criterion). • The degree to which the predictor and the criterion relate is the validity coefficient. A predictor that has a strong relationship to a criterion test would have a high coefficient.

Construct-related validity • This type of validity is more typically associated with research studies than testing. • It relates to psychological traits, so multiple sources are used to collect evidence. Often times a combination of observation, surveys, focus groups, and other measures are used to identify how much of the trait being measured is possessed by the observee. Proactive Coping Skills

Reliability The consistency of scores obtained from one instrument to another, or from the same instrument over different groups.

Errors of measurement • Every test or instrument has associated with its errors of measurement. • These can be due to a number of things: testing conditions, student health or motivation, test anxiety, etc. • Test developers work hard to try to ensure that their errors are not grounded in flaws with the test itself.

Reliability Methods • Test-retest: Same test to same group • Equivalent-forms: A different form of the same instrument is given to the same group of individuals • Internal consistency: Split-half procedure • Kuder-Richardson: Mathematically computes reliability from the # of items, the mean, and the standard deviation of the test.

Reliability coefficient • Reliability coefficient - a number that tells us how likely one instrument is to be consistent over repeated administrations • Alpha or Cronbach’s alpha • used on instruments where answers aren’t scored “right” and “wrong”. It is often used to test the reliability of survey instruments.

Standard error of the measurement This is a calculation that shows the extent to which a measurement would vary under changed circumstances. In other words, it tells you how much of the error is due to issues related to measuring.

Internal Validity

Validity • Validity can be used in three ways. • instrument or measurement validity • external or generalization validity • Internal validity, which means that what a researcher observes between two variables should be clear in its meaning rather than due to something that is unclear (“something else”)

What is “something else”? • Any one (or more) of these conditions: • Age or ability of subjects • Conditions under which the study was conducted • Type of materials used in the study • Technically, the “something else” is called a threat to internal validity.

Threats to internal validity • Subject characteristics • Loss of subjects • Location • Instrumentation • Testing • History • Maturation • Attitude of subjects • Implementation

Subject characteristics • Subject characteristics can pose a threat if there is selection bias, or if there are unintended factors present within or among groups selected for a study. For example, in group studies, members may differ on the basis of age, gender, ability, socioeconomic background, etc. They must be controlled for in order to ensure that the key variables in the study, not these, explain differences.

Subject characteristics • Age Intelligence • Strength Vocabulary • Maturity Reading ability • Gender Fluency • Ethnicity Manual dexterity • Coordination Socioeconomic status • Speed Religious/political belief

Loss of subjects (mortality) • Loss of subjects limits generalizability, but it can also affect internal validity if the subjects who don’t respond or participate are over represented in a group.

Location • The place where data collection occurs, aka “location” might pose a threat. For example, hot, noisy, unpleasant conditions might affect test scores; situations where privacy is important for the results, but where people are streaming in and out of the room, might pose a threat.

Instrumentation • Decay: If the nature of the instrument or the scoring procedure is changed in some way, instrument decay occurs. • Data Collector Characteristics: The person collecting data can affect the outcome. • Data Collector Bias: The data collector might hold an opinion that is at odds with respondents and it affects the administration.

Testing • In longitudinal studies, data are often collected through more than one administration of a test. • If the previous test influences subsequent ones by getting the subject to engage in learning or some other behavior that he or she might not otherwise have done, there is a testing threat.

History • If an unanticipated or unplanned event occurs prior to a study or intervention, there might be a history threat.

Attitude of subjects • Sometimes the very fact of being studied influences subjects. The best known example of this is the Hawthorne Effect.

Implementation • This threat can be caused by various things; different data collectors, teachers, conditions in treatment, method bias, etc.

Minimizing Threats • Standardize conditions of study • Obtain more information on subjects • Obtain as much information on details of the study: location, history, instrumentation, subject attitude, implementation • Choose an appropriate design • Train data collectors

Qualitative Research Validity and reliability??

Qualitative research . • Many qualitative researchers contend that validity and reliability are irrelevant to their work because they study one phenomenon and don’t seek to generalize • Fraenkel and Wallen - any instrument or design used to collect data should be credible and backed by evidence consistent with quantitative studies. • Trustworthiness

Quantitative vs. Qualitative

In qualitative research Reliability pertained to the extent to which the study is replicable and how accurate the research methods and the techniques used to produce data Objectivity of the researcher - researcher must look at her bias and preconceived notions of what she will find before she begins her research. Objectivity of the interviewee

In qualitative research • Triangulation • Member check • Audit trail

Let’s look at one particular design Validity in experimental research

Experimental Designs Should be Developed to Ensure Internal and External Validity of the Study

Internal Validity: • Are the results of the study (DV) caused by the factors included in the study (IV) or are they caused by other factors (EV) which were not part of the study?

Threats to Internal Validity Subject Characteristics (Selection Bias/Differential Selection) -- The groups may have been different from the start. If you were testing instructional strategies to improve reading and one group enjoyed reading more than the other group, they may improve more in their reading because they enjoy it, rather than the instructional strategy you used.

Threats to Internal Validity Loss of Subjects (Mortality) -- All of the high or low scoring subject may have dropped out or were missing from one of the groups. If we collected posttest data on a day when the debate society was on field trip , the mean for the treatment group would probably be much lower than it really should have been.

Threats to Internal Validity Location Perhaps one group was at a disadvantage because of their location. The city may have been demolishing a building next to one of the schools in our study and there are constant distractions which interfere with our treatment.

Threats to Internal Validity The testing instruments may not be scores similarly. Perhaps the person grading the posttest is fatigued and pays less attention to the last set of papers reviewed. It may be that those papers are from one of our groups and will received different scores than the earlier group's papers Instrumentation Instrument Decay

Threats to Internal Validity The subjects of one group may react differently to the data collector than the other group. A male interviewing males and females about their attitudes toward a type of math instruction may not receive the same responses from females as a female interviewing females would. Data Collector Characteristics

Threats to Internal Validity The person collecting data my favors one group, or some characteristic some subject possess, over another. A principal who favors strict classroom management may rate students' attention under different teaching conditions with a bias toward one of the teaching conditions. Data Collector Bias

Threats to Internal Validity Testing The act of taking a pretest or posttest may influence the results of the experiment. Suppose we were conducting a unit to increase student sensitivity to racial prejudice. As a pretest we have the control and treatment groups watch a movie on racism and write a reaction essay. The pretest may have actually increased both groups' sensitivity and we find that our treatment groups didn't score any higher on a posttest given later than the control group did. If we hadn't given the pretest, we might have seen differences in the groups at the end of the study.

Threats to Internal Validity History Something may happen at one site during our study that influences the results. Perhaps a classmate was injured in a car accident at the control site for a study teaching children bike safety. The control group may actually demonstrate more concern about bike safety than the treatment group.

Threats to Internal Validity There may be natural changes in the subjects that can account for the changes found in a study. A critical thinking unit may appear more effective if it taught during a time when children are developing abstract reasoning. Maturation

Validity and reliability

Validity and reliability

Presentation Transcript

Reliability and Validity

Reliability and Validity

Reliability and Validity

Reliability and Validity

VALIDITY AND RELIABILITY

Reliability and Validity

Validity and Reliability

Validity and Reliability

Reliability and Validity

Validity and Reliability

Validity and Reliability

Reliability and Validity

Validity and Reliability

Reliability and Validity

Validity and Reliability

Reliability and Validity

Reliability and Validity

Validity and Reliability