Fairness in Testing: Introduction

Fairness in Testing: Introduction Suzanne Lane University of Pittsburgh Member, Management Committee for the JC on Revision of the 1999 Testing Standards

Organization of 1999 Standards • Part I: Foundational Chapters • Part II: Fairness in Testing • Chapter 7: Fairness in Testing and Test Use • Chapter 8: The Rights and Responsibilities of Test Takers • Chapter 9: Testing Individuals of Diverse Linguistic Backgrounds • Chapter 10: Testing Individuals with Disabilities • Part III: Testing Applications

Proposed Revision • Combine three of the chapters in Part II into a single chapter: Fairness in Testing • Chapter 7: Fairness in Testing and Test Use • Chapter 9: Testing Individuals of Diverse Linguistic Backgrounds • Chapter 10: Testing Individuals with Disabilities • Move combined chapter to Part I: Foundational Chapters

Why Reorganize the Chapters? • Fairness in testing cannot be separated from accessibility • Individuals should be able to understand and respond without performance being influenced by construct irrelevant characteristics • All examinees that test is intended for should have an unobstructed opportunity to demonstrate their standing on the construct(s) being measured by the assessment

Accessibility is Essential for all Members of the Testing Population • Accessibility is a fundamental aspect of fairness and is the right of all members of the intended test taking population

Draft Fairness Chapter • Four sections: • Section I: General Views of Fairness • Section II: Threats to the Fair and Valid Interpretations of Test scores • Section III: Minimizing Construct Irrelevant Components Through the Use of Test Design and Testing Adaptations • Section IV: The Standards

Four Themes or Clusters • 1. Use test design, development administration and scoring procedures that minimize barriers to valid test interpretations for all individuals. • 2. Conduct studies to examine the validity of test score inferences for the intended examinee population. • 3. Provide appropriate accommodations to remove barriers to the accessibility of the construct measured by the assessment and to the valid interpretation of the assessment scores. • 4. Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups.

This Morning’s Round Table • Four members of the Joint Committee to Revise the 1999 Standards • Barbara Plake • Joan Herman • Linda Cook • Frank Worrell • Discussants • Martha Thurlow • Jamal Abedi

Fairness in Testing: Theme 1 Barbara S. Plake University of Nebraska-Lincoln Co-Chair, JC on Revision of the 1999 Testing Standards

Use test design, development, administration, and scoring procedures that minimize barriers to valid test interpretations for all individuals. • Test Design: use strategies to be as inclusive as possible for wide range of individuals • Universal Design • Administration • Clearly delineate construct

Use test design, development, administration, and scoring procedures that minimize barriers to valid test interpretations for all individuals. • Test Design: linguistic and reading demands consistent with construct • Removes construct irrelevant variance • Enhances validity of score interpretation; clarifies interpretation of standing on intended construct • Even when language is part of construct, demand should be commensurate with needed levels for performance

Use test design, development, administration, and scoring procedures that minimize barriers to valid test interpretations for all individuals. • Test Development: remove construct irrelevant components for members of special groups • Differentially familiar words, symbols • Sensitivity reviews

Use test design, development, administration, and scoring procedures that minimize barriers to valid test interpretations for all individuals. • Test Development: evaluate appropriateness of materials/items/tasks for identifiable subgroups • Small sample methodology • Accumulate data over operational administrations • Follow-up with causal investigations and actions to diminish differential test performance

Use test design, development, administration, and scoring procedures that minimize barriers to valid test interpretations for all individuals. • Administration: Test takers receive comparable treatment during test administration and scoring • Adhere to standardized protocols in admin except where flexibility enhances valid score interpretations • Individualized administrations • Role of interpersonal dynamics

Use test design, development, administration, and scoring procedures that minimize barriers to valid test interpretations for all individuals. • Documentation: include aspects of testing process that supports valid score interpretations • Specify how construct irrelevant variance was addressed in test design and development • Include results of technical studies to examine measurement quality for subgroups • Include studies of impact of accommodations and modifications on valid score interpretations

Fairness in Testing: Theme 2 Joan Herman CRESST/UCLA

THEME 2 • Conduct studies to examine the validity of test score inferences for the intended examinee population. • Where credible evidence indicates possibility of test bias • Where sample sizes constrain empirical evidence, use qualitative methods.

Conduct studies to examine the validity of test score inferences for the intended examinee population • the reliability and validity of score inferences for individuals from relevant subgroups should be specifically examined

Conduct studies to examine the validity of test score inferences for the intended examinee population • When differential prediction is an issue, use regression equations computed separately for each group under consideration or an analysis in which group is entered as moderator variable.

Conduct studies to examine the validity of test score inferences for the intended examinee population • When tests require scoring of constructed responses, evidence of reliability and validity of inferences should be obtained for relevant subgroups.

Fairness in Testing: Theme 3 Linda Cook Educational Testing Service

Provide AppropriateAccommodations to Remove Barriers to the Accessibility of the Construct Measured by the Assessment and to the Valid Interpretation of Scores • Provide test accommodations, when appropriate and feasible, to remove construct irrelevant barriers that otherwise would interfere with an examinee’s ability to demonstrate their standing on the target construct(s).

Provide Appropriate Accommodations to Remove Barriers to the Accessibility of the Construct Measured by the Assessment and to the Valid Interpretation of Scores • When test accommodations and/or modifications are permitted, test developers and/or test users are responsible for documenting provisions for their use.

Provide Appropriate Accommodations to Remove Barriers to the Accessibility of the Construct Measured by the Assessment and to the Valid Interpretation of Scores • Whoever assigns, administers or documents the use of permissible test accommodations and/or modifications should have sufficient information available to them and sufficient expertise to carry out this role.

Provide Appropriate Accommodations to Remove Barriers to the Accessibility of the Construct Measured by the Assessment and to the Valid Interpretation of Scores • When a test is changed to remove barriers to the construct being measured, empirical evidence of the reliability, validity, and comparability of inferences made from the scores should be obtained and documented.

Provide Appropriate Accommodations to Remove Barriers to the Accessibility of the Construct Measured by the Assessment and to the Valid Interpretation of Scores • When tests are translated to a different language, empirical evidence of the reliability, validity, and comparability of inferences made from the scores from the changed test should be documented.

Provide Appropriate Accommodations to Remove Barriers to the Accessibility of the Construct Measured by the Assessment and to the Valid Interpretation of Scores • A test generally should be administered in the test taker’s most proficient language for the testing context, unless proficiency in the language of the test is part of the construct that is being measured.

Provide Appropriate Accommodations to Remove Barriers to the Accessibility of the Construct Measured by the Assessment and to the Valid Interpretation of Scores • When an interpreter is used in testing, the interpreter should be sufficiently fluent in the language and content of the test and the examinee's native language and culture to translate the test instructions and questions, and, where required, to explain the examinee’s test responses. • Procedures for administering a test when an interpreter is used should be standardized.

Fairness in Testing: Theme 4 Frank C. Worrell University of California, Berkeley

Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups. • Focus of this theme is on the use of test scores—interpretation and consequences. • As with the previous themes, the goal is to apply the general principles to relevant subgroups. • ELLs, cultural minorities, immigrants, older individuals

Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups. • Test developers and publishers need to provide information supporting claims that a test can be used with examinees from specific subgroups (e.g., individuals from different linguistic or cultural backgrounds, individuals with disabilities).

Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups. • Research evidence is necessary to support the comparability of scores, when test scores are disaggregated and reported for subgroups (e.g., gender, ethnicity, age, language proficiency, disability).

Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups. • Tests should not be used with subgroups if credible evidence suggests that examinees’ scores are affected by construct-irrelevant characteristics of the test or of the examinees.

Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups. • It is inappropriate to use test scores as the sole indicator of an individual’s functioning, competence, attitudes and/or predisposition for the purposes of diagnosis and intervention.

Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups. • When alternative and equal measures of a construct exist, group differences (e.g., in mean scores or in percentages of subgroups of examinees passing) should be considered in deciding which test to use.

Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups. • When a test is used as an instrument of public policy, test users and policy makers must provide evidence (e.g., reliability, validity, and comparability of scores, likely consequences for individuals from relevant subgroups) in support of the proposed use.

Fairness in Testing: Introduction

Fairness in Testing: Introduction

Presentation Transcript

Fairness in the Media

Hypothesis Testing – Introduction

Psychological Testing: Introduction

FAIRNESS IN MENTAL HEALTH

Fairness

Fairness

Fairness

Software Testing: Introduction

Fairness

Language Testing Introduction

Fairness

Fairness

Introduction to Testing

fairness

Fairness

FAIRNESS IN CLASS ACTIONS

Testing Introduction

Software Testing Introduction

Standardized Testing in India Evaluating the Fairness of a Judging System