Chapter 5

An Introduction to Norm-Referenced Assessment Chapter 5

How Norm-Referenced Tests Are Constructed • When it is decided that an assessment will be created for a specific domain… • An item pool is created • Items are arranged sequence according to difficulty • A developmental version is field tested with a small sample • Professionals critique the assessment • Revisions are made • Field test with a larger sample

What is a Norm-referenced Test? • Allows teachers to compare the performance of one student with the average performance of other students who are of the same age or grade. • The norm group (i.e., sample) is a group of diverse students (e.g., linguistic, disabilities, cultural, etc.) • The norm group sets the average performance for the assessment • The norm group should be representative of the students who will be later assessed • Students who are later assessed should be similar in background, age or grade.

Interpolating Data • Test developers obtain average expected scores for each month by interpolating data. • Divide existing data into smaller units to establish developmental scores • Scores are written using a decimal (.) • Scores might also be divided by age group so that each month a chronological age is represented. • A student’s age expressed in years, months and days. • Scores are written using a dash (-)

Steps in Test Administration • Be sure to read the test manual • Follow the directions established by the test developer • Practice administering the test • Establish a positive testing environment • Get familiar with the student before testing begins. • Engage in friendly conversation before testing begins. • Explain why the testing is being completed. • Provide a brief introduction to the test. • Begin testing in a calm manner.

Calculating Chronological Age • Chronological age determines how old a student is in years, months and days (in that order). • Must be calculated correctly for interpreting test results. • Is calculated by writing the test day first then subtracting the student’s date of birth. • Each column in the calculation is a different base: • 12 months • 30 days • Chronological age (rounded) is 8-6 -1 +12 +30

Calculating Raw Score • The first score obtained during testing is the raw score. • The number of items a student answers correctly.

Basals and Ceilings • The starting and stopping points of a test must be determined so that unnecessary items are not administered. • The starting points are meant to represent a level at which a student could answer. • This information is provided in the manual, in the protocol or on the test itself. • Each student must establish a basal. • A basal is the level at which the student could correctly answer all easier items. • Typically, a manual will state that a student must get X number of items in a row correct in order to establish a basal. • Once a basal is established, the testing my proceed. • If the student does not establish a basal, the test is probably too hard. An alternative should be given.

Basals and Ceilings Continued • Starting points can be given as age or grade. • Student is 6-4; start with item 10 • Student is 8.3; start with item 50 • NOTE: When calculating the raw score, all items that appear before the established basal are counted as correct. • It is better to start the test asking easier questions to reduce frustration levels of students who may be below typical levels. • Ceilings are thought to represent the level at which more difficult questions would not be passed. • Typically a manual will state that a student must get X number of items in a row incorrect in order to establish a ceiling. • Once a student “hits” the ceiling, the testing stops. • NOTE: Basals and ceilings may not be the same number. In fact, the basal and ceiling may vary with each section of an assessment!

Using Information on Protocols • The protocol is the form used during the test administration and for scoring. • Student answers are often scored as a series of 1s and 0s (correct and incorrect) in the protocol. • Be sure to read the manual regarding what subtests to administer and basal and ceiling information.

Getting the Best Results • Students tend to respond more and perform better in testing situations with examiners who are familiar with them. • Students should not meet you for the first time on testing day! • It may also be helpful for the student to visit the testing site to become familiar with the environment. • Classroom observations and visits may aid the examiner in determining which tests to administer. • Do not over-test the student • Make the student feel at ease. • Convey the importance of the testing without making the student feel anxious. • Reinforce the student’s attempts and efforts, not correct responses. • Young students may enjoy a tangible reinforcer upon the completion of the testing session. • Not recommended during the assessment. • Follow all directions in the manual.

Reducing Bias • Do sensory or communicative impairments make portions of the test inaccessible? • Do sensory or communicative impairments limit students from responding to questions? • Do test materials or method of responding limit students from responding? • Do background experiences limit the student’s ability to respond? • Does the content of classroom instruction limit students from responding? • Is the examiner familiar to the student? • Are instructions explained in a familiar fashion? • Is the recording technique required of the student on the test familiar?

Obtaining Derived Scores • Raw scores are used to locate other derived scores. • There are advantages and disadvantages to using different types of derived scores. • It is important to understand the numerical scales that the scores are representing.

Types of Derived Scores

Group Testing • Schools often administer group standardized tests of achievement. • These are often referred to as high stakes tests. • Considerations in testing: • Tests should be logical and serve the purpose for which they were intended. • No test has the capability of answering all achievement questions.

National Center on Educational OutcomesNCEO • Core principles: • All students are included in ways that hold schools accountable for their learning. • Assessments allow all students to show their knowledge and skills on the same challenging content. • High-quality decision making determines how students participate. • Public reporting includes the assessment results of all students. • Accountability determinations are affected in the same way by all students. • Continuous improvement, monitoring, and training ensure the quality of the overall system (p. v).

High Stakes Considerations • IDEA requires students with disabilities to be included in assessments. • Used as a measure of accountability. • Student progress is measured to determine if programs are effective. • These students are afforded accommodations. • Changes in format, response mode, setting, timing or scheduling. (CCSSO) • May not alter what the test is measuring. • Accommodations should prevent measuring a student’s disability. • How assessment requirements are met are determined during the IEP and 504 processes. • Decisions regarding assessments should focus on the standard that students are expected to master.

High Stakes Considerations • Students who can not take the assessment must be provided with an alternative assessment. • There is a 1% cap on the number of students who may take the alternative assessments. • Assessments permitted include: Portfolio, performance-based, authentic and observations. • NOTE: Students who are ELL may require accommodations to test their content knowledge and not their English skills.

Issues & Research in High-Stakes Testing • Concerns about the inconsistency of definitions, federal law requirements, variability among states and districts, differences in standards of expectations for students with disabilities, lack of participation of students with disabilities in test development and standardization of instruments, and lack of consistency. • Conceptual understanding of the purpose and nature of the assessment. • Mandatory statewide assessments have resulted in damaging the American education system for all students and alternate assessments may not be the best way to measure academic progress. • Some teachers reported that high-stakes assessment helped teachers target and individualize instruction, and that their students who disliked reading or had difficulties with academics felt more in control of their own learning. • Performance task-based reading alternate tests can be scaled to statewide assessments, although determining their validity and reliability may be difficult. • Development of alternate assessments is difficult and states require more time to develop appropriate measures. • Practices on some campuses might result in specific groups of students being encouraged not to attend school on the days of the assessment so that campus data might be more favorable. • Test items were not comparable across assessments when the modified assessments used for children with various disabilities were analyzed. Moreover, the items varied by disability category.

Universal Design of Assessments • There has been a growing interest in making the design of all assessments more fair and user friendly for all learners rather than trying to fit a test to a student’s needs. UD NSCU

Chapter 5

Chapter 5

Presentation Transcript

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5 5

chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

CHAPTER 5

Chapter 5

CHAPTER 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5

Chapter 5