Topic 4: Formal assessment
Application Of Norm-referenced Tests & Criterion –Referenced Tests

Topic 4 formal assessment

Topic 4: Formal assessment

Application Of Norm-referenced Tests & Criterion –Referenced Tests


Informal Assessment

  • Informal assessment

    • a procedure for obtaining information that can be used to make judgments about characteristics of children or programs using means other than standardized instruments.

    • Examples: projects, presentations, experiments, demonstrations or performances, portfolios, asking questions during class, or through informal observations of interaction.

Formal Assessment

  • Formal assessment

    • using a test that involves standardized administration and that has norms and a formal interpretive procedure" (Reading Success Lab, 2006).

    • uses formal tests or structured continuous assessment to evaluate a person development/performance.

    • Examples : assessment include standardized tests or end of chapter tests. This type of assessment has a specific right or wrong answer based on a set of predetermined criteria and has been used on other students (Weaver, 2007).

Formal test

  • Formal tests (psychological assessment) usually fall into the following categories:

    • Achievement tests

    • Readiness tests

    • Developmental screening tests

    • Intelligence tests

    • Diagnostic tests

Formal Assessment

Formal assessments are used "to assess overall achievement, to compare a student's performance with others at their age or grade, or to identify comparable strengths and weaknesses with peers" (Weaver, 2007, p. 1).

Weakneses of formal assessment

Many students get nervous while taking formal assessments and may not perform as well.

Rely on multiple choice questions, i.e. with this form, students are never required to come up with their own answers, but to select the best from a list.

It cannot measure the depth of a student's knowledge.

Types of Formal assessment

  • Norm-referenced tests

  • Criterion –referenced tests

Norm-referenced Tests (NRT)

NRT refers to a pattern/average regarded as typical for a specific group.

They have standardized, formal procedures for administering, timing and scoring.

They have been "normed“/administered to a representative sample of similar age/grade level students so that final test results can be compared to students of similar characteristics.

Norm-referenced Tests (NRT)

  • Norm Referenced Tests are tests that compare the performance of an individual child to that of their classmates or of another group known as a norm group.

  • Anorm group is defined as a group of individuals that are used to standardize the test.

  • Raw scores are the results of a standardized test that are obtained when the test is scored according to the directions.

  • NRT is a standardized achievement tests that can provide educators with important information.

    • In order to understand the results, we must become familiar with the ways scores can be represented

Norm-referenced Tests (NRT)

  • From the NRT, you will be able to do the following:

    • Define norm group and norm-referenced tests

    • Differentiate between raw scores and derived scores

    • List and explain the meaning of each of the scores used in norm-referenced interpretation of standardized achievement test scores

    • Convert standard scores to percentile ranks, z-scores, and T-scores using a normal curve

Norm-referenced Tests (NRT)

Percentile scoresindicate a student’s position relative to the group. E.g. a student scoring in the 95th percentile, means that the student scored higher than 95 percent of similar students.

Grade Equivalent Scores are used mainly in elementary schools. They convert a student’s raw scores to the grade level equivalent to the student’s score. Eg., a fifth grade student who scores the following: math 5.6, language arts 6.7, and science 7.4 may not necessarily be ready to skip a grade because the material on the test probably does not include sixth and seventh grade material.

Standard Scoresderived from raw scores by making use of the norming information obtained when the test was developed. Standard scores inform you about how far above or below the average, or mean, your student’s score falls.

Strengths Of Norm-referenced Tests

They assume statistical rigor in that they are reliable (i.e., dependable and stable) and valid (i.e., measure what they are reported to measure);

The quality of test items is generally high, i.e. they are developed by test experts, pilot tested, and undergo revision prior to publication and use.

Administration procedures are standardized & test items are designed to rank examinees for the purpose of placing them in specific programs/ instructional groups.

Strengths Of Norm-referenced Tests

Provide meaningful information regarding average performance, eg., in a particular school/district. i.e can decrease the likelihood of bias in educational decision-making because a student's test performance is compared to other students whose demographic and background factors are similar.

Opportunity to compare data on students' educational outcomes to instructional curricula to which students have already been exposed.

Useful in facilitating decisions such as identifying the educational needs of students, determining standards for student progress, and identifying and making decisions about students' eligibility.

These norms are useful for identifying students at risk for school failure.

Weakness Of Norm-referenced Tests

The test items is seldom aligned with curricular content taught in educational settings (with the exception of locally normed tests) –

By right, the items on a NRT should correspond to the content of the curriculum taught in a classroom.

Results of a NRT devoid of content validity make it difficult to determine effective interventions that are needed for a student experiencing academic and/or behavioral challenges.

NRTs do not allow for monitoring academic progress over an extended period of time; instead, they provide an index of achievement or performance in comparison to a norm group at one specific point in time.

Weakness Of Norm-referenced Tests

The conclusions based on the examinee's test performance may be misleading due to the disparities that exist between examinees and the norm group in terms of skills and experiences.

“When a child's general background experiences differ from those of the children on whom a test was standardized, then the use of the norms of that test as an index for evaluating that child's current performance or for predicting future performances may be inappropriate” (Salvia & Ysseldyke, 1991, p. 18).

Criterion –Referenced Tests

  • CRT is a test that measures a specific level of performance or a specific degree of mastery.

  • It measure what the person is able to do and indicate what skills have been mastered.

  • CRT compares a person's performance with his or her own past performance.

    • Eg: Number of spelling words correct. If Molly spells 15 of 20 words correct, that is 75% correct, higher than the past week when her score was 60% correct.

  • In criterion-referenced measurement, the emphasis is on assessing specific and relevant behaviors that have been mastered rather than indicating the relative standing in the group.

Criterion –Referenced Tests

  • CRT is similar to norm-referenced tests in terms of administration, scoring, and format; however, they differ in terms of interpretation.

  • CRT interpretation involves evaluating an examinee's performance in relation to a specific criterion.

    • For instance, if a criterion were “the ability to subtract single digit numbers,” the interpretation would involve indicating simply whether or not the student answered the administered subtraction problem items correctly.

    • A norm-referenced test interpretation, however, would involve whether this student correctly answered more questions compared to others in the normative group.

Criterion –Referenced Tests

  • Generally, criterion-referenced performance is summarized as percentage correct or represented as a grade-equivalent score.

  • CRT attempt to uncover the strength & weakness of a child in terms of what they knows or doesn't know, understands/doesn't understand, can do/cannot do in a particular domain/areas. Also called content referenced assessment.

  • CRT are sometimes misunderstood.

    • Although these types of test can involve the use of a cut-off score (e.g., the point at which the examinee passes if the score exceeds this number), the cut-off score is not the criterion.

    • Rather, the criterion refers to the content area domain that the test is intended to assess (Witt et al., 1998).

Strengths Of Criterion Referenced Assessments

students are tested on their mastery of skills;

the tests are based upon common developmental ranges;

they help the teacher plan according to the child’s needs for classroom instruction;

the assessment could be used over a developmental age span to determine a child’s growth in acquiring skills as a response to intervention;

the measure intelligence;

teacher’s response to intervention guides their instruction.

Weaknesses To Criterion-referenced Assessments

It does not allow comparing the performance of students in a particular location with national norms. For example, a school would be unable to compare 5th grade achievement levels in a district, and therefore be unable to measure how a school is performing against other schools.

It is time-consuming and complex to develop. Teachers will be required to find time to write a curriculum and assessments with an already full work-load. It might require more staff to come in and help.

Weaknesses To Criterion-referenced Assessments

It costs a lot of money, time and effort. Creating a specific curriculum incur extra cost & time - to hire more experienced staff; and most likely the staff will have to be professionals who have experience.

Needs efficient leadership & collaboration, and lack of leadership can cause problems - eg. if a school is creating assessments for special ed students with no well-trained professionals, they might not be able to create assessments that are learner-centered.

It may slow the process of curriculum change if tests are constantly changed. Hard for curriculum developers to know what is working/not working because tests tend to be different from one school to another. May require years of collecting data to know what is working and what is not.