The impact of evaluation criteria on writing performance a study of pre service english teachers
Lina Mukhopadhyay & Geetha Durairajan
20 February: TEC14

THE IMPACT OF EVALUATION CRITERIA ON WRITING PERFORMANCE: A STUDY OF PRE-SERVICE ENGLISH TEACHERS . Lina Mukhopadhyay & Geetha Durairajan 20 February: TEC14. The objective of this presentation is to address the following points:

Lina Mukhopadhyay & Geetha Durairajan [email protected] 20 February: TEC14

Lina Mukhopadhyay



[email protected]

20 February: TEC14

The objective of this presentation is to address the following points:

A. Role and types of evaluation criteria (theoretical issues)

B. Impact of evaluation criteria on writing performance (empirical evidence)

C. Creation of evaluation criteria to assess writing

(practice for teachers)



  • What are issues in assessing writing?

  • What is fairness?

  • What is positive washback?


    A classroom based study on effect of evaluation criteria on writing performance


    A practice session on designing task-specific evaluation criteria


What are issues in assessing writing?

  • What to evaluate?Content /and Language

  • How to evaluate?Scales: Holistic/ Analytical

  • How to ensure

    • inter rater reliability?

    • fairness?

    • positive washback?

What is fairness?

  • Choose/create evaluation criteria according to level of learners, task requirements and test purpose(construct validity)

  • Share evaluation criteria with learners

    (justice: inter-learner equity)

  • Train learners to use evaluation criteria (access: educational opportunity to learn)

How do I choose my evaluation criteria?

Evaluation criteria: types

Role of Evaluation Criteria

Evaluation Criteria: research

  • Each task needs a different scale and the criterion should reflect the writing construct (Hamp-Lyons 1991, 1995)

  • What guides rater’s rating: educational background, interpretations of construct of language proficiency and task requirements (Cumming, Kantor & Powers 2001; Eckes 2005; Fahim & Baijani 2011)

  • Correlations between raters’ judgment: how to ensure inter rater reliability (Wang 2009)

What is positive washback?

ESL/ EFL learners will be able to

  • use evaluation criteria as a checklist to fulfill task requirements

  • understand assessor’s expectations from tasks in a transparent manner and work to fulfill those

  • do self and peer assessments using criteria (thereby learn to maintain inter-rater reliability and provide feedback to each other)

  • generalize from task-specific criteria and use this knowledge in other writing assessments


The Study

Aim: To examine the role of evaluation criteria in writing performance

To show this, we look at

  • participants’ perception of criteria and benefit (while and post task)

  • their awareness and use of criteria in writing

    Context: A course at PhD level titled ‘Language Testing and Assessment’ was where this study was conducted. The course had a formative assessment model – each assessment had task specific evaluation criteria which were shared with the learners prior to doing the tasks. An in-depth study was done to get evidence of learning through writing assessment.

    (assessment for and as learning)


positive washback

Research questions

If task specific criteria are provided to adult ESL learners,

(i) will they benefit from this knowledge?

(ii) what kinds of benefit will they experience?

Method of data collection

13 adult learners (8 female), 24 to 45 years of age, participated in the study. 8 participants had prior teaching experience and 2 of them reported to have used criteria in assessment.

Stage 1:making available task-specific criteria

Stage 2:perception of criteria

Stage 3: using criteria (implicit training)

Stage 4: talking about benefit(s)

Example of a writing assessment

Task prompt:

Look at the proficiency test. This was used as an entrance test for BA English programme at EFL-U. Does this test pass all the five principles of assessment (authenticity, reliability, validity, practicality andwashback)? Justify your stance with relevant examples. Write a critical response in about 500 words.

Evaluation criteria:

  • Does the response contain an overall thesis statement and comments on all the five principles? Is each principle justified with at least one example?(content)

  • Is the response written in academic language (e.g., passivization, linkers, voice) and includes referencing details? (language)

  • Is the response presented in three parts (intro-body-conclusion) with adequate links between them? Are ideas linked at intra and inter sentential levels?(organization)

Method of analysis

Qualitative analysis of perceptions from two sources:

A. participants

B. tutor as evaluator

to capture instances of learning (positive washback).

Measurement of learning (positive washback)

Do the participants

  • experience an ease in planning and performing on tasks?

  • understand assessor’s expectations for each task?

  • use criteria for self and peer assessment meaningfully?

  • reflect on strengths and weaknesses post performance?

  • generalize planning and writing techniques to write critical responses in the course and outside of it?


Were benefits experienced?

Participants reported benefit at the level of planning and post task reflection, at 96% . This wasexperienced due to availability of evaluation criteria to complete writing assessments.

Benefits experienced:(positive washback)

Participants’ responses

One instance of peer evaluation

Tutor’s assessment

1a. Examples: participants’ responses

ease in planning

I liked the idea of writing with the prompt and evaluation criteria as it helped me to produce responses that were clear and to a greater extent, up to the assessor’s expectations.

By the end of the course my response to using the evaluation criteria to plan and write my assignments improved. I think that it is a very significant and necessary aspect of writing an assignment. For the other courses, where we did not receive any evaluation criteria I tried to speculate the expectations of the assessor and create the criteria and then write the assignment. (S:VI)

understand assessor’s expectations

generalize techniques to other pieces of writing – post course application

1b. Examples: participants’ responses

I could not follow the evaluation criteria that much meaningfully for the first time. The problem was not obviously with the criteria, but with my understanding of the nature of assignment… But later on, day by day I had been trying to build a sort of familiarity or say rapport with the evaluation criteria, and started adjusting my writing into the criteria.

My later assignments would manifest how much labor I devoted to follow those criteria. And the result was satisfactory. I was happy, indeed. (S:RU)

ease in using criteria

positive reflection post performance

Summary of benefits

During task

  • crucial to finish tasks/ assignments on the course in an organized manner

  • understand different levels of performance and check before submission which level has their response met

  • understand assessor’s expectation(s) and features (content-organization-language) that were part of different levels of performance

Post task

  • useful to complete peer assessment and provide feedback

  • understand strengths and weaknesses in one’s own work, especially in content development(gaps in providing evidence to support claims)

Source: Participants’ responses

2. Peer evaluation

In the course there was one assessment task where the participants had to critique a test for its degree of usefulness. Evaluation criteria to complete the task was given to the participants before they attempted the task. They reported that they had used the criteria while working on the task.

Later, the same criteria was used by them to do peer assessment on the same task. It was found that the correlation of the peer assessment to the tutor’s assessment was at r=.79. This was a high positive correlation indicating a high degree of inter-rater reliability.

In a one-to-one discussion (through discussion board on the internet), the participants said that they found peer evaluation methodical because of use of task specific criteria. They could understand the direction in which the writing task had to be attempted and could give appropriate scores and feedback to their peers.

2a. Example

1. Did the criteria help you in assessing the response of your peer? If yes, then why?

Yes, the criteria helped me assessing my peer because it allowed one to look for specifics in the answer and score against that.

2.If you were not given the criteria but only the prompt then would your assessment have differed? If yes, then in what way? Would you have been able to justify the scores that you would have given as a holistic score or analytical score? Which score type would you be likely to give in the absence of a criteria?

Yes, if the criteria was not given then scoring would not have easy and it would not have been based on the specifics. Also, the justification of the scores would have been difficult. The scoring without criteria would have been a holistic one.

3. When you were given back your response as evaluated by your peer, did you agree on the scoring or disagree? Explain why you agreed or disagreed.

I agreed with the scoring because it was objectively scored against the criteria given.(S:SH)

positive washback: inter-rater reliability, feedback

3. Tutor’s assessment


  • Some attempts at forming an opinion and justifying it through elaboration and examples.

  • Most of the key ideas present.

  • Argumentation is weak.


  • Macro coherence attempted (all the key ideas were presented in their proper order).

  • Signaling of ideas present (organizational details of the paper presented).

  • Micro coherence not well developed (links between paragraphs and sentences not well developed).

Source: Tutor as evaluator

Why were the benefits experienced?

  • Cognitively it made tasks easier as it broke them down into manageable bits (e.g., key ideas, text structure).

  • It drew learners’ attention to structure content coherently and present the ideas in an academic manner.

    (comprehensible output, Schmidt 2001, 2010)

  • Provided learners with a checklist to edit and revise work prior to submission. So criterion was made available to the participants and this yielded positive washback. (Hughes 2003)

Why were benefits experienced?

4. Noticing specific details of tasks to do peer assessment helped learners process ideas at a deeper level. Consequently, they could give each other meaningful feedback on responses. (Robinson 2009)

5. Learners felt responsible for what they had written and evaluated: they learned to focus closely on content development. For instance use of appropriate examples to substantiate a claim was noticed by the learners due to the twin use of evaluation criteria. This created an atmosphere of democratic method of assessment that lead to further instances of learning (positive washback).(Shohamy 2002)

Approaches to assessment

Nitko 1983, 1989; Earl 2003; English language arts curriculum, British Columbia 2006; Ontario Report 2010

Assessment as and for learning

Pedagogical implications

  • Assessment can and should be used to support learning.

  • Free response items should have task-specific evaluation criteria.

  • Criteria can be shared to

    raise awareness,

    notice task requirements,

    revise documents

    track growth

positive washback


We need to design and share evaluation criteria with our learners because it can :


b) give rise toinstances of learning (positive washback)

Evaluation criteria: examples

TASK:You wish to subscribe the magazine READER’S DIGEST. Write a letter in 100-150 words to the editor requesting him/ her to give you the subscription details. In your letter, you can ask about the subscription rate, mode of payment, delivery and any other query that you may have.

Option 1:General criteria

You will be graded on content, language and organization.

Option 2:Task-specific criteria

Enquires about subscription details, mode(s) of payment, details of delivery, time to be taken, whom to contact in case of problems(Content)

Uses vocabulary appropriate to express each language function and a variety of sentence structures accurately.(Language)

Begins with a formal address to the editor and expresses interest about the magazine presents all enquiries about the subscription concludes by thanking the editor and intends to receive information at the earliest (Organization)


Being and looking fair is important. Do you agree? Discuss with reference to the following pictures. Write your answer in 100-150 words.

Picture APicture B

Evaluation Criteria: Template 1

Task 2

Being and looking fair is important. Do you agree? Discuss with reference to the following advertisement. Write your answer in 250-300 words.

Evaluation Criteria: Template 2

Anand, Ayesha, Barka, Clementine, Jayant, Kezo, Manish, Remya, Rukan, Shehla, Sunitha, Suraj and Vrishali.Thank you for your participation and timely responses without which this project would have remained unfulfilled 



Brown, J. D., and Abeywickrama, P. (2011). Language assessment: principles and classroom practices (2ndEdn). Pearson Education.

Earl, L. (2003) Assessment as Learning: Using Classroom Assessment to Maximise Student Learning. Thousand Oaks, CA, Corwin Press.

Hughes, A. (2003). Testing for language teachers. Cambridge: Cambridge University Press.

Kunnan, A. J. (2000). Fairness and validation in language assessment. Studies in Language Testing 9. Cambridge: Cambridge University Press.

Reid, J.M., 1993. Teaching ESL Writing. Prentice-Hall, New Jersey.

Schmidt, R. (2010). Attention, awareness, and individual differences in language learning. In W. M. Chan, S. Chi, K. N. Cin, J. Istanto, M. Nagami, J. W. Sew, T. Suthiwan, & I. Walker, Proceedings of CLaSIC 2010, Singapore, December 2-4 (pp. 721-737). Singapore: National University of Singapore, Centre for Language Studies.

Shohamy, E. (2001). The power of tests: a critical perspective on the use of language tests. Pearson Education.

Upshur, J.A., Turner, C.E., 1995. Constructing rating scales for second language tests. ELT Journal 49 (1), 3–12.


