A Critical Look Into The Online Service ‘Criterion’ In Regard To Evaluating ‘Non-Native’ Written English And How It Might Best Be Used For Cultivating Such Skills. Simon Potter (D.Phil.) Professor in Languages and Cultures Nagoya University, Japan. Opening Notes.
A Critical Look Into The Online Service ‘Criterion’ In Regard To Evaluating ‘Non-Native’ Written English And How It Might Best Be Used For Cultivating Such Skills
Simon Potter (D.Phil.) Professor in Languages and Cultures Nagoya University, Japan
In Japan, computer-based technology has become increasingly common at the tertiary level of education, and it seems to be used at that level far more than in the primary and secondary schools. Experiences at the tertiary level might therefore be useful for deciding how advanced technology gets used at the lower levels.
Although the paper submitted for this conference explains how an online educational service is being used at my university and how it might be better used, the paper was written with the subtle intention of warning against rushing into using ‘technology’ and possibly using it at the wrong level of education. There might be a temptation to use ‘technology’ in the classroom and for testing purposes, but it should be noted that because the machinery, programmes, services, etc. can be expensive, great care should be taken when choosing ‘technology’ and how to apply it.
The English Department at Nagoya University implemented a ‘New Curriculum’ which includes several required computer-based programmes and tests for students to do during their first two years. In some cases (notably the so-called ‘eFACE’ and ‘Gyutto-e’ programmes), they are more like courses than supplements, and the students are generally not enthusiastic about the assigned tasks and even having to work at a computer.
One of these tasks is to take the ‘Criterion’ writing examination soon after matriculating and then again near the end of their first year. It requires the students to type an essay on a computer, and the software (somewhere in America) evaluates their compositions; the teachers are given the scores to tabulate into the grades for their ‘Middle-Level English’ courses.
The paper submitted for this conference addresses ‘Criterion’ and how it is being used, makes some suggestions, and touches on some potential social values that might be accrued from using this online service – especially at an educationally appropriate level.
The two main divisions of the paper are:
‘Criterion’ as a Device to Evaluate and to Improve Skills in Written English; and
Potential Social Values of Using ‘Criterion’ to Help ‘Non-Natives’ Cultivate Writing Skills in English.
Following these are the ‘Concluding Comments’ and some notes.
Besides highlighting important parts of the paper, this presentation also includes some information, feelings, etc. which were solicited from some students after the paper had been written up.
From academic year 2009, ‘Criterion’ examinations for assessing English writing skills have been a requirement for all first-year undergraduates at Nagoya University.
‘Criterion’ is an ‘online writing evaluation service’ developed by the Educational Testing Service (ETS), an institution which specialises in standardised tests of value within the realm of education in the United States, and it can be used both for testing and for getting feedback on essays that are being written.
For purposes of evaluation, there are two ‘scoring’ systems, a four-point scale and a six-point scale. The latter is similar to the ‘scoring’ system used for the essay component of the SAT (originally, Scholastic Aptitude Test), a nationwide general examination which many high-school seniors in America take if they intend to apply for a university.
Because the scoring is done by a ‘scoring engine’ – a programmed mechanical device – the question ‘is there a potential bias embedded in “Criterion”?’ arose when originally looking into the validity of using ‘Criterion’ (for a presentation to the faculty in 2009). My concern was that good users of the English language who are in the habit of using ‘British’ writing practices could be subject to lower scores than they deserve.
The ‘standard English’ which ‘Criterion’ is supposed to help students learn to write, and which it evaluates, was not defined or explained at the websites consulted. Furthermore, information about the ‘scoring engine’ suggested that the programme is a work in progress, meaning that what is considered to be acceptable might be broadened over time. Still, whatever qualifies as being ‘standard’ has to reflect the linguistic values of a ‘mainstream’ culture, and this is constantly subject to debate and change just within the United States.
While considering the question about a potential bias, the investigation led to some rather intriguing findings about ‘Criterion,’ which are mentioned here.
The first is that faith has to be put into the system because, according to the website consulted, there were no ‘results from randomized controlled trials that demonstrate the Criterion service’s ability to improve student writing.’ Indirectly, this suggests that there could be problems with the ‘scoring’ as well.
Importantly, it can serve as a warning about making a premature leap into using ‘technology.’
Using ‘technology in the classroom’ was becoming a ‘for-its-own-sake’ obsession around the time that ‘Criterion’ was created, and overdoing the ‘technology in the classroom’ can make the role of the teachers confusing, and potentially turn them into technicians.
One teacher, cited at the website and perhaps aware of shortcomings within his profession, was happy that ‘Criterion’ eliminates ‘the human element’ and turns learning how to write into something like ‘a computer game.’
Another teacher valued ‘Criterion’ because it helped her students prepare for ‘standardized essay writing exams,’ which would have been linked to the federal ‘No Child Left Behind’ legislation (2001); this means that ‘Criterion’ can be used to ‘teach to the test,’ notably statewide achievement tests and the SAT.
A yin-yang type of relationship seems to have been envisioned for teachers vis-à-vis ‘Criterion,’ viz.:
(1) ‘Criterion’ detects certain errors in a student’s writing and can provide some feedback;
(2) Teachers can concentrate on ‘the higher-order features of writing’ and ‘interact with their students regarding other aspects of their writing’;
(3) Teachers have to read the essays for content (facts, correctness, relevance) and to make sure that students have not plagiarised or otherwise cheated.
‘Criterion,’ therefore, is supposed to work with the basic mechanics, while the teachers are supposed to work with the intellectual side.
Intriguing is how ‘Criterion’ evaluates an essay. It uses a ‘scoring engine,’ which has been programmed and thereby ought to raise questions about how inclusive the programme is in regard to linguistic acceptability.
According to the website consulted, the ‘scoring engine’ ‘process[es] information ... [but] cannot read and analyze’ what is written. Hence, ‘Criterion’ does not evaluate content, which ordinarily is the rationale for an essay being written.
Two related dangers are (1) if something correct or valid is not in the database for the ‘scoring engine,’ a person’s score can be negatively impacted; and (2) because the database for the ‘scoring engine’ is created by rulings from ‘trained faculty readers,’ it is not likely that everything which could be acceptable is covered, regardless of how expert the implied trainers are.
These two serious defects were acknowledged to exist: (1) the ‘scoring engine’ ‘can be fooled by an illogical, yet well-written, essay’; and (2) ‘Criterion’ cannot catch plagiarism or other forms of cheating.
If the teacher is actively engaged with the students in the process of doing writing assignments, these defects can be overcome. But, if ‘Criterion’ is being used – without any human checks – simply to evaluate essays for grading purposes, a devious examinee can benefit from academically unethical behaviour.
Tangentially, because the ‘scoring’ is done by a machine, it is reason-able to wonder how the students feel about this. The website claimed that ‘the majority of students are comfortable with ... automated testing and grades,’ but whether this is true is open to question, especially since school children are not likely to know any other than what they are exposed to.
Problems, whether termed ‘biases’ or something else, were therefore detected to exist from consulting the main website about ‘Criterion.’ Even though the website seemed to serve as a promotional advertisement, an impression received was that ETS did feel that valid concerns needed to be addressed and, in a sense, that potential users ought to be alerted to its limitations.
Having become aware of these problems, I nevertheless felt that ‘Criterion’ is not useless, and that it ought to be applied more broadly and constructively – that is, not just as a means to evaluate the English writing skills of first-year university students. A few recommendations are given in the paper, but can be skipped over here so that more attention is paid to potential social values and some observations from the feedback given recently by some first- and second-year students at Nagoya University.
Japan places a high value on examinations as devices for evaluating and classifying people, and examinations are taken with a social competition in mind (notably, getting into an educational institution at the next highest level) or as a personal ‘challenge.’ It therefore makes sense that Japanese educators might consider using ‘Criterion’ in this context, and ‘Criterion’ could turn out to be useful as a lower-order test attached to another English test which does not assess writing skills.
‘Criterion’ could be used to help teach the fundamentals of written English. An advantage is that it can help non-native teachers cover fundamentals and check for errors in spelling and grammar. It might also help shift the focus of English learning toward practical skills (written, in particular, but also oral), and away from the longstanding skew toward grammar and vocabulary.
Cultivating good skills in written English can be useful not just for academic purposes, but also for international correspondence. Given that economic and political integration among countries and regions is likely to continue, that English has essentially become the world’s lingua franca, and that written correspondence seems to be increasing in volume through computers and multitasking handheld gadgets (i.e. ‘cellphones’), being able to correspond reasonably well in English promises to be a useful skill.
Using ‘Criterion’ to develop the fundamentals of written English would contribute to this, but it would also be useful preparation for using software such as ‘Microsoft Word’ and being aware of its merits and demerits, especially the mechanical limitations of a language programme (e.g. the spelling and grammar checking part of ‘Word’).
Working with ‘Criterion’ can familiarise students with working on a computer, something which is hardly learned in the Japanese schools – and possibly true elsewhere in East and Southeast Asia?
A direct benefit is that students learn to use the machinery, software, and programmes – that is, to be ‘technologically engaged’ – and getting used to the technology encourages further use of it, including accessing the Internet for serious purposes such as writing essays or getting information and ideas for projects in the various subjects. While doing so, they can also learn about the limitations and drawbacks of using higher technology, including how programmes can be used deviously – and hence be instructed about why honesty is better. Knowing about the weaknesses, not just the strengths, of higher technology can also offer the benefit of appreciating human qualities such as competence, attitudes, and personality.
Nagoya University is a research-oriented institution, so it is likely that data will be collected and analysed over the following years to determine if its ‘Criterion’ examinations will have yielded any meaningful results, and that information would hopefully be made available for academic and bureaucratic specialists within the field of education.
An impression is that the first year of university is the wrong time to have students take the ‘Criterion’ test; ‘Criterion’ seems more suitable as an evaluator and a teaching companion at the secondary level in countries where English is being taught as a ‘non-native’ language.
When a programme such as ‘Criterion’ is considered for use in a society outside the United States, it is important that it be investigated carefully to understand its merits and demerits.
After the ‘New Curriculum’ was adopted at Nagoya University, I have informally sought information, ideas, feelings, etc. from the students about the ‘e-learning’ components and the ‘TOEFL’ and ‘Criterion’ exams, and the general sentiment is not a favourable one. Nearly all of the students claim to be against ‘e-learning.’
In mid October, the faculty in the English Department were given the results of formal surveys conducted in 2010 and 2011 about one of the programmes, a rather involved one for second-year students called ‘eFACE’ which comprises, in essence, an entire course (to do all twelve units requires around twenty four hours, while a semester course comprising fourteen class meetings entails twenty one hours). A conclusion from the two surveys is that students in 2011 felt better about the programme than did those in 2010, and that this presumably reflects some of the changes made for the 2011 version.
There have not been, however, any surveys taken to assess what the students think about taking the ‘TOEFL’ and ‘Criterion’ exams, so after completing the paper for this conference, I solicited some thoughts about ‘Criterion’ from my students who have taken it.
The questions were open-ended so the students could make comments freely and not be guided by the wording of the questions or – as can happen when a series of possible (multiple-choice) answers are presented – the answers. First, they were asked for their feelings about taking the ‘Criterion’ test, what they thought were its good points and its bad points, and anything else (as ‘etc.’). Second, after sufficient time had been given for these, they were asked these two specific questions: (1) What was the purpose of taking ‘Criterion’?, and (2) What is/are the social value(s) of taking ‘Criterion’?
Some recurrent themes from their answers which are germane to this conference are presented next.
Comments Pertaining to ‘Feelings’ or Thoughts in General
A common sentiment seems to be that students don’t like (taking) the test. Inadequacy in English was sometimes cited as a reason.
Some students do not like the fact that the test is taken on a PC. Some of these stated that they are not good at using a PC or a computer, while others pointed out an unfairness existing in the typing – not only is an ability to type required, but also the speed of typing plays a role in how much gets written. [Tangential note: This contrasts with my experience at the University of Washington, where some students complained about having had a limited amount of time to write – manually, with a pen or pencil – answers to questions on tests; the students in America had become more accustomed to using computers than to writing by hand.] continued
Recurring adjectives under ‘Feelings’ included ‘difficult,’ ‘tired’ or ‘tiring,’ ‘bored’ or ‘boring,’ and ‘troublesome’ or ‘bothersome.’ ‘Difficult,’ though, need not be problematic, if it is referring to the content of the test (i.e. an ‘easy’ test is not likely to be useful). Such negative-leaning adjectives, however, indicate that taking the test could fall into the category of ‘distractions,’ which are increasing at Nagoya University and probably elsewhere.
A few students did not like having to go to university on a ‘holiday,’ which in this case means on the weekend when classes are not held, and when students might ordinarily have other things to do.
Some students were not satisfied with the six-point scoring system. They didn’t know what their score actually means, and some wanted advice and/or information about mistakes.
From the ‘Good Points’
Students can find out about their level/skill/ability of/in (written) English. This would refer to their ability/skill in a raw (‘genuine’ as one respondent said) sense – as some students pointed out, they cannot use dictionaries or other materials, and they can write freely about the given topic.
Taking the ‘Criterion’ test can provide a reason for studying English, perhaps more so than usual. In other words, it serves as a form of motivation.
From the ‘Bad Points’
Having to use a computer: this included problems with typing and occasional mechanical malfunctions. Using a computer and having to type were cited the most under ‘bad points.’
Not being allowed to use dictionaries and other materials which might be helpful when writing the essay. (This, however, has to be accepted as an intrinsic part of the test.)
Insufficient time for writing the essay.
A computer determines the score; the content of the essay was not evaluated; there was no feedback in regard to mistakes etc.; and the meaning of the earned score is generally not understood.
Comments Touching on the ‘Purpose’ of the Test
It is a way to find out about one’s level/skill/ability of/in written English.
It is a means for placing students. (The ‘Criterion’ score was used to determine who was placed in a ‘Survival English’ course, which the students’ comments suggested was something to avoid.)
It was/is a requirement to be met in order to earn the credits for a certain course (‘Middle-Level English’).
Note: All three of these ‘purposes’ are correct.
Comments Touching on ‘Social Values’
Students might have to use English later in their lives. ‘Globalisation’ and ‘international society’ were mentioned, as were working in foreign countries and English-language documents. Relevant observations touched on the fact that occasions for communicating in English have been increasing, and this included corresponding within ‘the global society.’ Related was the idea that learning English contributes to internationalising people.
Taking the ‘Criterion’ examination makes students study more, in the sense of being educationally beneficial; included in this regard were its contribution to improving the English abilities of Japanese and its usefulness for practising for the ‘TOEIC.’ continued
One comment which was raised suggests that taking the ‘Criterion’ examination could be linked to introducing education which uses computers and the Internet, which is to say that the test contributes to familiarising students with IT (information technology). A related comment said that the examination provides practice in writing English on a computer. [Note: As it is used at Nagoya University, ‘Criterion’ offers only limited practice, but if the ‘Criterion’ service were more fully employed, it would be useful for getting practice writing on, and otherwise using, a computer.]
The examination offers an opportunity to express one’s thoughts. This was noted as useful preparation for the work force.
One student thought that using ‘Criterion’ cuts costs. [Perhaps this is true, but it is not clear whether costs are cut, or even whether reducing interactions with human beings is good.] continued
The previous comments pertaining to social values might be considered as positive, but there were some negative comments, which are listed below.
One is that the ‘Criterion’ score is not used in the world at large.
Another is that (unqualified) offices do not place an emphasis on fast essay-writing, as does the ‘Criterion’ examination.
Many students wrote the likes of ‘I don’t know,’ ‘no idea,’ ‘no social value,’ and ‘nothing’ when answering the question about ‘social value(s).’ [This, of course, pertains to the examination, and the students are probably not aware of the broader use of ‘Criterion.’]
After working through the students’ responses to my questions, there seemed to be some significant correspondence between what they said and what I had written in the paper for this conference. These are summarised in the following four points.
(1) The examination is a means to study more, in this case English. (It might seem cynical, but other than studying for educational benefit and occupational development, studying has the added social function of keeping people busy, especially in a wealthy country such as Japan.)
(2) English is important in the ‘global society,’ and there seems to be some value placed on ability to communicate with foreigners, including through written correspondence.
(3) ‘Criterion’ can provide, or at least encourage, exposure to computer-based technology. (Even if students use ‘Criterion’ only on the examination days, it might alert some students to the need for learning how to operate a keyboard and to learn to type reasonably fast; of course, if the programme were being used with some consistency throughout the semesters, such skills would most likely be improved among all students.)
(4) Being able to express one’s thoughts fits in with the proposition that students in the secondary schools should be learning more practical English. To become competent at oral and written English requires being able to think things through and then explain them; an added benefit is that this can be applied to skills in Japanese.
Judging from what I have learned about ‘Criterion’ and how it has been used at Nagoya University, I would offer these questions for consideration by institutions wanting to purchase and/or to use higher technology for educational purposes.
(1) Is the machinery, software, programme, or such like going to be used at an appropriate level? For example, ‘Criterion’ would seem to be more suitable for secondary schools, where basic linguistic skills are supposed to be cultivated, than for a proper university.
(2) Will the machinery, software, programme, or such like be used constructively? For example, ‘Criterion’ would seem to be better used for cultivating English writing skills over the course of at least one academic year, rather than just for testing students once or twice to find out where they rank in terms of such skills.
(3) What will be the impact of the technology on the teaching staff? As in, will the technology help the teachers convey information, ideas, and insights, or will the technology as applied essentially turn the teachers into technicians? For example, although ‘Criterion’ can help reduce basic errors in written English, it cannot serve as an intellectual substitute.
(4) Are indirect benefits available from using the technology? For example, ‘Criterion’ can help students learn to use computers, to type, and to find out that technology has limitations.
(5) Will the knowledge and skills acquired by using the technology promise to be useful for life beyond education, or for society at large? For example, ‘Criterion’ can be a stepping-stone toward engaging in international correspondence.