Faculty of Education

Faculty of Education Issues with assessment: How did we get here? where should we go now? Harry Torrance Education and Social Research Institute (ESRI) Manchester Metropolitan University, UK h.torrance@mmu.ac.uk 18/01/14

Trends in exam results since the 1970s including national test results since the 1990s Criticisms and explanations of rising test results Political intervention and recent exam results Proposals for change Problems with the proposals for change and the need for more investment in teacher development

Figure 1: % pupils gaining National Curriculum Assessment level 2 or above at age 7, England

Figure 2:% pupils gaining National Curriculum Assessment level 4 or above at age 11, England

Figure 3:% of pupils gaining O-level/CSE grade 1/GCSE & Equivalents 1975-2010, England

Figure 4:% A-level Passes, 1980-2010, England, age 181980: n=567,0272010: n=784,877

Figure 5:% Distribution of A-level grades, E-A, 1980 and 2010, England

But this is not just a UK phenomenon:Figure 6:

Background to rising test scores: Educational and political aspirations over 25+ years: Selection, certification and qualifications Human resources development and education for all Criterion referencing, clarity of outcomes and the development of ‘content standards’ Social justice and educational inclusion Summative and formative assessment

Changes in Assessment over 25+ years: More coursework, practical work, oral work, project work, fieldwork More modular and incremental assessment + re-sits More formative assessment and feedback

So…explanation of rising scores? i) Some element of a genuine rise in standards - driven by better socio-economic conditions of students, higher expectations of educational outcomes by students, parents and teachers, and better teaching ii) an increasingly more focused concentration on passing exams, by both teachers (‘teaching to the test’) and the majority of students (extrinsic motivation), because of the perceived importance of educational success in institutional accountability and individual life chances. iii) the increased transparency of modular, criterion-referenced assessment systems, which affords teachers and students much more opportunity to improve grades through, coaching, specific feedback and resubmission of work, and to practice for tests.

The problem: narrowing the curriculum and ‘teaching to the test’ In many [primary] schools the focus of the teaching of English is on those parts of the curriculum on which there are likely to be questions in national tests…History and, more so, geography continued to be marginalized …In [secondary] schools…the experience of English had become narrower…as teachers focused on tests and examinations...There was a similar tension in mathematics… (OfSTED 2006 pp.52-56) In an effort to drive up national standards, too much emphasis has been placed on a single set of tests and this has been to the detriment of some aspects of the curriculum and some students (Parliamentary Select Committee reported on BBC 13 May 2008) There are considerable concerns…that the system is too ‘high stakes’, which can lead to unintended consequences such as over-rehearsal and ‘teaching to the test’ (Bew Report on KS2 Testing 2011, p. 9)

International evidence is similar, this isn’t just an issue for the UK • All systems which try to use assessment for accountability purposes, and use test results as targets, rather than measures, face the same issues, • e.g. USA: changes included a narrowing of the curriculum and instruction toward tested topics and even toward certain problems styles or formats. Teachers also reported focusing more on students near the proficient cut-score… (Hamilton et. al. 2007, US National Science Foundation Evaluation of NCLB, Summary: p. xix) • NB Also other human service systems – health service, banking, etc. - when measures become used as targets, this corrupts the measure.

Also the issue of coursework and feedback: …greater transparency of intended learning outcomes and the criteria by which they are judged, and…Clarity in assessment procedures [and] processes…has underpinned the widespread use of coaching, practice and provision of formative feedback to boost individual and institutional achievement.…However…such transparency encourages instrumentalism…transparency of objectives coupled with extensive use of coaching and practice to help learners meet them is in danger of removing the challenge of learning and reducing the quality and validity of outcomes achieved. This might be characterized as a move from assessment of learning, through the currently popular idea of assessment for learning, to assessment aslearning, where assessment procedures and practices come completely to dominate the learning experience, and ‘criteria compliance’ comes to replace ‘learning’ (Torrance 2007 p. 282)

Confidence in the qualifications and assessment system has been diminishing...The usefulness of the system has been eroded by the politicisation of assessment outcomes, by universities’ loss of confidence in A levels as a certificate of readiness for university-level study, by employers’ loss of confidence in GCSEs and A levels as certification of relevant knowledge and skills, and by the disproportionate burden placed by external assessment on pupils, teachers and schools. The volume of external assessment has also grown enormously….This process has undermined the credibility of teacher and school assessment, as well as limiting and undermining teaching (Sykes Review 2010, p.4)

Michael Gove, 17/09/12, on the proposed new ‘EBC’: changes made to GCSEs, specifically the introduction of modules and the expansion of coursework, controlled assessment, undermined the credibility of exams...We believe it is time to tackle grade inflation and dumbing down. And we believe it is time to…restore rigour to our examinations…We want to ensure that modules - which encourage bite- size learning and spoon-feeding, teaching to the test and gaming of the system – go…We want to remove controlled assessment and coursework … Elizabeth Truss (schools minister on reform of A-level) 23/1/13: Pupils spend too much time thinking about exams and resits of exams that encourage a 'learn and forget' approach to studying… We want to end the treadmill of repeated exams…We want questions that encourage students to think and prepare for university study. Not a satnav series of exams.

Pressure for change has produced a plateau and then downturn in results: • GCSE summer exams (i.e. headline figures), A*-C: • 2011 70% • 2012 70% • 2013 68.7% • A-level summer exams A* + A: • 2011 27% • 2012 26.6% • 2013 26.3% • NB reduction of 0.7% of all entries = c.6000 fewer A* and As • Also, while KS2 results continue to edge up into the 80%+ range at level 4, the new grammar & punctuation test is lower (2013: 74% level 4) and only 75% reach level 4 in reading, writing & maths

Proposals for change: English Baccalaureate Certificate (EBC) to focus on: Maths, English, Science, Language(s), History/Geography Implication: narrow the curriculum Reform of GCSE/EBC exams & A-level: terminal exam papers at the end of 2 years Implication: reduce sample of work available for assessment and produce invalid and unreliable assessment Move to 9 grades at GCSE/EBC from 9-1, initially in English Lang. & Lit. and Maths, c.20 other GCSE subjects to follow Implication: more distinction at top end but can grades be so finely calibrated? Will 9-1 be understood? NB also new test of grammar, punctuation and spelling introduced at KS2 in 2013…and OfSTED Chief Inspector, Michael Wilshaw, has called for reintroduction of tests at 7 and 14

But the issue is not about the modes and methods of assessment per se - it is about the volume of assessment and the pressure to produce results – accountability • In August 2012, JCQ issued: • 2,308,527 GCE results. This figure includes: • 861,819 results for GCE A-level • 1,350,345 results for GCE AS • 6,636 results for Applied GCE double award A-level • 32,447 results for the Applied GCE single award A-level • 7,113 results for Applied GCE double award AS • 50,167 results for the Applied GCE single award AS • 5,638,240 GCSE results. This figure includes: • 5,225,288 GCSE Full Courses • 371,352 GCSE Short Courses • 41,600 GCSE Double Award (JCQ website)

The ‘Educationalist case’ for wide range of assessment methods: issues of teaching, learning and inclusion: • Social & Economic Need/Curriculum Change: developing new skills and understandings - investigation, problem-solving, analysing, applying - open rather than closed tasks in naturally occurring or ‘authentic’ settings • Teacher/Assessor Expertise: use of local circumstances and resources, including more flexible teaching methods and learning environments; detailed knowledge of candidate built up over time

Responsiveness to Students: rendering academic/scholarly work more relevant/meaningful/useful shorter term goals formative feedback from teachers/tutors recognising achievement other than the ‘academic’ self/peer assessment and the development of understanding • Inclusion/Equity: Combined impact of all of the above focussing on what students know, understand and can do in different circumstances

The ‘Examiner case’: issues of fitness-for-purpose, reliability and validity • Complementary assessment of same objectives as written papers re. subject content: i.e. increasing the size of the sample of assessed work, undertaken under more ‘authentic’ circumstances - e.g. open book exams (reliability) • Assessment of other objectives: broadening the scope of the sample of assessed work (validity) • Assessment of objectives where evidence is ephemeral: broadening the quality of assessed work e.g. speaking & listening, practical work in situ (fitness for purpose/validity) • Assessment of unanticipated outcomes; broadening the responsiveness of assessment (validity)

To reiterate: EBC to focus on: Maths, English, Science, Language(s), History/Geography Implication: narrow the curriculum Reform of exam methods: terminal exam papers at the end of 2 years Implication: reduce sample of work available for assessment and produce invalid and unreliable assessment The issue is not about the modes and methods of assessment per se it is about the size of the system and the pressure to produce results – accountability Gove has seen there is a problem, but come up with the wrong solution

i) just because assessment can be observed to have negative backwash effects on the curriculum and teaching, this doesn’t necessarily mean that the same mechanism is available to harness these effects to beneficial purposes; ii) the impact of ‘assessment for learning’ on students’ knowledge and understanding will inevitably be mediated by the accountability context in which it operates; iii) criterion-referencing enables the structure of knowledge domains and the processes of assessment associated with them to be more transparent, such that more students can achieve more success, but the very nature of that success threatens its credibility i.e. the inferences that can be drawn from it

Assessment intersects with every aspect of an educational system: at the level of the individual student and teacher and their various experiences (positive or negative) of the assessment process; at the level of the school or similar educational institution and how it is organized and held to account; and at the level of the educational and social system with respect to what knowledge is endorsed and which people are legitimately accredited for future economic and social leadership.

Implications for policy, minimise the impact of accountability on student experience: The greater the scale and scope of the testing system, the simpler the tests will be and hence the narrower the curriculum will become The more individual student achievement is tied to system accountability the more accountability measures will dominate student experience Therefore: restrict testing to a politically necessary minimum; attend to monitoring standards by use of small national samples; re-conceptualise the integration of curriculum development and assessment i.e. put resources and support into re-thinking curriculum goals for the 21st century and developing illustrative examples of high quality assessment tasks that underpin these goals, for teachers to use as appropriate include a broader range of indicators of educational experience and outcomes in accountability and inspection regimes

Education as induction into knowledge is successful to the extent that it makes the behavioural outcomes of the students unpredictable (Stenhouse 1975 p. 82)

References: • Torrance H. (2007) ‘Assessment as Learning? How the use of explicit learning objectives, assessment criteria and feedback in post-secondary education and training can come to dominate learning’ Assessment in Education 14, 3, 281-294 • Torrance H. (2011) ‘Using Assessment to Drive the Reform of Schooling: Time to Stop Pursuing the Chimera?’ British Journal of Educational Studies 59, 4, 459–485 • Torrance H (Ed. 2013) Educational Assessment and Evaluation, Four volume set in Routledge Major Themes in Education Series, Routledge • Wyse D. & Torrance H. (2009) ‘The development and consequences of national curriculum assessment for primary education in England’ Educational Research 51, 2, 213-238 • See also: Joint Council for Qualifications: http://www.jcq.org.uk/examination-results

Faculty of Education