1 / 44

Test comparability across languages

Test comparability across languages. Ülle Türk University of Tartu/Estonian Defence Forces ulle.turk@ut.ee/ylle.tyrk@mil.ee. Topics to be discussed. 1. Issues in comparing examinations in different languages (on the basis of the Estonian Year 12 examinations).

Download Presentation

Test comparability across languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Test comparability across languages Ülle Türk University of Tartu/Estonian Defence Forces ulle.turk@ut.ee/ylle.tyrk@mil.ee

  2. Topics to be discussed 1. Issues in comparing examinations in different languages (on the basis of the Estonian Year 12 examinations). 2. Some ways of achieving the comparability of examinations in different languages (on the basis of the Finnish matriculation examinations). 3. Relating language examinations to the CEFR – general principles. 4. Relating reading papers to the CEFR. 5. Relating writing papers to the CEFR.

  3. Issues in comparing examinations in different languages(on the basis of the Estonian Year 12 examinations)

  4. Why compare examinations across languages? • Needs of test users • University admissions officers • Employers • Teachers, students, parents • Increasing mobility of the population • Increasing consumer choice • A growing emphasis on accountability

  5. Equivalence of examinations • Equivalent forms = • Different versions of the same test, which are regarded as equivalent to each other in that they are based on the same specifications and measure the same competence. To meet the strict requirements of equivalence under classical test theory, different forms of a test must have the same mean difficulty, variance, and co-variance, when administered to the same persons. ALTE. 1998. Multilingual Glossary of Language Testing Terms.

  6. Year 12 examinations in Estonia =national school-leaving examinations established in 1997 • centrally developed, administered and marked • test the learning outcomes of the National Curriculum (2002) • contain tasks at different levels of difficulty • results on a 100-point scale • offered in 13 subjects • four foreign languages:English, French, German, Russian • students required to take three:mother tongue + two more

  7. Foreign language examinations • National Curriculum for foreign languages • two compulsory foreign languages • FL A: grades 1–3; 3 hrs per week/ 2 hrs per week • FL B: grades 4–6; 3 hrs per week/ 2 hrs per week • level B2 in ONE foreign language • Ministry regulations on the development, administration and grading of examinations and reporting results • five papers: listening and reading comprehension, speaking, writing, language structures • equally weighted (20 points each) • High-stakes examinations • results used for university entrance • foreign language requirement: English, French, German

  8. Questions • Does 93 in the examination of 2006 reflect the same level of competence as 93 in the examination of 2004 (or 2001)? • Is 63 in English the same as 63 in German, French or Russian? • What does ‘the same’ mean? • The same level of competence?  B2 • The level of competence reached after the same amount of work? • US Foreign Service Institute (Jackson & Kaplan, 1999: 78) • US Defense Language Institute Foreign Language Center (MacWhinney 1995: 294) • Threshold Level (Trim)

  9. Year 12 examinations in FLs

  10. German & English: reading paper • 50 minutes • Three texts with tasks • Length of texts: 1500 words • No of tasks: • German: one task per text • English: one or two tasks per text • No of items • German: 20 • English: 40

  11. Reading: mean scores Lang Year Paper Text 1 Text 2 Text 2 English 2001 61% 71% 72%/66% 69%/35% 68% 78% 75% 57% 2004 2001 German 61% 67% 67% 55% 2004 60% 84% 62% 46%

  12. Text types

  13. Task types

  14. More questions • The reading paper in English seems more difficult than that in German (B2/C1?  B1/B2?) • Why are then the mean scores for the German examination the same or lower than the mean scores for the English examination? • Students who take German are less motivated and less bright? • It takes longer for Estonian students to reach the same level of competence in German than in English.

  15. Foreign languages in Estonia

  16. U.S. Government Proficiency Ratings(Jackson & Kaplan, 1999: 73)

  17. Approximate Learning Expectations at the Foreign Service Institute(Jackson & Kaplan, 1999: 78)

  18. The Defense Language Institute Foreign Language Center(MacWhinney 1995: 294)

  19. Some ways of achieving the comparability of examinations in different languages (on the basis of the Finnish matriculation examinations)

  20. Preparing examinations • Joint examination board for foreign languages (16 people) • 2–3 members representing each language • Language groups – 5–6 people • Markers (for English: 20–30 people) • Collective responsibility: the board as a whole is responsible for the quality of all language examinations

  21. Process of examination preparation • Language groups • Agree on text and task types • Agree on the schedule of work • Find the texts (2–3 times as many as will be needed) • Design the materials • The board discusses all the examination papers • Constructive criticism • Suggest changes/improvements • Language groups make the necessary changes • All the members of the language group read the test materials to make sure that they contain no mistakes

  22. Post-examination analysis • 3–4 weeks for marking the papers • Teachers mark their own students’ papers first using very detailed marking schemes, but their marks do not count. • If the central marker’s grade differs from that given by the teacher too greatly, a second marker is brought in. • The whole board analyses the examination results • If a question does not ‘work’, all students are awarded a point for it. • Item difficulty is taken into consideration when awarding the grades.

  23. Grading • 7 grades based on norm referencing • Laudatur (5%) • Eximia (10%) • Magna cum laude (20%) • Cum laude (30%) • Lubenter approbatur (20%) • Approbatur (10%) • Improbatur (2-5%) • reliability – 0.9-0.95

  24. Relating language examinations to the CEFR – general principles

  25. Relating an examination or test to the CEF is a complex endeavour. The existence of such a relation is not a simple observable fact, but is an assertion for which examination provider needs to provide both theoretical and empirical evidence. The procedures by which such evidence is put forward can be summarized by the term “validation of the claim.” Relating Language Examinations to the CEF: Manual (Preliminary Pilot Version), 2003: 1

  26. Procedures (1) • Familiarisation: • A selection of activities designed to ensure that participants in the linking process have a detailed knowledge of the CEFR • Specification: • A self-audit of the coverage of the examination (content and task types) in relation to the categories presented in CEFR Chapters 4 (Language use and the language learner) and 5 (The user/learner’s competences)

  27. Procedures (2) • Standardisation: • Suggested procedures to facilitate the implementation of a common understanding of the Common Reference Levels presented in CEFR Chapter 3 • Empirical validation: • The collection and analysis of test data and ratings from assessments in order to provide evidence that both the examination itself and the linking to the CEFR are sound

  28. Specification of examination content (1) • Familiarisation with CEFR • Consideration of a selection of the question boxes printed at the end of relevant sections of CEFR chapters • Discussion of the CEFR levels as a whole • Self-assessment of own language level in a foreign language • Sorting individual CEFR descriptors into levels

  29. Specification of examination content (2) • Internal validity: Description and analysis of • general examination content • process of test development • marking, grading, results • test analysis and post-examination review • External validity: Relate • general examination description to CEFR scales • description of communicative activities tested to CEFR scales • description of aspects of communicative language competence tested to CEFR scales

  30. Standardisation of judgements (1) • Familiarisation with CEFR as in the Specification stage (2 h) • Productive skills • Training in assessing performance in relation to CEFR levels using standardised samples (3–4 h/skill) • Benchmarking local performance samples to CEFR levels (3–4 h/skill)

  31. Standardisation of judgements (2) • Receptive skills • Training in judging the difficulty of test items in relation to CEFR standardised items(3–4 h/skill) • Judging the difficulty of local items in relation to CEFR levels(3–4 h/skill)

  32. Empirical validation • Data collection • Internal validation: • Confirming the psychometric quality of the test • External validation: • Confirming the relationship to the CEFR through an independent measure

  33. Activities 1 • Familiarisation • Consideration of a selection of the question boxes printed at the end of relevant sections of CEF chapters 3, 4 and 5 • Discussion of the CEFR levels as a whole • Table 1. Common Reference Levels: global scale(p 24) • Sorting individual CEFR descriptors into levels • Spoken Fluency (p 129) • General Linguistic Range (p 110)

  34. 3 Common Reference Levels • Users of the Framework may wish to consider and where appropriate state: • to what extent their interest in levels relates to learning objectives, syllabus content,teacher guidelines and continuous assessment tasks (constructor-oriented); • to what extent their interest in levels relates to increasing consistency of assessment byproviding defined criteria for degree of skill (assessor-oriented); • to what extent their interest in levels relates to reporting results to employers, othereducational sectors, parents and learners themselves (user-oriented), providing definedcriteria for degrees of skill (assessor-oriented); • to what extent their interest in levels relates to reporting results to employers, othereducational sectors, parents and learners themselves (user-oriented).

  35. 4 Language use and the language user/learner • Users of the Framework may wish to consider and where appropriate state: • in which domains the learner will need/be equipped/be required to operate. • Users of the Framework may wish to consider and where appropriate state: • the situations which the learner will need/be equipped/be required to handle; • the locations, institutions/organisations, persons, objects, events and actions with whichthe learner will be concerned.

  36. 5 The user/learner’s competences • Users of the Framework may wish to consider and where appropriate state: • what prior sociocultural experience and knowledge the learner is assumed/required tohave; • what new experience and knowledge of social life in his/her community as well as in thetarget community the learner will need to acquire in order to meet the requirements of L2communication; • what awareness of the relation between home and target cultures the learner will need soas to develop an appropriate intercultural competence. • Users of the Framework may wish to consider and where appropriate state: • on which theory of grammar they have based their work; • which grammatical elements, categories, classes, structures, processes and relations arelearners, etc. equipped/required to handle.

  37. Can understand the main points of clear standard input on familiar matters regularly encountered in work, school, leisure, etc. Can deal with most situations likely to arise whilst travelling in an area where the language is spoken. Can produce simple connected text on topics which are familiar or of personal interest. Can describe experiences and events, dreams, hopes & ambitions and briefly give reasons and explanations for opinions and plans. Can understand and use familiar everyday expressions and very basic phrases aimed at the satisfaction of needs of a concrete type. Can introduce him/herself and others and can ask and answer questions about personal details such as where he/she lives, people he/she knows and things he/she has. Can interact in a simple way provided the other person talks slowly and clearly and is prepared to help. Can understand with ease virtually everything heard or read. Can summarise information from different spoken and written sources, reconstructing arguments and accounts in a coherent presentation. Can express him/herself spontaneously, very fluently and precisely, differentiating finer shades of meaning even in more complex situations. B1 A1 C2

  38. C2 Can express him/herself at length with a natural, effortless, unhesitating flow. Can keep going comprehensibly, even though pausing for grammatical and lexical planning and repair is very evident, especially in longer stretches of free production. Can construct phrases on familiar topics with sufficient ease to handle short exchanges, despite very noticeable hesitation and false starts. Can interact with a degree of fluency and spontaneity that makes regular interaction with native speakers quite possible without imposing strain on either party. Can manage very short, isolated, mainly pre-packaged utterances, with much pausing to search for expressions, to articulate less familiar words, and to repair communication. Can make him/herself understood in short contributions, even though pauses, false starts and reformulation are very evident. Can communicate spontaneously, often showing remarkable fluency and ease of expression in even longer complex stretches of speech. Can produce stretches of language with a fairly even tempo; although he/she can be hesitant as he/she searches for patterns and expressions, there are few noticeably long pauses. B1 A2 B2 A1 B1 C1 B2

  39. A1 Has a very basic range of simple expressions about personal details and needs of a concrete type. Has a sufficient range of language to be able to give clear descriptions, express viewpoints and develop arguments without much conspicuous searching for words, using some complex sentence forms to do so. Can select an appropriate formulation from a broad range of language to express him/herself clearly, without having to restrict what he/she wants to say. Has enough language to get by, with sufficient vocabulary to express him/herself with some hesitation and circumlocutions on topics such as family, hobbies and interests, work, travel, and current events. Can use basic sentence patterns and communicate with memorised phrases, groups of a few words and formulae about themselves and other people, what they do, places, possessions etc. Can exploit a comprehensive and reliable mastery of a very wide range of language to formulate thoughts precisely, give emphasis, differentiate and eliminate ambiguity. Can produce brief everyday expressions in order to satisfy simple needs of a concrete type: personal details, daily routines, wants and needs, requests for information. Has a sufficient range of language to describe unpredictable situations, explain the main points in an idea or problem with reasonable precision and express thoughts on abstract or cultural topics such as music and films. B2 C1 B1 A2 C2 A2 B2

  40. Sources • Common European Framework of Reference for Modern Languages: Learning, Teaching, Assessment (CEFR): http://www.coe.int/T/DG4/Portfolio/?L=E&M=/documents_intro/common_framework.html • Hardcastle, Peter. 2004. Test Equivalence and Construct Compatibility across Languages. University of Cambridge ESOL Examinations Research Notes, 17, August, 6−11. • Jackson, Frederick H. & Kaplan, Marsha A. 1999. Lessons learned from fifty years of theory and practice in government language teaching. In: Georgetown University Round Table on Language and Linguistics. Washington, DC: Georgetown University Press, 71–87.

  41. Sources • MacWhinney, Brian. 1995. Language-Specific Prediction in Foreign Language Learning. Language Testing, 12, 292–320. • Manual for relating language examinations to the Common European Framework of Reference for Languages – a preliminary pilot version: http://www.coe.int/T/DG4/Portfolio/?L=E&M=/documents_intro/Manual.html • Taylor, Lynda. 2004. Issues of test comparability. University of Cambridge ESOL Examinations Research Notes, 15, February, 2–5.

More Related