A learner generated corpus to direct learner-centered courses Clara Inés López Rodríguez University of Granada (Spain) Bryan Robinson María Isabel Tercedor Sánchez Maastricht, May 2005
Research and teaching context • R&D project of the Spanish Ministry of Education: PUERTOTERM: knowledge representation and the generation of terminological resources within the domain of Coastal Engineering (Reference BF 2003-04720) • Scientific and Technical translation classroom: English <> Spanish • Teaching innovation action Localización del Texto multimedia: generación de recursos en el aula de traducción científica y técnica (importance of audiovisual and multimedia material) • Close collaboration between teachers
Social constructivism, collaborative learning Kiraly (2000: 72)
Objectives • To build a sound foundation for new classroom activities • to make students aware of their mistakes/errors deriving from lack of understanding of the subject field and their inability to grasp the meaning of the text. • Analyze data extracted from a corpus of student translations so as to focus on the patterns and problems regularly generated by our learners. • To see the effect translation problems related to meaning have on the type of mistakes/errors students make.
Theoretical foundations: use of corpora in the translation classroom • DIY corpus: Zannetin (2002), López Rodríguez (2002) • Learner corpus: Uzar (2002) • Evaluation corpus: Bowker (2001) • Quality: comparable corpus • Quantity corpus • Inappropriate corpus
Theoretical foundations: evaluation • Learner error vs. learning mistake (Miller 1966; Corder 1973) • Approaches to evaluation: Waddington (2001) • Analytical: Martínez Melis and Hurtado (2001) • Analytical + holistic: González Davies (2001) • Holistic: Waddington (1999, 2001) and Robinson (1998) Adapted from: Bryan Robinson 1998. Traducción transparente: métodos cuantitativos y cualitativos en la evaluación de la traducción. Revista de Enseñanza Universitaria (Número extraordinario):577-589. ISSN 1131-5245.
DECODING Criterion descriptors (Robinson 1998)
Advantages of this scale • Formative • Holistic • Transparent, easy to understand and conducive to self-assessment • The learner acquires “editor like” training • You can adapt the scale and give different weights to different aspects of the scale, depending on the level of the course and the directionality of translation
Adapting the descriptors to final year students translating into their mother tongue 4. Written expression (ES > EN) 4. Translation brief and professional aspects 3. Translation brief and orientation to target text type (ES > EN) 3. Fluency and orientation to target text type
Implementing learner-centered courses (I) • 3rd and 4th year translation students (EN<>ES) • Scientific translation, technical translation and localisation, audiovisual translation • Subject field: coastal engineering
Implementing learner-centered courses (II) Design of learner-centered activities: • Acquisition of field knowledge • Getting familiar with corpora and annotation • Identifying translation problems • Developing translation strategies • Evaluating the solutions given by peers and by themselves: self-assessment of mistakes/errors and acquisition of “editor like” training
1. Acquisition of field knowledge and relevant cognitive and lexical structures • Analysis of visual and multimedia aids • Reading skills: skimming and scanning • Compiling a DIY corpus • Conceptual modeling: elaborating ontologies and glossaries
Analysis of visual/multimedia aids http://cil-www.coas.oregonstate.edu:8080/frames/motivate/motivate.html http://coastview.ims.plym.ac.uk/video.html http://wldelft.nl/cons/appl/argus/index.html a. Describing and comparing images using the target language (Scene-and-frame semantics, Kussmaul 1991, 1995). b. Matching images with their captions. PROCESS-ORIENTED TERMINOLOGY AND TRANSLATION
CAPTIONS • Instrument deployment during SandyDuck’97. • Diver installing pressure transducer on the sea bead. • Time-lapse movie camera on Oregon Coast.
Argus collects three basic image products for beach studies. These include the traditional snapshot, showing wave activity, ten-minute time-exposure (timex) images of the wave dissipation patterns (revealing submerged sand bars and rip channels) and variance images (separating dynamic from steady areas of the image. The greatest scientific value currently come from the timex images.
c. Associating images with the body of the text and with animations White bands in a timex image, locations of enhanced breaker dissipation, provide a proxy for submerged sand bars. Click for an AVI or RealPlayer version of an animation demonstrating the conversion of live video like that from Duck, NC on the left, into a finished timex image, below.
DIY corpus and conceptual modeling • assessing reliability of texts • improving documentation and terminology skills: elaborating ontologies and glossaries • acquiring the basic concepts that shape knowledge • interrelation between concepts • dynamic and process-oriented conceptual structure: Barsalou 2003 (apud Faber, Márquez y Vega in press) • PROCESS-ORIENTED TERMINOLOGY AND TRANSLATION
COASTAL ENGINEERING EVENT (Faber, Márquez and Vega (in press) MICRO-EVENTS After reading different texts on a specific topic (monitoring systems in coastal engineering-video imaging), students elaborate their own ontology: MindManager X5 Pro http://www.mindjet.com Protégé http://protege.stanford.edu/ Cmap http://cmap.ihmc.us
2. Corpus-based exercises and annotation: tagging the corpus • Initial tags referring to style, translation brief and professional aspects • <style=5><t-brief=1><professional=3> • Tags placed after the problem or mistake/error • Tags specify: • Type of problem • Type of error/mistake • Adequacy /appropriateness of translated sentences
3. Identifying translation problems (tagging) • <Number> referring to sentence number n • Conceptualization <CON> • Procedural <PRO> • Transfer (due to linguistic and cultural differences) <TRA> • Lack of quality of the Source text <QTO> López and Tercedor (2004)
and consists of: 3 Snap Shots<3><PRO>: These images are instantaneo 4 Time exposure images (Timex)<4a><PRO>: These images average the ver a time period of 10 minutes<4b><CON>. 5 Typically images are rec frequently over shallow sanbars<6><QTO><CON> generating foam appear a . 7 The wave-breaking patterns<7><PRO> highlighted in the timex imag the surface. 8 Variance images<8><PRO>: These images represent the v ude the waters edge (swash zone)<10><CON><PRO> where the beach is p projected on the ground plane<11a><CON>, resulting in rectified image s with real world co-ordinates<11b><CON>. 12 These rectified images ously, so-called merged images<13a><CON> can be obtained, which give a lan view of the nearshore zone<13b><PRO>. 14 The figure below shows time domain of optical signals<15a><CON>, by rapidly sampling the inte l elements of the image (pixels)<15b><TRA>. 16 It is now possible ow possible to define an array<16a><PRO> of pixels in the image which l moving oceanographic targets<16b><CON>. 17 Tests have shown pixels ampled at high frequencies (1Hz)<18><CON>, so that a time series of
4. Translation of the text(tagging their own TT) • Students use translation strategies and produce a translation which includes tags pointing to problematic segments: • Numbers: <1>, <16a>, <16b>)
5a. Evaluating translation segments of peers (tagging) • Students evaluate potentially problematic segments of texts from group assessments: • Tagging these segments according to the type of mistake/error • Tagging according to adequacy or appropriateness of translated sentences • Relating meaning problems with mistakes/errors
Tagging: type of error/mistake and adequacy of translation (1) Offer students a list of “filtered” concordances with their own rendering of the problematic segment (2) Students identify type of error/mistake and tag the segments (3) They grade the segments according to quality parameters (Laurscher 2000, López and Tercedor 2004): adequacy /appropriateness of translated sentences (4) Justify own choice of “best” segment
SOURCE TEXT Regions where waves break frequently over shallow sanbars<6><QTO><CON> generating foam appear as bright white bands in the timex images. The wave-breaking patterns<7><PRO> highlighted in the timex images can be used to infer the position and shape of sandbars, even though the bars are not visible above the surface. • FILTERED CONCORDANCES • 1 bre bancos de arena poco profundos<6><lx> generando espuma, en las imá • 2 en bancos de arena poco profundos<6><lx> generando espuma aparecen co • re barras de arenas poco profundas<6><colx>, aparecen indicadas en la • arras de arena en zonas de poca profundidad<6><AA>, generando espuma, apa • 5 las barras de arena poco profundas<6><se>, lo que genera espuma, aparecen • n barras de arenas no muy profunda<6><se>, provocando espuma, se represent • as barreras de arena superficiales<6><lx>, genera<ccsx><FF> la aparici • en una barra de arena superficial<6><lx>, lo que hace que se produzca • bancos de arena son poco profundos<6><lx>. Para inferir la posición y • superficie. Imágenes de varianza<6><lx>: estas imágenes representan • <sx><FF> sobre las barras de arena<6><mise> generando espuma en el agu • arras de arena de poca profundidad<6><se> produciendo espuma aparecen • obre las barras de arena del fondo<6><se>, generando espuma, aparecen
Type of error/mistake (from corpus) 1. CONTENT ES > EN EN >ES (x 2) • <se> meaning • <chse> Lack of cohesion • <mise> less information than ST • <pluse>more info than ST • <tvse> wrong tense that causes change in meaning • <cose> change in meaning due to wrong collocation • <dtse> changes in the data
2. REGISTER AND LEXIS ES > EN EN >ES • <lx> lexis and terminology • <colx> wrong collocation • <rglx> term that is not appropriate for the register of the text • <rg> register (inconsistencies)
3. TRANSLATION BRIEF AND ORIENTATION TO TARGET TEXT TYPE ES > EN • <o> organization • <pr> pragmatic mistakes • <rtpr> Grammatically correct but it sounds unnatural. The rhetorical effect of the ST is missing. Literal translation. • <ist> inappropriate style • <ot> orthotypography • <f> layout, wrong accomplishment of style sheet or computer requirements
3. FLUENCY AND ADEQUACY TO ORTHOTYPOGRAPHICAL, TEXTUAL AND PRAGMATIC CONVENTIONS OF TARGET LANGUAGE EN > ES • <o> organization • <pr> pragmatic mistakes • <rtpr> Grammatically correct but it sounds unnatural. The rhetorical effect of the ST is missing. Literal translation. • <ist> inappropriate style • <ot> orthotypography
4. WRITTEN EXPRESSION ES > EN • <or> spelling • <ptor> punctuation • <sx> syntax • <ccsx> lack of concord
4. TRANSLATION BRIEF AND PROFESSIONAL ASPECTS EN > ES A syntactic, spelling or punctuation mistake will reduce the mark in the fourth column to the minimum • <f> layout, wrong accomplishment of style sheet or computer requirements • <or> spelling • <ptor> punctuation • <sx> syntax • <ccsx> lack of concordance
Adequacy of translated sentences • Excellent solution <AA> • Inappropriate <type of error/mistake> • Very bad-serious mistake/error <type of error/mistake><FF>
Relating problems with mistakes • Students see that when they concentrate on specific problems and do not revise, inadvertent mistakes slip in. As a result, the quality of their translation diminishes dramatically. • Students keep a record of mistakes and errors in a “learner diary”. • Conceptual problems -> serious mistakes.
Merged image (panoramic and rectified view) of Noordwijk. Five individual time-exposure images are used to compose these merged images. In the rectified image (lower panel)<12a><TRA>, the shoreline is located at the lower side, around x = 0 m <12b><CON>. The bright band of 3000 m length at about 900 m off-shore<13a><CON> indicates the location of a 1.25 Mm³<13b><TRA> shoreface nourishment<13c><CON>. En la imagen rectificada (panel inferior)<12a><lx> el litoral esta<or><FF>situado en el lado inferior, aproximadamente x = 0 m<12b><se>. La banda luminosa de 3000 metros<ot> de longitud a aproximadamente 900 m de altamar<13a><se><FF> indica la localización de una regeneración de la zona costera<se> de 1.25 Mm3<13b><ot><FF>.
5b. Evaluation of full translations: tagging and criterion descriptors • They evaluate the translation of other students and their own translation. • They tag the translation and give a mark according to the criterion descriptors
6. Additional exercises • Corpus analysis with lexical analysis software (Wordsmith Tools) • Proposal of new texts to be translated • Access to DIY corpora • Access to quality corpus (Puertoterm) • Access to the learner corpus: solutions to translation problems
Conclusions and future research • In specialised translation, the aim of activities should be to increase learner autonomy: different skills and tasks involved in the translation process and in evaluation. • Students who understand the basic conceptual structure of the subject field (visual aids and DIY corpus) do better translations.
Conclusions and future research • Bottom-up, top-down approach to translation • Learner corpora as input for lexical analysis software • consistency analysis and statistical analysis of translations: false friends, most common mistakes/errors, etc.