Federico zanettin universit di perugia
This presentation is the property of its rightful owner.
Sponsored Links
1 / 39

Translating into the L2: Corpus tools and resources PowerPoint PPT Presentation


  • 167 Views
  • Uploaded on
  • Presentation posted in: General

Federico Zanettin Università di Perugia. Translating into the L2: Corpus tools and resources. Outline. Translation into the L2 Corpus resources and tools Sample translation activity Role of corpora Role of students Role of teacher Conclusions. Translation into the L2.

Download Presentation

Translating into the L2: Corpus tools and resources

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Federico zanettin universit di perugia

Federico Zanettin

Università di Perugia

Translating into the L2:Corpus tools and resources


Outline

Outline

Translation into the L2

Corpus resources and tools

Sample translation activity

Role of corpora

Role of students

Role of teacher

Conclusions


Translation into the l2

Translation into the L2

  • Is translation into L2 to be avoided?

    • standard practice in translator training

    • textbooks and manuals

    • actual professional practice

    • actual practice in L2 learning environments

  • Should the teacher be a native speaker of the L2?

    • Many are not…

  • A number of studies challenge these views:

    • e.g. Campbell 1998, Stewart 1999, 2000, forthcoming, Grosman et al. 2000, Kelly et al. 2003, Pokorn 2005, Kearns forthcoming, …


Corpus resources and tools

Corpus resources and tools

  • Corpora

    • The Web as corpus

    • The Web as a source for DIY corpora

    • Online corpora

      • Monolingual

      • Parallel

    • ‘Traditional’ corpora (i.e. non-native electronic texts)

      • Monolingual

      • Multilingual/Parallel

  • Tools

    • General purpose search engines (e.g. Google)

    • Online corpus analysis services

    • Stand-alone corpus analysis software (e.g.Wordsmith Tools, Textstat, Paraconc)

    • Custom software (e.g. Xaira, ENPC Explorer, etc.)


The web as corpus

The Web as corpus

  • Search engines advanced options

  • Specialized “sub-webs”

    • Google scholar

    • Google books

  • Online concordancers

    • WebCorp

    • WebCONC

    • KwikFinder


The web as source of diy corpora

The Web as source of DIY corpora

  • Manual DIY corpora

    • Download + corpus analysis software (e.g. Wordsmith Tools, TextStat, etc.)

  • (Semi) automatic DIY corpora

    • Sketch Engine


Sketch engine

Sketch Engine

  • Create your instant DIY web corpora

  • Add linguistic annotation to your corpora

  • Consult very large corpora for many languages

    • Word lists

    • Concordances

    • Word profiles (Word Sketch)


Word sketch for disease

Word Sketch for ‘Disease’


Word sketch

Word Sketch

  • A Word Sketch is a corpus-based summary of a word's grammatical and collocational behaviour.

  • Each column shows the words that typically combine with disease in a particular grammatical relations. For example, "object_of" lists - in order of statistical significance rather than raw frequency - the verbs that most typically occupy the verb slot in cases where disease is the object of a verb.

  • Switching between Concordance mode and Word Sketch mode is a useful way of getting more information about a particular word combination. Thus, if you want to look at examples of the string “transmit + disease", simply click on the number next to “transmit" in the object_of list (93) and you will be taken directly to a concordance showing all instances of this combination.

    Adapted from the Sketch Engine website


Online corpora

Online corpora

  • Monolingual

    • Leeds Internet corpora

    • The corpus of contemporary American English (COCA)

    • etc.

  • Bilingual

    • OPUS

    • Compara


Internet corpora at leeds

Internet Corpora at Leeds

al-luġatul-’arabiyyatu l-fuṣḥā


Opus europarl parallel corpus

OPUS (Europarl parallel corpus)

  • in modo sistematico

  • Systematically vs. in a systematic way


The web vs well constructed corpora

The Web vs. well-constructed corpora

Corpora = reliability, core patterns of language use

The Web

Lexical and terminological richness

Multi-word expressions

“naked eye”


Translating into the l2 corpus tools and resources

“to the naked eye”

Google = 2.5 million hits

BNC = 884 hits


Translating into the l2 corpus tools and resources

“visible to the naked eye”

Google = 1.2 million hits

BNC = 18 hits


Translating into the l2 corpus tools and resources

“barely visible to the naked eye”

Google = 83,000 hits

BNC = none


Translating into the l2 corpus tools and resources

“be barely visible to the naked eye”

Google = 49,000 hits

BNC = none


Translating into the l2 corpus tools and resources

“grains that are so small as to be barely visible to the naked eye”

Google = 5 hits (2 different results, duplicated)


Sample activity

Sample activity

Revise the output of an online machine translation system

Source text: specialized text in a curricular field (e.g. history, economics, politics)

Tools for revision

Dictionaries

Corpus resources and tools


An example

An example

I puritani della Nuova Inghilterra furono i primi fra tutti i coloni inglesi d'America ad elaborare in modo sistematico una teoria originale dello Stato e della società.

The puritani of New England were the first between all coloni English of America to elaborate in systematic way a theory originate them of the State and the society.

New England Puritans were the first among all English colonists of America to elaborate systematically an originaltheory of State and society.


Google advanced search

Google advanced search

New England Puritans

The New England Puritans

(The) Puritans of New England

(The) New England’s Puritans


How the students worked

How the students worked

  • Doubts about the MT outcome

  • Unknown words and expressions

  • Too literal renderings

    “in some cases, it was just a matter of verifying the accuracy of the MT output, whereas in others there were good reasons to improve the overall quality of the text.”


Use of corpus resources

Use of corpus resources

  • Does something exist?

  • Are there better alternatives?

  • Doubts confirmed (MT wrong)

  • Doubts disconfirmed (MT right)

    • Specific terminology

  • Need to

    • ask the right questions

    • formulate queries properly

    • analyse results successfully


Example 1

Example 1

  • Can globalization “exercise an effect” on income redistribution?

  • Google search = no results

  • Search for “an effect”

    • Something can “have” or “produce” an effect

  • “globalization has ( a number of) effects on income redistribution”


Example 2

Example 2

  • “the central theme of the debate”

  • Very literal: “il tema centrale del dibattito”

  • Google search = many results

  • EU proceedings parallel corpus = many results


Example 3

Example 3

  • “processi (economici) in corso” =

  • “(economic) processes Corsican” ?

  • Search for “processes” (COCA)

  • ongoing + processes (frequent collocates)

  • “ongoing (economic) processes”

  • Attested in comparable texts (sources of concordance lines)


Example 4

Example 4

“la diffusione di nuove tecnologie” = “the spread of new technologies”?

  • Google search = attested expression

  • But: what about “diffusion”?

  • Search for “spread” vs “diffusion (Web + COCA)

  • Search for

    • “the spread of * technologies” vs

    • “the diffusion of * technologies”

  • Spread = general English

  • Diffusion = academic English

  • “the diffusion of new technologies”


Example 5

Example 5

  • “avere i requisiti per votare” = “have the requirements to vote”?

  • Dictionary: “fulfil/satisfy/comply with/suit/match the requirements”

  • Corpora: “meet the requirements”


Role of corpora

Role of corpora

  • "dictionary items + combinatory rules"

    VS

  • "corpora + rules for querying and analyzing them"

  • focus on language units larger than the single word

  • Multiple local grammars

  • grammars for 1, 2, 3… word combinations


Translating into the l2 corpus tools and resources

  • Unanalyzed knowledge

  • Acquisition vs learning

  • Corpora used to produce generalizations

    • Gerund + “is not a duty”


Role of students

Role of students

  • Serendipity/discovery learning

  • A corpus is not necessarily “expected to provide the right answers … but constantly presents new challenges and stimulates new questions, renewing the user’s curiosity and offering ample opportunity for researching aspects of language and culture” (Bernardini 2002:166).


Role of teacher

Role of teacher

  • Guide, facilitator vs “walking dictionary”

  • Can only L2 native speakers be good translation teachers?

    • Native speakers

      • More knowledgeble about target language

    • Non-native speakers

      • More knowledgeble about source language

      • Same directionality of translation

      • Better understanding of translation difficulties

      • Better able to evaluate translation process


Risks

Risks

  • Insufficient expertise in the use of software will result in clumsy and superfluous searches

    • so enough time should be devoted to teaching search techniques, which are often specific to the corpora used

  • Insufficient expertise in the analysis of the data (concordances) will result in wrong conclusions ... and in turn in bad translations

    • so enough time should be devoted to teaching how to manipulate and interpret corpus data

  • However, something can also be learned from less successful learners, whose comments highlight areas of difficulty.


Conclusions

Conclusions

  • By using corpora in L2 translation learners can heighten their awareness of contrastive aspects and of varieties of possible translations

  • Even if equipped with limited formal linguistic knowledge learners are given the opportunity to discover language rules and conventions by themselves

  • The use of corpus resources in a translation task fosters reading and writing skills and encourages self-confidence and autonomy

  • Teachers do not necessarily have to be target language native speakers, but rather experts in using resources, formulating queries, evaluating findings


  • Login