1 / 21

Introducing Machine Translation to the European Parliament

Introducing Machine Translation to the European Parliament. Alexandros Poulis alexandros.poulis@europarl.europa.eu DGTRAD.ITS TOB 02A011. Outline. Do we need MT@EP? What do we need MT for? One general MT system for all institutions: Is this possible?

jesse
Download Presentation

Introducing Machine Translation to the European Parliament

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introducing Machine Translation to the European Parliament Alexandros Poulis alexandros.poulis@europarl.europa.eu DGTRAD.ITS TOB 02A011

  2. Outline • Do we need MT@EP? • What do we need MT for? • One general MT system for all institutions: Is this possible? • How to make MT work for the EP: Current state of the project, future actions

  3. Does the EP need MT technology? • MT poll 2009: out of 137 respondents 21% used MT regularly and 40% sometimes (in DGTRAD) • Today MT is more popular than what it used to be • Domain usage statistics • In early 2010 almost 40.000 hits per month on MT web-sites only by DGTRAD • In December 2010 almost 500.000 hits were registered EP-wide • Additional 1.000 requests to ECMT

  4. What do we need MT for? • Working group on Machine Translation • Needs Study • Explore and identify Parliament’s needs in our given business and IT environment

  5. What do we need MT for? • Members of the EP MT working group • 5 translators with different backgrounds • PreTrad • ITS • Planning • Under the guidance of Miguel Frank

  6. What do we need MT for? • Possible use-cases (1): • MT as a CAT Tool: Help translators focus on non-trivial, more demanding and creative translation tasks, • improve quality and • increase productivity (?) • Administrative mails are often written only in French or French-English “Dear colleagues,As not all EU staff speak French, it would be really useful if you could send your emails in English/German as well. It would ensure more support.Je ne peux pas toujours traduire tout pour mes collegues. Merci.”

  7. What do we need Machine Translation for? • Provide EP staff with access to knowledge and information they could not access before because of the language barrier • More than 1.000 pages a year need to be produced in less than one hour (short deadlines, need for synchronous MT)

  8. What do we need MT for? • Possible use-cases (2): • Provide a risk-free alternative to online MT tools where necessary and where possible • Help EP members and staff communicate faster and better in languages they do not feel 100% comfortable with • Try to reduce cost of outsourced translation • MT for gisting purposes when there is no need for high quality human translation. If necessary light or heavy post-editing can be offered.

  9. Can one general purpose MT system serve all institutions? • The more domain-specific a SMT system the better output it gives (data – languages – domain specificity) • Are we re-inventing the wheel by working on MT in the EP while the Commission has almost developed a solution?

  10. How can we make this work? • We must know which needs have to be addressed • We need appropriate hardware resources • Interinstitutional cooperation and sharing of information, data and know-how • User feedback

  11. Phase 1: building a lab-scale product • Open source Statistical MT tools • Combining EP and EC data from Euramis for various language pairs • We expected this system to be more appropriate for procedural documents • And indeed…

  12. Bleu score by document type (ENPT)

  13. Phase 1: Building a lab-scale product • A general purpose SMT system based on Euramis data may provide decent translations for certain document types (e.g. TC) • What about QEs (written questions), CREs (verbatim reports of debates) and other doctypes which account for a large amount of our translation production? • In 2010 we produced 488.622 pages of AM documents and… 113.111 pages of QEs! Almost 1 QE for every 4 AMs!

  14. Phase 1: Building a lab-scale product

  15. Phase 1: building a lab-scale product • Our next steps • Provide feedback to the EP MT working group • Customise and optimise for different use-cases • Integrate to translation production environment (CAT4TRAD, CAT-Tool) • Improve efficiency (faster updates of the models when we have new versions of the training corpora) • Enhance our corpora to create custom engines • Combine technologies: MT+TM (e.g. enhanced fuzzy matches - Philipp Koehn et al. 2010)

  16. Phase 2: Provide a test environment to MT users • Evaluate usability of MT • as a CAT tool (dissemination) • for assimilation and communication purposes

  17. GORAIBHMAITHAGAT (GA) KÖSZÖNJÜK (HU)GRAZIE (IT)AČIŪ (LT)PALDIES (LV)GRAZZI (MT)BEDANKT (NL)DZIĘKUJĘ (PL)OBRIGADO (PT)VA MULTUMIM (RO)DĚKUJI (SK)HVALA (SL)TACK (SV) БЛАГОДАРЯ ВИ (BG)GRÀCIES (CA)DĚKUJI (CS)TAK (DA)DANKE (DE) ΕΥΧΑΡΙΣΤΟΥΜΕ (EL)THANK YOU (EN)GRACIAS(ES)TÄNAME (ET)KIITOS (FI)MERCI (FR)

More Related