Download
1 / 22

A Brief History - PowerPoint PPT Presentation


  • 72 Views
  • Updated On :

. . Using Machine Translation for fast, inexpensive, and accurate health information assimilation and dissemination: Experiences at the Pan American Health Organization Julia Aymerich. A Brief History. 1976 First contract, Spanish-English 1979 In-house development began

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Brief History' - stefanie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg
.

.

Using Machine Translation for fast, inexpensive, and accurate health information assimilation and dissemination: Experiences at the Pan American Health Organization

Julia Aymerich

ICLM-9

September 2005


A brief history l.jpg
A Brief History

1976 First contract, Spanish-English

1979 In-house development began

1980 Spanish-English on mainframe

1985 English-Spanish on mainframe

1992 PC version on PAHO LAN

2000 32-bit Windows version

2003 Two new language pairs:

Portuguese-English, English-Portuguese

2004-5 Two new language pairs:

Spanish-Portuguese, Portuguese-Spanish

ICLM-9

September 2005


Slide3 l.jpg

PAHO HQ

Translation Services

Unit Chief

Spanish translator

English translator

Computational linguists (2)

Office assistants (3)

Dictionary coders (2 contractors)

Users of PAHO’s Machine Translation Software

Licensees

Ministries of Health

NGOs

International

Organizations

Universities

Consulting firms

Private industry

Translators

Physicians

Field Offices

U.S. Mexico Border

Regional Centers

Nutrition

Environment

Food Safety

Perinatology

PWR Offices

Subject AreaSpecialists

Administrative and Support Staff

Contract translators

ICLM-9

September 2005


Paho translation services l.jpg
PAHO Translation Services

  • 8 staff members

  • Two in-house contractors (linguists)

  • Over 100 free-lance contractors (translators)

  • MT usage: over 90% for documents into Spanish, English, and Portuguese

  • Productivity gain: 50%-100%

  • Cost reduction: approx 33%

ICLM-9

September 2005


Direct cost savings at paho hq l.jpg
Direct Cost Savings at PAHO HQ

  • Professional translators: 70% of Human Translation rate

  • No cost for PAHO HQ LAN users

ICLM-9

September 2005


Survey of translation costs 1998 l.jpg
Survey of Translation Costs1998

ICLM-9

September 2005


Keys to success at paho l.jpg
Keys to Success at PAHO

  • Appropriate input texts

  • Conducive institutional setting

  • Robust software

  • Years of hard work

ICLM-9

September 2005


Input texts l.jpg
Input Texts

  • Large volume in three languages

    4 million words per year

  • Predictable subject areas

    NOT JUST MEDICINE!

  • Some style and format guidelines

    No controlled input

    No pre-editing

ICLM-9

September 2005


Input texts cont l.jpg
Input texts (cont.)

ICLM-9

September 2005


Institutional setting l.jpg
Institutional Setting

  • Management commitment

  • Translator involvement

  • Staff support

ICLM-9

September 2005


Robust software l.jpg
Robust Software

  • Developed and maintained by PAHO’s language professionals

  • Designed to handle standard syntax

  • Fully customizable dictionaries

  • Used in a broad domain

ICLM-9

September 2005


Hard work l.jpg
Hard Work

  • Translators provide feedback

  • Linguists update dictionaries with new terminology and translation rules

  • Computational linguists improve the algorithms for analysis, transfer, and synthesis

ICLM-9

September 2005


Types of installations l.jpg
Types of installations

  • Standalone

  • Local Area Network (5, 10+ users)

  • Intranet server

  • May be licensed by non-profit institutions

Language combinations

  • English-Spanish

  • Spanish-English

  • English-Portuguese

  • Portuguese-English

  • Spanish-Portuguese

  • Portuguese-Spanish

ICLM-9

September 2005


Some features of paho mts l.jpg
Some Features of PAHO MTS

  • All formatting is preserved

  • Spanish/English/Portuguese user interface

  • Fully graphic Translation Interface invoked from the desktop, from MS Word toolbar, or from WordPerfect (8.0+) toolbar

  • Batch translation

  • Postediting Macros for MS Word

  • Active X Automation objects for translation of short segments

  • On-line context-sensitive trilingual help

  • Translation speed

    • minimum of 8,000 words/minute

    • in batch mode: 30,000 words/minute

ICLM-9

September 2005


Features of paho mts cont l.jpg
Features of PAHO MTS (cont.)

  • Dictionary Merge Utility

    • Used by clients to combine their entries with PAHO master entries

    • Used at PAHO HQ when users provide us with copies of their dictionaries

  • Import Utility for Bilingual Glossaries

    • World Bank glossaries (15,000+ entries)

    • International Classification of Diseases

    • INN: official non-proprietary or generic names of pharmaceutical substances

  • Dictionary Browse Utility

    • Accessible from MS Word

    • Copy/paste capabilities

  • Export Utility

    • To create lists of words added by users or small dictionaries that can be later merged

ICLM-9

September 2005


Slide16 l.jpg

Recent Developments

  • Compatible with Translation Memory software

    • Files pretranslated with TRADOS can be processed by PAHOMTS without modifications

    • Macro to activate Active X object when processing a file with WordFast

  • Spanish-Portuguese released in March 2004

  • Portuguese-Spanish released in June 2005

  • Recently acquired MultiTrans to align corpus of over 60+ million words and automatically extract terminology

ICLM-9

September 2005


Machine translation processing l.jpg
Machine Translation processing

  • Several options can be selected at run-time:

    • Type of grammar

      • Abstract, letter, manual, report, resolution, survey, speech, post description, news article, summary record

      • One type of grammar only

    • Specialized vocabulary

      • Medical research, financial, environment, equipment, agriculture, patient education, legal, computer science, United Nations, radiation, pharmaceutical, statistics, European variety

      • Several microglossaries may be activated, in order of priority

ICLM-9

September 2005


Information assimilation l.jpg
Information Assimilation

  • Fast and inexpensive method for gisting, but with much higher quality than free web-based translators

  • Appropriate options are set at run-time

  • Used for translating segments, web pages, full documents

  • No need for postediting

  • Sample uses: Hospital in Panamá, NorthCom, NLM, Supercourse lectures, health professionals scanning the web

ICLM-9

September 2005


Information dissemination l.jpg
Information Dissemination

  • MT produces a draft – Always needs to be postedited!

  • Output must be postedited by:

    • Professional (distribution, publication)

    • Native speaker of target language (internal distribution)

  • Publication quality translations or light postediting

  • Appropriate options are set at run-time

  • Used for translating segments, web pages or full documents

  • Benefits

    • Speed

    • Lower costs

    • Terminological consistency

  • Sample uses: CEPIS, translation unit, USP

ICLM-9

September 2005


Postediting l.jpg
Postediting

  • Done directly on the RAW file

  • Using the side-by-side (SBS) file as a reference

  • Freelancers don’t have access to PAHOMTS; only the output files.

  • Making use of the MS Word postediting macros

    • General macros: changing display,lower/uppercase, switch words to the right or to the left, delete words, look up in PAHOMTS dict.

    • English macros: delete definite article, create possessive, delete and switch, create NN compound, undo NN compound, serial comma

    • Spanish/Portuguese macros: singular, plural, masculine, feminine, feminine plural, masculine plural, delete definite article, -mente adverb, add diacritic

ICLM-9

September 2005


Bilingual corpus l.jpg
Bilingual corpus

  • Source and target documents preserved since 1990

  • Aligned using Multitrans

  • Semi-automatic terminology extraction

  • Bilingual lists checked by linguist and imported into PAHOMTS dictionaries

  • Aligned corpus will be made available on-line to freelance posteditors

ICLM-9

September 2005


Mt enhancements l.jpg
MT enhancements

  • Daily

    • Lists of not-found words

    • Lexical issues pointed out by translators (SBS file)

    • SBS files examined by computational linguists to improve parsing/synthesis

  • Occasionally

    • Incorporation of bilingual glossaries using the Import or Merge utilities

    • Research into specific linguistic issues, always using bilingual corpus

  • Automatic terminology extraction from aligned corpus

ICLM-9

September 2005


ad