1 / 13

Development of an Intelligent Translation Memory

IKTA5-146/2002. Development of an Intelligent Translation Memory. MorphoLogic http://www.morphologic.hu SZAK Publishers http://www.szak.hu Balázs Kis (kis@morphologic.hu). Rome, 21 May 2003. IKTA5-146/2002. Project Details. Duration 3 March 2003 – 25 February 2005 Budget

Download Presentation

Development of an Intelligent Translation Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IKTA5-146/2002 Development of an Intelligent Translation Memory MorphoLogic http://www.morphologic.hu SZAK Publishers http://www.szak.hu Balázs Kis (kis@morphologic.hu) Rome, 21 May 2003

  2. IKTA5-146/2002 Project Details • Duration 3 March 2003 – 25 February 2005 • Budget Total: 96,8 M HUF [387 200 €] Funding: 57,1 M HUF [228 400 €] • Consortium MorphoLogic Ltd. (84 %) SZAK Publishers Ltd. (16 %) Project leader: dr. Gábor Prószéky Rome, 21 May 2003

  3. IKTA5-146/2002 The Problem and Its Impact (1.) • Current state-of-the art translation memories • store previously translated segments and translations • offer look-up for similar source segments backed by character-based fuzzy indexes • Advantage: • this is language independent, and inexpensive to develop and support Rome, 21 May 2003

  4. IKTA5-146/2002 Rome, 21 May 2003 The Problem and Its Impact (2.) • Disadvantages of current TM technologies • they ignore relationships between syntactic structures, therefore • long segments or those with similar meaning or syntactic structure often stay hidden, so • many segments included in the translation memory are simply lost Rome, 21 May 2003

  5. IKTA5-146/2002 Before the project started... • MorphoLogichad at hand • Human Language Technologymodules from morphology to every level of parsing syntax • a localisation department with very specific technological needs (still pending) • SZAK Publishershad at hand • many years experiencewith translation and terminology • a parallel corpus of technical texts of approx. 1,5 million words (under processing for project needs) Rome, 21 May 2003

  6. IKTA5-146/2002 Main Objective • Development of a Translation Memory equipped with Linguistic Intelligence • finding source segments based on their grammatical similarity; • making changes to stored translations according to the current source segment • Long-term objective: • an improvement in the quality of translations and a decrease in the translation effort (time) Rome, 21 May 2003

  7. IKTA5-146/2002 Project Constraints • An important remark: • This will be a language-dependent translation memory(linguistic intelligence assumes language-specific HLT modules) • First phase: using English and Hungarian HLT modules Rome, 21 May 2003

  8. IKTA5-146/2002 Project Contents • The result is an integrated CAT tool(CAT = Computer Assisted Translation) • The tool consists of • A terminology management module (already available) • A text alignment program • A translation memory Rome, 21 May 2003

  9. IKTA5-146/2002 Project Phases • Planning and Specification (completed) • Corpus Building • Core Research Phase:Development of Grammatical Proximity Search and Translation Correction modules • Implementation of Database Engine • Integration and Test Translation Rome, 21 May 2003

  10. IKTA5-146/2002 Grammatical Proximity Search • Research on Non-Exact Matching of Phrases and Sentences (this is not fuzzy!) • A procedure for matching grammatical structuresnormalized by means of syntactic and semantic features • Critical evaluation of some „traditional” procedures • Research on Adapting Stored Translations to current source segment Rome, 21 May 2003

  11. IKTA5-146/2002 A sample match Stored source segment FrontPage opens the current page in Page view. A FrontPage az aktuális oldalt a Page nézetben nyitja meg. Stored translation Current source segment recognized Word opens the second file in Print Layout view. A Word a második fájlt a Print Layout nézetben nyitja meg. Adapted translation Traditional TMs do not find a match with the default 70% threshold! Rome, 21 May 2003

  12. IKTA5-146/2002 Expected Results... • Experiments start Autumn 2003 • First Test Version End of 2003 Rome, 21 May 2003

  13. IKTA5-146/2002 Further Steps • Making the tool known in Hungary and abroad • Improvement of Services based on User Feedback • Addition of Further Language Pairs Rome, 21 May 2003

More Related