1 / 14

Embedded MT Systems

Embedded MT Systems. Definition: A computational system with one or more MT engines embedded among its components. These systems accept various well-formed and degraded types of multilingual and multi-modal input, including * hard-copy pages (original and OCR-ed images)

ailani
Download Presentation

Embedded MT Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Embedded MT Systems Definition: A computational system with one or more MT engines embedded among its components. These systems accept various well-formed and degraded types of multilingual and multi-modal input, including * hard-copy pages (original and OCR-ed images) * online files (web pages, word processing files, email, chat) * video (image and text) * speech (natural signal, automatic and human transcription) From this range of input, such systems enable users to access the original, foreign language information in their own language. -- end-to-end performance depends on preprocessing modules’ level of accuracy [negative: noisy input to MT], or range of user input [negative: user error] -- as technology for preprocessing modules and user interfaces improves, overall system performance can improve [positive]

  2. Examples of Embedded MT Systems Background to Special Issue • AMTA’98 Workshop: Diplomat (CMU), CyberTrans (Mitre), FALCon (ARL), LinguaNet (CBS) • NAACL/ANLP’00 Workshop: Closed-Caption MT (Simon Frasier U),CLIR (JHU et al.), Riptides (Cornell et al.) Papers in Special Issue - Grouped by System Designs • Preprocessor + MT engine + Postprocessor (Bangalore & Riccardi, Gao et al., Lee et al.) • User interface front end + MT engine back end (Langlais et al., Dorr et al.) • Informant interface + MT build module + MT engine (Levin et al., Nirenberg et al.) • Platforms with Plug and Play Components, multiple MT engines (Hansen & Sorenson, Voss & Fisher)

  3. MT Engine(s) preprocess postprocess Embedded MT System DesignPreprocessor + MT engine + Postprocessor • - Preprocessing of noisy (non-character) input necessary • Cascading of errors through modules

  4. System Design: Preprocessor + MT engine + Postprocessor N N N Bangalore & Riccardi noise postprocess MT Engine Call Routing Transcribed speech preprocess Speech Recog MT Engine Ltd domain speech Composed FST Models

  5. N N N System Design: Preprocessor + MT engine + Postprocessor Gao et al. preprocess postprocess Speech Recog MT Engine Speech Generation Ltd domain speech Lee et al. preprocess postprocess Speech Recog MT Engine Speech Generation Read speech noise

  6. Embedded MT System DesignUser interface front end + MT engine back end Back End • - MT engine resides on Back End, • User Interface is Front End GUI • Room for further development of GUI • that enables system developers to • monitor how the user is making use • of the system & how to improve it • - feedback loop developed, task-oriented MT • all input typed by user (manual text entry • is bottleneck, need human error correction • or prediction or completion or constraints) MT Engine Front End

  7. Trans Model Lang Model System Design: User interface front end + MT engine back end Back End Statistical Engine User Lexicon Front End L1 L2 Langlais et al. TransType

  8. System Design: User interface front end + MT engine back end L2 L2 IR Back End MT Engine MT Engine Lexical Resources L1 -> L2 L2 -> L1 Front End L1 User query Dorr, Levow and Lin

  9. Embedded MT System Design Elicitation module + MT build module + MT engine • informant participates in elicitation process during development time • vs. MT engine build-time vs. MT engine run-time • -- MT system is built based on elicited knowledge provided • by bilingual informant • -- pre-established sequencing of guided elicitation is critical • -- Standalone MT engine is result • feedback loop allows for MT output to be viewed by • bilingual informant & system developers who can • modify MT engine (via rules, features, depending on engine design) • focus: experimental methodology • For low resource languages • (these are research systems at early stage of development) • linguistically motivated choice of elicited knowledge

  10. System Design: Elicitation module + MT build module + MT engine McShane et al. Expedition L1 MT Build Module Elicitation Module Learned Transfer rules MT Engine Language Corpus, Dictionaries Hand- crafted Rules, forms Interface L2 Computational Linguist/ Computer Scientist bilingual Informant (L1, L2)

  11. Probst et al. Avenue Project L1 MT Elicitation Module + MT Rule Learning Module + Engine Control Process Elicitation Corpus, Dictionaries,… Parsing Learning Process Transfer Rules Word-aligned, HT*, elicited minimal pairs Transfer Learned Transfer rules Handcrafted Rules, forms Generation Elicitation Interface HT*: Human Translation L2 Computational Linguist/ Computer Scientist Bilingual Informant (L1, L2)

  12. Embedded MT System Design Platforms with Plug and Play Components, multiple MT engines • customized for user groups during integration time, • augmented user-specific lexicons/glossaries • noisy input (degraded documents, human spelling errors) • - multimodal input • iterative development of design, includes user feedback • focus: extensibility of platform via new technologies, • upgraded components • multiple languages • (these are operational systems with specific applications) • user-provided domain knowledge (text, images, databases)

  13. Platform with Plug and Play Components, multiple MT engines Point-to-point communication Back End MT Engines Table Translation Knowledge Bases Front End L2 L1 L2 Hansen & Sorenson LinguaNet L1

  14. N Platform with Plug and Play Components, multiple MT engines Hardcopy documents preprocess postprocess scan DocEx tasks N OCR MT Engines Camera capture Scene or View of Text Voss et al. FALCon

More Related