1 / 12

Machine Translation, Digital Libraries, and the Computing Research Laboratory

Machine Translation, Digital Libraries, and the Computing Research Laboratory. Indo-US Workshop on Digital Libraries June 23, 2003. The Computing Research Laboratory (CRL). New Mexico State University Las Cruces, New Mexico http://crl.nmsu.edu Stephen Helmreich (505) 646-2141

micheal
Download Presentation

Machine Translation, Digital Libraries, and the Computing Research Laboratory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Translation, Digital Libraries, and the Computing Research Laboratory Indo-US Workshop on Digital Libraries June 23, 2003

  2. The Computing Research Laboratory (CRL) New Mexico State University Las Cruces, New Mexico http://crl.nmsu.edu Stephen Helmreich (505) 646-2141 Shelmrei@crl.nmsu.edu

  3. Machine Translation (MT) • Component technologies • Comparable technologies • Composed technologies

  4. MT--Purposes • Dissemination (high quality) sublanguages, controlled languages • Assimilation (broad coverage) • Communication (speed)

  5. MT -- Types • Direct – string-for-string • Transfer – structure-for-structure • Interlingual – to and from a meaning representation • Statistical – most probable translation given a corpus

  6. Component technologies -- I • Character encoding and representation, text editing (Unicode) • Text segmenting (OCR, sandhi?) • Morphological analysis • Lexical annotation (part of speech tagging, proper name identification, others)

  7. Component technologies -- II • Syntactic analyzers (grammars, parsers) • Bilingual/multilingual dictionaries • Ontologies (WordNet, OntoSem, Cyc)(lexical, linguistic, world-knowledge) • Generation systems

  8. Comparable technologies • Information Retrieval (IE) (URSA) • Information Extraction (IR) (MUC) • Text Summarization (DUC) • Word Sense Disambiguation (SensEval) • Cross-Document Named Entity Identification (Coreference Resolution)

  9. Composed Technologies • All of the above (IR/IE/Summarization) • multi-lingual • multi-modal • with attention to human-computer interaction (HCI)

  10. Composed technologies -- II • Personal Profiler – searches the web to find information about a particular person, translates it if appropriate, and organizes in temporal order • Quick Ramp-up MT (Expedition) – allows a non-linguist language user and a computer expert to construct a simple MT system

  11. Question-Answering Systems • Advanced Question and Answering for Intelligence (AQUAINT) • MOQA – Meaning-Oriented Question Answering • Allows user to pose structured or natural language queries, obtains answer from a variety of sources, and presents the answer appropriately

  12. Summary • Choose an appropriate purpose and type • Look at related technologies: component, comparable, composed • Search for an appropriate research partner

More Related