1 / 21

Machine Translation

Machine Translation. Jacquin Buchanan August 9, 2006. Introduction. VoxTec International develops speech-to-speech translation devices. Chief product, the Phraselator ® P2 already used by thousands of military, law enforcement customers worldwide

paul
Download Presentation

Machine Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Translation Jacquin Buchanan August 9, 2006

  2. Introduction • VoxTec International develops speech-to-speech translation devices. • Chief product, the Phraselator® P2 already used by thousands of military, law enforcement customers worldwide • Lead software architect of this project since early 2001. • Personal mission to create practical solutions for human-to-human interaction using speech and language technologies

  3. Discussion Topics • Machine Translation (MT) Challenges • Current MT Technologies • Several Reasonable Solutions

  4. Setting Expectations • The “Universal Translator” is harder than every other part of the Star Ship Enterprise • The universal translator is not necessary to meet most needs Impossible Technical Difficulty Possible We’re Here 0 10 Machine Translation Accuracy

  5. Communication Complexities • Artistic Expression of Language • Colloquialisms • Double Entendre • Euphemisms • Satire • Cultural Differences • “You’re a Chicken” is a compliment in some cultures and an insult in others

  6. Added Difficulties of Low Resource Languages • The military is interested in languages with very little resources • Few trained linguistic resources • No written dictionary • Very different morphology than English

  7. Discussion Topics • Machine Translation (MT) Challenges • Current MT Technologies • Several Reasonable Solutions

  8. “Speech-to-Speech” Diagram SPEECH Input Conversion to Input (Source) Language TEXT Automatic Speech Recognizer Conversion to Output (Target) Language TEXT Machine Translation Speech Synthesizer = SPEECH Output

  9. Various Machine Translation Technologies • Pre-Translated Phrases • Statistical Machine Translation • Rule Based Systems • Transfer • Interlingua

  10. Statistical Machine Translation (SMT) • Gather lots of pre-translated phrases • Train the engine based on those phrases • New translations are statistically similar to the original training data based on word order

  11. How Does SMT Define Domain? • By Choosing your training data you are defining the domain • The selection of words • The order of the words • What you train the engine with is what the engine knows

  12. Example Domains • Simpler, Tightly Restrained Topics • Military Check Point • Emergency Medical Technician • Police Traffic Violation • Harder, Complex Topics • Business Negotiation • Interview • Training

  13. Relationship Between Domain Complexity and Translation Accuracy

  14. Discussion Topics • Machine Translation (MT) Challenges • Current MT Technologies • Several Reasonable Solutions

  15. The Good News • We don’t need a universal translator • We can keep our speech simple and avoid using complex patterns • In a dialog we can dynamically recover from misunderstandings

  16. Solution One: Tightly Trained SMT • “Raise your hands” could be: • Doctor saying, “raise you hands over you head and cough, while I listen to your lungs” • Police saying, “drop the weapon and raise your hands over your head” • The three words could be translated differently in these two contexts, so don’t train with both phrases

  17. Solution Two:Phrase Based Translation • For domains with a low complexity, communications can be divided into an explicit set of phrases • Resolve syntactical differences of speech to pre-translated phrases • “Raise your hands” and “Put your hands over your head” are the same concept, therefore are stored as one translation

  18. Example Solution - Phraselator® • English phrases are pre-determined according to a domain, then translated into multiple languages • Translations are recorded by native linguists • A different “Module” is built for each domain of interest

  19. The Conclusion • It is possible to make a handheld speech-to-speech machine translation device that’s practical enough to help people perform their jobs • Primary customer is Military, Law Enforcement and Emergency Medical

  20. Phraselator® P2 Demonstration

  21. THANK YOU! Jacquin Buchanan Director of Software R&D jacquin.buchanan@voxtec.com VoxTec International, Inc 20 Ridgely Ave. Ste. 201 Annapolis, MD 21401 410.626.1110 Phraselator® Translation Products www.phraselator.com

More Related