1 / 26

Machine Translation at DARPA

Machine Translation at DARPA. Joseph Olive Program Manager. Agenda. Pre-GALE Programs and Studies DARPA and the Language Community GALE Plans GALE MT Evaluation GALE Accomplishments Future Research. Language Research at DARPA. Four Decades of Research Continuous progress

elia
Download Presentation

Machine Translation at DARPA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Translation at DARPA Joseph Olive Program Manager

  2. Agenda • Pre-GALE Programs and Studies • DARPA and the Language Community • GALE Plans • GALE MT Evaluation • GALE Accomplishments • Future Research Approved for Public Release, Distribution Unlimited

  3. Language Research at DARPA • Four Decades of Research • Continuous progress • Limited vocabulary single talker • Speaker-independent speech recognition • Large vocabulary • Machine translation • Natural language processing • TIDES and EARS • Great Accomplishments • Need for a New Program Approved for Public Release, Distribution Unlimited

  4. GALE Program Goal Enable Automated Processes &English Speaking Soldiers and Commanders to Absorb & Analyze All Incoming Information In a Timely Manner • Languages • Arabic • Chinese • . • . • . • Genres • Newswire • Broadcast news • New Groups • Talk Shows • . • . • . • Topics • Unbounded Approved for Public Release, Distribution Unlimited

  5. Planning for GALE • The community offered: • More Data • Evaluations • Word Error Rate - WER • Bilingual Evaluation Understudy - BLEU • DARPA Questions: • What are the applications for the research? • When is a technology good enough? • What is new? • How will progress be measured? Approved for Public Release, Distribution Unlimited

  6. Pre-GALE Studies • Main question – how good is good enough? • New MT study • Interpolation between human and machine translation • Analysts as subjects • The birth of Human-Targeted Translation Error Rate - HTER • HTER is the GALE MT metric Approved for Public Release, Distribution Unlimited

  7. No. of errors Accuracy =1 – No. of words HTER Translation Evaluation Foreign Language Text & Speech GALE Machine Translation Engine Translators GALE Machine Translation Evaluators Human Editors who conduct comparison Adjudicator Which is right? Can it be ambiguous? Is it an idiom? Gold Standard Translation Approved for Public Release, Distribution Unlimited

  8. Corrected machine translation HTER Editing Example Human-Translated Reference The statement said that “your brothers in the military wing of the Al-Qaeda Jihad Organization in Mesopotamia carried out an assassination of one of the criminal tyrants in the city of Baquba.” Corrected machine translation The statement said that theyour brothers in the military wing to regulateof the Al QaedaJihadorganizationbaseinthe countryMesopotamia had carried out the assassination of one of the criminal tyrantsin the city of penaltyBaquba. 11 errors in 33 words (67% accuracy) Machine translation The statement said that the brothers in the military wing to regulate Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty. Corrected machine translation The statement said that theyour brothers in the military wing to regulateof the Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty. 5 errors Corrected machine translation The statement said thattheyourbrothers in the military wing to regulate Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty. 1 error Corrected machine translation The statement said that theyour brothers in the militarywing to regulateof the Al QaedaJihad base in the country had carried out the assassination of one of the criminals in the city of penalty. 6 errors Deletion Insertion Approved for Public Release, Distribution Unlimited

  9. New Technologies Implemented in GALE • Topic-Dependent Language Modeling • Morphology • Extraction • Syntax Analysis • Hierarchical Classes • Long Distance Language Models • Semantic Analysis • Predicate Argument Analysis Approved for Public Release, Distribution Unlimited

  10. Arabic Translation Targets – Structured Language 90/95 90/90 90 90/90 Translation from text 85/90 90/85 90/95 80 80/90 90/90 90/90 85/85 85/90 75/90 90/85 Translation from speech 70 80/90 85/85 75/90 75/90 75/80 60 Completed 90 75/80 65/80 50 80 55 55 65/80 Pre-GALE 40 70 Accuracy (%) 60 50 90 % documents exceeding accuracy targets 35 35 40 80 Base Φ1 Φ2 Φ3 Φ4 Φ5 Line Targets include accuracy and consistency (% accuracy / % of documents) Approved for Public Release, Distribution Unlimited

  11. Arabic Translation Results – Newswire % Accuracy Ph 4 Target % of documents Approved for Public Release, Distribution Unlimited

  12. Arabic progress % error Formal Audio Formal Text Semi-Formal Text Semi-Formal Audio Approved for Public Release, Distribution Unlimited

  13. Chinese Progress Formal Audio Formal Text Semi-Formal Text Semi-Formal Audio Approved for Public Release, Distribution Unlimited

  14. Human vs. Machine GALE is as good as a single human in Arabic Approved for Public Release, Distribution Unlimited

  15. Improving Translation of Chinese Speech • Chinese transcription error rates are extremely low, but increase along with perplexity • Improvement in translation of Chinese speech will require work in lowering perplexity Approved for Public Release, Distribution Unlimited

  16. Phoneme Transcription Experiment, Human Vs. Machine • Overall Goal • Assess the bounds of human phonetic recognition and compare with machines • Previous Work • Human recognition tested on artificial stimuli • Results show that human accuracy is extremely high • Artificial stimuli lack the complexity of natural speech • The Problem • Isolate phonetic recognition from language biases • Human phonetic discrimination abilities are intimately tied with language, phonotactic and prosodic processing, and lexical and semantic familiarity • Solution • Use natural speech for stimuli • Use transcribers who lack prosodic, phonotactic, lexical, and semantic information, but share a phoneme space Approved for Public Release, Distribution Unlimited

  17. Phoneme Transcription Experiment, Human Vs. Machine • Japanese speakers – Italian transcribers • 15 Human Subjects • 420 phonemes per subject • The difference between human and machine performance was around 10% • Result indicates that progress in STT will require improved language models Approved for Public Release, Distribution Unlimited

  18. Systems in Use Today BBN Broadcast Monitoring System & Web Monitoring System Real-time translation of Arabic, Chinese, Spanish*, or Farsi* broadcasts and web text into English IBM Translingual Automated Language Exploitation System Real-time translation of Arabic, Chinese, Spanish*, or Farsi* broadcasts and web text into English “We are excited about the upgrades and think the program is a great asset to the Global War on Terror and beyond.” – SFC Douglas Wilderman 10th Special Forces Group(A) (Nov. 2008) “The Baghdad system was under extensive operation and the users were very pleased with its capability” – LTC. John Venhaus, commanding officer for Joint PSYOP Group at CENTCOM (Oct. 2007) BBN Web Monitoring System *Farsi and Spanish were funded by outside sources. 17 FOUO Approved for Public Release, Distribution Unlimited

  19. Broadcast Monitoring System* Arabic example 3 2 1 Automatic translationof Arabic transcript Automatic transcriptionof Arabic speech Real-time streaming video(~5 min delay) Although there are no official sources, and accurate numbers of dead, many believe that the number this year is the largest since the American invasion of Iraq and the fall of Saddam Hussein’s regime two thousand three. The estimated number of civilians killed daily in Iraq at least one hundred and twenty persons as well as the wounded. Sample Fielded Arabic Translation 18 Approved for Public Release, Distribution Unlimited

  20. DARPA Present Status • Success • GALE – Groundbreaking Improvements in machine translation of Arabic and Chinese text and speech, in some cases approaching human performance • TRANSTAC – New state of the art in two way multi-lingual communication by speech for tactical use • Deployment – GALE and TRANSTAC technologies have been integrated into operational systems and transitioned to users. Approved for Public Release, Distribution Unlimited

  21. DARPA Present Status (Continued) • Limitations • Lack of Flexibility – No ability to communicate or monitor informal language • Conversations, chat, messaging, etc. are mostly informal • Technology does not exist to cope with informal language models • Lack of Reliability – Error propagation in multiple dialogue turns • To perform multi-turn conversations and chat we need extremely high translation accuracies • Need human machine dialogue to clarify and disambiguate input to reduce probability of error • Lack of Robustness – No capabilities to translate speech signals of less than 25db SNR • Conversing and monitoring of conversation are often not in clean signal. • Transcription of degraded signals are unusable • Lack of Generality – Costly and time consuming methods to develop new language • Cannot duplicate the GALE effort for each new language and dialect • Huge parallel corpora – $60M-$160M/language • Parallel corpora are insufficient • e.g. Chinese corpora already consist of 200 million words • Requires expensive and time consuming annotations Approved for Public Release, Distribution Unlimited

  22. Future Language Research Areas • One way translation – Monitoring • Improvement of translation quality in language very different from English (e.g. Chinese) • Inclusion of informal genres – conversation, e-mail, web chat, messaging • Extension into Arabic dialects – Modern Standard Arabic is seldom used in informal genres • Fast acquisition of new language capabilities • Robustness to noise • Two way translation – Communication • Human-machine dialogue • Human-human and human-computer verbal and text interaction • Information retrieval – linguistically enabled search • Accurate retrieval of relevant, non-redundant information • Natural language query capability • Language Understanding • Grounded language comprehension through experiential learning of objects, actions, and consequences These four thrusts share many underlying technologies Approved for Public Release, Distribution Unlimited

  23. Future Algorithm Research • Rugged Syntactic, Semantic Role Labeling, and Predicate –Argument Analysis • Unconstrained topics and genres • Use semantic equivalences • Analysis of incomplete sentences and/or Analysis of inconclusive acoustic output • Projection of syntax and SRL from known to unknown languages • Powerful Language Models • Modeling non-adjacent words • Utilizing syntactic and semantic information • Using wild cards for incomplete sentences and/or inconclusive acoustic output • Analysis and Translation of Longer Input • discourse threading • Prosodic cues • Coherency of topics • Co-reference resolution • Content analysis Approved for Public Release, Distribution Unlimited

  24. Future Algorithm Research (Continued) • Increasing reliability of two-way communication and natural language query • Human – machine dialogue for clarification and disambiguation • Automatic error detection • Ambiguity resolution • Language generation • Multimodal input • Semantic Role Labeling and Dependency Parsing Analysis in Both Source and Target Languages • Dialects • Translation from one dialect to another (e.g. Modern Standard Arabic to dialectal Arabic) • Dialect detection and identification • New Techniques in Automatic Evaluation of Translation Quality as a Target for Optimization and Automatic Quality Assessment • Language Understanding Approved for Public Release, Distribution Unlimited

  25. Approved for Public Release, Distribution Unlimited

  26. Abstract: Defense Advanced Research Projects Agency (DARPA) Program Manager Joseph Olive will discuss the Chinese and Arabic machine translation work being carried out under DARPA's Global Autonomous Language Exploitation Program. Topics will include preparation for the program, the evaluation paradigm, the current status, and potential future research directions. Approved for Public Release, Distribution Unlimited

More Related