machine translation the translator s choice n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Machine Translation The Translator ’ s Choice PowerPoint Presentation
Download Presentation
Machine Translation The Translator ’ s Choice

Loading in 2 Seconds...

play fullscreen
1 / 18

Machine Translation The Translator ’ s Choice - PowerPoint PPT Presentation


  • 224 Views
  • Uploaded on

Machine Translation The Translator ’ s Choice. Heidi Düchting Sylke Krämer Johann Roturier. Outline . Background Challenges Solutions Benefits Next steps Conclusions. Commercial Imperatives. Effective Time-critical documents in volume Efficient Translation process automation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Machine Translation The Translator ’ s Choice' - vivienne-dotson


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
machine translation the translator s choice

Machine Translation The Translator’s Choice

Heidi Düchting

Sylke Krämer

Johann Roturier

outline
Outline
  • Background
  • Challenges
  • Solutions
  • Benefits
  • Next steps
  • Conclusions
commercial imperatives
Commercial Imperatives
  • Effective
    • Time-critical documents in volume
  • Efficient
    • Translation process automation
    • Combining translation technologies
      • workflow
      • TM, MT, and PE tools
  • Control
    • Loose writing guidelines vs. Controlled Language rules
      • Improved machine translatability
commercial systems
Commercial Systems
  • Combine technologies
      • TM with previously machine-translated and post-edited segments for look-up
  • TM systems with MT component
      • Rule based and Example based
      • Pre-translate phase
      • Towards improved post-editing efficiency?
      • Not available in all systems
  • MT systems with TM component
      • 100 % match look-up
challenges
Challenges
  • Setting a threshold for TM matches
    • 100% matches only
      • suitable when the objective is to provide MT output for gisting (no post-editing)
      • suitable when the MT system is fully customized and CL environment is in place (no post-editing?)
  • Quick PE
      • New sentences in which only one character changes are sent to the MT engine
        • W32.Beagle.AB is a mass-mailing worm that neither propagates via network shares nor deletes files
        • W32.Beagle.AC is a mass-mailing worm that neither propagates via network shares nor deletes files
solutions 1
Solutions (1)
  • Two-tier process
      • Leverage Trados TM repository
      • Use MT system to translate unknown segments (Systran Premium 5.0)
      • Use MT output as TM input
  • Determine the export threshold
      • Existing TM segments vs. new controlled segments
        • Uncontrolled: Symantec announced a patch was available
        • CL: Symantec announced that a patch was available
solutions 2
Solutions (2)
  • TMX format
      • obvious choice as the exchange format
      • XLIFF not supported by all MT systems
      • source and target segments

<tu usagecount="1" creationdate="20050301T122255Z" creationid="SUPER">

<tuv lang="EN-US">

<seg>Then the worm searches all local and network drives for .gif, .bmp, and .wav files.</seg>

</tuv>

<tuv lang="DE-DE">

<seg>Then the worm searches all local and network drives for .gif, .bmp, and .wav files.</seg>

</tuv>

</tu>

processing tmx
Processing TMX
  • Technical issues
      • TMX's various implementations can create discrepancies during the exchange process
      • Identical source and target segment
      • XML parser and TMX header
  • Pre and post processing with a single macro
      • Modules to remove and restitute sections
      • Environment: VBA
pre translation workflow
Pre-translation Workflow

Step 1:

Analyze new document

Step 2:

Export unmatched segments

Step 3:

Pre-processing module

Step 4:

Call to MT system

Step 5:

Post-processing module

Step 6:

Import segments into TM

effective pre translation
Effective pre-translation
  • Efficiency and robustness
      • Refinable
  • Opportunity for modifications
      • Target segments
      • CL environment predictability
      • Frequent errors
  • Ideal scenario
      • Address problems that could not be fixed with CL rules
towards automated post editing
Towards Automated Post-Editing
  • Surface post-editing
      • No linguistic analysis: no second MT
      • Text processing
      • Frequent errors due to default MT settings
      • Remove drudgery from post-editing
  • Lexical
      • Capitalization (folgende vs. Folgende)
      • Incorrect spelling (neuzustarten vs. neu zu starten)
      • Missing contractions (à le vs. au)
      • Extra words (fichier de .bmp vs. fichier .bmp)
towards automated post editing1
Towards Automated Post-Editing
  • Syntactic
      • Word order: “Klicken auf Sie” vs. “Klicken Sie auf”
      • Wrong structures (transfer or generation issue): neither…nor (ni ne..ni ne)
  • Textual
      • Formatting: trailing spaces after symbols (backslashes)
      • Punctuation inconsistent with style guide: inverted commas for German
towards automated post editing2
Towards Automated Post-Editing
  • Suitability of the environment
      • Regular expressions support
      • RE are a ‘way to describe text through pattern matching’ (Stubblebine 2003: 1)
      • Grouping and Capturing:
      • Match: ([Kk]licken) (auf) (Sie)
      • Replace: \1 \3 \2
next steps
Next steps
  • New environment
    • GMS integration
      • Centralized interface with content
      • Transport layer
      • MT as plug-in
    • XLIFF format
      • To machine translate unmatched segments
    • PE replacements
      • Fine-tune contextual replacements
conclusions
Conclusions
  • Combining MT & TM is efficient
      • leverage
      • post-editing is not repeated
      • increased throughput
  • Environment for avoiding errors
      • facilitated when CL rules are introduced
      • Scope of errors is reduced
  • New opportunities for translators
      • Fine-tuning MT user dictionaries
      • Refine automated PE tasks
thank you

Thank You

johann_roturier@symantec.com