1 / 15

Natural Language Processing: Using Machine Translation in Creation of a German-English Translator

Jason Ji Computer Systems Laboratory 2004-2005. Natural Language Processing: Using Machine Translation in Creation of a German-English Translator. Machine Translation. a field that has been around for decades several methods to solve problem none of them resemble human methods

padma
Download Presentation

Natural Language Processing: Using Machine Translation in Creation of a German-English Translator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jason Ji Computer Systems Laboratory 2004-2005 Natural Language Processing: Using Machine Translation in Creation of a German-English Translator

  2. Machine Translation • a field that has been around for decades • several methods to solve problem • none of them resemble human methods • this program attempts to use human methods to translate

  3. Direct Approach • original translation strategy • translate each word directly in one-to-one dictionary look-up • then perform some local reordering • doesn't consider semantic information

  4. Indirect Approach • Interlingua and Transfer Approaches • translate from source language to some intermediary, unnatural language, including semantic information, etc • translate from intermediary to the target language • other methods • knowledge based, etc, more complicated, less human-like

  5. Theory • no current translation method is 100% effective • no current translation method closely resembles human approach • humans can be 100% effective translators • therefore, use human approach with machines to have more effective translations?

  6. Overview of Method • separate input string into each word • first look-up: a list that maps each word to its part of speech • second look-up: each part-of-speech-specific list maps each word to its translation and semantic information • past tense forms, irregular conjugations, etc

  7. Development • Assumptions: • article must precede noun • preposition must be followed by anoun • verb must be preceded by a noun • look-up: • find word in list.txt, then redirect to other text files

  8. Development • Assumptions: • article must precede noun • preposition must be followed by anoun • verb must be preceded by a noun • look-up: • find word in list.txt, then redirect to other text files

  9. Development • Assumptions: • article must precede noun • preposition must be followed by anoun • verb must be preceded by a noun • look-up: • find word in list.txt, then redirect to other text files

  10. Development • Line of semantic information chopped up • various subclass objects (Noun, Verb, etc) are created • pronouns created in nominative case • articles created unidentified • verbs created infinitive form

  11. Development • Correct article genders and cases • for article in pos x, check noun in pos x+1 • check case of nearestModifier() • correct verb conjugations • for verb in pos x, check subject in pos x-1 • search in verblist, find weak or strong • weak: follow conjugation pattern; strong: read in conjugations from list

  12. Development • Correct pronoun cases • for pronoun in pos x, check verb or preposition in pos x-1 • append all corrected Strings together and display in text field

  13. Results/Conclusion • I see the dog / Ich sehe den Hund. • Correct for pronoun, present-tense verb conjugation, direct-object case correction • The cats help the dogs / Die Katzen helfen den Hunden. • Correct for nominative pluralizations, verb conjugation, and dative pluralization due to verb

  14. Results/Conclusion • The cats are the dogs / RUNTIME ERRORS • fails with irregular verbs in English, including “to be” • The cats ate the pie / Die Katzen essen die Torte. • Fails with past tense verbs • recognizes a past tense verb, but does not correct

  15. Results/Conclusion • Succeeds in limited goals • not practical or applicable in anything • highly fragile • runtime errors for basically anything that doesn’t follow the exact exact form • inefficient: list.txt with 53 words was 4KB; a list of 1,000,000 words would be 75.5MB

More Related