slide1 n.
Download
Skip this Video
Download Presentation
Development of a German-English Translator

Loading in 2 Seconds...

play fullscreen
1 / 12

Development of a German-English Translator - PowerPoint PPT Presentation


  • 124 Views
  • Uploaded on

Felix Zhang Period 5 2007-2008 Thomas Jefferson High School for Science and Technology Computer Systems Research Lab. Development of a German-English Translator. Summary of Quarter 2. NP Chunking Lemmatization Dictionary Lookup Inflection Noun-verb agreement. Scope for this quarter.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Development of a German-English Translator' - dolan


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
Felix Zhang

Period 5

2007-2008

Thomas Jefferson High School for Science and Technology Computer Systems Research Lab

Development of a German-English Translator
summary of quarter 2
Summary of Quarter 2
  • NP Chunking
  • Lemmatization
  • Dictionary Lookup
  • Inflection
  • Noun-verb agreement
scope for this quarter
Scope for this quarter
  • Focus less on statistical methods
  • Get rudimentary grammar system working
  • Fix all the bugs I’ve made since September
new and modified components
New and Modified Components
  • More info stored in NP chunking
  • Better noun-verb agreement
  • Grammar
    • Element Assignment
    • Priority Number Assignment
noun verb agreement
Noun-verb agreement
  • Simple method to eliminate more ambiguities

def eliminateother(attribs, sub, closest):

for x in attribs:

if x[0][1] == "nou" and x != sub:

for y in x[1]:

if y[0]== "nom":

attribs[attribs.index(x)][1].remove(y)‏

return attribs

noun phrase chunking
Noun phrase chunking
  • Now used for English sentences
  • Stores more info for later methods
  • “the man make the children”
  • NP Chunked English: [[['the', 'art'], ['man', 'nou', [['akk', 'mas'], ['dat', 'pl']]]], ['make', 'ver', [['3', 'pl'], 'pres']], [['the', 'art'], ['small', 'adj'], ['child', 'nou', [['nom', 'pl']]]]]
element assignment
Element Assignment
  • Based on linguistic information
  • If case is nominative, chunk is subject
  • If accusative, chunk is direct object
  • [[['the', 'art'], ['man', 'nou', [['akk', 'mas'], ['dat', 'pl']]], 'dobj'], ['make', 'ver', [['3', 'pl'], 'pres'], 'mverb'], [['the', 'art'], ['small', 'adj'], ['child', 'nou', [['nom', 'pl']]], 'sub']]
priority assignment
Priority Assignment
  • Each sentence element is assigned priority number
  • Based on position in English sentence
  • Assignments:
    • sub 1
    • mverb 2
    • auxverb 3
    • iobj 4
    • dobj 5
  • Sort by number for English grammar
full run of program
Full run of program

input: “den Mann machen die kleinen Kinder”

The small children make the man

fzhang@ltsp1 ~/research $ python proj.py

Part of speech tags: [['den', 'art'], ['Mann', 'nou'], ['machen', 'ver'], ['die', 'art'], ['kleinen', 'adj'], ['Kinder', 'nou']]

Morphological analysis: [[['Mann', 'nou'], [['akk', 'mas'], ['dat', 'pl']]], [['machen', 'ver'], [['1', 'pl'], ['3', 'pl'], 'pres']], [['kleinen', 'adj'], [['nom', 'pl'], ['akk', 'pl']]], [['Kinder', 'nou'], [['nom', 'pl'], ['akk', 'pl']]]]

Disambiguated after noun-verb agreement: [[['Mann', 'nou'], [['akk', 'mas'], ['dat', 'pl']]], [['machen', 'ver'], [['3', 'pl'], 'pres']], [['kleinen', 'adj'], [['nom', 'pl'], ['akk', 'pl']]], [['Kinder', 'nou'], [['nom', 'pl']]]]

Lemmatized: [['Mann', ['Mann', 'Man']], ['machen', ['machen']], ['kleinen', ['klein']], ['Kinder', ['Kind']]]

Root translated: [['den', 'the'], ['Mann', 'man'], ['machen', 'make'], ['die', 'the'], ['kleinen', 'small'], ['Kinder', 'child']]

NP Chunked English: [[['the', 'art'], ['man', 'nou', [['akk', 'mas'], ['dat', 'pl']]]], ['make', 'ver', [['3', 'pl'], 'pres']], [['the', 'art'], ['small', 'adj'], ['child', 'nou', [['nom', 'pl']]]]]

Inflected (only works before chunking):

['the', 'the'] ['man', ['akk', 'mas'], 'man'] ['man', ['dat', 'pl'], 'mans'] ['make', ['3', 'pl'], 'make'] ['the', 'the'] ['small', 'small'] ['child', ['nom', 'pl'], 'childs']

Assigned an element type:

[[['the', 'art'], ['man', 'nou', [['akk', 'mas'], ['dat', 'pl']]], 'dobj'], ['make', 'ver', [['3', 'pl'], 'pres'], 'mverb'], [['the', 'art'], ['small', 'adj'], ['child', 'nou', [['nom', 'pl']]], 'sub']]

Assigned priority:

[['5', ['the', 'art'], ['man', 'nou', [['akk', 'mas'], ['dat', 'pl']]], 'dobj'], ['2', 'make', 'ver', [['3', 'pl'], 'pres'], 'mverb'], ['1', ['the', 'art'], ['small', 'adj'], ['child', 'nou', [['nom', 'pl']]], 'sub']]

Rearranged to English structure:

[['1', ['the', 'art'], ['small', 'adj'], ['child', 'nou', [['nom', 'pl']]], 'sub'], ['2', 'make', 'ver', [['3', 'pl'], 'pres'], 'mverb'], ['5', ['the', 'art'], ['man', 'nou', [['akk', 'mas'], ['dat', 'pl']]], 'dobj']]

problems
Problems
  • Ambiguities (again)‏
    • One ambiguity can change the entire structure of the sentence
    • “I gave a horse the hat” vs. “I gave the hat a horse”
    • Attempt at all permutations possible
      • User disambiguation
problems1
Problems
  • Inflexible
    • Grammar can only be rearranged in one specific way
    • Subject – Main verb – Indirect – Direct – Auxiliary Verb
    • Does not accommodate for prepositions, conjunctions, etc.
future research
Future research
  • Implement more statistical methods
    • Morphological info
    • Actual translation – bilingual corpus
  • Create better parse tree – Dependency grammar
ad