combining word alignment symmetrizations in dependency tree projection n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Combining Word-Alignment Symmetrizations in Dependency Tree Projection PowerPoint Presentation
Download Presentation
Combining Word-Alignment Symmetrizations in Dependency Tree Projection

Loading in 2 Seconds...

play fullscreen
1 / 13

Combining Word-Alignment Symmetrizations in Dependency Tree Projection - PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on

Combining Word-Alignment Symmetrizations in Dependency Tree Projection. David Mare č ek marecek@ufal.mff.cuni.cz Charles University in Prague Institute of Formal and Applied Linguistics CICLING conference Tokyo, Japan, February 21, 2011. Motivation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Combining Word-Alignment Symmetrizations in Dependency Tree Projection' - derron


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
combining word alignment symmetrizations in dependency tree projection

Combining Word-Alignment Symmetrizations in Dependency Tree Projection

David Mareček

marecek@ufal.mff.cuni.cz

Charles University in Prague

Institute of Formal and Applied Linguistics

CICLING conference

Tokyo, Japan, February 21, 2011

motivation
Motivation
  • Let’s have a text in a language which is not very common...
  • We would like to parse it, but we do not have any parser
    • no manually annotated treebank
  • But we do have a parallel corpus with another language
    • English
our goal to create a parser
Our goal – To create a parser
  • Take the parallel corpus with English
  • Make a word-alignment on it
    • GIZA++
  • Parse the English side of the corpus
    • MST dependency parser
  • Transfer the dependencies from English to the target language using the word-alignment
  • Train the parser on the resulting trees
previous works
Previous works
  • Rebecca Hwa (2002, 2005)
    • Simple algorithm for projecting trees from English to Spanish and Chinesse
    • Only one type of alignment used and not specified which one
  • K. Ganchev, J. Gillenwater, B. Taskar (2009)
    • Unsuprevised parser with posterior regularization, in which inferred dependencies should correspond to projected ones
    • English to Bulgarian
our contribution
Our contribution
  • To show that utilization of various types of alignment improves the quality of dependency projection
  • GIZA++ [Och and Ney, 2003]
    • two uni-directonal asymmetric alignments
    • symmetrization methods
  • Simple algorithm for projecting dependencies using different types of alignment links
  • Training and evaluating MST parser
word alignment
Word alignment
  • GIZA++ toolkit has asymmetric output
    • For each word in one language just one counterpart from the other language is found

Coordination of fiscal policies indeed , can be counterproductive .

Eine Koordination finanzpolitischer Maßnahmen kann in der Tat kontraproduktiv sein .

ENGLISH-to-X

Coordination of fiscal policies indeed , can be counterproductive .

Eine Koordination finanzpolitischer Maßnahmen kann in der Tat kontraproduktiv sein .

X-to-ENGLISH

symmetrization methods
Symmetrization methods

Coordination of fiscal policies indeed , can be counterproductive .

  • Combinations of previous two unidirectional alignments

Eine Koordination finanzpolitischer Maßnahmen kann in der Tat kontraproduktiv sein .

INTERSECTION

Coordination of fiscal policies indeed , can be counterproductive .

Eine Koordination finanzpolitischer Maßnahmen kann in der Tat kontraproduktiv sein .

GROW-DIAG-FINAL

which alignment to use for the projection
Which alignment to use for the projection?
  • We have presented four different types of alignment
    • ENGLISH-to-X, X-to-ENGLISH, INTERSECTION, GROW-DIAG-FINAL
  • We prefer X-to-ENGLISH alignment
    • we need to find a parent for each token in the language X
    • we don’t mind English words that are not aligned
  • We recognize three types of links
    • A: links that appeared in INTERSECTION alignment (red)
    • B: links that appeared in GROW-DIAG-FINAL and also in X-to-ENGLISH alignment (orange)
    • C: links that appeared only in X-to-ENGLISH alignment (blue)

Coordination of fiscal policies indeed , can be counterproductive .

Eine Koordination finanzpolitischer Maßnahmen kann in der Tat kontraproduktiv sein .

algorithm example
Algorithm - example

Coordination of fiscal policies indeed , can be counterproductive .

Eine Koordination finanzpolitischer Maßnahmen kann in der Tat kontraproduktiv sein .

results
Results
  • The best results for each of the testing languages:
    • English parser trained on CoNLL-X data
    • The projection was made on first 100.000 sentence pairs from News-commentaries (or Acquis-communautaire) parallel corpus
    • We used McDonald’s maximum spaning tree parser
  • Why is the accuracy so low?
    • Treebanks in CoNLL differ in annotation guidelines
    • Different handling of coordination structures, auxiliary verbs, noun phrases, ...
comparison with previous work
Comparison with previous work
  • We have run our projection method on the same datasets as in the previous work by Ganchev et al. (2009)
    • Bulgarian, OpenSubtitles parallel corpus
    • English parser trained on PennTreebank
    • Tested on Bulgarian CoNLL-X train sentences up to 10 words
  • Our results are slightly better
    • we did NOT use any unsupervised inference of dependency edges
    • we utilized better the word aligment
conclusions
Conclusions
  • We proved that using combination of different word-alignment improves dependency tree projection
  • We outperform the state-of-the art results
  • The problem of testing is in a different anotation guidelines for each treebank