Daniel gildea 2003 loosely tree based alignment for machine translation
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Daniel Gildea (2003): Loosely Tree-Based Alignment for Machine Translation PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on
  • Presentation posted in: General

Daniel Gildea (2003): Loosely Tree-Based Alignment for Machine Translation. Linguistics 580 (Machine Translation) Scott Drellishak, 2/21/2006. Overview. Gildea presents an alignment model he describes as “loosely tree-based” Builds on Yamada & Knight (2001), a tree-to-string model

Download Presentation

Daniel Gildea (2003): Loosely Tree-Based Alignment for Machine Translation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Daniel gildea 2003 loosely tree based alignment for machine translation

Daniel Gildea (2003):Loosely Tree-Based Alignment for Machine Translation

Linguistics 580(Machine Translation)Scott Drellishak, 2/21/2006


Overview

Overview

  • Gildea presents an alignment model he describes as “loosely tree-based”

  • Builds on Yamada & Knight (2001), a tree-to-string model

  • Gildea extends it with a clone operation, and also into a tree-to-tree model

  • Wants to keep performance reasonable (polynomial in sentence length)


Daniel gildea 2003 loosely tree based alignment for machine translation

  • Background

  • Tree-to-String Model

  • Tree-to-Tree Model

  • Experiment


Background

Background

  • Historically, two approaches to MT: transfer-based and statistical

  • More recently, though, hybrids

  • Probabilistic models of structured representations:

    • Wu (1997) Stochastic Inversion Transduction Grammars

    • Alshawi et. al. (2000) Head Transducers

    • Yamada & Knight (2001) (see below)


Gildea s proposal

Gildea’s Proposal

  • Need to handle drastic changes to trees (real bitexts aren’t isomorphic)

  • To do this, Gildea adds a new operation to the Y&K’s model: subtree clone

  • This operation clones a subtree from the source tree to anywhere in the target tree.

  • Gildea also proposes a tree-to-tree model that uses parallel tree corpora.


Daniel gildea 2003 loosely tree based alignment for machine translation

  • Background

  • Tree-to-String Model

  • Tree-to-Tree Model

  • Experiment


Yamada and knight 2001

Yamada and Knight (2001)

  • Y&K’s model is tree-to-string: the input is a tree and output is a string of words.

  • (Gildea compares it to an “Alexander Calder mobile”. He’s the guy who invented that kind of sculpture, which is like Y&K’s model, because each node of the tree can turn either backwards or forwards. Visualize!)


Y k tree to string model

Y&K Tree-to-String Model

  • Three steps to turn input into output:

    • Reorder the children of each node (for m nodes, m! orderings; conditioned only on the category of the node and its children)

    • Optionally insert words at each node either before or after all the children (conditioned only on foreign word)

    • Translate words at leaves (conditioned on P(f|e); words can translate to NULL)


Aside y k suitability

Aside: Y&K Suitability

  • Recall that this model was used for translating English to Japanese.

  • Their model is well-suited to this language pair:

    • Japanese is SOV, while English is SVO. Japanese is also generally head-last where English is head-first. Reordering handles both of these.

    • Japanese marks subjects/topics and objects with postpositions. Insertion handles this.


Y k em algorithm

Y&K EM Algorithm

  • EM algorithm estimates inside probabilities β bottom-up:

    for all nodes εiin input tree T do for all k, l such that 1 < k < l < N do for all orderings ρof the children ε1… εmof εido for all partitions of span k, l into k1, l1…km, lmdo

    end for end for end forend for


Y k performance

Y&K Performance

  • Computation complexity O(|T|Nm+2), where T = tree, N = input length, m = fan-out of the grammar

  • “By storing partially complete arcs in the chart and interleaving the inner two loops”, improve to O(|T|n3m!2m)

  • Gildea says “exponential in m” (looks factorial to me) but polynomial in N/n

  • If |T| is O(n) then the whole thing is O(n4)


Y k drawbacks

Y&K Drawbacks

  • No alignments with crossing brackets:

    A

    BZ

    XY

  • XZY and YZX are impossible

  • Recall that Y&K flatten trees to avoid some of this, but don’t catch all cases


Adding clone

Adding Clone

  • Gildea adds clone operation to Y&K’s model

  • For each node, allow the insertion of a clone of another node as its child.

  • Probability of cloning εi under εj in two steps:

    • Choice to insert:

    • Node to clone:

  • Pclone is one estimated number, Pmakeclone is constant (all nodes equally probable, reusable)


Daniel gildea 2003 loosely tree based alignment for machine translation

  • Background

  • Tree-to-String Model

  • Tree-to-Tree Model

  • Experiment


Tree to tree model

Tree-to-Tree Model

  • Output is a tree, not a string, and it must match the tree in the target corpus

  • Add two new transformation operations:

    • one source node → two target nodes

    • two source nodes → one target node

  • “a synchronous tree substitution grammar, with probabilities parameterized to generate the target tree conditioned on the structure of the source tree.”


Calculating probability

Calculating Probability

  • From the root down. At each level:

    • At most one of node’s children grouped with it, forming an elementary tree (conditioned on current node and CFG rule children)

    • Alignment of e-tree chosen (conditioned as above). Like Y&K reordering except: (1) alignment can include insertions and deletions (2) two nodes grouped together are reordered together.

    • Lexical leaves translated as before.


Elementary trees

Elementary Trees?

  • Elementary trees allow the alignment of trees with different depths. Treat A,B as an e-tree, reorder their children together:

    AA

    BZ→XZY

    XY


Em algorithm

EM algorithm

  • Estimates inside probabilities β bottom-up:

    for all nodes εain source tree Ta in bottom-up order do for all elementary trees ta rooted in εado for all nodes εb in target tree Tb in bottom-up order do for allelementary trees tb rooted in εbdo for all alignments α of the children of ta and tbdo

    end forend for end for end forend for


Performance

Performance

  • Outer two loops are O(|T|2)

  • Elementary trees include at most one child, so choosing e-trees is O(m2)

  • Alignment is O(22m)

  • Which nodes to insert or clone is O(22m)

  • How to reorder is O((2m)!)

  • Overall: O(|T|2m242m(2m)!), quadratic (!) in size of the input sentence.


Tree to tree clone

Tree-to-Tree Clone

  • Allowing m-to-n matching of up to two nodes (e-trees) allows only “limited non-isomorphism”

  • So, as before, add a clone operation

  • Algorithm unchanged, except alignments may now include cloned subtrees, same probability as in tree-to-string (uniform)


Daniel gildea 2003 loosely tree based alignment for machine translation

  • Background

  • Tree-to-String Model

  • Tree-to-Tree Model

  • Experiment


The data

The Data

  • Parallel Korean-English corpus

  • Trees annotated by hand on both sides

  • “in this paper we will be using only the Korean trees, modeling their transformation into the English text.”

  • (That can’t be right—only true for TTS?)

  • 5083 sentence: 4982 training, 101 eval


Aside suitability

Aside: Suitability

  • Recall that Y&K’s model was suited to the English-to-Japanese task.

  • Gildea is going to compare their model to his, but using a Korean-English corpus. Is that fair?

  • In a word, yes. Korean and Japanese are syntactically very similar: agglutinative, head-last (so similar that syntax is the main argument that they’re related).


Results

Results

  • Alignment Error Rate Och & Ney (2000):


Results detailed

Results Detailed

  • The lexical probabilities come from Model 1 and node reordering probabilities initialized to uniform

  • Best results when Pins set to 0.5 rather than estimated (!)

  • “While the model learned by EM tends to overestimate the total number of aligned word pairs, fixing a higher probability for insertions results in fewer total aligned pairs and therefore a better trade-off between precision and recall”


How d tts and ttt do

How’d TTS and TTT Do?

  • The best results were with tree-to-string, surprisingly

  • Y&K + clone was ≈ to IBM, fixing Pins was best overall

  • Tree-to-tree + clone was ≈ to IBM, but it was much more efficient to train (since it’s quadratic instead of quartic)

  • Still, disappointing results for TTT


Conclusions

Conclusions

  • Model allows syntactic info to be used for training without ordering constraints

  • Clone operations improve alignment results

  • Tree-to-tree + clone is better only in performance (but he’s hopeful)

  • Future directions: bigger corpora, conditioning on lexicalized trees


  • Login