Daniel Gildea (2003): Loosely Tree-Based Alignment for Machine Translation

Download Presentation

Daniel Gildea (2003): Loosely Tree-Based Alignment for Machine Translation

Loading in 2 Seconds...

- 68 Views
- Uploaded on
- Presentation posted in: General

Daniel Gildea (2003): Loosely Tree-Based Alignment for Machine Translation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Daniel Gildea (2003):Loosely Tree-Based Alignment for Machine Translation

Linguistics 580(Machine Translation)Scott Drellishak, 2/21/2006

- Gildea presents an alignment model he describes as “loosely tree-based”
- Builds on Yamada & Knight (2001), a tree-to-string model
- Gildea extends it with a clone operation, and also into a tree-to-tree model
- Wants to keep performance reasonable (polynomial in sentence length)

- Background
- Tree-to-String Model
- Tree-to-Tree Model
- Experiment

- Historically, two approaches to MT: transfer-based and statistical
- More recently, though, hybrids
- Probabilistic models of structured representations:
- Wu (1997) Stochastic Inversion Transduction Grammars
- Alshawi et. al. (2000) Head Transducers
- Yamada & Knight (2001) (see below)

- Need to handle drastic changes to trees (real bitexts aren’t isomorphic)
- To do this, Gildea adds a new operation to the Y&K’s model: subtree clone
- This operation clones a subtree from the source tree to anywhere in the target tree.
- Gildea also proposes a tree-to-tree model that uses parallel tree corpora.

- Background
- Tree-to-String Model
- Tree-to-Tree Model
- Experiment

- Y&K’s model is tree-to-string: the input is a tree and output is a string of words.
- (Gildea compares it to an “Alexander Calder mobile”. He’s the guy who invented that kind of sculpture, which is like Y&K’s model, because each node of the tree can turn either backwards or forwards. Visualize!)

- Three steps to turn input into output:
- Reorder the children of each node (for m nodes, m! orderings; conditioned only on the category of the node and its children)
- Optionally insert words at each node either before or after all the children (conditioned only on foreign word)
- Translate words at leaves (conditioned on P(f|e); words can translate to NULL)

- Recall that this model was used for translating English to Japanese.
- Their model is well-suited to this language pair:
- Japanese is SOV, while English is SVO. Japanese is also generally head-last where English is head-first. Reordering handles both of these.
- Japanese marks subjects/topics and objects with postpositions. Insertion handles this.

- EM algorithm estimates inside probabilities β bottom-up:
for all nodes εiin input tree T do for all k, l such that 1 < k < l < N do for all orderings ρof the children ε1… εmof εido for all partitions of span k, l into k1, l1…km, lmdo

end for end for end forend for

- Computation complexity O(|T|Nm+2), where T = tree, N = input length, m = fan-out of the grammar
- “By storing partially complete arcs in the chart and interleaving the inner two loops”, improve to O(|T|n3m!2m)
- Gildea says “exponential in m” (looks factorial to me) but polynomial in N/n
- If |T| is O(n) then the whole thing is O(n4)

- No alignments with crossing brackets:
A

BZ

XY

- XZY and YZX are impossible
- Recall that Y&K flatten trees to avoid some of this, but don’t catch all cases

- Gildea adds clone operation to Y&K’s model
- For each node, allow the insertion of a clone of another node as its child.
- Probability of cloning εi under εj in two steps:
- Choice to insert:
- Node to clone:

- Pclone is one estimated number, Pmakeclone is constant (all nodes equally probable, reusable)

- Background
- Tree-to-String Model
- Tree-to-Tree Model
- Experiment

- Output is a tree, not a string, and it must match the tree in the target corpus
- Add two new transformation operations:
- one source node → two target nodes
- two source nodes → one target node

- “a synchronous tree substitution grammar, with probabilities parameterized to generate the target tree conditioned on the structure of the source tree.”

- From the root down. At each level:
- At most one of node’s children grouped with it, forming an elementary tree (conditioned on current node and CFG rule children)
- Alignment of e-tree chosen (conditioned as above). Like Y&K reordering except: (1) alignment can include insertions and deletions (2) two nodes grouped together are reordered together.
- Lexical leaves translated as before.

- Elementary trees allow the alignment of trees with different depths. Treat A,B as an e-tree, reorder their children together:
AA

BZ→XZY

XY

- Estimates inside probabilities β bottom-up:
for all nodes εain source tree Ta in bottom-up order do for all elementary trees ta rooted in εado for all nodes εb in target tree Tb in bottom-up order do for allelementary trees tb rooted in εbdo for all alignments α of the children of ta and tbdo

end forend for end for end forend for

- Outer two loops are O(|T|2)
- Elementary trees include at most one child, so choosing e-trees is O(m2)
- Alignment is O(22m)
- Which nodes to insert or clone is O(22m)
- How to reorder is O((2m)!)
- Overall: O(|T|2m242m(2m)!), quadratic (!) in size of the input sentence.

- Allowing m-to-n matching of up to two nodes (e-trees) allows only “limited non-isomorphism”
- So, as before, add a clone operation
- Algorithm unchanged, except alignments may now include cloned subtrees, same probability as in tree-to-string (uniform)

- Background
- Tree-to-String Model
- Tree-to-Tree Model
- Experiment

- Parallel Korean-English corpus
- Trees annotated by hand on both sides
- “in this paper we will be using only the Korean trees, modeling their transformation into the English text.”
- (That can’t be right—only true for TTS?)
- 5083 sentence: 4982 training, 101 eval

- Recall that Y&K’s model was suited to the English-to-Japanese task.
- Gildea is going to compare their model to his, but using a Korean-English corpus. Is that fair?
- In a word, yes. Korean and Japanese are syntactically very similar: agglutinative, head-last (so similar that syntax is the main argument that they’re related).

- Alignment Error Rate Och & Ney (2000):

- The lexical probabilities come from Model 1 and node reordering probabilities initialized to uniform
- Best results when Pins set to 0.5 rather than estimated (!)
- “While the model learned by EM tends to overestimate the total number of aligned word pairs, fixing a higher probability for insertions results in fewer total aligned pairs and therefore a better trade-off between precision and recall”

- The best results were with tree-to-string, surprisingly
- Y&K + clone was ≈ to IBM, fixing Pins was best overall
- Tree-to-tree + clone was ≈ to IBM, but it was much more efficient to train (since it’s quadratic instead of quartic)
- Still, disappointing results for TTT

- Model allows syntactic info to be used for training without ordering constraints
- Clone operations improve alignment results
- Tree-to-tree + clone is better only in performance (but he’s hopeful)
- Future directions: bigger corpora, conditioning on lexicalized trees