1 / 28

Training Tree Transducers

Training Tree Transducers. Author: Jonathan Graehl Kevin Knight Presented by Zhengbo Zhou. Outline . Finite State Transducers (FSTs) and R Trees and Regular Tree Grammars xR and Derivation Tree Inside-Outside algorithm and EM training Turning tree to string (xRS)

Download Presentation

Training Tree Transducers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Training Tree Transducers Author: Jonathan Graehl Kevin Knight Presented by Zhengbo Zhou

  2. Outline • Finite State Transducers (FSTs) and R • Trees and Regular Tree Grammars • xR and Derivation Tree • Inside-Outside algorithm and EM training • Turning tree to string (xRS) • Example and Related Work • My thought/questions

  3. b:y a:x q0 q1 Finite State Transducers (FSTs) • Finite-state Transducer: from what we’ve learned->

  4. R transducer • An R transducer compactly represent a potentially infinite set of input/output tree pairs. • While a FST compactly represent such a set of input/output string pairs. • R is a generalization of FST.

  5. S PRO VP he V NP drinks water Example of R He drinks water

  6. q S • S • S PRO VP qleft.vp.v VP qleft.vp.v VP qpro PRO qpro PRO qright.vp.np VP qright.vp.np VP he V V PRO NP NP drinks water Example for R cont Rule 1: Rule: 2,3,4 English order S(PRO, VP(V, NP)) Arabic order S(V,PRO,NP)

  7. Trees • Definitions:

  8. Regular Tree Grammars (RTG) • Regular Tree Grammar, a common way of compactly representing a potentially infinite set of trees. • wRTG is just like WFSA. • wRTG G : (∑,N,S,P) ∑: alphabet N: nonterminals S: start nonterminal : Weighted productions

  9. Sample wRTG

  10. Extended-LHS Tree Transducer (xR) • Different from R: explicitly represent the lookahead and movement with a more specified LHS • Form of LHS is: The pattern will be used to match an input subtree. • There is a set of finite tree patterns.

  11. Binary Relation:

  12. Derivation Tree • So many trees now, but this derivation tree is a representation of the transducer, neither the input tree nor the output tree. • But derivation tree can deterministically produce a single weighted output tree.

  13. Derivation tree & derivation wRTG X X’

  14. Inside-Outside algorithm • Basic idea of inside-outside algorithm: Use current probability of rules to estimate the expected frequencies of certain types of derivation steps and compute new probabilities for those rules.[1] • Generally for inside probability is to recalculate p of A->a may go through A->BC for outside probability is to recalculate p of C->AB or C->BA

  15. Inside-Outside for wRTG • Inside weights using G are given by βG: • Outside weights αG:

  16. EM training • EM training: to maximized the corpus likelihood, repeatedly estimating the expectation of decision and maximizing by assigning counts to parameter and renormaliztion. • Algorithm 2 implements EM xR training by repeatedly computing inside-outside weights.

  17. From tree to string • Although we can use Extended-LHS Tree Transducer (xR) to get an output tree from an input tree (say parse trees), but still, it is a (parse) tree, not the sentence in another language (for machine translation). • Now we have xRS—tree to string transducer.

  18. Tree-to-string transducer • Weighted extended-lhs root-to-frontier tree-to-string transducer: X=(∑,Δ,Q, Qi, R) • It is similar to xR, but the rhs is strings instead of trees.

  19. Example • Implemented the translation model of (Yamada and Knight 2001) • There is a trainable xRS tree-to-string transducer that embodies:

  20. Example

  21. Related Work • TSG vs RTG (equivalent) • xR vs weighted synchronous TSG (similar) • EM training vs forward backward algorithm for finite state (string) transducer and also for HMM

  22. Questions • Is there any future work on this tree transducer especially for Machine Translation? • Precision? Recall? • Also a little bit confused in the descriptions of those two relationships =>x and =>G • Not very sure about inside-outside algorithm. Questions?

  23. Thank you!!

  24. Reference • 1 Fernando Pereira, Yves Schabes INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA 1992

  25. What might be useful • An Overview of Probabilistic Tree Transducers for Natural Language Processing Kevin Knight and Jonathan Graehl

  26. – R: Top-down transducer, introduced before. • – F: Bottom-up transducer (“Frontier-to-root”), with similar rules, but transforming the leaves of the input tree first, and working its way up. • – L: Linear transducer, which prohibits copying subtrees. Rule 4 in Figure 4 is example of a copying production, so this whole transducer is R but not RL. • – N: Non-deleting transducer, which requires that every left-hand-side variable also appear on the right-hand side. A deleting R-transducer can simply delete a subtree (without inspecting it). The transducer in Figure 4 is the deleting kind, because of rules 34-39. It would also be deleting if it included a rule for dropping English determiners, e.g., q NP(x0, x1) q x1. • – D: Deterministic transducer, with a maximum of one production per <state, symbol> pair. • – T: Total transducer, with a minimum of one production per <state, symbol> pair. • – PDTT: Push-down tree transducer, the transducer analog of CFTG [36]. • – subscript: Regular-lookahead transducer, which can check to see if an input subtree is tree-regular, i.e., whether it belongs to a specified RTL. Productions only fire when their lookahead conditions are met.

More Related