1 / 54

The Complexity of Tree Transducer Output Languages

The Complexity of Tree Transducer Output Languages. NII Logic Seminar 2009/04/15 Kazuhiro Inaba @ NII Joint work with Sebastian Maneth @ NICTA&UNSW. Table of Contents. Overvie w The Problem Statement Key Lemma: “Garbage-Free Form” Results and Applications The Proof

sidney
Download Presentation

The Complexity of Tree Transducer Output Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Complexity of Tree Transducer Output Languages NII Logic Seminar 2009/04/15 Kazuhiro Inaba@ NIIJoint work with Sebastian Maneth@ NICTA&UNSW

  2. Table of Contents • Overview • The Problem Statement • Key Lemma: “Garbage-Free Form” • Results and Applications • The Proof • (Other Topics in my Thesis…?)

  3. Preliminaries • Σ, Δ, Γ, …: Ranked Finite Set • Each Symbol in Σ, Δ is associated with a natural number called “rank” of the symbol • TΣ = The set of Trees over Σ • TΣ ::= σ(TΣ, …, TΣ) with rank(σ) = k • Example: • Σ={a(2),b(1),c(0)}, a(b(c), c)∈ TΣ

  4. Preliminaries • τ ⊆ TΣ×TΔ is called a(tree-to-tree) translation • For τ1∈ TΣ×TΓ, τ2∈ TΓ×TΔ,the sequential composition is defined as follows:τ1;τ2= {(s, t) | (s, r)∈τ1, (r, t)∈τ2}

  5. “Complexity of Output Languages” • Given… • A tree-to-tree translationτ ⊆ TΣ×TΔ • How complex is the setrange(τ)⊆ TΔ?(i.e., for a tree t ∈ TΔ, how is it computationally hard to determine whether t ∈range(τ) or not ?)

  6. Classic Results • τ: Program of the Turing-Machine  Undecidable • τ: Nondeterministic Finite-State String Transduction  range(τ)is regular! •  The membership of range(τ) issolved in O(n) time, O(1) space • So for τ∈ Finitely Many Compositionsof Nondet-FST b/bb b/b a/ a/a

  7. Our Result • τ: Composition of NondeterministicMacro Tree Transducers •  the membership problem of range(τ) is NP-complete and in DSPACE(n). • What is Macro Tree Transducers? • A relatively powerful (yet terminating) model of tree translation • The formal definition will be explained soon…

  8. Actually, we’ve shownthe “Garbage-Free Form” • For any compositionτ = τ1;τ2; … ;τn⊆ TΣ×TΔ of MTTs, there exists an equivalent oneτ = ρ0 ; ρ1 ; … ; ρ2n ,such that • dom(ρ0)=TΣ and range(ρ0) is regular • For any(s0,t)∈τ, there’ss1,..,s2ns.t. (si,si+1) ∈ρi, (s2n,t)∈ρ2n, |si+1|≦ 2|si+2|, |s2n|≦2|t|

  9. Actually, we’ve shownthe “Garbage-Free Form” • In each step of translation (except the first step), the size of the tree always becomes larger • (Ignoring the constant factor “2”) τ1 ρ0 τ2 ρ1 τk ρ2n s0 s1 s1 s2 s2 t t S2n Sk-1 s0

  10. (Possible)Applications in Practice • Compositions of MTTs are known to be a good model for XML translations • MTT* can represent good subclasses of XML-QL, XSLT, XQuery, … • Query Optimization by GFF • Verification by the range(MTT*) membership

  11. Corollaries of GFF(Applications in Theory) • range(τ1;τ2; … ;τn) is in NP and in DSPACE(n) • Higher-order context free languages are context-sensitive languages • Details are in the next slide…

  12. Chomsky Hierarchy[Chomsky1959] • Systems of Finite Description of String Languages • Type-0 Grammar • G=<N, Σ, S, R> where • S ∈ N • R is a set of rewrite rules of the form • α → β with α,β∈ (N∪Σ)* • [G] = {w∈Σ* | S→*w}

  13. Chomsky Hierarchy and Complexity • Type-0 (Computable Language) • Type-1 (Context-Sensitive Language) • αAβ→ αγβ A∈N, γ≠empty • Context-senstiveiff recognizable in NSPACE(n) [Landweber63, Kuroda64] • Type-2 (Context-Free) • A→α • Proper Subclass of PTIME [?] • Type-3 (Regular) • A→sB or A→s A,B∈N, s∈Σ • Regular iff recognizable in DSPACE(1) [Folklore]

  14. Context-Free Grammar= Level-0 Grammar • Example of Type-2 Grammar: • S → 0 S 0, S→1 S 1, S→0, S→1 • Palindromes over 0s and 1s of odd length • A → α • Can be regarded as a nullary (nondeterministic) function whose output is strings

  15. Natural Extension:Macro Grammar [Fischer68]= Level-1 Grammar • S→A(0), A(x) → A(xx), A(x)→x • Sequence of 0s of length 2n • Nonterminals can be parameterized by strings, i.e., they’re nondeterministic function from strings to strings • [Aho68] Level-1 languages are context-sensitive(=NSPACE(n))

  16. Natural Extenstion:Higher-order grammar [Damm82] • Level-n Grammar is n-th order grammar, in which the nonterminals are parameterized by at most (n-1)-th order entities • B(f,x) → B(λy.f(f(y)), xx), B(f,x)→f(1x1) • S → B(λy.yy, 0) • 2n Repetition of “102^n1” • = Simply-Typed λ Caluculus + Strings as the Base Type + Nondeterminism

  17. Complexity? • Are Level-2, Level-3, … languages context-sensitive (=NSPACE(n)), or do they go beyond? • [Damm82] Decidable • [Maneth02] Call-by-Value Level-n Languages are in DSPACE(n) • [This Work] Call-by-Name Level-n Languages are in DSPACE(n)!

  18. Level-n Languages and MTTs [Damm82] • For any G : CbN Level-n grammar, there exists n+1 composition of CbN MTTs s.t. [G] = range(τ1;τ2; … ;τn) • Intuition: MTTs can carry out substitution • Hence, giving a complexity upperbound for range(MTT*) also gives the upperbound for Level-n languages

  19. [Our Work]⊆ DSPACE(n)NP-complete Level-n Language [Damm82] [Kuroda64]= NSPACE(n) range(MTT*) Context-Sensitive Language Level-n Language Level-3 Language Level-2 Language CFG where nonterminals are parameterized by other nonterminals [Aho68] ⊆NSPACE(n) [Rounds73] NP-complete Level-1 Language = Macro Language Context-Free Language [CYK, Earley, …] ⊆PTIME

  20. Brief Introduction toMacro Tree Transducers (MTTs)

  21. RHS ::= F( RHS, … , RHS ) | q(xi, RHS, …, RHS) | yi MTT start( A(x1) ) → double( x1, double(x1, E) ) double( A(x1), y1) → double( x1, double(x1, y1) ) double( B, y1 ) → F( y1, y1) double( B, y1 ) → G( y1, y1) • An MTT M = (Q, start, Σ, Δ, R) is a set of first-order functions of type Tree(Σ) × Tree(Δ)k Tree(Δ) • Defined by mutual induction on the1st parameter (the input tree) • Application in the right-hand side is restricted only to the direct children of the current node • Can take output tree fragments via other parameters, but not allowed to inspect or decompose them Nondeterminism!

  22. Evaluation Strategy • Two Evaluation Strategies(analogous to the λ-calculus…) • Call-by-Value (IO : Inside-Out) • Call-by-Name (OI : Outside-In)

  23. Example (IO / call-by-value) double( A(x1), y1) → double( x1, double(x2, y1) ) double( B, y1 ) → F( y1, y1) double( B, y1 ) → G( y1, y1) start(A(B))  double( B, double(B, E) )  double( B, F(E, E) ) F( F(E,E), F(E,E) ) or G( F(E,E), F(E,E) ) or  double( B, G(E, E) )  F( G(E, E), G(E, E) ) or  G( G(E, E), G(E, E) )

  24. Example (OI / call-by-name) double( A(x1), y1) → double( x1, double(x2, y1) ) double( B, y1 ) → F( y1, y1) double( B, y1 ) → G( y1, y1) start(A(B))  double( B, double(B, E) ) F( double(B, E), double(B, E) ) F( F(E,E), double(B, E) ) F( F(E,E), F(E,E) ) F( F(E,E), G(E,E) ) !! F( G(E,E), double(B, E) ) F( G(E,E), F(E,E) ) !! F( G(E,E), G(E,E) ) G( double(B, E), double(B, E) ) G( F(E,E), double(B, E) ) G( F(E,E), F(E,E) ) G( F(E,E), G(E,E) ) !! G( G(E,E), double(B, E) ) G( G(E,E), F(E,E) ) !! G( G(E,E), G(E,E) )

  25. Evaluation Strategy • Two Strategies(analogous to the λ-calculus…) • Call-by-Value (IO : Inside-Out) • Call-by-Name (OI : Outside-In) • Today, we consider OI evaluation only. • Why? • MTTIO* = MTTOI* (even though MTTIO≠ MTTOI) • MTTOI has better compositionality (as shown later)

  26. Main Result:DSPACE(n) Membership of range(MTT*)── or, how to apply the GFF to the range complexity

  27. ? ? ? ∈ ∈ ∈ ≪Approach: Generate & Test≫ • Guess the input s0and all the intermediate trees s1, …, sn-1 • Check whether(s,s1)∈τ1, (s1,s2)∈τ2, …, (sn-1, t) ∈τn • If it is, then t is in the output language! • Otherwise, try another s, s2, …, sn-1 τ1 τ2 τn s0 s1 s2 Sn-1 t

  28. ? ? ? ∈ ∈ ∈ ≪Approach: Generate & Test ≫ • In order to carry out the algorithm in DSPACE(|t|) … • The sizes |s0|, |s1|, |s2|, …, |sn| must be linearly bounded by |t| • i.e., there must be a constant c independent from tsuch that |s| ≦ c|t| • Each step to test the “translation membership” of τi must be done in DSPACE(n) “Garbage Free-Form” assures this property! τ1 τ2 τn s0 s1 s2 Sn-1 t

  29. [Review] “Garbage-Free Form” • For any compositionτ = τ1;τ2; … ;τn⊆ TΣ×TΔ of MTTs, there exists an equivalent oneτ = ρ0 ; ρ1 ; … ; ρ2n ,such that • dom(ρ0)=TΣ and range(ρ0) is regular • For any(s0,t)∈τ, there’ss1,..,s2ns.t. (si,si+1) ∈ρi, (s2n,t)∈ρ2n, |si+1|≦ 2|si+2|, |s2n|≦2|t|

  30. [Review] “Garbage-Free Form” • In each step of translation (except the first step), the size of the tree always becomes larger • (Ignoring the constant factor “2”) τ1 ρ0 τ2 ρ1 τk ρ2n s0 s1 s1 s2 s2 t t S2n Sk-1 s0

  31. “Translation Membership” • Let τ be a MTT. For trees s and t, can we check whether (s,t)∈τ or not within O(|s|+|t|) space? • The answer is Yes, but it is hard to prove it directly. • Let us consider subclasses of MTTs…

  32. In Search of a Good Subclass… • [Engelfriet&Vogler85]MTT ⊆ T ; LMTT • Hence, range(MTT*) = range((T∪LMTT)*) • T : MTTs without accumulating params. • LMTT: “Linear” MTTs, each input variable occurs at most once in rhs. • T and LMTT is weak enough to show the DSPACE(n) translation memship • But, too weak to have the garbage-free form

  33. Our Idea : Path-linear MTT “Path-linear” = on a path of nested application, each variable occurs linearly Linear: f(x1, g(x2), h(x3)) Path-linear(but not linear): f(x1, g(x2), h(x2)) Not path-linear: f(x1, g(x1), h(x2)) • Introduce a new class “PLMTT” • T∪LMTT ⊆ PLMTT • Still weak enough to show the DSPACE(n) translation memshp • Strong enough to have the GFF, as will be shown

  34. Linear Space“Translation Membership” • Lemma: Let τ∈ PLMTT. For trees s and t, we can check whether (s, t)∈τ or not within O(|s|+|t|) space. • Proof: Basically, try all nondeterministic computation by backtracking. (Some clever stack-sharing is required) • Path-linearity assures the length of the backtracking stack is O(|s|).

  35. Key Theorem:the Garbage-Free Form of (PL)MTT*

  36. Garbage-Free = No-deletion • In each step of translation (except the first step), the size of the tree always becomes larger τ1 ρ0 τ2 ρ1 τk ρ2n s0 s1 s1 s2 s2 t t S2n Sk-1 s0

  37. Proof Sketch ofthe Garbage-Free Form Decompose τ2 to ‘deleting part’ D and ‘nondeleting’ τ’2 • “Factor out” the deletion • τ1 ; τ2== τ1 ; (D ; ρ2)== (τ1 ; D) ; ρ2== τ’1 ; ρ2 Associativity Compose τ1 with D (Right-Compositionality of OI MTTs)

  38. Three Types of Deletion • “Input-Deletion” • E.g., f( A(x1, x2) )  B( f(x1) ) • Discarding the “x2” subtree! • “Skipping” • E.g., f( A(x1) )  f(x1) • No output is generated at the unary node A. • “Erasure” • E.g., f( L, y1 )  y1 • (Mainly at leaf nodes) No new output symbol is generated at the node.

  39. Three Types of Deletion • If there is no input-deletion, skipping, and erasure during the computation,|in| ≦ 2|out| • Intuition: • No input-deletion  visits all nodes • No skipping  outputs at least one node for each input unary node • No erasure  outputs at least one node for each input leaf node • For any tree, 2*(#unary + #leaf) ≧ #nodes(cf., the number of matches in a knockout tournament)

  40. How to eliminate“erasing” rules • Look-ahead + Inline-Expansion • Decompose τ into τ = E ; τne • E : Bottom-up translation that annotates each input tree with • τne: τ without erasing rules

  41. A A B B B C C B C C C C How to eliminate“erasing” rules : τ = E ; τne • Example τ: f( C, y1 ) → y1 g( B(x1,x2) ) → X( f(x1, Y) ) Applying f to the 1st child invokes erasure… E Applying f to the 1st child invokes erasure…

  42. A A B B B C C B C C C C How to eliminate“erasing” rules : τ = E ; τne τne: f( C, y1 ) → y1 g( B (x1,x2) ) → X( Y) Applying f to the 1st child invokes erasure… Applying f to the 1st child invokes erasure… E Applying f to the 1st child invokes erasure…

  43. A A B B B C C B C C C C How to eliminate“erasing” rules : τ = E ; τne • More complex case τ: f( C, y1 ) → y1 h( B(x1,x2), y1 ) → f(x1, y1) h(1st)  erasure f(1st)  erasureh(2nd) erasure E f(1st)  erasure

  44. Problem! • “Inline-Expansion” is not always possible for nondeterministic MTTs τ: f( C, y1, y2 )  y1 f( C, y1, y2 )  y2 g( B (x1,x2) )  h(x1, f(x1, Y, Z) ) h( C, y1 )  X( y1, y1 ) er… τne: g( B (x1,x2) )  h(x1, Y ) g( B (x1,x2) )  h(x1, Z ) h( C, y1 )  X( y1, y1 ) DifferentTranslation! er… er…

  45. Solution! • Extend MTTs with “inline-nondeterminism” τ: f( C, y1, y2 )  y1 f( C, y1, y2 )  y2 g( B (x1,x2) )  h(x1, f(x1, Y, Z) ) h( C, y1 )  X( y1, y1 ) er… SameTranslation! τne: g( B (x1,x2) )  h(x1, +(Y,Z) ) h( C, y1 )  X( y1, y1 ) er…

  46. Solution:“MTT with Choice and Failure” • MTT + inline-nondeterminism + inline-partiality • Same expressiveness: MTT = MTTCF • Much more flexible syntax • Allows inline-expansion for free! RHS ::= F( RHS, … , RHS ) | q(xi, RHS, …, RHS) | yi | +(RHS, RHS) | θ

  47. Note on Path-linearity • Inline-Expansion • …does not preserve linearity. • …does preserve path-linearity. This is the reason for using PLMTTs. τ: f( C, y1 ) X( y1, y1 ) g( B(x1,x2) )  f(x1, h(x2, Y) ) τne: g( B(x1,x2) ) X( h(x2, Y), h(x2, Y) )

  48. A B C B C C How to eliminate“Input-Deletion”: τne = I ; τnei • Exhaustively trying all deletion by using nondeterminism A1 A0 or or A1 or B00 B10 A0 C I B01 or… B11 C C

  49. How to eliminate“Input-Deletion”: τne = I ; τnei • Example τne: f( B(x1,x2) )  g(x1, g(x1, Y) ) f( B(x1,x2) )  g(x1, g(x2, Y) ) f( B(x1,x2) )  g(x2, g(x1, Y) ) τnei: f( B10(x1) )  g(x1, g(x1, Y) ) f( B10(x1) )  g(x1, θ ) f( B10(x1) )  g( x2, g(x1, Y) ) …

  50. A A B BA B C CBB CB How to eliminate“Skipping”: τnei = S ; τneis • Same as for the “input-deletion” • Try all possible deletion nondeterminisitically… or or... S or CABB

More Related