html5-img
1 / 36

Developing a German grammar for analysis and generation using OpenCCG

Developing a German grammar for analysis and generation using OpenCCG. Ciprian Gerstenberger University of Saarland IGK Colloquium January 13th 2005. Outline. NLP environments: a comparison The choice: OpenCCG The formalism: MMCCG The German grammar Future work.

valmai
Download Presentation

Developing a German grammar for analysis and generation using OpenCCG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing a German grammar for analysis and generation using OpenCCG Ciprian Gerstenberger University of Saarland IGK Colloquium January 13th 2005

  2. Outline • NLP environments: a comparison • The choice: OpenCCG • The formalism: MMCCG • The German grammar • Future work

  3. Dialogue systems • building dialogue systems → linguistic resources • linguistic resources → tools for developing and maintaining • wide range of different NLP environments ⇒Which is the most appropriate environment for our purposes?

  4. NLP environments for dialogue systems General requirements • both for analysis and generation • multi-lingual • easy domain reconfigurability Requirements for NLG • realization of contextually sensitive utterances • linguistically motivated control over flexible sentence realization

  5. NLP environments for dialogue systems Technical requirements • freely available • well documented • offering support when needed • freely available resources (for German) • efficient • platform independent

  6. NLP environments • KPML (Lisp): Systemic-Functional Grammar (SFG) • OpenCCG (Java): Multi-Modal Combinatory Categorial Grammar (MMCCG) • Babel (Prolog): Head-Driven Phrase Structure Grammar (HPSG) • LKB (Lisp): Head-Driven Phrase Structure Grammar (HPSG) • XLE (C): Lexical Functional Grammar (LFG) • XTAG (Lisp): Tree Adjoning Grammar (TAG) • XDG (Oz): Topological Dependency Grammar (TDG)

  7. NLP environments: Babel Babel-System (S. Müller) • implementing HPSG • Prolog • only analysis, no generation • multi-lingual (?) • resources for German: grammar with good coverage • freely available • documentation • support (?)

  8. NLP environments: LKB LKB • implementing HPSG • Lisp • multi-lingual • both analysis and generation • but: resources for German not usable for generation • freely available • documentation • support (?)

  9. NLP environments: XTAG XTAG • implementing TAG • Lisp • both analysis and generation • multi-lingual • resources for German (DFKI ?) • freely available • documentation • support (?)

  10. NLP environments: XDG XDG • implementing TDG • Oz • only analysis (generation as dependency parsing using TAGs) • multi-lingual (?) • resources for German (toy grammars) • freely available • documentation (?) • support

  11. NLP environments: KPML KOMET-Penman Multilingual Linguistic resource development • implementing Systemic-Functional Grammar (SFG) • Lisp • multi-lingual • flexible generation • good sentence realization control • only for generation, no parsing • resources for German: grammar with good coverage • freely available • documentation and support

  12. NLP environments: XLE Xerox Linguistic Environment • implementing LFG • C and Tcl/Tk • multi-lingual • both analysis and generation • resources for German (not freely available) • documentation • support • not freely available

  13. NLP environments: OpenCCG OpenCCG • implementing Multi-Modal Combinatory Categorial Grammar (MMCCG) • open source Java-based NLP library • both analysis and generation • multi-lingual • no resources for German, but grammars for English • freely available • documentation • support

  14. NLP environments: The Choice OpenCCG • Java-based NLP library → platform independent • analysis and generation → uniform grammar resources • multi-lingual → extendable • used and in use in several other projects: FLIGHTS, COMIC, COSY • supporting output format for TTS (e.g. APML) • optimized sentence realization • flexible generation • sentence realization control

  15. Basic formalism: CCG Combinatory Categorial Grammar • lexicalized grammar formalism • lexical items are assigned syntactic categories • combinatory rules

  16. MMCCG Multi-Modal Combinatory Categorial Grammar • refining CCG by introducing means of controlling the application of combinatory rules • specifying modes on category forming operators (slashes) • making application of rules dependent on the slash mode • four basic modes governing different levels of associativity and permutativity

  17. Example Der Hund sieht die Katze.

  18. Example (cont.) Der Hund sieht die Katze.

  19. Developing a German Grammar • joint work with Magdalena Wolska (DIALOG Project) Desiderata • uniform resources for analysis and generation • covering all phenomena in our domains • achieve more generality of the grammar than wrt phenomena encountered in our (relatively small) corpora

  20. Phenomena Some phenomena in German • agreement • position of the finite verb • Topological Fields: controlling the Vorfeld • complex sentences • ambiguity • controlling sentence realization

  21. Lexical forms

  22. Agreement

  23. Agreement (cont.)

  24. Agreement/Complex sentences

  25. Clause types Verb-initial clauses • yes/no questions: Soll ich die den Titel zu der Liste hinzufügen? • alternative questions: Möchtest Du Mozart oder Bach hören? • imperatives: Wähle das Album „Californication“ von den Red Hot Chili Peppers!

  26. Clause types (cont.) Verb-second clauses • main declarative: Der Titel wurde hinzugefügt. • wh-question: Welcher Künstler spielt „Missunderstood“?

  27. Clause types (cont.) Verb-final clauses • subordinate clause: Wenn Sie möchten, kann ich „We Just Can´t Get Enough CCG“ abspielen. • relative clause: Ich nehme aus den ersten vier Alben, die du hast, jeweils den ersten Song. • complement clause: Ich glaube, daß das Album „Dangerously In Love“ heißt.

  28. Topological Fields Controlling the Vorfeld occupation using flags

  29. Topological Fields (cont.) Controlling the Vorfeld occupation using flags

  30. Analysis: Ambiguities Der Hund von dem traurigen Mann den ich sah rennt.

  31. Analysis: Ambiguities (cont.) Das Kind rennt wenn der Hund rennt weil die Katze rennt.

  32. Generation Sentence realization without control

  33. Generation (cont.) Sentence realization with control: fronted subject

  34. Generation (cont.) Sentence realization with control: fronted object

  35. Future Work (1) • extending the grammar wrt the two domain currently modelled (MP3 and maths tutorial) • (AP, NP, sentence, etc.) coordination • complex NP (e.g. postmodifications) • control and raising verbs • particle verbs (Ich spiele den Song ab vs. Ich möchte den Song abspielen) • Topological Fields: scrambling in the Mittelfeld

  36. Future Work (2) • analysis: coping with partial input, ill-formed utterances • generation: realizing elliptical output • using a dynamic morphological module • development of an ontology

More Related