1 / 38

Multi-layered XML-based Annotation for Integrated NL Processing

Multi-layered XML-based Annotation for Integrated NL Processing. Anette Frank Language Technology Lab DFKI GmbH Saarbr ücken, Germany. Japanese-German Workshop on NLP, Sapporo, Japan July 4-5, 2003. Background.

karif
Download Presentation

Multi-layered XML-based Annotation for Integrated NL Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-layered XML-based Annotation for Integrated NL Processing Anette Frank Language Technology Lab DFKI GmbH Saarbrücken, Germany Japanese-German Workshop on NLP, Sapporo, Japan July 4-5, 2003

  2. Background Whiteboard– Multilevel Annotation for Dynamic Free Text Processing H.Uszkoreit, B.Crysmann, A.Frank, B.Kiefer, G.Neumann, J.Piskorski, U.Schäfer, F.Xu, (M.Becker, and H.-U.Krieger) Major project goals • Integration of shallow and deep linguistic processing • Processing of unrestricted free text • Variable-depth text analysis • XML-based system architecture • Uniform way of representing and combining results of various NLP components • Flexible software infra-structure for NLP-based applications • Applications • Grammar & controlled language checking • Intelligent information extraction

  3. DNLP Fine-grained analysis High precision if correctly disambiguated High ambiguity rates Insufficient robustness coverage ill-formeded input Insuffiencent efficiency SNLP Partial analysis Insufficient precision Tamed ambiguity High robustness coverage ill-formed input High efficiency Goal of integrated ‘hybrid’ processing • Robustness and efficiency of shallow analysis • Precision and fine-grainedness of deep syntactic analysis Motivation—Annotation-based Integration of Shallow and Deep NLP —

  4. Integration of Shallow and Deep Analysisin WHAT: an XML-based Annotation Architecture • Whiteboard Annotation Machine & Transformer (Schäfer 2003) • Managing shallow and deep analyses in multi-layer XML architecture • XSLT queries to XML standoff annotations for flexible, efficient integration • Lexical Integration(Crysmann et al. 2002) • SPPC-HPSG interface: building HPSG lexicon entries “on the fly” • Named entities, open class categories (nouns, adjectives, adverbs, ..) • HPSG-GermanNet integration • association with HPSG lexical sorts • coverage and robustness • Phrasal integration for ‘hybrid’ syntactic processing (Frank et al. 2003) • Integration of shallow topological field parsingand deep HPSG parsing • efficiency and robustness

  5. WHAT Integration of Shallow and Deep NLP— XML/XSLT-based system architecture — • Multi-layer XML standoff annotation for integration of NLP components • Standoff annotation allows for combination of overlapping hierarchies • Access to results of alternative NLP components, for flexible use in applications • XSLT-based system architecture WHAT: Whiteboard Annotation Transformer(Schäfer 2003) WHAM XML standoff annotation shallow NLP components programming interface NLP-based application deep NLP components multilayer chart

  6. Integration of Shallow and Deep NLP— XML/XSLT-based system architecture — • Multi-layer XML standoff annotation for integration of NLP components • Standoff annotation allows for combination of overlapping hierarchies • Access to results of alternative NLP components, for flexible use in applications • XSLT-based system architecture WHAT: Whiteboard Annotation Transformer(Schäfer 2003) WHAT query • XSLT queries to XMLstandoff markup • Template library for 3 types of queries: • V(alue), N(ode sets), D(ocument) • Flexible, efficient access for online / • offline integration of NLP components • ACT:Accessing, Computing, Transforming • Portability component- specific XSLT template library constructed XSLT query XML standoff markup XSLT processor result

  7. Integration of Shallow and Deep NLP— XSLT-based queries for annotation-based integration — • Through V(alue) and N(ode) queries: • Morphology and stemming of unknown words (unknown in HPSG lexicon) • PoS tagging • Compounds • Named entities (spans and semantic types) • Through D(ocument), V(alue) and N(ode) queries: • Chunks • Topological structure (spans, types) • Example: returnNamed Entity type from SPPC_XML:getValue.NE.type(I4) <query name="getValue.NE.type"> <!-- returns the type of named entity --> <xsl:param name="index"/> <xsl:template match="/WHITEBOARD/SPPC_XML//NE[@id=$index]"> <xsl:value-of select="@type"/> </xsl:template> </query>

  8. Integration of Shallow and Deep NLP— Lexical integration — • Shallow processing (SPPC) • Morphological and compound analysis • PoS tagging • Named Entity recognition • Deep syntactic processing (HPSG) • Subcategorisation • Argument structure • Lexical semantic sorts • Building HPSG lexicon entries “on the fly” • XML-encoding of typed feature structures • Mapping lexical information from SPPC to HPSG typed feature structures • Lexical syntactic and semantic information • Mapping GermaNet semantic classes to HPSG sorts (Siegel et al., 2001) • Subcategorisation acquisition from parsed corpora • Increase of coverage and robustness at lexical level • Increase of fully lexically covered sentences: 43% (on NEGRA corpus) • Increase of parsed sentences due to lexical coverage: 8,9%

  9. Integration of Shallow and Deep NLP— Syntactic integration — • Using robust, efficient shallow parsing • to pre-partition deep parser‘s search space  efficiency • to select partial analyses from deep parser’s chart  robustness • Constraining the search space of a chart-based parser • External knowledge sources deliver • compatible subtrees to be checked for compatibility with deep parsing • additional information (categorial, featural constraints) for constituents • Prioritisation scheme: constituents (chart edges) of deep parser are • rewarded if compatible • penalised if incompatible with external constraints • Best-first filter on ambiguous output • Challenge:shallow analysis needs to provide reliable, compatible structures

  10. NP CL CL Deep syntactic structure [Die Programme [die [sie] benutzen, [um [ihre Ergebnisse] zu verbreiten] [The programs [that [they] use, [in order to [their results] distribute] NP & CL chunks NP CL CL The Shallow-Deep Mapping Problem— Problems and Solutions— The shallow-deep mapping problem • Chunk parsing not isomorphic to deep syntactic structure („attachments“)

  11. NP CL CL Deep syntactic structure [Die Programme [die [sie] benutzen, [um [ihre Ergebnisse] zu verbreiten] [The programs [that [they] use, [in order to [their results] distribute] NP & CL chunks NP NP The Shallow-Deep Mapping Problem— Problems and Solutions— The shallow-deep mapping problem • Chunk parsing not isomorphic to deep syntactic structure („attachments“) CL

  12. NP CL CL Deep syntactic structure [Die Programme [die [sie] benutzen, [um [ihre Ergebnisse] zu verbreiten] [The programs [that [they] use, [in order to [their results] distribute] NP & CL chunks NP NP NP The Shallow-Deep Mapping Problem— Problems and Solutions— The shallow-deep mapping problem • Chunk parsing not isomorphic to deep syntactic structure („attachments“)

  13. CL CL CL NP The Shallow-Deep Mapping Problem— Problems and Solutions— The shallow-deep mapping problem • Chunk parsing not isomorphic to deep syntactic structure („attachments“) • „Bottom-up“ chunk parsing not constrained by sentence macro-structure Peter eats pizza and Mary drinks wine • Stochastic Topological Field Parsing (Becker and Frank 2002) • High degree of compatibility with deep syntactic structure • Flat, partial macro-structure: robustness, coverage, efficiency, precision

  14. sentence Vorfeld Left sentence Mittelfeld (MF) Right sentence Nachfeld type (VF) bracket (LK) bracket (RK) (NF) V2 Fritz kennt die Freunde seines Sohns , die zur Party kommen. Fritz hat die Freunde seines Sohns kennengelernt , die zur Party kommen. V1  Hat Fritz die Freunde s. Sohns kennengelernt , die zur Party kamen?  Kennt Fritz die Freunde s. Sohns , die zur Party kommen? Vletzt  weil Fritz die Freunde s.Sohns kennt wer  die Freunde seines Sohns kennt , die zur Party kommen CL VF topological structure LK MF mapping topological to deep syntactic structure VF LK MF RK NF RK NF Stochastic Topological Field Parsing— Topological field model of German syntax— Theory-neutral macro-structure of complex sentences

  15. CL-V2 RB-PTK VF-TOPIC LB-VFIN MF NF ADV Daher thus VVFIN wies ordered NE Souza S. ART die the NN Polizei police PTKVZ an particle CL-INF RB-VINF NF MF $, PTKZU zu to VVINF fassen capture CL-REL ART den the NN Häuptling chieftain $, VF-REL MF RB-VFIN PRELS der who PRF sich himself VVPP versteckt hidden VVFIN hält keeps Stochastic Topological Field Parsing— A corpus-based approach (Becker & Frank 2002)— Non-lexicalised PCFG trained from (converted) NEGRA corpus • Flat phrasal fieldsVF, MF, NF: sequences of POS-tags (and CL-nodes) • Parameterised categories:CL–V2/–V1/–SUBCL/–REL/–WH,.. RB–INF/–FIN • Explicit clausal embedding structure

  16. Stochastic Topological Field Parsing— Performance— Best model[para+, bin+, pnct+, prun +] • High accuracy (93% / 88% ) at high coverage (up to 100% ) • High rate of perfect matches (fully correct) 80% / 72% • Efficiency: 0.12 secs/sentence (LoPar parser, Schmid 2000) Evaluation: ignoring parameters and punctuation (length  40 words)

  17. CL-V2 RB-VPART VF-TOPIC LB-VFIN MF NF VAPP gehabt,7 ART Der,1 NN Zehnkampf,2 VAFIN hätte,3 ART eine,4 ADJA andere,5 NN Dimension,6 CL-SUBCL MF LB-COMPL RB-VFIN S VAFIN wäre,13 PPER er,10 PROAV dabei,11 KOUS wenn,9 VAPP gewesen,12 S/NP-NOM NP-NOM D Der,1 V hätte,3 EPS/NP-NOM N’ Zehnkampf,2 NP-ACC EPS N’ EPS CP-MOD ART eine,4 AP-ATT andere,5 N’ Dimension,6 C wenn,9 V gehabt,7 S V NP-NOM er,10 PP dabei,11 V V gewesen,12 V-LE wäre,13 Integrated Shallow and Deep Parsing — TopP meets HPSG —

  18. CL-V2 RB-VPART VF-TOPIC LB-VFIN MF NF XSLT-based extraction of map constraints VAPP gehabt,7 ART Der,1 NN Zehnkampf,2 VAFIN hätte,3 ART eine,4 ADJA andere,5 NN Dimension,6 CL-SUBCL MF LB-COMPL RB-VFIN MAP_CONSTR id="T10" constr="extrapos_rk+nf" left="W7" right="W13"/ S VAFIN wäre,13 PPER er,10 PROAV dabei,11 KOUS wenn,9 VAPP gewesen,12 S/NP-NOM NP-NOM D Der,1 V hätte,3 EPS/NP-NOM N’ Zehnkampf,2 NP-ACC EPS to guide deep parsing N’ EPS CP-MOD ART eine,4 AP-ATT andere,5 N’ Dimension,6 C wenn,9 V gehabt,7 S V NP-NOM er,10 PP dabei,11 V V gewesen,12 V-LE wäre,13 Integrated Shallow and Deep Parsing — Bridging structural non-isomorphisms —

  19. Flattening phrasal fields Integrated Shallow and Deep Parsing— XML/XSLT-based integration: TopP meets HPSG —

  20. chunk insertion Integrated Shallow and Deep Parsing— XML/XSLT-based integration: TopP meets HPSG —

  21. bracket extraction <TOPO2HPSG type="root" id="5608"> <MAP_CONSTR id="T1" constr="v2_cp" left="W1" right="W13"/> <MAP_CONSTR id="T2" constr="v2_vf" left="W1" right="W2"/> <MAP_CONSTR id="T3" constr="vfronted_vfin+rk" left="W3" right="W3"/> <MAP_CONSTR id="T4" constr="vfronted_vfin+vp+rk" left="W3" right="W13"/> <MAP_CONSTR id="T5" constr="vfronted_vp+rk" left="W4" right="W13"/> <MAP_CONSTR id="T6" constr="vfronted_rk-complex" left="W7" right="W7"/> <MAP_CONSTR id="T7" constr="vl_cpfin_compl" left="W9" right="W13"/> <MAP_CONSTR id="T8" constr="vl_compl_vp" left="W10" right="W13"/> <MAP_CONSTR id="T9" constr="vl_rk_fin+complex+f" left="W12" right="W13"/> <MAP_CONSTR id="T10" constr="extrapos_rk+nf" left="W7" right="W13"/> </TOPO2HPSG> HPSG Parsing (prioritisation) Shallow lexical processing SPPC Integrated Shallow and Deep Parsing— XML/XSLT-based integration: TopP meets HPSG —

  22. Shaping the Deep Parser’s Search Space— Bracket conditions from shallow topological parsing — • Interface to shallow components: labelled brackets • Provide information about constituent start and end positions • Bracket names (types) associated with additional constraints • HPSG parser PET: Agenda-based chart parser • Flexible priority heuristics for the parsing tasks (i.e. possible combination of edges) • Matching start, connecting and end position of new tasks against brackets • Bracket information is used to modify task priorities • Reward tasks consistent with bracket information • Penalize tasks building incompatible chart edges • No pruning, but shaping the search space !

  23. Bracketx Shaping the Deep Parser’s Search Space— Matching brackets and chart edges — Crossing Event

  24. Shaping the Deep Parser’s Search Space— Matching brackets and chart edges — Match Event Bracketx

  25. Shaping the Deep Parser’s Search Space— Matching brackets and chart edges — Right (Left)-match Inside Event Bracketx

  26. Shaping the Deep Parser’s Search Space— Matching brackets and chart edges — Right (Left)-match Outside Event Bracketx

  27. ˜ p(t) = p(t)  ( 1 ± confent(brx)  confpr(x) (x) ) Shaping the Deep Parser’s Search Space— Conditions and Effects — • Additional constaints on bracket types for prioritisation • Constituent matching conditions • „Match“ and „Cross“: brackets compatible with HPSG constituents • „Right Inside“ and „Right Outside“: partially specified constituents • HPSG grammar constraints • Allowed/Disallowed HPSG grammar rules • Necessary/Forbidden HPSG Feature Structure configurations • Positive vs. negative priority effects: rewarding vs. penalising • Changing priorities • If both match conditions and grammar constraints are fulfilled • Confidence values can be used to modulate the strength of the effect

  28. Confidence Measures— Accuracy of map-constraints — • Static confidence measure: precision of bracket type x:confpr(x) • Precision/recall of brackets extracted from best topological parse measured against brackets extracted from evaluation corpus (Becker&Frank 2002) precision: 88.3%, recall 87.8% • Threshold pr = 0.7 excludes 22.8% of bracket mass,32.35% of bracket types • includes chunk-brackets (with 71.1% precision)

  29. Confidence Measures— Tree Entropy — • Experiment I. Effect of varying entropy thresholds on prec/recall of topological parsing precision: proportion of selected parses that are perfect matches recall: proportion of perfect matches that are selected coverage: perfect matches above/below entropy threshold: in/out of coverage II.Determining optimal entropy threshold, trading coverage for precision • Entropy of a parse distribution delivers a measure of how certain the parser is about its best analysis for a given sentence (e.g. Hwa 2000) • Confent: Tree entropy as a confidence measure for the quality of the best topological parse and extracted bracket constraints • Uniform distribution, high entropy  very uncertain • Spike distribution, low entropy  very certain

  30. Experiments carried out on (split) evaluation • corpus of (Becker and Frank, 2002) • Varying entropy thresholds [1,0] • ent = 1 no filtering • lowering ent increases precision, • decreases recall and coverage Optimal entropy threshold:ent = 0.236 maximising f-measure ( = 0.5) on training set Confidence Measures— Tree Entropy — Effect of threshold ent = 0.236 on test set

  31. Experiments— Data and setup — Data • 5060 NEGRA sents (24.57% of NEGRA corpus as covered by HPSG) •  length: 8.94 w/o punct ;  lex. ambiguity: 3.05 entries/word Setup • Performance measuring (absolute run-time, no. of tasks) • Baseline : HPSG parsing w/ PoS guidance, but w/o topological information • Testing various integration parameters • topological brackets • confidence weights for topological information • bracket precision (P) (± thresholded) • tree entropy (E) (± thresholded) • chunk brackets

  32. Results Baseline: HPSG-parsing w/ PoS guidance • Heuristic weights on task priorities • ½ : increase / decrease by half • 1 : incease to double / • decrease to zero

  33. Results Baseline: HPSG-parsing w/ PoS guidance Heuristic weights with  set high, wrong topological information can mislead the parser Confidence weights [0,1] P(T): (thresholded) bracket precision E(T): (thresholded) tree entropy

  34. Results Baseline: HPSG-parsing w/ PoS guidance Heuristic weights with  set high, wrong topological information can mislead the parser • Confidence weights • PT and E work best • ET: threshold cuts out entire tree, • while some brackets can be correct PT with chunk constraints w/ and w/o topological brackets

  35. Results Baseline: HPSG-parsing w/ PoS guidance Heuristic weights with  set high, wrong topological information can mislead the parser • Confidence weights • PT and E work best • ET: threshold cuts out entire tree, • while some brackets can be correct • No improvement by adding chunks • Chunks w/o topological brackets: • almost no improvement over BL

  36. Efficiency gains/losses by sentence length Baseline vs. PT –E ½ Distribution: # sentences / sentence length Observations— Monitoring efficiency gains by sentence length — • Outliers • 963 sents (len  3,  len 11.09) • Observation • conflicting topological / HPSG parses • cross-validation effects

  37. Observations— Guidance from PoS, chunks, and topological brackets — Impact of guidance by PoS, chunks, or topological parsing • Baseline includes PoS-prioritisation • Chunk-based constraints rather poor • Topological constraints (span and grammar constraints): highest impact Related work : Daum et al. 2003 PoS- and chunk-based prioritisation in dependency parsing

  38. Conclusion and Outlook • Data-driven integration of shallow and deep parsing, mediated by XML multi-layer annotation architecture • XSLT-based integration: efficient, fine-grained dovetailing of shallow and deep constraints • Shallow macro-structural constraints yield substantial performance gains • Focus on annotation-based system architecture and efficiency • Further integration scenarios target • Robustness • Topological information for fragment recovery from deep parser’s chart • Pruning failed input sentences for reparsing (snipping adjunct clauses, ...) • Precision • Confidence-based filtering: tree entropy, decision tree learning • Fine-grainedness of analysis • Projecting robust semantic structures from shallow trees

More Related