deep processing for restricted domain qa n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Deep Processing for Restricted Domain QA PowerPoint Presentation
Download Presentation
Deep Processing for Restricted Domain QA

Loading in 2 Seconds...

play fullscreen
1 / 21

Deep Processing for Restricted Domain QA - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Deep Processing for Restricted Domain QA. Yi Zhang Universit ä t des Saarlandes yzhang@coli.uni-sb.de. Why Deep?. Is Shallow Processing Enough? For TREC-like QA evaluation (in most cases) YES However, for restricted domain QA More complicated questions

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Deep Processing for Restricted Domain QA' - lotus


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
deep processing for restricted domain qa

Deep Processing for Restricted Domain QA

Yi Zhang

Universität des Saarlandes

yzhang@coli.uni-sb.de

why deep
Why Deep?

Is Shallow Processing Enough?

  • For TREC-like QA evaluation
    • (in most cases) YES
  • However, for restricted domain QA
    • More complicated questions
    • Less information redundancy for data intensive approach
    • Domain knowledge available
deep processing provides
Deep Processing Provides
  • More fine-grained linguistic analysis
    • Long distance dependency
    • Agreements
  • Semantic Representation
    • MRS/RMRS
general problems with deep processing
General Problems with Deep Processing
  • Robustness
    • Lexicon
    • Compound NP
  • Specificity
    • “John saw Mary”
  • Efficiency (not discussed here)
deep processing
Deep Processing
  • MRS/RMRS
    • (Robust) Semantic representation with underspecification.
  • HPSG Grammars
    • LinGO ERG Grammar
    • Other grammars (German, Japanese, Modern Greek, Norwegian, Chinese, …)
  • HoG
    • Hybrid shallow & deep processing architecture with uniformed semantic representation (RMRS).
qa in quetal 1
QA in QUETAL (1)
  • Hybrid shallow & deep approach
  • Cross-lingual QA
  • QA on
    • Texts
    • Semi-structured documents
    • Database
qa in quetal 2

Info Source

Texts

IE

Fact DB

QA in QUETAL (2)
  • Seman Ana.
  • Seman Q. Ana.
  • Q-type
  • A-type
  • Q-focus

NLQ

  • Syntax Ana.
  • Dependency Parser
  • TAG for En/De Q.

IR Schema

Ans. Planning

& Generation

GetData

IR Query Planner

Result Merge

qa in quetal 3
QA in QUETAL (3)

Deep processing in QUETAL

  • HPSG grammar used for question analysis.
  • Documents are processed with relatively shallow methods.
  • Answer matching with RMRS.
restricted domain qa
Restricted Domain QA
  • More complicated questions
  • Less documents with better quality
  • Domain specific ontology available
restricted domain qa an example
Restricted Domain QA – an Example

Where is the City Hall of Shanghai?

Shanghai City Planning Exhibition Hall[LOC_1] is located to the east of the City Hall[LOC_2], …, setting off with the crystal-like GrandTheatre[LOC_3]to the west.

Between Shanghai City Planning Exhibition Hall and the Grand Theatre.

Domain Onto.

open topics
Open Topics
  • Grammar extension & automated lexicon acquisition
  • Robust deep processing
  • Semantic answer matching
  • Cross-lingual
grammar extension
Grammar Extension

Tourism Domain

  • ERG extended for
    • “RONDANE” -- Norway mountain area tourism
      • 1.4K sentences
      • 15 word/sentence
      • coverage > 74%
  • Shanghai tourist guide from http://www.shanghai.gov.cn
      • 1,600 sentences
      • 18 word/sentence
grammar extension1
Grammar Extension
  • ERG lexicon
  • It is relatively easier to automated the lexicon acquisition for nouns
automated lexicon acquisition
Automated Lexicon Acquisition
  • POS tagging
  • Name entity recognition
  • Statistical models finding the best lexical type for unknown noun.
robust deep processing
Robust Deep Processing
  • Back-off to RMRS generated with intermediate or shallow parsers (HoG architecture).
  • Keep non-full parsing charts and corresponding MRS fragments for semantic answer matching.
parse disambiguation
Parse Disambiguation
  • Select the best parse with statistical models

(Toutanova et al. 2002)

answer matching with r mrs
Answer Matching with (R)MRS
  • Semantic answer matching
    • Create semantic patterns for each question type.
      • where -> locate_v(e, x1, x2)
    • Semantic distance measurement.
      • pred1(x)&pred2(x) <-> pred1(x)&pred2(y)
  • Query expansion
    • Synonym substitution
    • Semantic structure replacement
      • give_v(e1, x1, x2, x3) => receive_v(e2, x2, x1, x3)
work plan
Work Plan
  • Narrow down my focus onto one of the topics above.
  • Continue the Chinese HPSG grammar development.
references
References
  • Baldwin, Timothy, Emily M. Bender, Dan Flickinger, Ara Kim and Stephan Oepen (to appear) Road-testing the English Resource Grammar over the British National Corpus, In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal.
  • Ulrich Callmeier. 2002. PET – a platform for experimentation with efficient HPSG processing techniques. In Collaborative Language Engineering. CSLI Publications, Stanford, USA.
  • Hans Uszkoreit. 2002. New chances for deep linguistic processing. In Proc. of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan.
  • Ann Copestake, Dan Flickinger, Ivan A. Sag, and Carl Pollard. 2003. Minimal recursion semantics: An introduction. Under review.
  • Timothy Baldwin and Francis Bond. 2003. Learning the countability of English nouns from corpus data. In Proc. of the 41st Annual Meeting of the ACL, pages 463–70, Sapporo, Japan.
  • Carol, J. and Fang, A. Automatic Acquisition of Verb Subcategorisations and their Impact on the Performance of an HPSG Parser. IJCNLP 2004
  • Oepen, Stephan, Dan Flickinger, Kristina Toutanova, Christoper D. Manning. 2002. LinGO Redwoods: A Rich and Dynamic Treebank for HPSG In Proceedings of The First Workshop on Treebanks and Linguistic Theories (TLT2002), Sozopol, Bulgaria.
  • Toutanova, Kristina, Christoper D. Manning, Stephan Oepen. 2002. Parse Ranking for a Rich HPSG Grammar In Proceedings of The First Workshop on Treebanks and Linguistic Theories (TLT2002), Sozopol, Bulgaria.
  • Stephan Oepen. [incr tsdb()] - Competence and Performance Laboratory. User Manual.Technical Report. Computational Linguistics. Saarland University (in preparation).
  • Robert Malouf and Gertjan van Noord. 2004. "Wide coverage parsing with stochastic attribute value grammars." In IJCNLP-04 Workshop: Beyond shallow analyses - Formalisms and statistical modeling for deep analyses.
  • Toutanova, Kristina, Christopher D. Manning, Stuart M. Shieber, Dan Flickinger, and Stephan Oepen. 2002. Parse Disambiguation for a Rich HPSG Grammar. First Workshop on Treebanks and Linguistic Theories (TLT2002), pp. 253-263. Sozopol, Bulgaria.