Finding what matters in questions
Download
1 / 42

Finding What Matters in Questions - PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on

Finding What Matters in Questions. Xiaoqiang Luo , Hema Raghavan , Vittorio Castelli , Sameer Maskey and Radu Florian IBM T.J. Watson Research Center. Introduction. e.q . : “ How does one apply for a New York day care license?” bag-of-words model 的最高分 :

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Finding What Matters in Questions' - ally


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Finding what matters in questions

Finding What Matters in Questions

XiaoqiangLuo, HemaRaghavan, VittorioCastelli, SameerMaskey and RaduFlorian

IBM T.J. Watson Research Center

NAACL-HLT 2013


Introduction
Introduction

  • e.q.:“How does one apply for a New York day carelicense?”

    • bag-of-words model 的最高分:

      • “New licenses for day care centersin York county, PA”

    • MMPmodel :

      • 用 “New York,” “day care,” and “license”這三個phrase來搜尋

    • We call these important phrases mandatory matchingphrases (MMPs)

NAACL-HLT 2013


Question corpus
Question Corpus

  • subset of the DARPA BOLT corpus containing forum postings in English.

  • 四人挑選question

    • 以下5種question 不會用

      • 需要推理或計算才能得到答案的問句

      • 問題描述不清楚或有ambiguation

      • 可以拆成很多問句的問題

      • multiple choice questions

      • factoid questions

NAACL-HLT 2013


Question corpus1
Question Corpus

  • 兩位標記者負責標記所挑選的question 的MMP類型(MMP-Must, MMP-Maybe)以及span

  • E.q.

不重疊

連續

NAACL-HLT 2013


Generate mmp training instances
Generate MMP Training Instances

NAACL-HLT 2013


Generate mmp training instances1
Generate MMP Training Instances

N

N

N

m

m

m

NAACL-HLT 2013


Generate mmp training instances2
Generate MMP Training Instances

  • MMP type:

    • MMP-Must:+1

    • MMP-Skip:-1

    • MMP-Maybe:-1

deep: 0

1

2

3

4

5

6

  • Output instances:

    • < span, MMP type>

      E.q. : hedge funds= <(5, 6), +1>

<(4, 6), +1>

Np

Np

<(4, 4), +1>

<(4, 6), +1>

<(5, 6), +1>

<(7, 9), +1>

p

p

p

Position: 0123456789

<(9, 9), +1>

NAACL-HLT 2013



Mmp features
MMP Features

Lexical Features:

  • CaseFeatures:

    • is the first word of an MMPupper-case?

    • Is it all capital letters?

    • Does it containnumeric letters?

    • E.q. :

      • For “(NP American)” in Figure 1, the upper-case feature fires.

NAACL-HLT 2013


Mmp features1
MMP Features

Lexical Features:

  • CommonQWord:

    • Does the MMP contain question words, including “What,” “When,” “Who,” etc.

NAACL-HLT 2013


Mmp features2
MMP Features

Syntactic Features:

  • PhraseLabel:

    • this feature returns the phrasal label of the MMP.

    • E.q:

      • For “(NP American)” in Figure 1, the feature value is “NP.”

NAACL-HLT 2013


Mmp features3
MMP Features

Syntactic Features:

  • NPUnique:

    • this Boolean feature fires if a phrase is the only NP in a question

    • E.q.:

      • For “(NP American),” the feature value would be false.

NAACL-HLT 2013


Mmp features4
MMP Features

Syntactic Features:

  • PosOfPTN:

    • (1) the position of the left-most word of the node

    • (2) whether the left-most word is the beginning of the question

    • (3) the depth of the anchoring node, defined as the length of the path to the root node.

NAACL-HLT 2013


E q of posofptn
E.q. of PosOfPTN

deep: 0

1

2

3

4

5

6

  • E.q: For “(NP American)” in Figure 1:

    • 5th word in the sentence

    • not the first word of the sentence

    • Depthof the node is 6

Position: 123456789 10

NAACL-HLT 2013


Mmp features5
MMP Features

Syntactic Features:

  • PhrLenToQLenRatio:

    • This feature computes thenumber of words in an MMP, and its relative ratio tothe sentence length.

NAACL-HLT 2013


Mmp features6
MMP Features

Semantic Features (NETypes):

  • The feature tests ifa phrase is or contains a named entity, and, if thisis the case, the value is the entity type.

    • information extraction (IE) pipeline consisting of syntactic parsing, mention detection and coreference resolution (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005)

  • E.q. : For “(NP American)” in Figure 1, the feature value would be “GPE.”

NAACL-HLT 2013


Mmp features7
MMP Features

Corpus-based Features ( AvgCorpusIDF):

  • This group of features computes the average of the IDFs of the words in this phrase.

    • Have stop words

NAACL-HLT 2013


Mmp classification results
MMP Classification Results

Classifier:

  • logistic regression binary classifier using WEKA.

    Data set:

NAACL-HLT 2013


P erformances of the mmp classifier
Performances of the MMP classifier

NAACL-HLT 2013



Data for relevance model
Data for Relevance Model

  • From BOLT-IR task(IR, 2012)

  • Top snippets returned by the search engine are judged for relevancy by our annotators.

NAACL-HLT 2013


Relevance prediction
Relevance Prediction

  • The relevance model is a conditional distribution P(r|q, s;D)

    • where r is a binary random variable indicating if the candidate snippet s is relevant to the question q.

    • D is the document where the snippet s is found.

NAACL-HLT 2013


Relevance prediction1
Relevance Prediction

Baseline system

  • (1) Text Match Features

    • query and snippet 的 cosine scores

  • (2) Answer Type Features:

    • The top 3 predictions of a statistical classifier trained to predict answer categories were used as features.

NAACL-HLT 2013


Relevance prediction2
Relevance Prediction

Baseline system

  • (3) Mention Match Features

    • whether a named entity in the query occurs in the snippet.

NAACL-HLT 2013


Relevance prediction3
Relevance Prediction

Baseline system

  • (4) Event match features

    • use several hand-crafted dictionaries containing terms exclusive to various types of events like ”violence”, ”legal”, ”election”.

    • If both the query and snippet contain the same event type

      • The features take value is ‘1’

NAACL-HLT 2013


Relevance prediction4
Relevance Prediction

Baseline system

  • (5) Snippet Statistics:

    • snippet length

    • the position of the snippet in the post etc were created.

NAACL-HLT 2013


Relevance prediction5
Relevance Prediction

Features Derived from MMP

  • HardMatch:

    • LetI(m ∈ s) be a 1 or 0 functionindicating if a snippet contains the MMP m

NAACL-HLT 2013


Relevance prediction6
Relevance Prediction

Features Derived from MMP

  • SoftLMMatch:

    • The SoftLMMatch score is a language-model (LM) based score, similar to that used in (Bendersky and Croft, 2008), except that MMPs play the role of concepts.

NAACL-HLT 2013


Relevance prediction7
Relevance Prediction

Features Derived from MMP

  • SoftLMMatch:

    • The SoftLMMatch score is a language-model (LM) based score, similar to that used in (Bendersky and Croft, 2008), except that MMPs play the role of concepts.

NAACL-HLT 2013


Relevance prediction8
Relevance Prediction

Features Derived from MMP

  • SoftLMMatch:

    • where wi is the ith in snippet s

    • I(wi= v) is an indicator function, taking value 1 if wiis v and 0 otherwise

    • |V | is the vocabulary size

NAACL-HLT 2013


Relevance prediction9
Relevance Prediction

Features Derived from MMP

  • MMPInclScore:

    • where w ∈ m are the words in m

    • I(・) is the indicator function taking value 1 when the argument is true and 0 otherwise

    • is a constant threshold

    • l(w, s) is the similarity of word w to the snippet s as:

      • l(w, s) = maxv ∈ s JW(w, v)

      • JW(w, v) = (Jaro Winkler similarity score between words w and v)

NAACL-HLT 2013


Relevance prediction10
Relevance Prediction

Features Derived from MMP

  • MMPInclScore:

    • The MMP weighted inclusion score between the question q and snippet s is computed as:

NAACL-HLT 2013


Relevance prediction11
Relevance Prediction

Features Derived from MMP

  • MMPRankDep:

    • This feature, RD(q, s) first tests if there exists a matched bilexcial dependency between q and s;

NAACL-HLT 2013


Relevance prediction12
Relevance Prediction

Features Derived from MMP

  • MMPRankDep:

    • Let m(i) be the ith ranked MMP

    • let <wh, wd| q> and <uh, ud| s> be bilexical dependencies from qand s, respectively

      • whand uh are the heads

      • wdand ud are the dependents

NAACL-HLT 2013


Relevance prediction13
Relevance Prediction

Features Derived from MMP

  • MMPRankDep:

    • EQ(w, u)

      • EQ(w, u) is true if either w and u are exactly the same, or their morphs are the same, or they head the same entity, or their synset in WordNetoverlap

    • RD(q, s)

      • RD(q, s) is true if and only if

        • EQ(wh, uh) ∧ EQ(wd, ud) ∧ wh ∈ m(i) ∧ wd ∈ m(j) is true for some <wh, wd | q>, for some <uh, ud | s> and for some i and j.

NAACL-HLT 2013


Relevance prediction14
Relevance Prediction

3snippet classifiers model

  • noMMP model

    • a system without MMP features;

  • IDF-as-MMP model

    • a baseline with each word as an MMP and the word’s IDF as the MMP score.

  • MMP model

NAACL-HLT 2013


Relevance prediction15
Relevance Prediction

Performance of 3snippet classifiers system

NAACL-HLT 2013


End to end system results
End-to-End System Results

  • The question-answering system is used in the 2012 BOLT IR evaluation (IR, 2012)

    • There are 499K(Arabic), 449K(Chinese ) and 262K(English ) threads in each of these languages.

    • The Arabic and Chinese posts were first translated into English before being processed.

NAACL-HLT 2013


End to end system results1
End-to-End System Results

  • performance

NAACL-HLT 2013


Bolt evaluation results
BOLT Evaluation Results

  • The BOLT evaluation consists of 146 questions, mostly event- or topic- related

NAACL-HLT 2013


Bolt evaluation results1
BOLT Evaluation Results

NAACL-HLT 2013


Conclusions
Conclusions

  • 作者提供一個使用mandatory matching phrases (MMP) 的QA系統

  • 從question抽取出MMP的F-measure 高達 88.6%

  • 將MMP model 跟snippet relevance model 合併可以有效提升snippet relevance model的效能

  • 使用MMP的QA系統是2012 BOLT IR中效能最好的系統

NAACL-HLT 2013


ad