Finding What Matters in Questions

Finding What Matters in Questions XiaoqiangLuo, HemaRaghavan, VittorioCastelli, SameerMaskey and RaduFlorian IBM T.J. Watson Research Center NAACL-HLT 2013

Introduction • e.q.:“How does one apply for a New York day carelicense?” • bag-of-words model 的最高分: • “New licenses for day care centersin York county, PA” • MMPmodel : • 用 “New York,” “day care,” and “license”這三個phrase來搜尋 • We call these important phrases mandatory matchingphrases (MMPs) NAACL-HLT 2013

Question Corpus • subset of the DARPA BOLT corpus containing forum postings in English. • 四人挑選question • 以下5種question 不會用 • 需要推理或計算才能得到答案的問句 • 問題描述不清楚或有ambiguation • 可以拆成很多問句的問題 • multiple choice questions • factoid questions NAACL-HLT 2013

Question Corpus • 兩位標記者負責標記所挑選的question 的MMP類型(MMP-Must, MMP-Maybe)以及span • E.q. 不重疊連續 NAACL-HLT 2013

Generate MMP Training Instances NAACL-HLT 2013

Generate MMP Training Instances N N N m m m NAACL-HLT 2013

Generate MMP Training Instances • MMP type: • MMP-Must:+1 • MMP-Skip:-1 • MMP-Maybe:-1 deep: 0 1 2 3 4 5 6 • Output instances: • < span, MMP type> E.q. : hedge funds= <(5, 6), +1> <(4, 6), +1> Np Np <(4, 4), +1> <(4, 6), +1> <(5, 6), +1> <(7, 9), +1> p p p Position: 0123456789 <(9, 9), +1> NAACL-HLT 2013

Generate MMP Training Instances NAACL-HLT 2013

MMP Features Lexical Features: • CaseFeatures: • is the first word of an MMPupper-case? • Is it all capital letters? • Does it containnumeric letters? • E.q. : • For “(NP American)” in Figure 1, the upper-case feature fires. NAACL-HLT 2013

MMP Features Lexical Features: • CommonQWord: • Does the MMP contain question words, including “What,” “When,” “Who,” etc. NAACL-HLT 2013

MMP Features Syntactic Features: • PhraseLabel: • this feature returns the phrasal label of the MMP. • E.q: • For “(NP American)” in Figure 1, the feature value is “NP.” NAACL-HLT 2013

MMP Features Syntactic Features: • NPUnique: • this Boolean feature fires if a phrase is the only NP in a question • E.q.: • For “(NP American),” the feature value would be false. NAACL-HLT 2013

MMP Features Syntactic Features: • PosOfPTN: • (1) the position of the left-most word of the node • (2) whether the left-most word is the beginning of the question • (3) the depth of the anchoring node, defined as the length of the path to the root node. NAACL-HLT 2013

E.q. of PosOfPTN deep: 0 1 2 3 4 5 6 • E.q: For “(NP American)” in Figure 1: • 5th word in the sentence • not the first word of the sentence • Depthof the node is 6 Position: 123456789 10 NAACL-HLT 2013

MMP Features Syntactic Features: • PhrLenToQLenRatio: • This feature computes thenumber of words in an MMP, and its relative ratio tothe sentence length. NAACL-HLT 2013

MMP Features Semantic Features (NETypes): • The feature tests ifa phrase is or contains a named entity, and, if thisis the case, the value is the entity type. • information extraction (IE) pipeline consisting of syntactic parsing, mention detection and coreference resolution (Florian et al., 2004; Luo et al., 2004; Luo and Zitouni, 2005) • E.q. : For “(NP American)” in Figure 1, the feature value would be “GPE.” NAACL-HLT 2013

MMP Features Corpus-based Features ( AvgCorpusIDF): • This group of features computes the average of the IDFs of the words in this phrase. • Have stop words NAACL-HLT 2013

MMP Classification Results Classifier: • logistic regression binary classifier using WEKA. Data set: NAACL-HLT 2013

Performances of the MMP classifier NAACL-HLT 2013

Example Questions by MMP Model NAACL-HLT 2013

Data for Relevance Model • From BOLT-IR task(IR, 2012) • Top snippets returned by the search engine are judged for relevancy by our annotators. NAACL-HLT 2013

Relevance Prediction • The relevance model is a conditional distribution P(r|q, s;D) • where r is a binary random variable indicating if the candidate snippet s is relevant to the question q. • D is the document where the snippet s is found. NAACL-HLT 2013

Relevance Prediction Baseline system • (1) Text Match Features • query and snippet 的 cosine scores • (2) Answer Type Features: • The top 3 predictions of a statistical classifier trained to predict answer categories were used as features. NAACL-HLT 2013

Relevance Prediction Baseline system • (3) Mention Match Features • whether a named entity in the query occurs in the snippet. NAACL-HLT 2013

Relevance Prediction Baseline system • (4) Event match features • use several hand-crafted dictionaries containing terms exclusive to various types of events like ”violence”, ”legal”, ”election”. • If both the query and snippet contain the same event type • The features take value is ‘1’ NAACL-HLT 2013

Relevance Prediction Baseline system • (5) Snippet Statistics: • snippet length • the position of the snippet in the post etc were created. NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • HardMatch: • LetI(m ∈ s) be a 1 or 0 functionindicating if a snippet contains the MMP m NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • SoftLMMatch: • The SoftLMMatch score is a language-model (LM) based score, similar to that used in (Bendersky and Croft, 2008), except that MMPs play the role of concepts. NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • SoftLMMatch: • where wi is the ith in snippet s • I(wi= v) is an indicator function, taking value 1 if wiis v and 0 otherwise • |V | is the vocabulary size NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • MMPInclScore: • where w ∈ m are the words in m • I(・) is the indicator function taking value 1 when the argument is true and 0 otherwise • is a constant threshold • l(w, s) is the similarity of word w to the snippet s as: • l(w, s) = maxv ∈ s JW(w, v) • JW(w, v) = (Jaro Winkler similarity score between words w and v) NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • MMPInclScore: • The MMP weighted inclusion score between the question q and snippet s is computed as: NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • MMPRankDep: • This feature, RD(q, s) first tests if there exists a matched bilexcial dependency between q and s; NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • MMPRankDep: • Let m(i) be the ith ranked MMP • let <wh, wd| q> and <uh, ud| s> be bilexical dependencies from qand s, respectively • whand uh are the heads • wdand ud are the dependents NAACL-HLT 2013

Relevance Prediction Features Derived from MMP • MMPRankDep: • EQ(w, u) • EQ(w, u) is true if either w and u are exactly the same, or their morphs are the same, or they head the same entity, or their synset in WordNetoverlap • RD(q, s) • RD(q, s) is true if and only if • EQ(wh, uh) ∧ EQ(wd, ud) ∧ wh ∈ m(i) ∧ wd ∈ m(j) is true for some <wh, wd | q>, for some <uh, ud | s> and for some i and j. NAACL-HLT 2013

Relevance Prediction 3snippet classifiers model • noMMP model • a system without MMP features; • IDF-as-MMP model • a baseline with each word as an MMP and the word’s IDF as the MMP score. • MMP model NAACL-HLT 2013

Relevance Prediction Performance of 3snippet classifiers system NAACL-HLT 2013

End-to-End System Results • The question-answering system is used in the 2012 BOLT IR evaluation (IR, 2012) • There are 499K(Arabic), 449K(Chinese ) and 262K(English ) threads in each of these languages. • The Arabic and Chinese posts were first translated into English before being processed. NAACL-HLT 2013

End-to-End System Results • performance NAACL-HLT 2013

BOLT Evaluation Results • The BOLT evaluation consists of 146 questions, mostly event- or topic- related NAACL-HLT 2013

BOLT Evaluation Results NAACL-HLT 2013

Conclusions • 作者提供一個使用mandatory matching phrases (MMP) 的QA系統 • 從question抽取出MMP的F-measure 高達 88.6% • 將MMP model 跟snippet relevance model 合併可以有效提升snippet relevance model的效能 • 使用MMP的QA系統是2012 BOLT IR中效能最好的系統 NAACL-HLT 2013

Finding What Matters in Questions

Finding What Matters in Questions

Presentation Transcript

Measure What Matters

Measure What Matters

What Matters in College

What Matters Most!

Grey Matters! Finding Grey Literature

What Matters Most

What MATTERS in Physical Science?

What Matters to Girls in STEM

What Matters Most!!!

What Matters in Physical Science?

Investing in What Matters: Member Equity

What Matters?

‘CHASE WHAT MATTERS’

Choosing What Matters

Measure What Matters

What Matters In Data Center Assessment?

Measuring What Matters

What matters most

What Really Matters

Measure What Matters

Measuring What Matters