Question answering
Download
1 / 54

Question Answering - PowerPoint PPT Presentation


  • 165 Views
  • Updated On :

Question Answering. Gideon Mann Johns Hopkins University gsm@cs.jhu.edu. Information Retrieval Tasks. Retired General Wesley Clark How old is General Clark? How long did Clark serve in the military? Will Clark run for President?. Ad-Hoc Queries.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Question Answering' - kamuzu


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Question answering l.jpg

Question Answering

Gideon Mann

Johns Hopkins University

gsm@cs.jhu.edu


Information retrieval tasks l.jpg
Information Retrieval Tasks

  • Retired General Wesley Clark

  • How old is General Clark?

  • How long did Clark serve in the military?

  • Will Clark run for President?


Ad hoc queries l.jpg
Ad-Hoc Queries

  • Prior work has been concerned mainly with answering ad-hoc queries :

    General Clark

  • Typically a few words long, not an entire question

  • What is desired is general information about the subject in question


Answering ad hoc queries l.jpg
Answering Ad-Hoc Queries

  • Main focus of Information Retrieval past 2-3 decades

  • Solution(s) :

    • Vector-based methods

    • SVD, query expansion, language modeling

    • Return a page as an answer

  • Resulting systems Extremely Useful

    • Google, Altavista


Traditional ir l.jpg
Traditional IR

Document Collection

Query

Document retrieval

Document Ranking


But not all queries are ad hoc l.jpg
But not all queries are Ad-Hoc!

How old is General Clark?

  • Does not fit well into an Ad-hoc paradigm

    • “How” and “is” are not relevant for appropriate retrieval

    • Potentially useful cues in the question are ignored in traditional ad-hoc retrieval system


Documents are not facts l.jpg
Documents are not Facts

  • Traditional IR systems return Pages

    • Useful when only a vague information need has been identified

  • Insufficient when a fact is desired:

    • How old is General Clark?58

    • How long did Clark serve in the mililary? 36 years

    • Will Clark run for president? Maybe


Question answering as retrieval l.jpg
Question Answering as Retrieval

Given a document collection and a question:

A question answering system should retrieve a short snippet of text which exactly answer the question asked.


Question answering9 l.jpg
Question Answering

Document Collection

Query

Document retrieval

Document Ranking

(Sentence ranking)

Answer Extraction

Ranked Answers


Qa as a comprehension task l.jpg
QA as a Comprehension Task

  • For perfect recall, the answer only has to appear once in the collection.

  • In essence, this forces the QA system to function as a text understanding system

  • Thus QA may be interesting, not only for retrieval, but also to test understanding


Qa as a stepping stone l.jpg
QA as a stepping stone

  • Current QA focused on Fact extraction

    • Answers appear verbatim in text

      How old is General Clark?

  • How can we answer questions which don’t appear exactly in the text?

    How long has Clark been in the military?

    Will Clark run for President?

  • Maybe build on low-level QA extracted facts


Qa methods l.jpg
QA Methods

Two Main Categories of Methods for Question Answering

  • Answer Preference Matching

  • Answer Context Matching


Lecture outline l.jpg
Lecture Outline

  • Answer Preferences

    • Question Analysis

    • Type identification

    • Learning Answering Typing

  • Answer Context

    Learning Context Similarity

    • Alignment

    • Surface Text Patterns


Answer type identification l.jpg
Answer Type Identification

From the question itself infer the likely type of the answer

  • How old is General Clark?

    • How Old 

  • When did Clark retire?

    • When 

  • Whois the NBC war correspondent?

    • Who 


Nassli l.jpg
NASSLI!

  • April 12 Deadline –

    • Could be extended….

    • Mail hale@jhu.edu to ask for more time


Answer type identification16 l.jpg
Answer Type Identification

From the question itself infer the likely type of the answer

  • How old is General Clark?

    • How Old Age

  • When did Clark retire?

    • When  Date

  • Whois the NBC war correspondent?

    • Correspondent  Person



Difficult to enumerate all possibilities though l.jpg
Difficult to Enumerate All Possibilities Though

What is the service ceiling for a PAC750?


Wordnet l.jpg
WordNet

length

wingspan

diameter

radius

altitude

ceiling


Slide20 l.jpg

WordNet For Answer Typing

length

NUMBER

wingspan

diameter

radius

altitude

ceiling

What is the service ceiling for a PAC750?


Lecture outline21 l.jpg
Lecture Outline

  • Answer Preferences

    • Question Analysis

    • Type identification

    • Learning Answering Typing

  • Answer Context

    Learning Context Similarity

    • Alignment

    • Surface Text Patterns


Answer typing gives the preference l.jpg
Answer Typing gives the Preference…

  • From Answer Typing, we have the preferences imposed by the question

  • But in order to use those preferences, we must have a way to detect potential candidate answers


Some are simple l.jpg
Some are Simple…

  • Number  [0-9]+

  • Date  ($month) ($day) ($year)

  • Age  0 – 100


Others complicated l.jpg
… Others Complicated

  • Who shot Martin Luther King?

    • Person preference

    • Requires a Named Entity Identifier

  • Who saved Chrysler from bankruptcy?

    • Not just confined to people…

    • Need a Tagger to identify appropriate candidates


Use wordnet for type identification l.jpg
Use WordNet for Type Identification

“What 20th century poet wrote Howl?”

communicator

writer

poet

Rilke

Ginsburg

Frost

Candidate Set


Simple answer extraction l.jpg
Simple Answer Extraction

How old is General Clark?

General Clark, from Little Rock, Arkansas, turns 58 after serving

36 years in the service, this December 23, 2002.

Age

Age Tagger

General Clark, from Little Rock, Arkansas, turns 58 after serving

36 years in the service, this December 23, 2002.


Lecture outline27 l.jpg
Lecture Outline

  • Answer Preferences

    • Question Analysis

    • Type identification

    • Learning Answering Typing

  • Answer Context

    Learning Context Similarity

    • Alignment

    • Surface Text Patterns


Learning answer typing l.jpg
Learning Answer Typing

  • What is desired is a model which predicts P(type|question)

  • Usually a variety of possible types

    • Who

      • Person (“Who shot Kennedy?”Oswald)

      • Organization (“Who rescued Chrysler from bankruptcy?”The Government)

      • Location (“Who won the Superbowl?”New England)


What training data l.jpg
What training data?

  • Annotated Questions

    • “Who shot Kennedy” [PERSON]

  • Problems :

    • Expensive to annotate

    • Must be redone, every time the tag set is devised


Trivia questions l.jpg
Trivia Questions!

  • Alternatively, use unannotated Trivia Questions

    • Q: “Who shot Kennedy”

    • A: Lee Harvey Oswald

  • Run your Type-Tagger over the answers, to get tags

    • A: Lee Harvey Oswald [ PERSON]


Mi model l.jpg
MI Model

  • From tags, you can build a MI model

    • Predict from the question head-word

      • MI(Question Head Word, Type Tag)

        = P(Type Tag | QuestionHeadWord)

        ---------------------------------------------

        P(Type Tag)

    • From this you can judge the fit of a question/word pair

    • (Mann 2001)


Maxent model l.jpg
MaxEnt Model

  • Rather than just use head word alone train on the entire set of words, and build a Maximum Entropy model to combine features suggested by the entire phrase

    “What was the year in which Castro was born?”

    (Ittycheriah et al. 2001)


Maybe you don t even need training data l.jpg
Maybe you don’t even need training data!

  • Looking at occurrences of words in text, look at what types occur next to them

  • Use these co-occurrence statistics to determine appropriate type of answer

  • (Prager et al. 2002)


Lecture outline34 l.jpg
Lecture Outline

  • Answer Preferences

    • Question Analysis

    • Type identification

    • Learning Answering Typing

  • Answer Context

    Learning Context Similarity

    • Alignment

    • Surface Text Patterns


Is answer typing enough l.jpg
Is Answer Typing Enough?

  • Even when you’ve found the correct sentence, and know the type of the answer a lot of ambiguity in the answer still remains

  • Some experiments show that in every sentence, around 2/3 choices of appropriate type for a sentence which answers a question

  • For high precision systems, this is unacceptable


Answer context l.jpg
Answer Context

Who shot Martin Luther King?

Answer Context

Answer Preference


Using context l.jpg
Using Context

  • Many systems simply look for an answer of the correct type in a context which seems appropriate

    • Many matching keywords

    • Perhaps using query expansion


Another alternative l.jpg
Another alternative

  • If the question is “Whoshot Kennedy”

  • Search for all exact phrases matches

    • “X shot Kennedy”

  • And simple alternations

    • “Kennedy was shot by X”

  • (Brill et al. 2001)


  • Beyond l.jpg
    Beyond…

    • The first step beyond simple keyword matching, is to use relative position information

    • One way of doing this is to use alignment information


    Lecture outline40 l.jpg
    Lecture Outline

    • Answer Preferences

      • Question Analysis

      • Type identification

      • Learning Answering Typing

    • Answer Context

      Learning Context Similarity

      • Alignment

      • Surface Text Patterns


    Local alignment l.jpg
    Local Alignment

    Who shot Kennedy?

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.

    Three Potential Candidates by type


    Local alignment42 l.jpg
    Local Alignment

    Question Head word

    Who shot Kennedy?

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.

    Matching Context


    Local alignment43 l.jpg
    Local Alignment

    Whoshot Kennedy?

    Anchor word

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.


    Local alignment44 l.jpg
    Local Alignment

    Whoshot Kennedy?

    Potential alignments

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.


    Local alignment45 l.jpg
    Local Alignment

    Whoshot Kennedy?

    One Alignment

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.

    Three Alignment Features :


    Local alignment46 l.jpg
    Local Alignment

    Whoshot Kennedy?

    One Alignment

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.

    2

    Three Alignment Features :

    • Dws : Distance between Question Head word and Anchor

    • in the sentence


    Local alignment47 l.jpg
    Local Alignment

    1

    Whoshot Kennedy?

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.

    Three Alignment Features :

    2. Dwq Distance between Question Head word and Anchor

    In the question


    Local alignment48 l.jpg
    Local Alignment

    Whoshot Kennedy?

     Headword position flipped

    Jack assassinated Oswald, the man who shot Kennedy, and was Mrs. Ruby’s Husband.

    Three Alignment Features :

    3. R Has the Head Word changed position?


    Build a statistical model l.jpg
    Build a Statistical Model

    • Pr (answer | question, sentence) =

      Pr ( Dws | answer, question, sentence)

      *Pr(Dwq | answer, question, sentence)

      *Pr(R | answer, question, sentence)

    • and if unsure about type preference, can add in a term there


    Slide50 l.jpg


    Surface text patterns l.jpg
    Surface text Patterns for using the context of the question to pick out the correct answer from a given sentence containing an answer

    • Categorize question into what kind of data it is looking for

    • Use templates to build specialized models

    • Use resulting “surface text patterns” for searching


    Birthday templates l.jpg
    Birthday Templates for using the context of the question to pick out the correct answer from a given sentence containing an answer


    Web search to generate patterns l.jpg
    Web Search to generate patterns for using the context of the question to pick out the correct answer from a given sentence containing an answer

    Web pages w/“Mozart” “1756”

    Sentences with “Mozart” “1756”

    Substrings with “Mozart” “1756”


    How can we pick good patterns l.jpg
    How can we pick good patterns? for using the context of the question to pick out the correct answer from a given sentence containing an answer

    • Frequent ones may be too general

    • Infrequent ones not that useful

    • Want precise, specific ones

    Use held out templates to evaluate patterns