Using information extraction for question answering
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Using Information Extraction for Question Answering PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on
  • Presentation posted in: General

Using Information Extraction for Question Answering. Done by Rani Qumsiyeh. Problem. More Information added to the web everyday. Search engines exist but they have a problem This calls for a different kind of search engine. History of QA. QA can be dated back to the 1960’s

Download Presentation

Using Information Extraction for Question Answering

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Using information extraction for question answering

Using Information Extraction for Question Answering

Done by

Rani Qumsiyeh


Problem

Problem

  • More Information added to the web everyday.

  • Search engines exist but they have a problem

  • This calls for a different kind of search engine.


History of qa

History of QA

  • QA can be dated back to the 1960’s

  • Two common approaches to design QA:

    • Information Extraction

    • Information Retrieval

  • Two conferences to evaluate QA systems

    • TREC (Text REtrieval Conference)

    • MUC (Message Understanding Conference)


Common issues with qa systems

Common Issues with QA systems

  • Information retrieval deals with keywords.

  • Information extraction learns the question.

  • The question could have multiple variations which means

    • Easier for IR but more broad results

    • Harder for IE but more EXACT results


Message understanding conference muc

Message Understanding Conference (MUC)

  • Sponsored by the Defense Advanced Research Projects Agency (DARPA) 1991-1998.

  • Developed methods for formal evaluation of IE systems

  • In the form of a competition, where the participants compare their results with each other and against human annotators‘ key templates.

  • Short system preparation time to stimulate portability to new extraction problems. Only 1 month to adapt the system to the new scenario before the formal run.


Evaluation metrics

Evaluation Metrics

  • Precision and recall:

    • Precision: correct answers/answers produced

    • Recall: correct answers/total possible answers

  • F-measure

    • Where is a parameter representing relative importance of P & R:

    • E.g., =1, then P&R equal weight, =0, then only P

  • Current State-of-Art: F=.60 barrier


Muc extraction tasks

MUC Extraction Tasks

  • Named Entity task (NE)

  • Template Element task (TE)

  • Template Relation task (TR)

  • Scenario Template task (ST)

  • Coreference task (CO)


Named entity task ne

Named Entity Task (NE)

  • Mark into the text each string that represents, a person, organization, or location name, or a date or time, or a currency or percentage figure


Template element task te

Template Element Task (TE)

  • Extract basic information related to organization, person, and artifact entities, drawing evidence from everywhere in the text.


Template relation task tr

Template Relation task (TR)

  • Extract relational information on employee_of, manufacture_of, location_of relations etc. (TR expresses domain independent relationships between entities identified by TE)


Scenario template task st

Scenario Template task (ST)

  • Extract prespecified event information and relate the event information to particular organization, person, or artifact entities (ST identifies domain and task specific entities and relations)


Coreference task co

Coreference task (CO)

  • Capture information on corefering expressions, i.e. all mentions of a given entity, including those marked in NE and TE (Nouns, Noun phrases, Pronouns)


An example

An Example

  • The shiny red rocket was fired on Tuesday. It is the brainchild of Dr. Big Head. Dr. Head is a staff scientist at We Build Rockets Inc.

  • NE: entities are rocket, Tuesday, Dr. Head and We Build Rockets

  • CO: it refers to the rocket; Dr. Head and Dr. Big Head are the same

  • TE: the rocket is shiny red and Head‘s brainchild

  • TR: Dr. Head works for We Build Rockets Inc.

  • ST: a rocket launching event occurred with the various participants.


Scoring templates

Scoring templates

  • Templates are compared on a slot-by-slot basis

    • Correct: response = key

    • Partial: response » key

    • Incorrect: response != key

    • Spurious: key is blank

      • overgen=spurious/actual

    • Missing: response is blank


Maximum results reported

Maximum Results Reported


Knowitall textrunner knowitnow

KnowitAll, TextRunner, KnowitNow

  • Differ in implementation, but do the same thing.


Using them as qa systems

Using them as QA systems

  • Able to handle questions that produce 1 relation

    • Who is the president of the US? “can handle”

    • Who was the president of the US in 1998? “fails”

  • Produces a huge number of facts that the user still has to go through.


Textract

Textract

  • Aims at solving ambiguity in text by introducing more named entities.

  • What is Julian Werver Hill's wife's telephone number?

    • equivalent to: What is Polly's telephone number?

  • Where is Werver Hill's affiliated company located?

    • equivalent to: Where is Microsoft located?


Proposed system

Proposed System

  • Determine what named entity we are looking for using Textract.

  • Use Part of Speech tagging.

  • Use TextRunner as the basis for search.

  • Use WordNet to find synonyms.

  • Use extra entities in text as “constraints”


Example

Example


Example1

Example

  • (WP who) (VBD was) (DT the) (JJ first) (NN man) (TO to) (VB land) (IN on) (DT the) (NN moon)

  • The verb (VB) is treated as the argument.

  • The noun (NN) is treated as the predicate

  • We make sure that position is maintained

  • We keep prepositions if they have two nouns. (president of the US)

  • Other non stop words are constraints, i.e., “first”


Example2

Example


Anaphora resolution

Anaphora Resolution

  • Use anaphora resolution to determine that landed is not associated with landed but wrote instead.


Use synonyms

Use Synonyms

  • We use word net to find possible synonyms for verbs and nouns to produce more facts.

  • We only consider 3 synonyms as it takes more time the more fact retrievals we have to do.


Using constraints

Using constraints


Delimitations

Delimitations

  • Works well with Who, When, Where questions as named entity is easily determined.

    • Achieves about 90% accuracy on all

  • Works less well with What, How questions

    • Achieves about 70% accuracy

  • Takes about 13 seconds to answer question.


Future work

Future Work

  • Build an ontology to determine named entity and parse question (faster)

  • Handle combinations of questions.

    • When and where did the holocaust happen?


  • Login