Qa system
This presentation is the property of its rightful owner.
Sponsored Links
1 / 17

QA System PowerPoint PPT Presentation


  • 116 Views
  • Uploaded on
  • Presentation posted in: General

QA System. Maria Alexandropoulou Max Kaufmann Alena Hrynkevich. Bug fixes / improvements. To actually take into account weights from web-results produced by exact and inexact queries Correctly post process Right and Left answer placeholder for web results of exact queries

Download Presentation

QA System

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Qa system

QA System

Maria Alexandropoulou

Max Kaufmann

AlenaHrynkevich


Bug fixes improvements

Bug fixes / improvements

  • To actually take into account weights from web-results produced by exact and inexact queries

  • Correctly post process Right and Left answer placeholder for web results of exact queries

  • Return web results for inexact queries and rank them based on bigram overlap with search string

  • Reformulations now use corrected past forms of verbs (including irregular) and 3rd person

  • Caching of web search results to facilitate parameter tuning


Improved q uestion processing reformulation improvements

ImprovedQuestion ProcessingReformulation Improvements

Not so hard questions we had no chance answering correctly for D2:

  • Who is she married to? - Judi Dench

  • When was it founded? - American Enterprise Institute

  • When was the division formed? - 82nd Airborne Division

  • Who was the victim of the murder? - John William King convicted of murder

  • For what crime was the deposed leader found guilty? - Pakistani government overthrown in 1999

  • At what institute was this procedure done? - cloning of mammals (from adult cells)


Improved question processing question series

ImprovedQuestion ProcessingQuestion series

More complex scenario we did not attempt to address (we process each question in isolation):

Question series1

  • Who became Tufts University President in 1992? - Tufts University

  • Over which other university did he preside? - Tufts University

    Question series2

  • What was her husband's title when she married him? - 165.6 - the Queen Mum's 100th Birthday

  • What was his title when he died? - 165.7 - the Queen Mum's 100th Birthday


Improved question processing

ImprovedQuestion Processing

A multi-step way of consulting the topic of the question to make it as unambiguous as possible:

  • Resolved personal, possessive and demonstrative pronouns (ex. "How long was it used as a defense?" became "How long was Great Wall of China used as a defense?", but "How long did it take to build the Tower of Pisa?" remained intact; “At what institute was this procedure done? became “At what institute was cloning of mammals (from adult cells) procedure done?)

  • Expanded notion about some concepts based on target (ex. "What is the division's motto?" became "What is the 82nd Airborne Division motto?" )

  • Questions like "How old was Thabo Mbeki when he was elected president?" given topic "Thabo Mbeki elected president of South Africa" remained as is

  • When we arrived at conclusion that question does not relate to topic well enough ex. "How old was the dam?" under the topic "Johnstown flood" we simply constructed a reformulations of original question and appended topic's text to each of them before searching


  • Improved question processing classification

    ImprovedQuestion ProcessingClassification

    • Software: Mallet

    • Classification Algorithm: MaxEnt

    • Features (inspired Li and Roth 2005)

      • Named Entity Unigrams

      • POS tags for every word

      • Head of first NP after question word

      • Head of first VP after question word

      • Bigrams

    • Ex. 171.1 NE_PERSON_StephenNE_PERSON_WynnWORDPOS_When_WRBWORDPOS_was_VBDWORDPOS_Stephen_NNPWORDPOS_Wynn_NNPWORDPOS_born_VBNWORDPOS_?_. NPHEAD_StephenVPHEAD_bornBG_When_wasBG_was_StephenBG_Stephen_WynnBG_Wynn_born?


    Question classification resutlts

    Question Classification Resutlts

    • Classified all questions using Li&Roth taxonomy

      • Used training and testing published by Li and Roth (http://cogcomp.cs.illinois.edu/Data/QA/QC/)

      • Accuracy: .808

      • Confusion matrix showed errors were fairly evenly distributed over all classes

    • Do we really care about all types of question?

      • Limited to the following ----->

      • Everything else was “OTHER”

      • Accuracy: .884

      • Errors were fairly evenly

        distributed among selected

        classes

    NUM:code

    NUM:money

    NUM:perc

    LOC:state

    LOC:city

    LOC:other

    LOC:country

    LOC:mount

    ENTY:color

    ABBR:abb

    ABBR:exp

    HUM:ind

    HUM:gr

    HUM:title

    HUM:ind

    NUM:date

    NUM:count


    Future question classification experiments

    Future question classification experiments

    • which features were most helpful?

    • does pre-processing like tokenization, toLowercase, stemming help?

      We got state-of-the art accuracy on Li and Roth test set, that leads us to believe the perf will be similar on our TRAC set for classes we are mostly interested in

      We can easily re-train classifier model to incorporate more classes like ENTITY:food, ENTITY:bodyand ENTITY:animal to use this data during answer extraction stage


    Improved web search web results filtering based on question class

    ImprovedWeb SearchWeb results filtering based on question class

    Filtering of web-passages: NUM:date, HUM:ind, HUM:title, NUM:money, HUM:gr, ABBR:abb, NUM:perc, LOC:other, LOC:state, LOC:country, LOC:city

    • if (className.equalsIgnoreCase("NUM:date"))

    • {

    • List<String> ll = stanfordNLP.extractNamedEntities(passage);//check if string contains named entities DATE_

    • for (String s : ll)

    • {

    • if (s.startsWith("DATE_"))

    • return true;

    • ******

    • else if (className.startsWith("LOC:"))

    • {

    • List<String> ll = stanfordNLP.extractNamedEntities(passage);

    • for (String s : ll)

    • {

    • if (s.startsWith("LOCATION_") || s.startsWith("ORGANIZATION_"))

    • return true;

      • etc


    Answer extraction improvements

    Answer Extraction Improvements

    • During document projection whenever no document is returned we repeat the search by sending a more relaxed version of the query.

      • Second query is treated by Indri as a bag of words

      • This method leads to slight improvement of strict MRR

    • Fixed encoding issue in the strings we get from Bing that led to non-sensical appearing in some ngrams

    • Previously, we excluded ngrams containing words found in the question. Stop words are now excluded from that check since they are expected to be found in both the question and the answer without that meaning that they contain the answer.


    Todo filtering of answer candidates based on question class

    TODO: Filtering of Answer Candidatesbased on question class

    • Use NER results (ex. person)

    • Use rules (Date -> just Year or actual Date?)

    • Use lookup lists (animal, food, country. Profession, etc)

    • Speedup

      • Even with limited runtime performance


    Results

    Results


    Issues

    Issues

    • Performance issue when we incorporated more NLP features (tokenization, stemming, NER)

    • Too little value so far from running classifier and doing some web passage filtering based on class label => need use NER data for more aggressive filtering at answer extraction stage

    • Bing throwing cert exceptions at us, so we work with cached data

    • A bug, so that combined answer is ranked lowers than its unigrams


    Observations

    Observations

    • Different view on data regarding some events between Internet and TRAC corpus, ex.

    • How many events are part of the LPGA tour? 43

    • . ... 82 events on the LPGA Tour

    • the new tournaments means a net gain of one from 2012 to 28 official events for the LPGAtour …

    • How many people died from the massacre (tourists massacred at Luxor in 1997)? 68

    • The LuxorMassacre refers to the killing of 62 people, mostly tourists

    • In 1997, fifty-eight innocent tourists were massacred in ...

    • the luxormassacre. ... massacred 62 peoplein 1997. Since then, Egypt's tourism industry


    Observations1

    Observations

    • TRAC corpus is frozen it time while Internet not:

    • Who is the Secretary-General for political affairs? Danilo Turk

    • Jeffrey Feltman

    • What is Tufts' current endowment“ $600 million, but that was for 2006

    • The endowment was valued at $1.45 billion on June 30 ...

    • Tufts' own endowment increased by $110 million in fiscal year 2010, evidence…


    Observations2

    Observations

    • Litowskipatterns do not account for variation in answers, ex.

      How many times was Moon a Pro Bowler? 9

    • nine

      Where is the IMF headquartered? Washington

    • Washington D.C.

    • Washington DC

      Who was the victim of the murder? (John William King convicted of murder) James Byrd Jr.?

    • James Byrd

      Who did the Prince marry?Sophie Rhys.*Jones

    • Rhys-Jones

    • Sophie Rhys-Jones

      Who is the senior vice president of the American Enterprise Institute?John Bolton

    • BOLTON John

    • JOHN R. BOLTON


    Questions

    Questions?


  • Login