Log based evaluation resources for question answering
This presentation is the property of its rightful owner.
Sponsored Links
1 / 10

Log-Based Evaluation Resources for Question Answering PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on
  • Presentation posted in: General

Log-Based Evaluation Resources for Question Answering. Thomas Mandl, Julia Maria Schulz. Information Retrieval Logs and Question Answering. Users are not always aware that such different systems exist

Download Presentation

Log-Based Evaluation Resources for Question Answering

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Log based evaluation resources for question answering

Log-Based Evaluation Resources for Question Answering

Thomas Mandl, Julia Maria Schulz


Information retrieval logs and question answering

Information Retrieval Logs and Question Answering

Users are not always aware that such different systems exist

The short query is a preferred way of asking for information, but sometimes also phrases or complete sentences are entered

Demand for query specific treatment (Mandl & Womser-Hacker 2005)


Logfile resources at clef

Logfile resources at CLEF


Information retrieval evaluation resources

Information Retrieval Evaluation Resources

GeoCLEF 2007:

investigated and provided evaluation resources for geographic information retrieval (Mandl et al. 2008)

The query identification task was based on a query set from MSN, which is no longer distributed by Microsoft

LogCLEF 2009

“action logs” from The European Library portal (TEL), covered period: 1st January 2007 until 30th June 2008

web search engine query log from the Tumba! search engine

LogCLEF 2010

Extended TEL query and action logs

DIPF query logs (raw server log representing three months of activities on the portal is made available. The size of the files is 5 GB.)


Log based evaluation resources for question answering

TEL

The most significant columns of the table are:

A numeric id, for identifying registered users or “guest” otherwise;

User’s IP address;

An automatically generated alphanumeric, identifying sequential actions of the same user (sessions) ;

Query contents;

Name of the action that a user performed;

The corresponding collection’s alphanumeric id;

Date and time of the action’s occurrence.


Question style queries in query logs i

Question Style Queries in Query Logs I

Examples for queries from the MSN query logfile.


Question style queries in query logs ii

Question Style Queries in Query Logs II

Examples for queries from the TEL logfile.


Stop words in query reformulations

Stop Words in Query reformulations

over 1/4 of all reformulations in the TEL are additions or deletions of stop words (Ghorab et al. 2009).

Also question words like “where” or “when” are common stop words in information retrieval systems.

Prepositions are typical in the reformulation set, too.

frequent use of prepositions in the Tumba! search engine log.

prepositions belong to the most frequent terms in the MSN log.


Outlook

Outlook

CLEF has created evaluation resources for logfile analysis which can be used for comparative system evaluation.

The available files do contain queries which could be interesting for question answering systems.

They contain full sentences as questions or phrases which cannot be processed appropriately by the “bag of words” approach.


References

References

Ghorab, M.R.; Leveling, J.; Zhou, D.; Jones, G.; Wade, V.: TCD-DCU at LogCLEF 2009: An Analysis of Queries, Actions, and Interface Languages. In: Peters, C.; Di Nunzio, G.; Kurimo, M.; Mandl, T.; Mostefa, D.; Peñas, A.; Roda, G. (Eds.): Multilingual Information Access Evaluation Vol. I Text Retrieval Experiments: Proceedings 10th Workshop of the Cross$Language Evaluation Forum, CLEF 2009, Corfu, Greece. Revised Selected Papers. Berlin et al.: Springer [Lecture Notes in Computer Science] to appear. Preprint in Working Notes: http://www.clef- campaign.org/2009/working_notes/

Li, Z., Wang, C., Xie, X., Ma, W.-Y. (2008). Query Parsing Task for GeoCLEF2007 Report. In: Workingnotes 8th Workshop of the Cross$Language Evaluation Forum, CLEF 2007, Budapest, Hungary, http://www.clef-campaign.org/2007/working_notes/LI_OverviewCLEF2007.pdf

Mandl, T., Gey, F., Di Nunzio, G., Ferro, N., Larson, R., Sanderson, M., Santos, D., Womser-Hacker, C., Xing, X. (2008). GeoCLEF 2007: the CLEF 2007 Cross- Language Geographic Information Retrieval Track Overview. In: Peters, C.; Jijkoun, V.; Mandl, T.; Müller, H.; Oard, D.; Peñas, A.; Petras, V.; Santos, D. (Eds.): Advances in Multilingual and Multimodal Information Retrieval: 8th Workshop of the Cross$Language Evaluation Forum. CLEF 2007, Budapest, Hungary, Revised Selected Papers. Berlin et al.: Springer [Lecture Notes in Computer Science 5152] pp. 745--772.

Mandl, T., Womser-Hacker, C. (2005). The Effect of Named Entities on Effectiveness in Cross-Language Information Retrieval Evaluation. In: Proceedings of 2005 ACM SAC Symposium on Applied Computing (SAC). Santa Fe, New Mexico, USA. March 13.-7. pp. 1059--1064.

Mandl, T.; Agosti, M.; Di Nunzio, G.; Yeh, A., Mani, I.; Doran, C.; Schulz, J.M. (2010): LogCLEF 2009: the CLEF 2009 Cross-Language Logfile Analysis Track Overview. In: Peters, C.; Di Nunzio, G.; Kurimo, M.; Mandl, T.; Mostefa, D.; Peñas, A.; Roda, G. (Eds.): Multilingual Information Access Evaluation Vol. I Text Retrieval Experiments: Proceedings 10th Workshop of the Cross$Language Evaluation Forum, CLEF 2009, Corfu, Greece. Revised Selected Papers. Berlin et al.: Springer [Lecture Notes in Computer Science] to appear. Preprint in Working Notes: http://www.clef-campaign.org/2009/working_notes/LogCLEF-2009-Overview-Working-Notes-2009-09-14.pdf


  • Login