1 / 14

Question Answering for Machine Reading Evaluation Evaluation Campaign at CLEF 2011

Question Answering for Machine Reading Evaluation Evaluation Campaign at CLEF 2011. Anselmo Peñas (UNED, Spain) Eduard Hovy (USC-ISI, USA) Pamela Forner (CELCT, Italy) Richard Sutcliffe ( U. Limerick , Ireland) Álvaro Rodrigo (UNED, Spain). Knowledge-Understanding dependence.

ivan
Download Presentation

Question Answering for Machine Reading Evaluation Evaluation Campaign at CLEF 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Question Answering for Machine Reading EvaluationEvaluation Campaign at CLEF 2011 AnselmoPeñas (UNED, Spain) Eduard Hovy (USC-ISI, USA) Pamela Forner (CELCT, Italy) Richard Sutcliffe (U. Limerick, Ireland) Álvaro Rodrigo (UNED, Spain)

  2. Knowledge-Understanding dependence We “understand” because we “know” Capture ‘knowledge’ expressed in texts ‘Understand’ language

  3. Control the variable of knowledge • The ability of making inferences about texts is correlated to the amount of knowledge considered • This variable has to be taken into account during evaluation • Otherwise is very difficult to compare methods • How to control the variable of knowledge in a reading task?

  4. Question Answering • Restricted-domain QA systems 1. On large knowledge bases • Structured QA, not aiming for language understanding 2. On a domain specific collection • Information Extraction rules • Open domain QA systems • On open domain collections • Based on retrieval and redundancy • Very limited inference • What’s next in QA?

  5. Recognizing Textual Entailment Test: Text (evidence) – Hypothesis pair Source of knowledge: Free • Difficult to evaluate if best systems have better methods or better knowledge or both • Cheap evaluation • Reusable 100% • Same framework for any level of complexity • What´s next in RTE? Control the variable of knowledge

  6. Proposal: QA4MRE Question Answering for Machine Reading Evaluation (QA4MRE) • New task of QA Track at CLEF 2011 • General Goal • Measure progress in two reading abilities • Capture knowledge from text collections • Answer questions about a single text

  7. Requirements • Don’t fix the representation formalism • Semantic representation beyond sentence level is part of the research agenda • Don't build systems tuned for specific domains • But general technologies, able to self-adapt to new contexts or topics • Evaluate reading abilities • Knowledge acquisition • Answer questions about a single document • Control the role of knowledge

  8. Sources of knowledge • Text Collection • Big and diverse enough to acquire required knowledge • Impossible for all possible topics • Define a scalable strategy: topic by topic • Several topics • Narrow enough to limit knowledge needed (e.g. Petroleum industry, European Football League, Disarmament of the Irish Republican Army, etc.) • Reference collection per topic (10,000-50,000 docs.) • Documents defining concepts about the topic (e.g. wikipedia) • News about the topic • Web pages, blogs, opinions

  9. Reading test Text Coal seam gas drilling in Australia's SuratBasin has been halted by flooding. Australia's Easternwell, being acquired by TransfieldServices, has ceased drilling because of the flooding. The company is drilling coal seam gas wells for Australia's Santos Ltd. Santos said the impact was minimal. Multiple choice test According to the text… What company owns wells in Surat Basin? Australia Coal seam gas wells Easternwell Transfield Services Santos Ltd. Ausam Energy Corporation Queensland Chinchilla

  10. Knowledge gaps Company A Australia is part of drills for Queensland is part of Well C Company B Owns | P Surat Basin • Acquire this knowledge from the reference collection

  11. Runs • Type I • No external sources of knowledge • Only the given reference collection • Type II • With external sources • Specify which ones.

  12. Schedule Web site: http://celct.fbk.eu/QA4MRE/

  13. Program Committee • Ken Barker, University of Texas at Austin, US • Johan Bos, Rijksuniversiteit Groningen, Netherlands • Peter Clark, Vulcan Inc., US • IdoDagan, Bar-IlanUniversity, Israel • Bernardo Magnini, Fondazione Bruno Kessler, Italy • Dan Moldovan, University of Texas at Dallas, US • EmanuelePianta, Fondazione Bruno Kessler, and CELCT, Italy • John Prager, IBM, US • Dan Tufis, RACAI, Romania • HoaTrang Dang, NIST, US

  14. Join the organization Working group is open to collaboration • Development collections • Add new languages • Define types of questions • Write down tests about a topic • … anselmo@lsi.uned.es

More Related