The Multiple Language
This presentation is the property of its rightful owner.
Sponsored Links
1 / 31

The Multiple Language Question Answering Track at CLEF 2003 PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on
  • Presentation posted in: General

The Multiple Language Question Answering Track at CLEF 2003. Bernardo Magnini*, Simone Romagnoli*, Alessandro Vallin* Jes ús Herrera**, Anselmo Peñas**, Víctor Peinado**, Felisa Verdejo** Maarten de Rijke*** * ITC-irst, Centro per la Ricerca Scientifica e Tecnologica, Trento - Italy

Download Presentation

The Multiple Language Question Answering Track at CLEF 2003

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The multiple language question answering track at clef 2003

The Multiple Language

Question Answering Track at CLEF 2003

Bernardo Magnini*, Simone Romagnoli*, Alessandro Vallin*

Jesús Herrera**, Anselmo Peñas**, Víctor Peinado**, Felisa Verdejo**

Maarten de Rijke***

* ITC-irst, Centro per la Ricerca Scientifica e Tecnologica, Trento - Italy

{magnini,romagnoli,[email protected]

** UNED, Spanish Distance Learning University, Madrid – Spain

{jesus.herrera,anselmo,victor,[email protected]

*** Language and Inference Technology Group, ILLC, University of Amsterdam - The Netherlands

[email protected]


The multiple language question answering track at clef 2003

Overview of the Question Answering track at CLEF 2003

  • Report on the organization of QA tasks

  • Present and discuss the participants’ results

  • Perspectives for future QA campaigns


The multiple language question answering track at clef 2003

  • QA: find the answer to an open domain question in a large collection of documents

    • INPUT: questions (instead of keyword-based queries)

    • OUTPUT: answers (instead of documents)

  • QA track at TREC

    • Mostly fact-based questions

      • Question: Who invented the electric light?

      • Answer: Edison

  • Scientific Community

    • NLP and IR

    • AQUAINT program in USA

  • QA as an applicative scenario


The multiple language question answering track at clef 2003

Purposes:

  • Answers may be found in languages different from the language of the question

  • Interest in QA systems for languages other than English

  • Force the QA community to design real multilingual systems

  • Check/improve the portability of the technologies implemented in current English QA systems

  • Creation of reusable resources and benchmarks for further multilingualQA evaluation


The multiple language question answering track at clef 2003

  • [email protected] WEB SITE( http://clef-qa.itc.it )

  • CLEF QA MAILING LIST ( [email protected] )

  • GUIDELINES FOR THE TRACK (following the model of TREC 2001)


The multiple language question answering track at clef 2003

200 questions

target corpus

exact answers

50 bytes answers


The multiple language question answering track at clef 2003

1

1

0

1

1

1

3

1


The multiple language question answering track at clef 2003

4 p/d for 1 run

(600 answers)

QA system

Assessment

English answers

English text

collection

Italian questions

English questions

Translation

Question extraction

2 p/d for 200 questions

1 p/m for 200 questions


The multiple language question answering track at clef 2003

Corpora licensed by CLEF in 2002:

  • Dutch Algemeen Dagblad and NRC Handelsblad (years 1994 and 1995)

  • Italian La Stampa and SDA press agency (1994)

  • Spanish EFE press agency (1994)

  • English Los Angeles Times (1994)

MONOLINGUAL TASKS

BILINGUAL TASK


The multiple language question answering track at clef 2003

MONOLINGUAL TEST SETS

CLEF Topics

150 q/a

Dutch

150 q/a

Italian

150 q/a

Spanish

NEW TARGET LANGUAGES

ENGLISH

QUESTIONS SHARING

ILLC

ITC-irst

UNED

300

Ita+Spa

300

Dut+Spa

300

Ita+Dut

ENGLISH

DATA MERGING

150 Dutch/English

150 Italian/English

the DISEQuA corpus

150 Spanish/English


The multiple language question answering track at clef 2003

  • 200 fact-based questions for each task:

  • queries related to the events occurred in the years 1994 and/or 1995, i.e. the years of the target corpora;

  • coverage of different categories of questions: date, location, measure, person, object, organization, other;

  • questions were not guaranteed to have an answer in the corpora: 10% of the test sets required the answer string “NIL”


The multiple language question answering track at clef 2003

  • 200 fact-based questions for each task:

  • queries related to the events occurred in the years 1994 and/or 1995, i.e. the years of the target corpora

  • coverage of different categories of questions (date, location, measure, person, object, organization, other)

  • questions were not guaranteed to have an answer in the corpora: 10% of the test sets required the answer string “NIL”

  • - definition questions (“Who/What is X”)

  • - Yes/No questions

  • - list questions


The multiple language question answering track at clef 2003

  • Participants were allowed to submit up to three answers per question and up to two runs:

  • answers must be either exact (i.e. contain just the minimal information) or 50 bytes long strings

  • answers must be supported by a document

  • - answers must be ranked by confidence

  • Answers were judged by human assessors, according to four categories:

  • CORRECT (R)

  • UNSUPPORTED (U)

  • INEXACT (X)

  • INCORRECT (W)


The multiple language question answering track at clef 2003

The score for each question was the reciprocal of the rank of the first answer to be found correct; if no correct answer was returned, the score was 0.

The total score, or Mean Reciprocal Rank (MRR), was the mean score over all questions.

In STRICT evaluation only correct (R) answers scored points.

In LENIENT evaluation the unsupported (U) answers were considered correct, as well.


The multiple language question answering track at clef 2003

Participants in past QA tracks

Comparison between the number and place of origin of the participants in the past TREC and in this year’s CLEF QA tracks:


Performances at trec qa

67%

23%

66%

25%

58%

24%

TREC-8 TREC-9TREC-10

Performances at TREC-QA

  • Evaluation metric: Mean Reciprocal Rank (MRR)

    1

    rank of the correct answer

  • Best result

  • Average over 67 runs

/ 500


The multiple language question answering track at clef 2003

Results - EXACT ANSWERS RUNS

MONOLINGUAL TASKS


The multiple language question answering track at clef 2003

Results - EXACT ANSWERS RUNS

MONOLINGUAL TASKS


The multiple language question answering track at clef 2003

Results - EXACT ANSWERS RUNS

CROSS-LANGUAGE TASKS


The multiple language question answering track at clef 2003

Results - EXACT ANSWERS RUNS

CROSS-LANGUAGE TASKS


The multiple language question answering track at clef 2003

Results - 50 BYTES ANSWERS RUNS

MONOLINGUAL TASKS


The multiple language question answering track at clef 2003

Results - 50 BYTES ANSWERS RUNS

CROSS-LANGUAGE TASKS


The multiple language question answering track at clef 2003

Average Results in Different Tasks


The multiple language question answering track at clef 2003

Two main different approaches used in Cross-Language QA systems:

translation of the question into the target language (i.e. in the language of the document collection)

1

question processing

answer extraction

question processing in the source language to retrieve information (such as keywords, question focus, expected answer type, etc.)

2

translation and expansion of the retrieved data

answer extraction


The multiple language question answering track at clef 2003

Two main different approaches used in Cross-Language QA systems:

translation of the question into the target language (i.e. in the language of the document collection)

1

CS-CMU

question processing

ISI

Limerik

answer extraction

DFKI

preliminary question processing in the source language to retrieve information (such as keywords, question focus, expected answer type, etc.)

2

ITC-irst

translation and expansion of the retrieved data

RALI

answer extraction


The multiple language question answering track at clef 2003

  • A pilot evaluation campaign for multiple language Question Answering Systems has been carried on.

  • Five European languages were considered: three monolingual tasks and five bilingual tasks against an English collection have been activated.

  • Considering the difference of the task, results are comparable with QA at TREC.

  • A corpus of 450 questions, each in four languages, reporting at least one known answer in the respective text collection, has been built.

  • This year experience was very positive: we intend to continue with QA at CLEF 2004.


The multiple language question answering track at clef 2003

  • Organization issues:

    • Promote larger participation

    • Collaboration with NIST

  • Financial issues:

    • Find a sponsor: ELRA, the new CELCT center, …

  • Tasks (to be discussed)

    • Update to TREC-2003: definition questions, list questions

    • Consider just “exact answer”: 50 bytes did not have much favor

    • Introduce new languages: in the cross-language task this is easy to do

    • New steps toward multilinguality: English questions against other language collections; a small set of full cross-language tasks (e.g. Italian/Spanish).


The multiple language question answering track at clef 2003

  • Find 200 questions for each language (Dutch, Italian, Spanish), based on CLEF-2002 topics, with at least one answer in the respective corpus.

  • Translate each question into English, and from English into the other two languages.

  • Find answers in the corpora of the other languages (e.g. a Dutch question was translated and processed in the Italian text collection).

  • The result is a corpus of 450 questions, each in four languages, with at least one known answer in the respective text collection. More details in the paper and in the Poster.

  • Questions with at least one answer in all the corpora were selected for the final question set.


  • Login