The QALL-ME Benchmark: a Multilingual Resource of Annotated Spoken Requests for Question Answering

The QALL-ME Benchmark:a Multilingual Resource of Annotated Spoken Requests for Question Answering E. Cabrio, M. Kouylekov, B. Magnini, M. Negri (FBK-Irst) L. Hasler, C. Orasan, (University of Wolverhampton) D. Tomas, J.L. Vicedo (University of Alicante) G. Neumann, C. Weber (DFKI)

Outline: • Motivations and goals • QALL-ME Project • QALL-ME Benchmark • Data collection • Translation into English • Speech Acts Annotation • Question Answering Annotation • Annotation of relations • Conclusion and Future Work

SMS SMS INPUT OUTPUT VOICE VOICE TEXT TEXT MMS VIDEO DIGITAL ASSISTANT Context: the Qall-me project QALL-ME (Question Answering Learning technologies in a multiLingual and multiModal Environment): an EU-funded project aiming at the realization of a shared and distributed infrastructure for Question Answering systems on mobile devices (e.g. mobile phones).

QALL-ME details • Reference: FP6 IST-033860 • Contract Type:STREP • Start date: October 1st, 2006 • Duration: 36 months • Project Funding: 2.82 M euros http://qallme.fbk.eu

Motivations • Providing a dataset of requests beyond factoid questions (e.g. verification, procedural)

Motivation: beyond factoid… • has Venezia hotel a restaurant • is there a toll free number for the INAIL office in via Gazzoletti in Trento VERIFICATION • where is the INAIL office and how can I get there • how can I get to the pharmacy De Gerloni of Trento PROCEDURAL

Motivations • Providing a dataset of requests beyond factoid questions (e.g. verification, procedural) • Investigating domain dependent vs domain independent annotation schema (Qall-me project domain: cultural events in a town).

Challenges • Context aware QA • What can I see tonight at cinema • Where is the nearest pharmacy • Persistent vs dynamic information • Multiple sources (database, newspaper, web)

Challenges related to events • Context aware QA • What can I see tonight at cinema (in Trento) • Where is the nearest pharmacy (to piazza Duomo) • Persistent vs dynamic information • Multiple sources (database, newspaper, web)

Motivations • Providing a dataset of requests beyond factoid questions (e.g. verification, procedural) • Investigating domain dependent vs domain independent annotation schema (Qall-me project domain: cultural events in a town). • Experimenting the impact of QA annotations (e.g. EAT) on spoken requests (speech vs QA).

QA annotation LOCATION Expected Answer Type : DATE may I know where the ice stadium of Trento is located and at what time it opens

Motivations • Providing a dataset of requests beyond factoid questions (e.g. verification, procedural) • Investigating domain dependent vs domain independent annotation schema (Qall-me project domain: cultural events in a town). • Experimenting the impact of QA annotations (e.g. EAT) on spoken requests (speech vs QA). • Investigating of the portability of semantic annotation through languages.

Portability of annotations ich möchte wissen wo das Eisstadium von Trento ist potrei sapere dov’è lo stadio del ghiaccio di Trento Expected Answer Type: LOCATION puedo saber donde esta el estadio de hielo de Trento may I know where the ice stadium of Trento is located

Data collection • 14645 questions in four different languages: ITALIAN, ENGLISH, GERMAN, SPANISH • Domain: cultural events in a town Acquisition: Every speaker performs 30 questions, based on 15 scenarios : • Using a graphical interface, for each scenario is first generated a spontaneous request and then a written one (previously predefined) • A telephone was used to acquire questions.

Data collection

Data acquisition features

Transcription All the audio files acquired from a speaker were joined together and orthographically transcribed using the tool Transcriber. (http://trans.souceforge.net) Being domain-restricted, our scenarios led sometimes to the same utterance (matching word sequence). However, the number of repetitions is actually small.

Translation into English Translation made by simulating the real situation of an English speaker visiting a foreign city. E.g. • what is the address of museodell'aeronautica Gianni Caproni Future work: all data collected for one language translated into the other three languages

Annotation of speech acts from the QALL-ME benchmark • As a starting point for further analyses, it is important to separate within an utterance (each speaker’s turn) what has to be interpreted as the actual request from what does not need an answer. halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

Annotation of speech acts from the QALL-ME benchmark • As a starting point for further analyses, it is important to separate within an utterance (each speaker’s turn) what has to be interpreted as the actual request from what does not need an answer. to greet halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

Annotation of speech acts from the QALL-ME benchmark • As a starting point for further analyses, it is important to separate within an utterance (each speaker’s turn) what has to be interpreted as the actual request from what does not need an answer. to contextualise halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

Annotation of speech acts from the QALL-ME benchmark • As a starting point for further analyses, it is important to separate within an utterance (each speaker’s turn) what has to be interpreted as the actual request from what does not need an answer. halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks to ask

Annotation of speech acts from the QALL-ME benchmark • As a starting point for further analyses, it is important to separate within an utterance (each speaker’s turn) what has to be interpreted as the actual request from what does not need an answer. halloI am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks to thank

Annotation of speech acts UTTERANCE NONREQUESTS All the utterances used by the speaker to introduce himself, to contextualize himself or his request in time and space, to thank, to greet. • DIRECT: • wh-questions • Introduced by: • Could you tell me… • May I know… • pronounced with ascendant intonation • INDIRECT: • requests formulated in indirect or implicit ways ASSERT THANKS GREETINGS OTHER For our purposes, we used CLaRK, an XML Based System for Corpora Development (http://www.bultreebank.org/clark/index.html). REQUESTS

Agreement (speech acts) Inter-annotator agreement (calculated on 1000 randomly picked sentences) for ITALIAN: Dice coefficient = 2C/(A+B) C=number of common annotations A , B =number of annotations provided by the first and the second annotator

Expected Answer Type Extracted from Graesser’s (1988) taxonomy • DOMAIN-INDEPENDENT (SEKINE’S ENE HIERARCHY) • DOMAIN-SPECIFIC (QALL-ME ONTOLOGY) For EAT annotation we propose the following scheme: EAT PROCEDURAL VERIFICATION FACTOID DEFINITION/DESCRIPTION

Sekine’s ENE vs Qall-me ont. what is the restaurant in via Brennero in Trento  EAT Sekine’s ENE hierarchy Qall-me ontology

Sekine’s ENE vs Qall-me ont. can you give me the name of the pharmacy in piazza Pasi 20 in Trento  EAT Sekine’s ENE hierarchy Qall-me ontology

Annotation of Relations • At what time is the movie il grande capo beginning tomorrow afternoon at • Vittoria cinema • Rel1 (MOVIE, DATE) • Rel2 (MOVIE, STARTINGHOUR) • Rel3 (MOVIE, CINEMA) • 10% of the Italian questions (referring to Cinema/Movie domain) have been annotated with the 12 relations holding in such domain (Qall-me ontology). • Relations among entities: convey and complete the context in which a specific request has to be interpreted

Status of the benchmark Present situation and tentative scheduling: The QALL-ME benchmark is being made incrementally available at the project website(http://qallme.fbk.eu)

Future work Additional annotation layers will be considered: • Focus of the question • Multiwords • Named Entities • Normalized Temporal Expressions • …

Conclusions • QALL-ME benchmark: multilingual resource (for Italian, Spanish, English and German) of annotated spoken requests in the tourism domain. • Beyond factoid • Context aware QA and dynamic changes • QA annotation on spoken requests • Portability of semantic annotation • Reference resource, useful to train and test ML based QA systems

Thank you {cabrio, kouylekov, magnini, negri}@fbk.eu {L.Hasler, c.orasan}@wlv.ac.uk {tomas, vicedo}@disi.ua.es {neumann, cowe01}@dfki.de Project website: http://qallme.fbk.eu

Acquisition scenarios SubDomain DesiredOutput MandatoryItems OptionalItems

Example from the corpus <question id="3118"> <text>buongiorno chiamo da Trento avrei bisogno dell'indirizzo del teatro Auditorium per un concerto di Salvatore Accardo del 17 gennaio 2007</text> <analysis> <greetings>buongiorno</greetings> <assert>chiamo da Trento</assert> <indirect>avrei bisogno dell'indirizzo del teatro Auditorium per un concerto di Salvatore Accardo del 17 gennaio 2007</indirect></analysis> <reference> <ref> <speaker>spk075_27mar07comd_it_sid023</speaker> <turn>6</turn> <originalString> buongiorno chiamo da Trento ho [mmm] avrei bisogno dell'indirizzo del teatro Auditorium per un [eh] concerto di Salvatore Accardo del 17 gennaio 2007 [b] </originalString> </ref> </reference> <translation>good morning I am calling from Trento I would like to know the address of Auditorium theatre for Salvatore Accardo's concert on 17th January 2007</translation> </question>

Expected Answer Type (1) The semantic category associated to the desired answer, chosen out of a predefined set of labels (e.g. PERSON, LOCATION, DATE). • How many colors are in the Italian flag QUANTITY • Where is the Uffizi museum LOCATION Most QA systems described in literature heavily rely on EAT information, at least in the Answer Extraction phase, to narrow the potential answer candidate search space.

Example from the corpus • What are the address and the telephone number of Venezia hotel in Trento • <eats> • <EAT type= “FACTOID” sekine=“ADDRESS_OTHER” qallme=“PostalAddress” eaq= “one”/> • <EAT type= “FACTOID” sekine=“ADDRESS_OTHER” qallme=“Contact” eaq= “one”/></eats>

Expected Answer Quantifier Attribute of the EAT that specifies the number of expected items in the answer. • I would like to know the three colors of the Italian flag • which movies are on tonight at Multisala Modena all The possible values are: one, at least one, all, n.

The QALL-ME Benchmark: a Multilingual Resource of Annotated Spoken Requests for Question Answering