1 / 1

SICS @ CLEF 2004 – Interactive Xling

SICS @ CLEF 2004 – Interactive Xling Bookmarking, thesaurus , and cooperatio n in bilingual Q & A. Jussi Karlgren – jussi@sics.se Preben Hansen – preben@sics.se Magnus Sahlgren – mange@sics.se. Swedish-French Bilingual Experiments : Results. Tasks. Swedish-French Bilingual Experiments.

amelia
Download Presentation

SICS @ CLEF 2004 – Interactive Xling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SICS @ CLEF 2004 – Interactive Xling Bookmarking,thesaurus, and cooperation inbilingual Q & A Jussi Karlgren – jussi@sics.se Preben Hansen – preben@sics.se Magnus Sahlgren – mange@sics.se Swedish-French Bilingual Experiments : Results Tasks Swedish-French Bilingual Experiments • Interactive Cross-Language Question & Answering task • Bookmarking – introduce a ”save” list for a second inspection • Collaboration – Allow users to help and support each other to accomplish their task. • With or without thesaurus (term expansion) • Data-driven query translation approach: find correspondences between  terms in different languages based on their mutual occurrence in  aligned text regions. • Term weights (N is total number of documents, n(t) number of docs for term t, wt,q = 1 in experiments) • All runs morphologically preprocessed at Conexor using the Functional Dependency Grammar parser. Xling Reading • People do collaborate when given the possibility • People do need to have help with related or expanded terms Search and Inspection Interface • Experiment: • Parallell corpora ("Europarl"), aligned at sentence level. • Lemmatization using tools from Connexor. • Build bilingual vector space using Random Indexing. • Translate Swedish queries by extracting the most correlated terms in French. • Use search system "Searcher" developed at SICS. • The thesaurus component was also used for the interactive QA experiment. Experiment • 8 participants were grouped into 4 pairs • Each participant ran searches according to the i-CLEF setup matrix - 16 i-CLEF queries • Tasks were given in Swedish; users had a mono-lingual French retrieval system to work with. • Subjects allowed to communicate during the task • 2 systems: with and without term expansion • Le Monde and SDA French from 1994-1995. • Error Analysis: • Most errors are out-of-vocabulary errors.  • Proper names are problematic (e.g. "Tour de France")  • Polysemy is problematic. Topic C229: Swedish word "damm" means both "dust" (not relevant to the query) and "dam" (relevant to the query). While the term expansion technology worked well for the ad-hoc retrieval translationtask, it did not work as well in the interactive monolingual case. In the lattersituation, failing to meet user expectations of wide coverage reduced trust andhence usefulness of the tools to nothing. Conclusions • Subjects actually did collaborate during search task • Collaborative IR activities were observed in 5 categories: • Topic; Search strategies; Vocabularies, Translation; and System functionalities • Collaboration seems to correlate with performance. User pairs seemed to have similar resultsas per the task evaluation metric. This is probably notbecause of explicit aid given from one user to another - but due to meta-information such as: "can the system cope with this type of question” Collaboration during Search Task Conclusions •   The approach is very simple, but clearly viable.   • It is efficient, fast, and scalable.  • The quality of the parallel data is decisive for its performance. Number of Turns (Utterances) Query # Users

More Related