1 / 16

Implementation of a QA system in a real context

Implementation of a QA system in a real context. Carlos Amaral (Priberam, Portugal) Dominique Laurent (Synapse Développement, France). Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent. 1. The Question-Answering system What is a QA System ?

denise
Download Presentation

Implementation of a QA system in a real context

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementation of a QA system in a real context Carlos Amaral (Priberam, Portugal) Dominique Laurent (Synapse Développement, France) Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  2. 1. The Question-Answering system • What is a QA System ? • System that enables the extraction of an answer (or several) to a request (a question) based on a corpus • The problematic of « the type of the question » • An answer or several, possibly a list from one or several documents, an answer of the type Yes/No…, • On a corpus in one or several languages… Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  3. 1.1. QA and Language Processing • A QA system appears to be a LP « par excellence » • However, certain systems are uniquely based on pattern matching (cf Soubotine & Soubotine, TREC 2003), • These systems seems to have reached their limits • And, if they can process all what is factual, the complex questions/queries are far beyond their possibility. • The best systems validated at TREC and CLEF are based on Automated Language Processing. Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  4. 1.2. OUR QA SYSTEM • First developed (1999 - 2001) within a French innovation project (Anvar) • Then (end 2001- end 2003) within the European project TRUST (FP5) • Currently, (2005/06) within the European project M-CAST (FP6) • Main features : targets B2B and B2C, multilingual, NLP based and intensive. Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  5. A modular conception Italian Language Module Portuguese Language Module Polish Language Module Czech Language Module English Language Module French Language Module Indexation engine Extraction of text engine Documents Visualization of Results Index Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  6. Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  7. 1.3. Evaluations of the QA system • Professional benchmarking contests and campaigns such as EQueR (2004) and CLEF (2005 & 2006), • Evaluations for the French, English, Portuguese and Spanish language modules, in monolingual and multilingual. Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  8. CLEF 2005 Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  9. CLEF 2006 Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  10. In CLEF 2005 and CLEF 2006, the best engines for monolingual were our systems for Portuguese and French. And the best systems for multilingual were our systems for English-French, Portuguese-French, Spanish-Portuguese, Portuguese-Spanish. • Synapse Développement and Priberam are now partners of the project Quaero. Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  11. 2. Implementation in M-CAST Project • Tests carried-out on books in the National Czech library and the Torun library in Poland, • Processing several millions of digitized documents, • Manages meta-data and UDC classification, • Accommodates questions and answers in English, French, Italian, Portuguese, Polish, Czech • Implemented on both library portals Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  12. 2.1. Adaptation to Digital Libraries Resources • Scanned texts : poor quality • > Spell checker to improve the quality of documents. • One book, lots of pages : • > Management of multi-part documents during semantic analysis Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  13. 2.2. Integration of Dublin Core document’s attributes • Storage of Dublin Core attributes as Metadata • QA : Who is the author of Hamlet ? • Adaptation of the system to search in metadata • Use of those metadata as filters Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  14. 2.3. Universal Decimal Classification • Storage of UDC codes for each document • Search through UDC codes • Filtering through UDC codes • Semantic disambigation through UDC codes Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

  15. Technical architecture

  16. ENDof Presentation I would appreciate your questions ! Thank you - Merci ! Workshop TellMeMore, November 24, 2006, C.Amaral, D.Laurent

More Related