slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Three-level approach for Passage Retrieval in Arabic Question/Answering Systems PowerPoint Presentation
Download Presentation
Three-level approach for Passage Retrieval in Arabic Question/Answering Systems

Loading in 2 Seconds...

play fullscreen
1 / 30

Three-level approach for Passage Retrieval in Arabic Question/Answering Systems - PowerPoint PPT Presentation


  • 169 Views
  • Uploaded on

The 3rd International Conference on Arabic Natural Language Processing . Three-level approach for Passage Retrieval in Arabic Question/Answering Systems. Lahsen Abouenour 1 , Karim Bouzoubaa 1 , Paolo Rosso 2. Mohammadia School of Engineers, Rabat, Morocco - May 2009.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Three-level approach for Passage Retrieval in Arabic Question/Answering Systems' - Samuel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

The 3rd International Conference on

Arabic Natural Language Processing

Three-level approach for Passage Retrieval

in Arabic Question/Answering Systems

Lahsen Abouenour1, Karim Bouzoubaa1, Paolo Rosso2

Mohammadia School of Engineers, Rabat, Morocco - May 2009

slide2

Arabic Question/Answering Systems

Classical IR

User Query (keywords)

2

1

List of documents/links

?

User Checking

3

Answer to

User Query

4

???

slide3

Arabic Question/Answering Systems

Question/Answering

User Query (question = keywords+structure)

1

?

List of documents/links

2

User Checking

Answer to

User Query

3

???

slide4

Arabic Question/Answering Systems

Existing Arabic Q/A Systems

  • QARAB (based on Al-Raya corpus)
  • AQAS (extract answers from only structured texts)
  • ArabiQA (deal with factoid questions, embeds NER module )
  • QASAL (semi-automatic Q/A system for factoid questions)

Three Modules

Question

Analysis

Passage

Retrieval

Answer

Extraction

Question type

Candidate passage

Answer identification

Keywords

Passage ranking

Answer construction

Named Entities

slide5

Arabic Question/Answering Systems

Challenges of Arabic Q/A Systems

  • short vowels,
  • absence of capital letters,
  • complex morphology,
  • etc.
slide6

Arabic Question/Answering Systems

Question/Answering

User Query (question = keywords+structure)

1

Natural Language (أين توجد مدينة مراكش ؟ | Where is the city of Marrakech ?)

-- Keywords : Where | is | the | city | of | Marrakech

أين| توجد| مدينة | مراكش

?

-- Structure :

أين توجد مدينةمراكش ؟

Where isthecity of Marrakech ?

IsMarrakechacity?

هلمراكشمدينة ؟

slide7

Arabic Question/Answering Systems

Question/Answering

Passage Retrieval

(أين توجدمدينةمراكش ؟ | Where is the city of Marrakech ?)

2

Passage 1

Xxxxx مراكش (Marrakech)xxxxxx xx xxx xxxx

Xx xxx xxxxx xxx xxxx xxx xxxx

Xxxxxمدينة (city) xxxxx xx xxx توجد (exist in) xxx

No answer

Passage N

المغرب (Morroco) xxx مراكشإقليميوجد (the region of marrakech exists in) xxx Xx xxx xxxxx xxx xxxx xxx xxxx

Xxxxx xx xxxxx xx xxx xx xxx

The answer

slide8

Arabic Question/Answering Systems

Question/Answering

Passage Retrieval

(أين توجدمدينةمراكش ؟ | Where is the city of Marrakech ?)

2

Passage 1

Xxxxx مراكش (Marrakech) xxxxxx xx xxx xxxx Xx xxx xxxxx xxx xxxx xxx xxxx

Xxxxxمدينة (city) xxxxx xx xxx توجد (exist in) xxx

(Is in | Marrakech | city)

توجد | مراكش | مدينة

Morphological

relation

hyponymy/semantic

relation

Passage N

المغرب (Morroco) xxx مراكشإقليميوجد (the region of marrakech exists in) xxx Xx xxx xxxxx xxx xxxx xxx xxxx

Xxxxx xx xxxxx xx xxx xx xxx

يوجد | مراكش | إقليم

(Is in | Marrakech | city)

slide9

Arabic Question/Answering Systems

Question/Answering

Passage Retrieval

(أين توجدمدينةمراكش ؟ | Where is the city of Marrakech ?)

2

Passage 1

Passage N

Xxxxx مراكش xxxxxx xx xxx xxxx

Xx xxx xxxxx xxx xxxx xxx xxxx

Xxxxxمدينة xxxxx xx xxx توجد xxx

المغرب xxx مراكشإقليميوجدxxx

Xx xxx xxxxx xxx xxxx xxx xxxx

Xxxxx xx xxxxx xx xxx xx xxx

Vs

???

With respect to Morphological and Semantic Relation

relevance(P1)=relevance(PN)

What about the question structure ?

slide10

Arabic Question/Answering Systems

Question/Answering

Passage Retrieval

(أين توجدمدينةمراكش ؟ | Where is the city of Marrakech ?)

2

Expected Answer:

Question:

أين توجدمدينةمراكش ؟

توجدمدينةمراكش في

(The city of Marrakech is in …)

(Where is the city of Marrakech ?)

Passage 1 structures

Passage N structures

slide11

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Levels

Semantic Query Expansion (extending the list of keywords related

to the user question)

Keyword-based level (candidate passages with related keywords)

Structure-based level (candidate passages with related structure)

Semantic reasoning level (comparing CG representations)

slide12

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Resources & Tools

Semantic Query Expansion (Arabic WordNet, Amine Plateform)

Keyword-based PR (Yahoo API)

Structure-based PR (The Java Information Retrieval System - JIRS)

Semantic reasoning level (Amine Plateform)

slide13

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Semantic Query Expansion

Ontology

  • AWN is a free Lexical resource
  • AWN containsOver than 20 000 arabic words grouped into synsets
  • AWN is connected with the SUMO (Suggested Upper Merged Ontology)
  • SUMO has about 2000 general concept
  • SUMOMany relations between concepts (hyponymy, hypernymy, ...)
slide14

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Semantic Query Expansion

Amine Platform

  • Amine is a multi-layer platform dedicated to the development
  • of Intelligent Systems and Multi-Agents Systems
  • - Amine is an Open Source Platform
  • - Amine is 100 % Java implementation
  • - Amine provides a set of operations related to Ontologies
slide15

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Semantic Query Expansion

Arabic WordNet

Temporary

DataBase(MySQL)

Content

Structure

Link with SUMO

Amine

Platform API

JAVA Program

Amine AWN ontology

slide16

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Semantic Query Expansion

slide17

Concept/Term

Global Expansion

Morphological Expansion

AAWN Ontology Expansion

1 - By synonyms

2 – By supertypes

3 – By definition

4 – By subtypes

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Semantic Query Expansion

slide18

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Presentation

Structure-based PR

The Java Information Retrieval System (JIRS)

  • a language-independent PR system
  • adpated for many non-agglutinative European languages (English, French, Spanish, Italian, ...)
  • adapted for the Arabic language
  • re-ranking of the retrieved passages is based on a distance density n-gram model

URL : http://sourceforge.net/projects/jirs/

slide19

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Evaluation Process

CLEF Questions

TREC Questions

1 - Manual Process

2 - Automatic Process

Google

Semantic QE

Yahoo

Semantic QE

JIRS

Semantic QE

JIRS

Google

Yahoo

Keyword-based

Structure-based

slide20

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Evaluation Process

The Questions

  • a set of 82 of the CLEF and TREC questions
  • facoid questions seeking for NE
  • significant coverage : questions classified into different domains
slide21

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Evaluation Process

Keyword-based evaluation

 Accuracy and MRR have been improved after using semantic QE

slide22

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Evaluation Process

Structure-based evaluation

 Accuracy and MRR have been improved after using semantic QE

 Compared to the keyword-based PR, the structure-based PR gives

The best Accuracy and MRR

slide23

Arabic Question/Answering Systems

Our Passage Retrieval Approach : Evaluation Process

Summarize

Yes

No

Semantic Query Expansion

Acc. 1,22%

MRR 0,99

Acc. 7,32%

MRR 3,25

Keyword-based PR

Acc. 19,51%

MRR 7,85

Acc. 15,85%

MRR 5,46

Structure-based PR

slide24

Question

Expected Answer

CG-EA

Semantic score (p1)

Generalization

(CG-P1,CG-EA)

P1

sub passage

CG1

Semantic score (pi)

Generalization

(CG-Pi,CG-EA)

Pi

sub passage

CGi

Arabic Question/Answering Systems

Our Passage Retrieval Approach : The semantic reasoning level

Presentation

slide25

Arabic Question/Answering Systems

Our Passage Retrieval Approach : The semantic reasoning level

Example

TREC question: أين تقع أعلى نقطة على سطح الأرض؟

(Where is the highest point on the surface of the earth?" )

>> Using Google Search Engine

slide26

Arabic Question/Answering Systems

Our Passage Retrieval Approach : The semantic reasoning level

Example

TREC question: أين تقع أعلى نقطة على سطح الأرض؟

(Where is the highest point on the surface of the earth?" )

>> Passages Ranks after LEVEL 1 (Keyword-based) and LEVEL 2 (Structure-based)

slide27

Arabic Question/Answering Systems

Our Passage Retrieval Approach : The semantic reasoning level

Example

TREC question: أين تقع أعلى نقطة على سطح الأرض؟

(Where is the highest point on the surface of the earth?" )

The expected answer is: تقع أعلى نقطة على سطح الأرض في ...

  • CG-EA : [نقطة]-
      • -attr->[أعلى],
      • -ala->[الأرض],
      • <-agnt-[تقع]-fi->[مفهوم عام]
slide28

Arabic Question/Answering Systems

Our Passage Retrieval Approach : The semantic reasoning level

Example

TREC question: أين تقع أعلى نقطة على سطح الأرض؟

(Where is the highest point on the surface of the earth?" )

Semantic Score Formula

SemanticScore(P) = ∑(weight(ci)*β(ci,π(ci)))/ ∑(weight(ci)

ci  C

slide29

Conclusion & Future Work

  • The keyword-based and structure-based levels of our Arabic PR approach have improved the Accuracy and the MRR in the context of Q/A systems
  • A semantic reasoning level on top of the first and second levels could impove even more the reached performances
  • Covering all CLEF and TREC questions
  • Automating the semantic reasoning level module
  • Conducting corresponding experiments
  • Integrating more enriched releases of Arabic WordNet