Cadial search engine at inex
1 / 11

CADIAL search engine at INEX - PowerPoint PPT Presentation

  • Uploaded on

CADIAL search engine at INEX. Jure Mijić 1 , Marie-Francine Moens 2 , Bojana Dalbelo Bašić 1 1 Faculty of Electrical Engineering and Computing [email protected], [email protected] 2 Department of Computer Science, Katholieke Universiteit Leuven sien.moens @

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' CADIAL search engine at INEX' - drake

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Cadial search engine at inex
CADIAL search engine at INEX


Jure Mijić1, Marie-Francine Moens2, Bojana Dalbelo Bašić1

1Faculty of Electrical Engineering and Computing

[email protected], [email protected]

2Department of Computer Science, Katholieke Universiteit Leuven

[email protected]

INEX 2008Schloss Dagstuhl Conference Center, Wadern, Germany2008-12-16

Presentation overview
Presentation overview

INEX 2008Dagstuhl2008-12-16

  • What is CADIAL project?

  • System overview

  • Ranking model

  • Ad hoc results

  • Conclusion

  • Future work

What is cadial project
What is CADIAL project?

INEX 2008Dagstuhl2008-12-16

  • Bilateral project between the Government of Flanders and the Ministry of Science, Education and Sports of the Republic of Croatia

  • Aims of the CADIAL project:

    • Provide access to a collection of Croatian legislative documents

    • Enable the use of the Eurovoc thesaurus, an EU standard thesaurus for document indexing and retrieval

System overview
System overview

INEX 2008Dagstuhl2008-12-16

  • Built with expandability in mind

  • Supports multiple information retrieval models

  • Supports morphological normalization modules

  • An indexer tool is used for document indexing

    • Input documents are in XML format

    • Output is an index database (a base structure for every search engine model)‏

    • Index database is upgraded with additional data required by the model (various statistical information)‏

Ranking model
Ranking model

INEX 2008Dagstuhl2008-12-16

  • Language model

    • Element priors based on element location and depth

    • Smoothing on document and collection level

  • Additional features

    • Support for CAS queries

    • Support for +/- keyword operators

    • Simple overlapping element removal

    • Stemming

Ad hoc results
Ad hoc results

INEX 2008Dagstuhl2008-12-16

  • Our runs:

    • Three CO runs

      • One returning only documents

      • Two returning elements

    • Three CAS runs with various smoothing factors

Ad hoc results1
Ad hoc results

INEX 2008Dagstuhl2008-12-16


INEX 2008Dagstuhl2008-12-16

  • Retrieving whole documents performed better than element retrieval at higher levels of recall

  • CAS queries performed slightly better that CO queries

  • Higher smoothing at the document level contributed to better performance

Future work
Future work

INEX 2008Dagstuhl2008-12-16

  • Other smoothing techniques

  • Pseudo relevance feedback

  • Incorporating link evidence

  • Information extraction methods

The end
The End

INEX 2008Dagstuhl2008-12-16

Thank you

Language model
Language model

INEX 2008Dagstuhl2008-12-16