Cadial search engine at inex
This presentation is the property of its rightful owner.
Sponsored Links
1 / 11

CADIAL search engine at INEX PowerPoint PPT Presentation


  • 45 Views
  • Uploaded on
  • Presentation posted in: General

CADIAL search engine at INEX. Jure Mijić 1 , Marie-Francine Moens 2 , Bojana Dalbelo Bašić 1 1 Faculty of Electrical Engineering and Computing [email protected], [email protected] 2 Department of Computer Science, Katholieke Universiteit Leuven sien.moens @ cs.kuleuven.be

Download Presentation

CADIAL search engine at INEX

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cadial search engine at inex

CADIAL search engine at INEX

ITI2008Cavtat2008-06-25

Jure Mijić1, Marie-Francine Moens2, Bojana Dalbelo Bašić1

1Faculty of Electrical Engineering and Computing

[email protected], [email protected]

2Department of Computer Science, Katholieke Universiteit Leuven

[email protected]

INEX 2008Schloss Dagstuhl Conference Center, Wadern, Germany2008-12-16


Presentation overview

Presentation overview

INEX 2008Dagstuhl2008-12-16

  • What is CADIAL project?

  • System overview

  • Ranking model

  • Ad hoc results

  • Conclusion

  • Future work


What is cadial project

What is CADIAL project?

INEX 2008Dagstuhl2008-12-16

  • Bilateral project between the Government of Flanders and the Ministry of Science, Education and Sports of the Republic of Croatia

  • Aims of the CADIAL project:

    • Provide access to a collection of Croatian legislative documents

    • Enable the use of the Eurovoc thesaurus, an EU standard thesaurus for document indexing and retrieval


System overview

System overview

INEX 2008Dagstuhl2008-12-16

  • Built with expandability in mind

  • Supports multiple information retrieval models

  • Supports morphological normalization modules

  • An indexer tool is used for document indexing

    • Input documents are in XML format

    • Output is an index database (a base structure for every search engine model)‏

    • Index database is upgraded with additional data required by the model (various statistical information)‏


Ranking model

Ranking model

INEX 2008Dagstuhl2008-12-16

  • Language model

    • Element priors based on element location and depth

    • Smoothing on document and collection level

  • Additional features

    • Support for CAS queries

    • Support for +/- keyword operators

    • Simple overlapping element removal

    • Stemming


Ad hoc results

Ad hoc results

INEX 2008Dagstuhl2008-12-16

  • Our runs:

    • Three CO runs

      • One returning only documents

      • Two returning elements

    • Three CAS runs with various smoothing factors


Ad hoc results1

Ad hoc results

INEX 2008Dagstuhl2008-12-16


Conclusion

Conclusion

INEX 2008Dagstuhl2008-12-16

  • Retrieving whole documents performed better than element retrieval at higher levels of recall

  • CAS queries performed slightly better that CO queries

  • Higher smoothing at the document level contributed to better performance


Future work

Future work

INEX 2008Dagstuhl2008-12-16

  • Other smoothing techniques

  • Pseudo relevance feedback

  • Incorporating link evidence

  • Information extraction methods


The end

The End

INEX 2008Dagstuhl2008-12-16

Thank you


Language model

Language model

INEX 2008Dagstuhl2008-12-16


  • Login