Vocabulary Matching for Book Indexing
1 / 31

- PowerPoint PPT Presentation

  • Updated On :

Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation. Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel. Problem: subject indexing. Describing subjects of books

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - Sharon_Dale

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Vocabulary Matching for Book IndexingSuggestion in Linked Libraries – A PrototypeImplementation & Evaluation

Antoine Isaac, Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel

Problem subject indexing l.jpg
Problem: subject indexing

  • Describing subjects of books

  • Using concepts from vocabularies (e.g. thesauri)

Problem re indexing l.jpg
Problem: re-indexing

  • Describing a book that has already be described

  • With a new vocabulary

    • Fitting a different context (e.g., different libraries)

Why re indexing at kb l.jpg
Why re-indexing at KB?

  • The Dutch National Library (KB) holds many books that are also in other Dutch public libraries

  • KB deposit uses Brinkman thesaurus for indexing

  • Public Libraries use Biblion thesaurus

A wider issue l.jpg
A wider issue

  • KB shares books with many other libraries

  • All having their own description practices

Room for improvement l.jpg
Room for improvement?

  • Libraries devote large resources to indexing

    • 20 people at KB

    • About 20,000 books per year

  • Leveraging already existing descriptions for re-indexing can be beneficial for both sides

Alignment and re indexing l.jpg
Alignment and re-indexing

  • STITCH project

    • Tackling semantic interoperability in Cultural Heritage

    • Using ontology alignment

  • Mappings between concepts from different vocabularies can be used for re-indexing

    Basic idea: replace concepts in descriptions

    by conceptually equivalent concepts

Goal a re indexing prototype l.jpg
Goal: a re-indexing prototype

  • Past: preliminary experiments with KB data

  • Now: building a prototype and

    • plugging it onto the KB production system

    • having it evaluated by its potential users (indexers)

  • Prototype case: Dutch public libraries / KB

    Suggesting Brinkman subjects based on Biblion ones

Alignment and re indexing requirements l.jpg
Alignment and re-indexing: requirements

Subjects can be complex

  • Mappings between groups of concepts

    "Travel guides" + "Spain" → "Spain; travel guides"

    Concepts are used in descriptions

  • Mappings taking into account extensional semantics

    "Building engineering"

    → "Learning material ; building engineering"

Obtaining re indexing rules l.jpg
Obtaining re-indexing rules

  • Lexical alignments are not good enough

  • Probabilistic rules are calculated

    • Using extension of concepts: existing indexing

    • Simple probabilities, with adhoc adjustment

      "Travel guides","Spain"→"Spain; travel guides", 0.982

  • Not only based on Biblion subjects

    • AUT – main authors of books

    • KAR – “characteristic”

    • DGP – intellectual level/target group

Slide11 l.jpg

Doesn't work?

User study l.jpg
User study

  • Quantitative aspect

    • How well does the tool compare to human subject indexing?

  • Qualitative aspect

    • User satisfaction

    • Improvement suggestion

Evaluation setting l.jpg
Evaluation setting

  • 6 indexers

  • 6 weeks

  • 284 books

  • Evaluation integrated in daily indexing work

  • Pre-evaluation briefing

  • Questionnaire during evaluation

  • Post-evaluation de-briefing & questionnaire

User study results l.jpg
User study results

  • Top ranked mappings are indeed much better

  • Individual book satisfaction level > 70%

User study results 1 l.jpg
User study results (1)

  • But the general satisfaction is lower

    • Only two out of six would use the tool as such

  • Quality of suggestions

    • Lower-level suggestions are often not meaningful

  • Perception of suggestions' quality

    • Long lists with wrong suggestions ad the end are bad

    • Ranking is appreciated, but it is not enough

User study results 2 l.jpg
User study results (2)

Suggestions were found promising

  • Bridging the indexing gap between collections

    • Different indexing strategies

      "Persian language" (Biblion)

      vs. "Iranian language and literature" (Brinkman)

      Lots of suggestions for improvement

  • More re-indexing!

    • Suggesting concepts from other vocabularies

    • More context metadata as input

Conclusions l.jpg

  • Shows the potential of re-using data in a library network

  • Alignment approach fitting indexing practice

  • Concrete demonstration, in KB production environment

  • Technology transfer: KB wants to continue efforts

  • Flexibility: architecture ready to exploit other vocabularies

    • Linked data & SKOS

Thank you l.jpg
Thank you!

  • Questions?