Vocabulary Matching for Book Indexing
1 / 31

Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation. Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel. Problem: subject indexing. Describing subjects of books

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentationdownload

Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Antoine isaac dirk kramer lourens van der meij shenghui wang stefan schlobach johan stapel

Vocabulary Matching for Book IndexingSuggestion in Linked Libraries – A PrototypeImplementation & Evaluation

Antoine Isaac, Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel

Problem subject indexing

Problem: subject indexing

  • Describing subjects of books

  • Using concepts from vocabularies (e.g. thesauri)

Problem re indexing

Problem: re-indexing

  • Describing a book that has already be described

  • With a new vocabulary

    • Fitting a different context (e.g., different libraries)

Why re indexing at kb

Why re-indexing at KB?

  • The Dutch National Library (KB) holds many books that are also in other Dutch public libraries

  • KB deposit uses Brinkman thesaurus for indexing

  • Public Libraries use Biblion thesaurus

A wider issue

A wider issue

  • KB shares books with many other libraries

  • All having their own description practices

Room for improvement

Room for improvement?

  • Libraries devote large resources to indexing

    • 20 people at KB

    • About 20,000 books per year

  • Leveraging already existing descriptions for re-indexing can be beneficial for both sides

Alignment and re indexing

Alignment and re-indexing

  • STITCH project

    • Tackling semantic interoperability in Cultural Heritage

    • Using ontology alignment

  • Mappings between concepts from different vocabularies can be used for re-indexing

    Basic idea: replace concepts in descriptions

    by conceptually equivalent concepts

Goal a re indexing prototype

Goal: a re-indexing prototype

  • Past: preliminary experiments with KB data

  • Now: building a prototype and

    • plugging it onto the KB production system

    • having it evaluated by its potential users (indexers)

  • Prototype case: Dutch public libraries / KB

    Suggesting Brinkman subjects based on Biblion ones

Alignment and re indexing requirements

Alignment and re-indexing: requirements

Subjects can be complex

  • Mappings between groups of concepts

    "Travel guides" + "Spain" → "Spain; travel guides"

    Concepts are used in descriptions

  • Mappings taking into account extensional semantics

    "Building engineering"

    → "Learning material ; building engineering"

Obtaining re indexing rules

Obtaining re-indexing rules

  • Lexical alignments are not good enough

  • Probabilistic rules are calculated

    • Using extension of concepts: existing indexing

    • Simple probabilities, with adhoc adjustment

      "Travel guides","Spain"→"Spain; travel guides", 0.982

  • Not only based on Biblion subjects

    • AUT – main authors of books

    • KAR – “characteristic”

    • DGP – intellectual level/target group

Antoine isaac dirk kramer lourens van der meij shenghui wang stefan schlobach johan stapel


Doesn't work?

User study

User study

  • Quantitative aspect

    • How well does the tool compare to human subject indexing?

  • Qualitative aspect

    • User satisfaction

    • Improvement suggestion

Evaluation setting

Evaluation setting

  • 6 indexers

  • 6 weeks

  • 284 books

  • Evaluation integrated in daily indexing work

  • Pre-evaluation briefing

  • Questionnaire during evaluation

  • Post-evaluation de-briefing & questionnaire

User study results

User study results

  • Top ranked mappings are indeed much better

  • Individual book satisfaction level > 70%

User study results 1

User study results (1)

  • But the general satisfaction is lower

    • Only two out of six would use the tool as such

  • Quality of suggestions

    • Lower-level suggestions are often not meaningful

  • Perception of suggestions' quality

    • Long lists with wrong suggestions ad the end are bad

    • Ranking is appreciated, but it is not enough

User study results 2

User study results (2)

Suggestions were found promising

  • Bridging the indexing gap between collections

    • Different indexing strategies

      "Persian language" (Biblion)

      vs. "Iranian language and literature" (Brinkman)

      Lots of suggestions for improvement

  • More re-indexing!

    • Suggesting concepts from other vocabularies

    • More context metadata as input



  • Shows the potential of re-using data in a library network

  • Alignment approach fitting indexing practice

  • Concrete demonstration, in KB production environment

  • Technology transfer: KB wants to continue efforts

  • Flexibility: architecture ready to exploit other vocabularies

    • Linked data & SKOS

Prototype components

Prototype components

Linked libraries

Linked libraries?

Thank you

Thank you!

  • Questions?



Winibw production tool

WinIBW production tool

Stitch suggestion tool

STITCH suggestion tool

Original metadata

Original metadata

Concept suggestions

Concept suggestions

Comparing with human re indexing

Comparing with human re-indexing

Complement lexical alignments

Complement: lexical alignments

Adding subjects using thesaurus access

Adding subjects using thesaurus access

Concept suggestions1

Concept suggestions

Saving and back to winibw

Saving and back to WinIBW



  • Back

  • Login