Vocabulary Matching for Book Indexing
Sponsored Links
This presentation is the property of its rightful owner.
1 / 31

Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel PowerPoint PPT Presentation


  • 87 Views
  • Uploaded on
  • Presentation posted in: General

Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation. Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel. Problem: subject indexing. Describing subjects of books

Download Presentation

Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Vocabulary Matching for Book IndexingSuggestion in Linked Libraries – A PrototypeImplementation & Evaluation

Antoine Isaac, Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel


Problem: subject indexing

  • Describing subjects of books

  • Using concepts from vocabularies (e.g. thesauri)


Problem: re-indexing

  • Describing a book that has already be described

  • With a new vocabulary

    • Fitting a different context (e.g., different libraries)


Why re-indexing at KB?

  • The Dutch National Library (KB) holds many books that are also in other Dutch public libraries

  • KB deposit uses Brinkman thesaurus for indexing

  • Public Libraries use Biblion thesaurus


A wider issue

  • KB shares books with many other libraries

  • All having their own description practices


Room for improvement?

  • Libraries devote large resources to indexing

    • 20 people at KB

    • About 20,000 books per year

  • Leveraging already existing descriptions for re-indexing can be beneficial for both sides


Alignment and re-indexing

  • STITCH project

    • Tackling semantic interoperability in Cultural Heritage

    • Using ontology alignment

  • Mappings between concepts from different vocabularies can be used for re-indexing

    Basic idea: replace concepts in descriptions

    by conceptually equivalent concepts


Goal: a re-indexing prototype

  • Past: preliminary experiments with KB data

  • Now: building a prototype and

    • plugging it onto the KB production system

    • having it evaluated by its potential users (indexers)

  • Prototype case: Dutch public libraries / KB

    Suggesting Brinkman subjects based on Biblion ones


Alignment and re-indexing: requirements

Subjects can be complex

  • Mappings between groups of concepts

    "Travel guides" + "Spain" → "Spain; travel guides"

    Concepts are used in descriptions

  • Mappings taking into account extensional semantics

    "Building engineering"

    → "Learning material ; building engineering"


Obtaining re-indexing rules

  • Lexical alignments are not good enough

  • Probabilistic rules are calculated

    • Using extension of concepts: existing indexing

    • Simple probabilities, with adhoc adjustment

      "Travel guides","Spain"→"Spain; travel guides", 0.982

  • Not only based on Biblion subjects

    • AUT – main authors of books

    • KAR – “characteristic”

    • DGP – intellectual level/target group


Demo

Doesn't work?


User study

  • Quantitative aspect

    • How well does the tool compare to human subject indexing?

  • Qualitative aspect

    • User satisfaction

    • Improvement suggestion


Evaluation setting

  • 6 indexers

  • 6 weeks

  • 284 books

  • Evaluation integrated in daily indexing work

  • Pre-evaluation briefing

  • Questionnaire during evaluation

  • Post-evaluation de-briefing & questionnaire


User study results

  • Top ranked mappings are indeed much better

  • Individual book satisfaction level > 70%


User study results (1)

  • But the general satisfaction is lower

    • Only two out of six would use the tool as such

  • Quality of suggestions

    • Lower-level suggestions are often not meaningful

  • Perception of suggestions' quality

    • Long lists with wrong suggestions ad the end are bad

    • Ranking is appreciated, but it is not enough


User study results (2)

Suggestions were found promising

  • Bridging the indexing gap between collections

    • Different indexing strategies

      "Persian language" (Biblion)

      vs. "Iranian language and literature" (Brinkman)

      Lots of suggestions for improvement

  • More re-indexing!

    • Suggesting concepts from other vocabularies

    • More context metadata as input


Conclusions

  • Shows the potential of re-using data in a library network

  • Alignment approach fitting indexing practice

  • Concrete demonstration, in KB production environment

  • Technology transfer: KB wants to continue efforts

  • Flexibility: architecture ready to exploit other vocabularies

    • Linked data & SKOS


Prototype components


Linked libraries?


Thank you!

  • Questions?


Screenshots


WinIBW production tool


STITCH suggestion tool


Original metadata


Concept suggestions


Comparing with human re-indexing


Complement: lexical alignments


Adding subjects using thesaurus access


Concept suggestions


Saving and back to WinIBW


Screenshots

  • Back


  • Login