Vocabulary Matching for Book Indexing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 31

Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation. Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel. Problem: subject indexing. Describing subjects of books

Download Presentation

Antoine Isaac , Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Antoine isaac dirk kramer lourens van der meij shenghui wang stefan schlobach johan stapel

Vocabulary Matching for Book IndexingSuggestion in Linked Libraries – A PrototypeImplementation & Evaluation

Antoine Isaac, Dirk Kramer, Lourens van der Meij, Shenghui Wang, Stefan Schlobach, Johan Stapel


Problem subject indexing

Problem: subject indexing

  • Describing subjects of books

  • Using concepts from vocabularies (e.g. thesauri)


Problem re indexing

Problem: re-indexing

  • Describing a book that has already be described

  • With a new vocabulary

    • Fitting a different context (e.g., different libraries)


Why re indexing at kb

Why re-indexing at KB?

  • The Dutch National Library (KB) holds many books that are also in other Dutch public libraries

  • KB deposit uses Brinkman thesaurus for indexing

  • Public Libraries use Biblion thesaurus


A wider issue

A wider issue

  • KB shares books with many other libraries

  • All having their own description practices


Room for improvement

Room for improvement?

  • Libraries devote large resources to indexing

    • 20 people at KB

    • About 20,000 books per year

  • Leveraging already existing descriptions for re-indexing can be beneficial for both sides


Alignment and re indexing

Alignment and re-indexing

  • STITCH project

    • Tackling semantic interoperability in Cultural Heritage

    • Using ontology alignment

  • Mappings between concepts from different vocabularies can be used for re-indexing

    Basic idea: replace concepts in descriptions

    by conceptually equivalent concepts


Goal a re indexing prototype

Goal: a re-indexing prototype

  • Past: preliminary experiments with KB data

  • Now: building a prototype and

    • plugging it onto the KB production system

    • having it evaluated by its potential users (indexers)

  • Prototype case: Dutch public libraries / KB

    Suggesting Brinkman subjects based on Biblion ones


Alignment and re indexing requirements

Alignment and re-indexing: requirements

Subjects can be complex

  • Mappings between groups of concepts

    "Travel guides" + "Spain" → "Spain; travel guides"

    Concepts are used in descriptions

  • Mappings taking into account extensional semantics

    "Building engineering"

    → "Learning material ; building engineering"


Obtaining re indexing rules

Obtaining re-indexing rules

  • Lexical alignments are not good enough

  • Probabilistic rules are calculated

    • Using extension of concepts: existing indexing

    • Simple probabilities, with adhoc adjustment

      "Travel guides","Spain"→"Spain; travel guides", 0.982

  • Not only based on Biblion subjects

    • AUT – main authors of books

    • KAR – “characteristic”

    • DGP – intellectual level/target group


Antoine isaac dirk kramer lourens van der meij shenghui wang stefan schlobach johan stapel

Demo

Doesn't work?


User study

User study

  • Quantitative aspect

    • How well does the tool compare to human subject indexing?

  • Qualitative aspect

    • User satisfaction

    • Improvement suggestion


Evaluation setting

Evaluation setting

  • 6 indexers

  • 6 weeks

  • 284 books

  • Evaluation integrated in daily indexing work

  • Pre-evaluation briefing

  • Questionnaire during evaluation

  • Post-evaluation de-briefing & questionnaire


User study results

User study results

  • Top ranked mappings are indeed much better

  • Individual book satisfaction level > 70%


User study results 1

User study results (1)

  • But the general satisfaction is lower

    • Only two out of six would use the tool as such

  • Quality of suggestions

    • Lower-level suggestions are often not meaningful

  • Perception of suggestions' quality

    • Long lists with wrong suggestions ad the end are bad

    • Ranking is appreciated, but it is not enough


User study results 2

User study results (2)

Suggestions were found promising

  • Bridging the indexing gap between collections

    • Different indexing strategies

      "Persian language" (Biblion)

      vs. "Iranian language and literature" (Brinkman)

      Lots of suggestions for improvement

  • More re-indexing!

    • Suggesting concepts from other vocabularies

    • More context metadata as input


Conclusions

Conclusions

  • Shows the potential of re-using data in a library network

  • Alignment approach fitting indexing practice

  • Concrete demonstration, in KB production environment

  • Technology transfer: KB wants to continue efforts

  • Flexibility: architecture ready to exploit other vocabularies

    • Linked data & SKOS


Prototype components

Prototype components


Linked libraries

Linked libraries?


Thank you

Thank you!

  • Questions?


Screenshots

Screenshots


Winibw production tool

WinIBW production tool


Stitch suggestion tool

STITCH suggestion tool


Original metadata

Original metadata


Concept suggestions

Concept suggestions


Comparing with human re indexing

Comparing with human re-indexing


Complement lexical alignments

Complement: lexical alignments


Adding subjects using thesaurus access

Adding subjects using thesaurus access


Concept suggestions1

Concept suggestions


Saving and back to winibw

Saving and back to WinIBW


Screenshots1

Screenshots

  • Back


  • Login