1 / 36

Crossing the boundaries: interoperability between vocabularies

Crossing the boundaries: interoperability between vocabularies Stella G Dextre Clarke Senior Metadata Consultant, Bridgeman Art Library; Independent Consultant Summary Interoperability: At the metadata schema level At the vocabulary level Practicalities of vocabulary mapping

Gabriel
Download Presentation

Crossing the boundaries: interoperability between vocabularies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Crossing the boundaries: interoperability between vocabularies Stella G Dextre Clarke Senior Metadata Consultant, Bridgeman Art Library; Independent Consultant

  2. Summary • Interoperability: • At the metadata schema level • At the vocabulary level • Practicalities of vocabulary mapping • Interoperability at the data exchange level • Standards to help us through the maze

  3. In a networked world, interoperability is all the rage • CIDOC-CRM • Web 2.0 • Mash-ups • Semantic Web (well, not quite with us yet, but said to be coming shortly) … and it’s not just about Museum A sharing with Gallery B

  4. How to achieve interoperability? • Step 1: apply a metadata schema consistently to all your records and export via a standard metadata format • Step 2: implement a metadata cross-walk e.g. The Getty crosswalk at http://www.getty.edu/research/conducting_research/standards/intrometadata/metadata_element_sets.html • So far so good – it’s not so difficult

  5. But interoperability needs to apply at two levels • Between metadata schemas, e.g: Artist → Creator → Maker Location → Place → Coverage.spatial Keywords → Subject • Between vocabulary terms, e.g: rowing boats → rowboats → pulling boats gramophone records → phonograph records garments → clothes → clothing

  6. How to achieve interoperabilityat the vocabulary level? • Step 1: apply a controlled vocabulary consistently to all your records • Step 2: implement a vocabulary cross-walk (a.k.a. set of mappings) • But ready-made crosswalks are not so easy to find; you may have to build your own, and it can be a long job…

  7. some practicalities of building and using crosswalks

  8. Sample entries from a crosswalk

  9. Building the mappings – an easy example Vocabulary A Vocabulary B Churches Churches

  10. Look a little closer. Is it so easy? Vocabulary A Vocabulary B Churches Churches NT Byzantine churches NT Anglican church Gothic churches Protestant church Norman churches Roman catholic church

  11. Another example: compare 5 different vocabularies Look for the concept “schools” in the following: • IPSV (UK public sector) • AAT (art/architecture) • GEMET (environmental) • ERIC (education) • MeSH (medical)

  12. URLs for those vocabularies • IPSV http://www.esd.org.uk/standards/ipsv/ • AAT http://www.getty.edu/research/conducting_research/vocabularies/aat/ • GEMET http://www.eionet.europa.eu/gemet • ERIC http://www.eric.ed.gov/ • MeSH http://www.nlm.nih.gov/mesh/

  13. Typical differences between vocabularies • Different term for the same concept (and same term can signify a different concept) • Hierarchical structure around the concept • Scope note, definition, synonyms and other attributes of a term/concept • Concepts designated by terms or by codes or notation • Language of access (e.g. French, German) • Layout and format

  14. More practicalities: two-way versus one-way mappings Poultry Parrots Chickens Canaries Birds Ducks Budgies Geese Vocabulary 3 Vocabulary 1 Vocabulary 2

  15. More practicalities – planning the architecture A B F C D H E G

  16. Or some people do chain mapping… A B F C D H E G P Q R S

  17. buses → coaches coaches → trainers trainers → training shoes Job vacancies → jobs Jobs → posts Posts → post post → mail Any one of the mappings could be OK in one context, but not when chained. Most howlers can be avoided, but only if you check carefully Timber → wood Wood → woods Woods → forests Firewood → logs Logs → records Records → archives But what happens with chain mapping?

  18. So best avoided… A B F C D H E G P Q R S

  19. Mapping 25 vocabularies (slide from GESIS project)

  20. A bit of practical reasoning You can’t rely on a computer to do the matching But it’s such a huge job, you can’t do it without a computer! Ergo, use a computer to suggest matches, but do a human check on each one

  21. One more practical need for interoperability • Data exchange between vocabularies and the computer applications that exploit them • Either for importing a vocabulary into an application (e.g. into a search engine or a cataloguing package) • Or to allow online interrogation of a vocabulary by a searching or indexing application • What we need are standard formats and protocols

  22. So what standards do we have? • ISO 2788, ISO 5964 and national equivalents • ANSI/NISO Z39.19 • SKOS, Zthes, ADL, MARC, SRW/SRU • BS 8723 • ISO NP 25964

  23. Vocabulary construction and management • ISO 2788-1986 Guidelines for the establishment and development of monolingual thesauri = BS 5723:1987 and other national standards • ISO 5964-1985 Guidelines for the establishment and development of multilingual thesauri = BS 6723:1985 and other national standards • ANSI/NISO Z39.19-2005 Guidelines for the construction, format and management of monolingual controlled vocabularies

  24. Vocabulary data formats only • Simple Knowledge Organization Systems (SKOS) format is in XML/RDF and destined for Semantic Web. http://www.w3.org/2004/02/skos/ • Zthes – an application profile of Z39.50, for exchange of thesaurus data. http://zthes.z3950.org/ • MARC has a format for “authority records”, suitable for library applications. at http://www.loc.gov/marc/authority/

  25. Vocabulary data protocols only • SKOS API designed for live querying of vocabularies on the Web. http://www.w3.org/2001/sw/Europe/reports/thes/skosapi.html • ADL Thesaurus Protocol for querying and navigation around monolingual thesauri on the Web. http://www.alexandria.ucsb.edu/thesaurus/specification.html • SRW/SRU (Search and Retrieve via the Web/URLs) is for a variety of search types, not just vocabularies. http://www.loc.gov/standards/sru/

  26. Vocabulary construction and management + interoperability BS 8723: Structured vocabularies for information retrieval – Guide • Part 1: Definitions, symbols and abbreviations • Part 2: Thesauri • Part 3: Vocabularies other than thesauri • Part 4: Interoperability between vocabularies • Part 5: Exchange formats and protocols for interoperability Motivation throughout is “interoperability”

  27. ISO NP 25964 (adoption of BS 8723 as an ISO standard) • The proposal to revise ISO 2788 and ISO 5964, basing the work on BS 8723, was submitted to ISO TC 46/SC 9 members in April 2007 • Project now approved • At least 9 countries participating: France, Germany, Canada, Finland, New Zealand, Sweden, UK, Ukraine, USA

  28. In conclusion • In a networked world, we need interoperability at the vocabulary level • Building the mappings is a job for people, not computers (but computer support is vital) • Mapping may not be easy, but it’s fun… for the person with the right mindset • We need to apply standards to all aspects of vocabulary work, data exchange as well as construction and maintenance

More Related