Linked data and the implications for library cataloguing: metadata models and structures in the Sema...
Download
1 / 59

Gordon Dunsire - PowerPoint PPT Presentation


  • 208 Views
  • Uploaded on

Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web. Gordon Dunsire Presented at the Canadian Library Association Annual Conference, 26-29 May 2011, Halifax, Nova Scotia. Outline. Context: evolution of the catalogue record RDF 101

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Gordon Dunsire' - tiana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web

Gordon Dunsire

Presented at the Canadian Library Association Annual Conference, 26-29 May 2011, Halifax, Nova Scotia


Outline
Outline metadata models and structures in the Semantic Web

  • Context: evolution of the catalogue record

  • RDF 101

  • Library metadata models/schemas in RDF

    • FRBR, RDA, ISBD, DCT, BiBO, ...

  • From record to triples: worked example


A short history of the evolution of the library catalogue record

A short history metadata models and structures in the Semantic Webof the evolutionof the library catalogue record


In the beginning
In the beginning ... metadata models and structures in the Semantic Web

Lee, T. B.

Cataloguing has a future. - Audio disc

(Spoken word). - Donated by the author.

1. Metadata

... the catalogue card


From flat file record
From flat-file record ... metadata models and structures in the Semantic Web

Bibliographic description

Name authority

Author:

Lee, T. B.

Name:

Biography:

Title:

Cataloguing has a future

...

Content type:

Spoken word

Carrier type:

Audio disc

Subject authority

Subject:

Metadata

Term:

Provenance:

Donated by the author

Definition:

...

... to relational record


From flat file description
From flat-file description ... metadata models and structures in the Semantic Web

Bibliographic description

Name authority

Author:

Name:

Lee, T. B.

Biography:

Title:

Cataloguing has a future

...

Work

Content type:

Spoken word

Author:

Carrier type:

Audio disc

Subject authority

Subject:

Subject:

Term:

Metadata

Expression

Provenance:

Donated by the author

Definition:

Content type:

Spoken word

...

Manifestation

Item

... to FRBR record


From frbr record
From FRBR record ... metadata models and structures in the Semantic Web

Work

Name authority

Author:

Name:

Lee, T. B.

Subject:

Subject authority

Expression

Term:

Metadata

Content type:

Spoken word

Manifestation

RDA content type

Title:

Cataloguing has a future

Term:

Carrier type:

Audio disc

RDA carrier type

Item

Term:

Provenance:

Donor:

Donated by the author

Amazon/Publisher

Title:

... to extinction!


Where is the record
Where is the record? metadata models and structures in the Semantic Web

  • Implicit, not explicit

    • Everywhere and nowhere

  • A semantic Web will allow machines to create the record just-in-time

    • We will not have to maintain records just-in-case

  • The user will have control over the presentation

    • I want to see an archive or library or museum or Amazon or Google or Flickr or ? display

  • And by avoiding duplication, we can all get on with describing new stuff ...


The hyperdimensional tardis card
The hyperdimensional (Tardis) card metadata models and structures in the Semantic Web

W3C Library

Audio shop

Lee, T. B.

Cataloguing has a future. - Audio disc

(Spoken word). - Donated by the author.

1. Metadata

Lee Museum

Spoken word archive

“TARDIS four port USB hub, for office-bound Time Lords:

Open a time vortex on your desk” – Pocket-lint


Rdf 101

RDF 101 metadata models and structures in the Semantic Web


Semantic web
Semantic Web metadata models and structures in the Semantic Web

  • “machine-readable metadata”

    • Faster! 24/7/365! Global!

  • Metadata expressed as “atomic” statements

    • A simple, single, irreducible statement

      • The title of this book is “Treasure island”

  • In a standard machine-processable format

    • Resource Description Framework (RDF)


Resource description framework
Resource Description Framework metadata models and structures in the Semantic Web

  • Metadata statement constructed in 3 parts

    • “Triple”

  • The title of this book is “Treasure island”

    • Subject of the statement = Subject: This book

    • Nature of the statement = Predicate: has title

    • Value of the statement = Object: “Treasure island”

  • This book – has title – “Treasure island”

    • subject – predicate - object


Identifiers
Identifiers metadata models and structures in the Semantic Web

  • Need unambiguous way of identifying each part of the triple for efficient machine-processing

    • Human labels (“This book”, “has title”) no good

      • Same thing, different labels; different things, same label

  • Exploit the utility of the URL

    • Machine-readable, regular syntax, unambiguous

  • Uniform Resource Identifier (URI)


Uniform resource identifier
Uniform Resource Identifier metadata models and structures in the Semantic Web

  • Can be any unique combination of numbers and letters

    • No intrinsic meaning; it’s just an identifying label

  • Can look like a URL

    • http://iflastandards.info/ns/isbd/elements/P1001

    • But does not lead to a Web page (in principle ...)

  • RDF requires the subject and predicate of triple to be URIs

    • Object can be a URI, or a literal string (“Treasure island”)


Namespaces
Namespaces metadata models and structures in the Semantic Web

  • URI can be constructed from a base plus a unique, identifying suffix

    • http://iflastandards.info/ns/isbd/elements/

    • + P1001

  • Base is known as a namespace

    • Can be abbreviated by human programmer

      • “isbd” = http://iflastandards.info/ns/isbd/elements/

    • isbd:P1001

  • Machine expands abbreviation for processing


Everything as triples in rdf
Everything as triples in RDF metadata models and structures in the Semantic Web

  • Every aspect of the metadata must be expressed in RDF to be machine-processable

    • Metadata about real-world objects (books, people, etc.)

    • Metadata about the predicates (definition, label, scope, etc.)

      • Common predicates apply to many types of thing (human-readable label, etc.)

        • High-level RDF namespaces (rdfs, owl)

  • RDF is expressed in RDF (“bootstrap”)


Library namespaces

Library namespaces metadata models and structures in the Semantic Web


Creating namespaces and uris
Creating namespaces and URIs metadata models and structures in the Semantic Web

  • FRBR/FRAD/FRSAD, ISBD, and RDA are using the Open Metadata Registry

    • Can assign a running “number” to the base to create a new URI

  • Set of properties for creating basic triples

    • Properties = predicates

    • rdfs:label for assigning a human-readable label to the subject

    • isbd:P1001 - rdfs:label - “has content form”


Subject metadata models and structures in the Semantic Web

Predicate

Object

isbd:P1001

rdfs:label

“has content form”


Subject metadata models and structures in the Semantic Web

Predicate

Object

isbdcf:T1008

skos:prefLabel

“spoken word”


Application profile
Application profile metadata models and structures in the Semantic Web

  • Need a way to specify how a useful “record” can be constructed from RDF triples

  • Which triples are involved, and from which namespaces?

  • Sequence? Repeatable? Mandatory?

  • Sub-component aggregations

    • Publication statement = place + name + date

  • Content rules?


Mandatory metadata models and structures in the Semantic Web

Not repeatable

Aggregation of simpler elements

Syntax of aggregation (punctuation)


Getting triples from records

Getting triples from records metadata models and structures in the Semantic Web


Linking Open Data cloud (LOD) metadata models and structures in the Semantic Web

Diagram by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/


LOD: “Library” corner metadata models and structures in the Semantic Web


Why get involved
Why get involved? metadata models and structures in the Semantic Web

  • To share our data

    • We work for “society”

  • To share our expertise and experience

    • 150 + years

  • To promote the power of libraries (and archives and museums)

  • To survive


From record to triples in 9 stages
From record to triples (in 9 stages) metadata models and structures in the Semantic Web

  • Very large numbers of records

    • Catalogue records, finding aids, etc.

    • 300 million; 1 billion?

  • High quality metadata

    • In comparison with other communities

  • Each record may generate many triples

    • 200 “raw” triples (no inferences) per MARC record?

  • Very, very large numbers of triples

    • Billions? Trillions?


1 take a record
1. Take a record metadata models and structures in the Semantic Web


2 disaggregate to single statements
2. Disaggregate to single statements metadata models and structures in the Semantic Web


3 create uri for record
3. Create URI for record metadata models and structures in the Semantic Web

  • Must be unique, so 54321 no good on its own

  • http URIs are a good thing (W3C)

  • So add record ID to a unique http domain

    • E.g. http://MyLibraryX.com (unique to the library)

    • + 54321

  • http://MyLibraryX.com/54321

    • (or http://MyLibraryX.com#54321)

  • This is not a URL!


4 replace record id with uri
4. Replace record ID with URI metadata models and structures in the Semantic Web

“mlx” = qname (xmlns) = shorthand for “http://MyLibraryX.com/”


5 find uris for attributes
5. Find URIs for attributes metadata models and structures in the Semantic Web

  • Attributes are modelled as RDF properties (predicates) in “element set” namespaces

    • E.g. Dublin Core terms (dct); ISBD (isbd); FRBR (frbrer); RDA (rdaxxx); Bibliographic Ontology (bibo); etc.

  • Choose a namespace, find property with same (or closest) “meaning” (e.g. definition) as attribute

    • Nearest property minimises loss of information

  • Get URI for property

  • If no suitable property, choose another namespace

    • Properties do not have to come from single namespace

  • Match and mix!


5 cont find uri for title
5 (cont). Find URI for title metadata models and structures in the Semantic Web

  • http://purl.org/dc/terms/title (dct:title)

  • http://iflastandards.info/ns/isbd/elements/P1014 (isbd:P1014)

    • hasTitleProper

  • http://RDVocab.info/Elements/titleProper  (rdaGR1:titleProper)


5 cont find uri for author
5 (cont). Find URI for author metadata models and structures in the Semantic Web

  • dct:creator

  • rdarole:author

  • (isbd does not cover “headings”)


5 cont find uri for date
5 (cont). Find URI for date metadata models and structures in the Semantic Web

  • dct:date

  • isbd:P1018

    • hasDateOfPublicationProductionDistribution

  • rdaGr1:dateOfPublication


5 cont find uri for lcsh
5 (cont). Find URI for LCSH metadata models and structures in the Semantic Web

  • LCSH is a subject vocabulary

    • Controlled terms

  • So attribute is really “subject”

    • And the term itself is the value

  • dct:subject


5 cont find uri for media type
5 (cont). Find URI for media type metadata models and structures in the Semantic Web

  • Assuming record uses new ISBD Area 0 ...

  • isbd:P1003

    • hasMediaType


5 cont find uri for content form
5 (cont). Find URI for content form metadata models and structures in the Semantic Web

  • Assuming record uses new ISBD Area 0 ...

  • isbd: P1001

    • hasContentForm


6 replace attributes with uris
6. Replace attributes with URIs metadata models and structures in the Semantic Web


7 find uris for values
7. Find URIs for values metadata models and structures in the Semantic Web

  • If object of a triple is a URI, it can link to the subject of another triple with the same URI

    • Linked data!

  • Values from controlled vocabularies may have URIs

    • Possible vocabularies: author, subject, ISBD Area 0

    • NOT: title, date

  • For author: Virtual International Authority File (VIAF)

  • For LCSH: Library of Congress Authorities & Vocabularies

  • For ISBD Area 0: Open Metadata Registry


7 cont find uri for author
7 (cont). Find URI for author metadata models and structures in the Semantic Web

  • Author: Wythe, Deborah

  • VIAF: http://www.viaf.org/

    • viaf:31899419/#Wythe,+Deborah


7 cont find uri for subject lcsh
7 (cont). Find URI for subject (LCSH) metadata models and structures in the Semantic Web

  • LCSH: Museum archives

  • LoC: http://id.loc.gov/authorities/

    • lcsh:/sh85088707#concept


7 cont find uris for isbd area 0
7 (cont). Find URIs for ISBD Area 0 metadata models and structures in the Semantic Web

  • Media type: Electronic

  • ISBD media type

    • isbdmt:T1002

  • Content form: Text

  • ISBD Content form

    • isbdcf:T1009


8 replace values with uris
8. Replace values with URIs metadata models and structures in the Semantic Web


9 publish triples linked data
9. Publish triples (linked data) metadata models and structures in the Semantic Web

mlx:54321 | isbd:P1014 | “Museum archives: an introduction”

mlx:54321 | rdarole:author | viaf:31899419/#Wythe,+Deborah

mlx:54321 | isbd:P1018 | “2004”

mlx:54321 | dct:subject | lcsh:/sh85088707#concept

mlx:54321 | isbd:P1003 | isbdmt:T1002

mlx:54321 | isbd:P1001 | isbdcf:T1009


Linked data chains
Linked data chains metadata models and structures in the Semantic Web

mlx:54321 | dct:subject | lcsh:/sh85088707#concept

lcsh:/sh85088707#concept | skos:related | rameau:XXX

rameau:XXX | frbrer:isSubjectOf | mly:98765

mly:98765 | rda:titleOfTheWork | “Managing archives in museums”

rameau:XXX | skos:prefLabel | “archives du musée”


Linked data cluster record
Linked data cluster = “record” metadata models and structures in the Semantic Web

mlx:54321 | isbd:P1014 | “Museum archives: an introduction”

mlx:54321 | rdarole:author | viaf:31899419/#Wythe,+Deborah

mlx:54321 | isbd:P1018 | “2004”

mlx:54321 | dct:subject | lcsh:/sh85088707#concept

mlx:54321 | isbd:P1003 | isbdmt:T1002

mlx:54321 | isbd:P1001 | isbdcf:T1009


Metadata focus metadata models and structures in the Semantic Web

Shift of focus of metadata creation, maintenance, storage, preservation (by professionals, amateurs, machines)

From Record

To Statement(s) = triple(s)

But metadata display ...

... aggregates triples (from multiple sources) to create records on the fly


Thank you
Thank you metadata models and structures in the Semantic Web


ad