said the spider to the fly identity and authority in the semantic web l.
Skip this Video
Loading SlideShow in 5 Seconds..
Said the spider to the fly: identity and authority in the Semantic Web PowerPoint Presentation
Download Presentation
Said the spider to the fly: identity and authority in the Semantic Web

Loading in 2 Seconds...

play fullscreen
1 / 24

Said the spider to the fly: identity and authority in the Semantic Web - PowerPoint PPT Presentation

  • Uploaded on

Said the spider to the fly: identity and authority in the Semantic Web. Presented to the annual conference of the Cataloguing and Indexing Group, 2008, Glasgow Gordon Dunsire. How does the spider know it’s a fly?. Musca domestica. undone. flies. Zoom. . fly. mouche. bluebottle. mosca?.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Said the spider to the fly: identity and authority in the Semantic Web

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
said the spider to the fly identity and authority in the semantic web

Said the spider to the fly: identity and authority in the Semantic Web

Presented to the annual conference of the Cataloguing and Indexing Group, 2008, Glasgow

Gordon Dunsire

how does the spider know it s a fly
How does the spider know it’s a fly?

Musca domestica









  • Introduction to the Semantic Web
  • Importance of identifiers
  • Initiatives from library communities
    • Subject classification and indexing
  • Tag clouds
    • Learning from the users
  • Authorising
semantic web
Semantic Web
  • “… an evolving extension of the [WWW] in which the semantics of information and services on the web is defined.”
    • Wikipedia, English, 19.50 30 Aug 2008
semantic web foundations
Semantic Web foundations
  • Resource Description Framework (RDF)
    • Statements about Web resources in the form of subject-predicate-object expressions, called triples
    • E.g. “This presentation” – “has creator” – “Gordon Dunsire”
    • Each component of an RDF statement (triple) is a “resource”
  • RDF Schema (RDFS)
    • Vocabulary description language of RDF
semantic web building blocks
Semantic Web building blocks
  • RDF is about making machine-processable statements, requiring
    • A machine-processable language for representing RDF statements
      • Extensible Markup Language (XML) 
    • A system of machine-processable identifiers for resources (subjects, predicates, objects)
      • Uniform Resource Identifier (URI) 
    • For full machine-processing, an RDF statement is a set of three URIs
  • Things requiring identification:
    • Subject “This presentation”
      • e.g. its electronic location (URL):
    • Predicate “has creator”
      • e.g.
    • Object “Gordon Dunsire”
      • e.g. URI of entry in Library of Congress Name Authority File (real soon now?)
  • Declaring vocabularies/values as “namespaces” in RDF applications provides URIs
semantic web applications
Semantic Web applications
  • Simple Knowledge Organization System (SKOS)
    • Expresses the basic structure and content of concept schemes such as thesauri and other types of controlled vocabularies
    • An RDF application
  • Web Ontology Language (OWL)
    • Explicitly represents the meaning of terms in vocabularies and the relationships between them
library catalogues and rdf 1
Library catalogues and RDF (1)
  • Resource Description and Access (RDA)
    • Successor to AACR
  • Dublin Core Metadata Initiative RDA Task Group is developing namespaces
    • Working with two types of RDA vocabularies
    • RDA metadata entities (elements, attributes)
      • E.g. “Title”, “Content type”
      • Represented as an RDF Schema
    • RDA value vocabularies (terms)
      • E.g. “spoken word”, “microform” (media type)
      • Represented in SKOS
    • Using NSDL Metadata Registry tools
library catalogues and rdf 2
Library catalogues and RDF (2)
  • IFLA bibliographic control standards
    • Discussions during WLIC 2008, Québec City
  • XML namespaces for entities and relationships from Functional Requirements for Bibliographic Records (FRBR)
    • E.g. “Work”, “has Expression” / ”is Expression of”
  • Others are likely to follow:
    • Functional Requirements for Authority Data (FRAD)
    • International Standard Bibliographic Description (ISBD)
    • Functional Requirements for Subject Authority Records (FRSAR)
  • Library of Congress taking a similar approach with MARC21
rda rdf vocabulary example fake
RDA RDF vocabulary example (fake)

<?xml version="1.0" encoding="UTF-8"?>







<!-- WARNING: This is a single-concept fragment -->

<!-- Scheme: RDA Content Type -->

<skos:ConceptScheme rdf:about="">

<dc:title>RDA Content Type</dc:title>


<!-- Concept: spoken word -->

<skos:Concept rdf:about="">

<skos:inScheme rdf:resource=""/>

<skos:prefLabel>spoken word</skos:prefLabel>

<skos:definition>Content expressed through language in an audible form.

Includes recorded readings,recitations, speeches, etc., computer-generated

speech, etc.</skos:definition>



Namespaces used to declare the RDA namespace – everything must be defined explicitly to the machine!

Overall base domain

Vocabulary URI

Term URI


Term definition

rda content type spoken word
RDA content type “spoken word”

The term “spoken word” can be referenced as the value of the field “content type” in any metadata record using RDF/XML (Semantic Web):

xmlns:rdvct =”

<… rdvct:1001 …>

The field/attribute/element “content type” can be referenced in a similar way to the RDF Schema for RDA elements being developed by DCMI/RDA

the distributed catalogue record
The distributed catalogue record

RDA element record

Work record

Name authority record


Lee, T. B.


Cataloguing has a future


Biography: …

Content type:

Spoken word

Expression record

Carrier type:

Audio disc

Subject authority record



Manifestation record


Donated by the author


Definition: …

RDA content type record

Item record


Spoken word

Definition: …

RDA carrier type record

subject indexing and rdf 1
Subject indexing and RDF (1)
  • Library of Congress authority files
    • LC Subject Headings (LCSH)
    • LC Name Authority File (LCNAF)
  • SKOS representation planned
    • LCSH thesaurus relationships included
    • Namespaces will be freely available
  • Potential for development of third-party web services
    • E.g. Disambiguation of names
    • E.g. Structured browsing of subject headings
subject indexing and rdf 2
Subject indexing and RDF (2)
  • Dewey Decimal Classification (DDC)
    • RDFS/SKOS representation of WebDewey in development
    • OCLC Terminology Services pilot for web services
  • High-Level Thesaurus project (HILT)
    • Using DDC as a spine for interoperating multiple monolingual subject vocabularies
      • Web service based on SKOS
  • European DDC Users Group (EDUG)
    • Technical Issues Working Group monitors projects and develops use cases for the online environment for subject access and translation
  • MelvilSearch
    • Subject search interface of the Deutsche NationalBibliotek and other German libraries
      • Developing web services based on RDFS/SKOS.
  • DeweyBrowser
    • Experimental subject access service using tag clouds
tag clouds a web 2 0 technique
Tag clouds: a Web 2.0 technique
  • Tag: Keyword applied by a user
    • Usually subject-based
    • No controls on vocabulary or application
    • Single “words” only (no phrases)
      • E.g. “flyingInsect”
  • Cloud: set of tags applied across a collection of documents
    • Size/colour indicates relatively how many documents the tag will retrieve
      • The bigger the tag, the more it has been applied
    • Displayed horizontally, not a vertical listing
  • Cloud display is now being used with controlled tags (vocabularies)
    • E.g. DeweyBrowser
  • How “authoritative” is a subject heading?
    • Metadata created by 3 categories of “agent”:
      • Professionals (us) 
      • “Authors” (mmm …) 
      • Users (them) 
  • Quality and quantity
    • High quality, controlled terms
      • Expensive to maintain (?); legacy issues; slow to respond
    • Self-interested, uncontrolled terms
      • Dependent on “publishing” context (i.e. sex sells)
    • High quantity, uncontrolled terms
      • Cheap to maintain (?); user-oriented; fast to respond
authority in the semantic web
Authority in the Semantic Web
  • SKOS makes controlled terms available to authors and users
    • And gives more flexibility to professionals
  • Statistical analysis of folksonomies should sort the wheat from the chaff
    • Law of large numbers; regression to the mean
  • Full-text indexing and associative analyses (proximity and citation) complement traditional library approaches
    • aka Googlizing






It’s a fly!

thank you
Thank you
  • See handout for acronyms and links