200 likes | 419 Views
Initiatives to make standard library metadata models and structures available to the Semantic Web. Gordon Dunsire, UK g.dunsire@strath.ac.uk Mirna Willer, HR mwiller@unizd.hr. Presented at WLIC Session 149, Sun 15 Aug 2010, Gothenburg, Sweden. Overview.
E N D
Initiatives to make standard library metadata models and structures available to the Semantic Web Gordon Dunsire, UK g.dunsire@strath.ac.uk Mirna Willer, HR mwiller@unizd.hr Presented at WLIC Session 149, Sun 15 Aug 2010, Gothenburg, Sweden
Overview • I: Initiatives: IFLA initiatives (FRBR, ISBD, etc.) and the relation to external initiatives (RDA, linked-data vocabularies like VIAF, LCSH, etc.). • II: Shift of focus: Potential use of these initiatives to support the Semantic Web (parsing existing legacy records to create huge quantities of high-quality instance triples, the power of inferencing to create new triples, etc.), and the shift of cataloguing focus from record to statement (triple). WLIC 2010, Gothenburg: Sun 15 August 2010
IFLA initiatives: Background • IFLA’s initiatives to make standard library metadata models, structures, and vocabularies developed by IFLA available to the Semantic Web, initially stimulated by external projects: • RDA: resource description and access • Data models meeting (London) with Dublin Core Metadata Initiative (DCMI), IEEE Learning Object Metadata (IEEE LOM), W3C Simple Knowledge Organization System (SKOS) WLIC 2010, Gothenburg: Sun 15 August 2010
IFLA initiatives: Standards, models • “Functional Requirements” family or “FRBR family of models”: • FRBR, 1998: Bibliographic Records [data] • FRAD, 2009: Authority Data • FRSAD, [2010]: Subject Authority Data • Preliminary work: theFRBR Namespace Project used the testing area of the National Science Digital Library Metadata Registry (NSDL) • Now the Open Metadata Registry • ISBD XML in the RDF/XML environment WLIC 2010, Gothenburg: Sun 15 August 2010
IFLA initiatives: Infrastructure • 2009-2010: the IFLA Namespaces project is developing an administrative and technical infrastructure to support such initiatives and encourage uptake of standards by other agencies. • Basic namespace: “iflastandards.info“ • FRBR: “http://iflastandards.info/ns/fr/frbr/frbrer/” as the basis of the uniform resource identifiers (URIs) of each RDF class and property [entity & relationship] in the FRBR model • /frbrer/ to distinguish from FRBRoo [CIDOC CRM] • FRAD: “http://iflastandards.info/ns/fr/frad/” WLIC 2010, Gothenburg: Sun 15 August 2010
IFLA initiatives: FR family • Representation of FRBRer model element set is mainly complete • FRAD and FRSAD close behind • Representation in Resource Description Framework (RDF) is informing work on combining and consolidating the model family • Also supplies “learning curve” for Semantic Web environment WLIC 2010, Gothenburg: Sun 15 August 2010
IFLA initiatives: ISBD RDF/XML • FRBR is a conceptual model built on the E-R methodology which is intrinsically applicable to representation in RDF, while ISBD is a data standard • Design of the RDF representation of ISBD involves: • the treatment of aggregated statements in a defined number of elements within the areas; • the treatment of mandatory and optional elements and areas; • the order of areas and elements within an area; • the repeatability of areas and elements; • the treatment of punctuation and its double function. WLIC 2010, Gothenburg: Sun 15 August 2010
Related standards: RDA • DCMI RDA Task Group has three goals: • define RDA modelling entities as an RDF vocabulary of properties and classes; • identify in-line value vocabularies as candidates for publication in RDFS or SKOS [nearly completed]; • develop a Dublin Core Application Profile for RDA based on FRBR and FRAD. • Task Group is using the Open Metadata Registry to develop RDF representations of the RDA vocabularies WLIC 2010, Gothenburg: Sun 15 August 2010
Related standards: Other • The National Library of Sweden has developed a methodology for representing MARC21 records in RDF and implemented it for LIBRIS, the Swedish Union Catalogue • The Vocabulary Mapping Framework (VMF) project [funded by UK Joint Information Systems Committee (JISC)]: • to develop a major expansion of the RDA/ONIX framework for resource categorization • to create a tool to support the automated mapping of vocabularies from metadata standards of use to the JISC community, which includes research, teaching, and learning environments • CIDOC CRM, FRAD, FRBR, MARC21 and RDA vocabularies [included] & ISBD and UNIMARC [represented] WLIC 2010, Gothenburg: Sun 15 August 2010
Related standards: Vocabularies • Instance values from terminologies (subject headings, classification captions and indexes, and thesauri) can be represented in RDF using SKOS: • Library of Congress Subject Headings (LCSH) • Faceted Application of Subject Terminology (FAST), Medical Subject Headings (MESH), Form and genre headings for fiction and drama, and Thesaurus for Graphic Materials (TGM) • French RAMEAU subject headings • DDC Summaries • Linked data: [set of] best practices for publishing and connectingstructured data on the Web WLIC 2010, Gothenburg: Sun 15 August 2010
Linked data initiatives • UDC Consortium: published a selection of around 2,000 UDC classes in 16 languages online as the UDC summary (RDF version in development) • Virtual International Authority File (VIAF): a set of linked controlled vocabularies (authority records of personal names)by national bibliographic agencies • ISBD: prescribes vocabulary control for the data in the Area 0 for content form and media type. Terms for the elements(content form, content qualification, and media type) are taken from closed lists WLIC 2010, Gothenburg: Sun 15 August 2010
Linked data from catalogue records • Most linked data initiatives involve vocabularies • Linked data can also represent bibliographic descriptions • Huge quantities of high quality bibliographic metadata are locked in catalogue records • UNIMARC, MARC21, EAD, etc. • Use RDF “models” to parse the records into linked data WLIC 2010, Gothenburg: Sun 15 August 2010
Disaggregating the metadata record into single statements Record Record ID 1234 Author MirnaWiller Title “UNIMARC format for authority records” Date “2004” Statements 1234 has Author MirnaWiller 1234 has Title “UNIMARC format for authority records” 1234 has Date “2004” WLIC 2010, Gothenburg: Sun 15 August 2010
Representing a single statement as an RDF triple Statement 1234 has Title “UNIMARC format for authority records” [subject] URI = http://natlibx/ 1234 [property] URI = http://.../??? [object] literal = “UNIMARC format for authority records” Triple <http://natlibx/1234> <http://.../???> “UNIMARC ...” natlibx:1234 some:??? “UNIMARC ...” WLIC 2010, Gothenburg: Sun 15 August 2010
Property URIs has Title = http://.../??? ISBD:has Title Proper http://iflastandards.info/ns/isbd/elements/1004 FRBR:has Title of the Manifestation http://iflastandards.info/ns/fr/frbr/frbrer/3020 FRBR:has Title of the Expression http://iflastandards.info/ns/fr/frbr/frbrer/3008 FRBR:has Title of the Work http://iflastandards.info/ns/fr/frbr/frbrer/3001 WLIC 2010, Gothenburg: Sun 15 August 2010
Inferring new triples from existing triples An RDF property can have a domain (the type of thing the property is applied to) and a range (the type of thing that can be a value of the property) Example: FRBR property “is created by (person)” (frbrer:2009) has domain Work (frbrer:1001) and range Person (frbrer:1005) natliby:456 natliby:456 viaf:21647077 rdf:type rdf:type frbrer:2009 frbrer:1001 frbrer:1005 viaf:21647077 Therefore natliby:456 is a Work, and viaf:21647077 is a Person WLIC 2010, Gothenburg: Sun 15 August 2010
Linking triples Statement 1234 has Author MirnaWiller [object] URI = viaf:29776655 Triple natlibx:1234 some:123 viaf:29776655 Another viaf:29776655 [is Author of] natliby:456 and natliby:456 frbrer:2009 viaf:21647077 and viaf:21647077 foaf:name “Dunsire, Gordon” Q: Who is a co-author with MirnaWiller? A: “Dunsire, Gordon” Q: Are they persons? A: Yes Q: Really? A: VIAF & natliby say so! WLIC 2010, Gothenburg: Sun 15 August 2010
Metadata focus Shift of focus of metadata creation, maintenance, storage, preservation (by professionals, amateurs, machines) From Record To Statement(s) = triple(s) But metadata display ... ... aggregates triples (from multiple sources) to create records on the fly WLIC 2010, Gothenburg: Sun 15 August 2010
Thank you • mwiller@unizd.hr • gordon@gordondunsire.com WLIC 2010, Gothenburg: Sun 15 August 2010