Metadata Vocabularies in Subject Gateways for Japanese Regional Public Libraries Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information and Media Studies University of Tsukuba
Some Japanese Governmental Activities,Metadata Vocabularies, anda Model for Sharing Metadata Schemas Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information and Media Studies University of Tsukuba
DC in Japanese Industrial Standard • Simple Dublin Core has been accepted as a National Standard in March 2005 by JIS. JIS X 0836 • Translation of ISO 15836 • Key issue for translation • Choice of Japanese terms for Labels, which should be domain neutral • Example: Subject element • 主題、件名 • Straightforward translation of “subject” • Library oriented terms, possibility of confusion with the meaning of Title • キーワード（keyword） • Domain Neural • chosen for JIS X 0836
Some Activities in Japan • Preservation • Web Archiving: National Diet Library (NDL) • Governmental Resources: NDL and National Archives of Japan • NDL • National Deposit Library • Appointed as the central organization for national Web archiving • Revision of Law for legal deposit • National Archives of Japan • Archiving of government resources • Born digital resources • Definition of “Governmental Resource” • Central repository for government documents
Digital Okayama Dai-Hyakka (DODH) • DODH • Regional portal by the Okayama Prefectural Library • Metadata creation by librarians and non-professionals, e.g. school teachers, students, and volunteers. • A Key Issue: Subject Classification • Choice of subject vocabularies • Small set of subject terms usable for the non-professionals designed in accordance with regional needs.
Basic Ideas in DODH • Basic Question: Is comprehensive/conventional subject vocabulary really useful? • Use domain specific vocabularies in addition to a comprehensive and widely used by libraries • Three vocabularies • NDC (Nippon Decimal Classification) • Very widely used by Japanese libraries • Kid’s Vocabulary • Designed for children and child resources • Multiple labels in accordance with user ages • Prefectural Resource Vocabulary • Developed by the government of Okayama prefecture for their resources
Basic Ideas in DODH • Maintenance of subject vocabularies • Librarians’ major concern on domain specific vocabularies • This is still an open question. • Need software tools to maintain vocabularies • Metadata Schema Registry • Semantic Web technology
Some Issues on Subject Vocabularies • Comprehensive and conventional subject vocabularies are not always useful for domain-specific resources. • Regional governments’ vocabularies for classifying resources • Comparison between the three subject vocabularies • Mappings between all pairs of these three vocabularies show the characteristics of vocabularies and requirements for classifying resources from different perspectives.
Classification Vocabularies of Four Prefectural Governments in Japan “+” means that terms to express names of regions or organizations are excluded.
Tokyo Okayama Ibaraki Kagawa Kanagawa
Distribution of Terms in the NDC term space - Okayama’s Case - NDC: 000=Generalities, 100=Philosophy, 200=History, 300=Social Sciences, 400=Natural Sciences, 500=Technology, 600=Industry, 700=The Arts, 800=Language, 900=Literature #NDC in KV/PV: the number of NDC terms in x00 used in the KV/PV mapping
Resources for Government and Social Activities Educational and Learning Resources PV KV Social Science Technology Natural Science Arts Industries Comprehensive NDC General Resources Subject Vocabularies - Okayama’s Case -
Discussions • Interoperability vs. Domain Specificity • Reasonably small and domain-specific vocabulary is advantageous for domain-specific applications. • Domain-specificity is, in general, disadvantageous for interoperability • Maintainability of Vocabularies • Long-term maintenance of vocabularies is expensive. • Reusability and customizability • Neighboring communities would need to share a vocabulary in part. • Need a good model to solve these issues
termA termB termC termX termZ termY A Model for Reusing Metadata Schemas termA: Mandatory termC: Optional Repeatable termZ: Mandatory if applicable termX: Mandatory Repeatable Metadata Vocabulary 2 (Metadata Element Set) Metadata Vocabulary 1 (Metadata Element Set) A structural view of application profile
<rdf:Description about=”foo”> <mv1:A>an example.</mv1:A> <mv2:X>bar</mv2:X> ... • <meta name=”mv1:A” content=”an example”> • <meta name=”mv2:X” content=”bar”> • ... Description in a syntax defined in an application Application Profile: Terms used in an application and structural constraints Metadata Vocabulary 2 (Metadata Element Set) Metadata Vocabulary 1 (Metadata Element Set) Application Profile termA: Mandatory termC: Optional Repeatable termZ: Mandatory if applicable termX: Mandatory Repeatable termA termB termC termX termZ termY A Model for Reusing Metadata Schemas Abstract Syntax and Concrete Syntax
RDF implementation XML implementation in an XML Schema An Oracle implementation Layer 3 Concrete Syntax DCMI Library Application Profile Open Archives Initiative Schema IPL Asia Schema ULIS Core Schema Layer 2 Abstract Syntax Layer 1 Semantics DCMES (Elements and Qualifiers) ULIS element extension IEEE-LOM Layered Model of Metadata Schema A Model for Reusing Metadata Schemas A Layered Modelsplit semantics and syntax into layers
Layer 3 Application Profile A Application Profile B Layer 2 Layer 1 DCMI Registry Tsukuba Registry DCMES Terms (Elements and Qualifiers) ULIS element extension ULIS-DL Subject Vocabulary A Model for Reusing Metadata Schemas Layered Model and Metadata Schema Registry
XML Schema for A XML Schema for B Layer 3 Layer 2 Layer 1 DCMI Registry Tsukuba Registry DCMES Terms (Elements and Qualifiers) ULIS element extension ULIS-DL Subject Vocabulary A Model for Reusing Metadata Schemas Layered Model and Metadata Schema Registry
A Model for Reusing Metadata Schemas Layer 3 Layer 2 Layer 1 DCMI Registry Tsukuba Registry • Use registries to share, reuse, customize and maintain vocabularies • Develop software tools to support these functions. Layered Model and Metadata Schema Registry
Discussions • Need good model and software tools • DCAM provides the basic architecture of metadata elements and metadata descriptions • The layered model presented is designed in order to enhance reusability/customizability of metadata schemas by splitting semantic and syntactic features of metadata schemas. • Metadata Schema Registry as a key component to share metadata schemas • Collaborating registries to collect community-specific schemas and to share the schemas among communities • Software tools attached to a registry, e.g. • Vocabulary maintenance tool attached to a registry, • Metadata application software generator