ALA Midwinter, ALCTS, CCS,
1 / 25

ALA Midwinter - PowerPoint PPT Presentation

  • Updated On :

ALA Midwinter, ALCTS, CCS, Philadelphia, PA, January 12, 2008 Metadata Creation and Metadata Quality Control across Digital Repositories Dr. Jung-ran Park Caimei Lu [email protected] Drexel University Research supported through IMLS award (2006-2009) Research Needs

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'ALA Midwinter' - Jims

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

ALA Midwinter, ALCTS, CCS, Philadelphia, PA, January 12, 2008 Metadata Creation and Metadata Quality Control across Digital Repositories

Dr. Jung-ran Park

Caimei Lu

[email protected]

Drexel University

Research supported through IMLS award (2006-2009)

Research needs l.jpg
Research Needs

  • Rapid proliferation of digitization projects by libraries and other organizations calls for serious research on metadata quality evaluation.

  • Resource discovery and exchange across ever-growing distributed digital collections demands semantic interoperability based on accurate and consistent resource description.

Research goals l.jpg
Research Goals

  • Assess the current status of metadata creation and mapping between cataloger-defined field names and Dublin Core (DC) metadata elements across three digital image collections.

  • Identify the most frequently occurring inconsistent and incomplete DC metadata applications

  • Overarching goal: examine metadata quality in relation to semantic interoperability of concept representation

Research questions l.jpg
Research Questions

  • What is the current practice of metadata creation and semantic mapping across digitized image collections utilizing CONTENTdm?

  • Which field names produce the most frequent inaccurate, inconsistent and null mappings from cataloger defined field names onto DC metadata?

  • What types of locally created metadata elements are added to the DC metadata scheme by these three user groups of CONTENTdm?

  • What conceptual ambiguities and semantic overlaps can be found in the DC metadata elements?

Data and research methods l.jpg
Data and Research Methods

  • A study has been conducted comparing and analyzing 20 digital image metadata templates (see Table 1) and 659 metadata item records (see Figure 1) for digitized image collections derived from three repositories.

  • DC metadata element name and its corresponding definition are examined by utilizing linguistic semantic analysis.

Table 1 metadata template l.jpg
Table 1:Metadata Template

Criteria for examining metadata item records l.jpg
Criteria for examining metadata item records View

  • Completeness/unused DC elements

  • Accuracy

  • Consistency

  • Local addition

Slide10 l.jpg

Concept Equivalence View

(J. R. Park 2002)

Diagram 1. Source concept equivalent to several target concepts:

[Source] [Target]




Diagram 2. Two or more source concepts equivalent to one target concept:

[Source] [Target]




Diagram 3. No conceptual equivalent between the source concept and the target concept:

[Source] [Target]




Inaccurate and inconsistent field names and metadata elements l.jpg
Inaccurate and Inconsistent Field Names and Metadata Elements

  • ‘Physical description’ field is either mapped onto DC Description or Format.

  • Great confusion in employing the DC elements Type and Format and they are interchangeably used.

  • DC elements Source and Relation are inconsistently mapped onto various cataloger-defined fields.

  • DC element Relation is interchangeably used with cataloger-defined field names such as ‘digital collection’ and ‘example issues.’

Most frequent null mapping fields all locally added metadata elements l.jpg
Most Frequent Null Mapping Fields/All Locally Added Metadata Elements

  • Accessibility and Provenance:

    - Contact information

    - Ordering information

    - Acquisition

    - Image modification

    - Full resolution, scan date, full text, note

Most and least used dc metadata elements l.jpg
Most and Least Used DC Metadata Elements Elements

  • Most: subject, description, title, format, coverage (over 50%)

  • Least: language, relation, source, creator and identifier

Semantic overlaps in dc metadata elements l.jpg
Semantic Overlaps in DC Metadata Elements Elements

  • The inherent conceptual ambiguities and semantic overlaps in some of the DC metadata elements affect semantic interoperability. Semantic overlap among certain DC metadata element names and their corresponding definitions create conceptual ambiguity and consequently hinder accurate, consistent and complete application of the DC metadata scheme.

Format vs type l.jpg
Format vs. Type Elements

  • Format is “physical or digital manifestation of the resource” —unqualified DC metadata (DCMI, 2005)

  • Type: “image may include both electronic and physical representations” —qualified DC metadata (DCMI, 2005) type vocabulary on image

Creator contributor vs publisher l.jpg
Creator, Contributor, vs. Publisher Elements

  • Creator: “An entity primarily responsible for making the content of the resource.”

  • Contributor: “An entity responsible for making the content of the resource.”

  • Publisher: “An entity responsible for making the resource available.”

    source: unqualified DC metadata (DCMI, 2005)

Source vs relation l.jpg
Source vs. Relation Elements

  • Source is “a reference to a resource from which the present resource is derived.”—unqualified DC metadata (DCMI, 2005)

  • Relation is “the described resource is a physical or logical part of the referenced resource.” — qualified DC metadata: Relation, is Part of

  • Relation is “the described resource is a version, edition, or adaptation of the referenced resource.” — qualified DC metadata: Relation, is Version of

    Source is a particular type of Relation.

Implications 1 l.jpg
Implications 1 Descriptions

  • Training and educating catalogers for metadata creation and mapping

Implications 2 l.jpg
Implications 2 Descriptions

  • Critical need for the development of mediation mechanism such as guidelines and concept maps that facilitate the metadata creation and mapping process.

Implications 3 l.jpg
Implications 3 Descriptions

  • Semantic interoperability across digital collections utilizing the DC metadata scheme is hindered partially due to the drawbacks inherent in the semantics of the scheme. DC metadata scheme needs to further evolve in order to disambiguate the semantic relations of the DC metadata elements that present semantic overlaps and conceptual ambiguities.

Future studies l.jpg
Future Studies Descriptions

  • Metadata application guidelines (i.e., content specification) and procedures for cataloging professionals to follow during the creation of descriptive metadata elements and application of controlled vocabularies

  • Identification of criteria and reasoning behind local addition and variation of metadata element values to and from selected metadata and controlled vocabulary schemes

  • Identification of measures and procedures for metadata quality control employed by cataloging professionals in describing digital resources.

  • Identification of new competencies and skill sets needed by cataloging professionals and current trends in LIS curricula designed to address such needs.

  • Survey and focus group interviews with catalogers

Questions comments l.jpg
Questions/Comments Descriptions

Thank you