Using metadata in contentdm
Download
1 / 26

Using Metadata in CONTENTdm - PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on

Using Metadata in CONTENTdm. Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct. 29, 2002 . Outline. The metadata “environment”: factors that influence basic decisions

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Using Metadata in CONTENTdm' - Anita


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Using metadata in contentdm

Using Metadata in CONTENTdm

Diana Brooking and Allen Maberry

Metadata Implementation Group, Univ. of Washington

Crossing Organizational Boundaries

Oct. 29, 2002


Outline
Outline

  • The metadata “environment”: factors that influence basic decisions

  • Structure of metadata: Dublin Core, field structure in CONTENTdm

  • Content standards: what goes into the fields, formatting, controlled vocabularies

  • The data dictionary: bringing it all together


Metadata what is it
Metadata: what is it?

  • Data about data

    • “Metadata are data that describe the attributes of a resource; characterize its relationships; support its discovery, management, and effective use; and exist in an electronic environment.”(Sherry Vellucci, LRTS 44 (1), 1999)

  • Commonly known as cataloging


Metadata how is it used
Metadata: how is it used?

  • For description: information for display with the image

  • For searching: users search for images by searching for text attached to the image


Basic decisions description
Basic Decisions: Description

  • How much information do you have?

  • How much information do your users need/want?

    • What is depicted in the image?

    • Who created it?

    • Why is it important? Why did you select it?

  • How much detail do you need to go into?


Basic decisions searching
Basic Decisions: Searching

  • How will users find the images? What will they be looking for? What aspects are they interested in?

  • How will you find the images? What are your staff’s needs?

  • At what level do you need to distinguish images from one another?

  • At what level do you need to bring like resources together?


Decision factors
Decision Factors

  • Size of file

    • 50 images (small enough to browse)

    • 10,000 images (need for more precise searching)

    • 10,000 images of many different things vs. 10,000 images of trains


Decision factors1
Decision Factors

  • Audience

    • General public vs. specialists (e.g., railroad enthusiasts)

  • Institutional mission

    • Say you are a railroad museum (audience expectations)


Decision factors2
Decision Factors

  • Legacy data

    • Starting from scratch

    • Years of good cataloging

    • Years of inconsistent cataloging

  • Software issues

    • What kind of data can the system handle?

    • What are its search capabilities

    • Short-term vs. long-term view


Basic dublin core metadata
Basic Dublin Core Metadata

  • What is the Dublin Core Metadata Element Set (DCMES)

  • Why was it developed, and how has it been developed.

  • A short history of the DC Initiative is available at http://www.dublincore.org/about/overview/


Dublin core metadata element set
Dublin Core Metadata Element Set

  • There are15 basic elements

  • See Dublin Core Element Set, Version 1.1 - Reference Description

  • But, it is adaptable and expandable to fit the needs of different users by the use of “Applications profiles”


Dublin core and contentdm
Dublin Core and CONTENTdm

  • CONTENTdm is designed around the Dublin Core

  • (Very) basic overview of how CONTENTdm works

    • CONTENTdm uses DC element names as file names

    • Because each database has constant file names it is easy to combine them to search either one or more collections


Dublin core mapping
Dublin Core mapping

  • An example:

    • Collection A has a field “Photographer” mapped to DC:Creator, and Collection B has a field “Artist” mapped to DC:Creator. Searching across both databases searches the CONTENTdm index “Creat*” and retrieves data from the index for both “Photographers” and “Artists” for collections A + B or A+B+n…


Dublin core and searching
Dublin Core and searching

  • What are the practical consequences of this?

    • In cross database searching, one can search on specific fields. However, the names of these fields will not be Photographer or Artist, but “Creator” because that is the common name of the index in each collection.

    • However you can do a keyword search on all “searchable” fields in the database whether they are mapped to a Dublin Core field or not.


Modern Book Arts field labels

bibliographic description = descr0

text production = descr1

image production = descr2, etc.

Cross-database search index

Description = descr*


Dublin core tips
Dublin Core tips

  • It is important to make sure that you are careful about what information you put in searchable fields, even if they are not mapped to a DC element.

  • If you have multiple collections it is very important to make sure that the same type of data is mapped to the same DC elements consistently


Content standards
Content Standards

  • Used for choosing and formatting the data that goes into the fields.

  • Increase coherence and intelligibility of description

  • Enhance reliability of retrieval

  • Enable compatibility with other collections (cross-database searching)

  • Makes maintenance and possible migration of data to other software easier


Standards consistency
Standards = Consistency

  • “Date” field: dates should always be formatted the same way

  • “Photographer” field: same person’s name should always appear in the same form

  • “Subject” field: same topic should have the same term used to describe it across images

  • If different terms or formats are used, the user may not even realize that more than one search is necessary


Examples of content standards
Examples of Content Standards

For description:

  • Anglo-American Cataloging Rules, 2nd ed., 2002 revision (libraries)

  • Graphic Materials: Rules for Describing Original Items and Historical Collections, 1982; revisions available electronically (libraries, also museums, historical societies, LC Prints & Photo., CORBIS)


Content standards controlled vocabularies
Content Standards: Controlled Vocabularies

“Any subset of the lexicon of a natural language. A list of preferred and nonpreferred terms produced by the process of vocabulary control. Types of controlled vocabularies include subject heading lists and thesauri.” (NISO)


Controlled vocabs for which fields
Controlled vocabs for which fields?

  • When you need consistency across images, user searches to find all …

    • Proper names for things (people, places, etc.)

    • Subjects depicted in the images

  • Not necessary when you have…

    • Fields that contain data more likely to be unique to the particular image (title, notes, other free text fields)


Remember
Remember…

You can have fields that don’t use controlled vocabularies, but where you still need consistency in format:

  • Dates

  • Image numbers

  • Physical description

  • You could create your own controlled vocab lists (if you really had to)


  • Controlled vocabularies
    Controlled Vocabularies

    For names:

    • Library of Congress/National Authority File: http://authorities.loc.gov

    • Union List of Artist Names (Getty): http://www.getty.edu/research/tools/vocabulary/ulan

    • USGS Geographic Names Information System: http://geonames.usgs.gov/gnishome.html


    Controlled vocabularies1
    Controlled Vocabularies

    For subjects:

    • Library of Congress Subject Headings: http://authorities.loc.gov

    • LC Thesaurus for Graphic Materials: http://www.loc.gov/rr/print/tgm1

    • Art & Architecture Thesaurus (Getty): http://www.getty.edu/research/tools/vocabulary/aat

    • Chenhall’s Nomenclature (The Revised Nomenclature for Museum Cataloging. Walnut Creek: Altamira Press, 1995)


    Vocabulary conflicts
    Vocabulary conflicts?

    • DC Subject: LCSH vs. AAT

      • Church buildings vs. Churches

    • DC Coverage: LC vs. Board of Geographic Names

      • Moscow vs. Moskva

    • Challenge of meeting needs of diverse collections and users, while maintaining consistency within and between databases


    Data dictionaries
    Data Dictionaries

    For each project a data dictionary documents:

    • Database-specific field labels

    • Mapping of fields to DC elements

    • Data formatting instructions for each field

    • Recommended controlled vocabularies

    • UW data dictionaries: http://www.lib.washington.edu/msd/mig/datadicts/default.html

    • MOHAI


    ad