Trends in Metadata Practices:
1 / 28

Outline - PowerPoint PPT Presentation

  • Uploaded on

Trends in Metadata Practices: A Longitudinal Study of Collection Federation Carole L. Palmer, Oksana Zavalina, & Megan Mustafoff Center for Informatics Research in Science and Scholarship (CIRSS) Graduate School of Library and Information Science University of Illinois at Urbana-Champaign

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Outline' - paul

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Trends in Metadata Practices: A Longitudinal Study of Collection Federation

Carole L. Palmer, Oksana Zavalina, & Megan Mustafoff

Center for Informatics Research in Science and Scholarship (CIRSS)

Graduate School of Library and Information Science

University of Illinois at Urbana-Champaign

ACM IEEE Joint Conference on Digital Libraries, Vancouver, British Columbia, Canada

22 June 2007

Institute of Museum and Library Services

Digital Collections and Content


  • Background on IMLS Digital Collections & Content (DCC) project

  • Selected results from surveys on metadata applications and supplementary data

  • Conclusions and future work

The people
The people

UIUC Library: Tim Cole, PI

Amy Jackson, Project Coordinator

Bill Mischo &

Sarah Shreeves, co-PIs

GSLIS: Carole Palmer &

Mike Twidale, co-PIs

Oksana Zavalina, Research Assistant

Participants: IMLS NLG project developers and metadata librarians; usability testers

Development aim: integrated access

To digital collections funded through IMLS National Leadership Grant (NLG) program and, more recently, some LSTA grants.

  • Collection registry

    • Collection-level schema developed based on DC and RSLP

  • Metadata repository

    • Harvested metadata aggregated in one location

    • Portal to the item-level records

  • Assistance for projects to develop shareable metadata.

Research aim to investigate federating
Research aim: to investigate “federating”


  • Range and evolution of practices and interoperabilityissues among NLG projects

    Tension between local practices / needs and the more global potential of digital collections

  • How to best represent items and collections to meet the needs of service providers and diverse user communities


  • Role of individual, “intentional” collection within a federation

Research approach

Content analysis of proposals and webpages of projects

Two surveys of metadata applications, 2003 and 2006

Interviews with resource developers for 45 NLG projects

Analysis of item description patterns

Analysis of collection description patterns / revisions

Analysis of subject vocabulary in collection search logs

Focus groups with resource developers

Case studies of selected projects

Usability testing

Research approach

Previous reports
Previous reports

  • Collection definition and roles

  • Collection level description

  • OAI and metadata aggregation

  • Metadata quality

  • Search and discovery across collections

  • Metadata knowledge sharing

    See papers and presentations listed in

    Three-Year Interim Report

Focus of this report
Focus of this report

  • Profile of federated resource

  • Metadata practices

    • Survey administered to 122 projects awarded between 1998 – 2003

      2003 – 109 respondents, 76% response rate

      2006 – 72 respondents, 72% response rate, with 26% panel mortality

      • Self report by resource developers to closed questions about materials, metadata, audience*

    • Supplemented by open survey questions and interview data.

* p-value of chi-square .05 or below. Odds ratios measures, significant finding has confidence interval that does not contain 1.

Range of materials represented
Range of materials represented

Among 169 collections:

  • Images - 80% photographs/slides/negatives, posters, maps

  • Text - 68% - books, pamphlets, archival finding aids, newspapers, government documents

  • Physical Objects (29%) - museum artifacts, specimens

  • Sound (20%) - music, oral histories

  • Interactive Resources (10%) - learning objects

  • Moving Images (7%) - films, interviews, performances, video art

  • Data sets (4%) - field data, geospatial data, statistics

Item level subject strengths
Item level subject strengths

Top ranked subject headings: one kind of view

  • United States

  • people

  • songs with piano

  • trees

  • archeology of the United States

  • Work Progress Administration

  • cities & towns

  • women

  • archaeology

  • buildings

  • photographers

  • mountains

  • men

  • archaeological site

  • insects

  • bodies of water

  • shrubs

  • flowers

Collection level subject strengths
Collection level subject strengths

Top ranked subject headings: better landscape view

  • Social Studies (80% of collections):

    • U.S. history

    • state history

    • world history

    • U.S. government

    • urban studies

    • anthropology

    • geography …

  • Arts (46% of collections):

    • visual arts

    • photography

    • popular culture

    • architecture

    • music

    • history of art ...

Changes in intended primary audience
Changes in “intended” primary audience

  • Scholars – 88% (n=72) of collections

    increase from 84% in 2003 (n=94)

  • General public – 83%

  • Undergraduates – 82%

  • High school students – 79%

    increase from 59%

  • K-12 audience – 75%,

    increase from 65%

But most only have anecdotal evidence of user base. Some beginning to study use.

New content
New content

Similar increase in all types, except interactive resources more pronounced (institutions responding in 2003 & 2006)

Multiple scheme use

←2003 (n=94)

2006 (n=59) ↓

Multiple scheme use

Surprisingly, multiple scheme use

is as common in non-collaborative as

in collaborative projects.

Changes in dublin core and marc use
Changes in Dublin Core and MARC use

  • MARC or Dublin Core, up from 79% to 85%

  • Dublin Core only, up from 11% to 30%

  • Dublin Core alone or in combination with other schemes, up from 50% to 58%

  • MARC only, up from 4% to 8%

  • MARCalone or in combination with other schemes,down from 29% to 27%

Changes in metadata sharing
Changes in metadata sharing

Surveys indicated:

  • 20% (n=66) of projects conform to OAI-PMH in 2006, up from 16% in 2003 (n=94)

  • Another 26% (17 projects) plan to apply OAI-PMH, including:

    • 6 academic libraries

    • 2 each - state libraries, library consortia, academic museums, and archives

    • 1 each – botanical garden, public library, and academic department

  • Most current counts: 195 collections with 36% harvested.

  • Items up from 298,778 to 310,448 since January 2007.

  • Notable trends
    Notable trends

    • While broad range of institutions in terms of size and scope of collections, and different “cultures of description” evident in interview data, no important differences between university and non-university institutions.

    • Substantial decline in use of two or more schemes

    • Use of Dublin Core increasing, especially for single scheme use, but limitations a consistent, strong theme in interview data

    • Locally developed scheme use is steady overall, and up for single scheme use.

    • MODS application has remained minimal, but some projects are or intend to map to MODS.

    • 74% are or intend to map their metadata to other schemes / scheme

    • Percentage of records using 8 core fields down sharply, most reduction in “description” and “format”

    • Subject, description, format, and source fields most misused

    Final observations
    Final observations

    Surveys provided important, but crude benchmarks.

    Other data sources add important details, raise many questions.

    Influence of content management systems – deterring use of MODS and TEI encoding decisions - (interviews)

    Need models for description of newer, more complex and interactive objects, such as works of art / art “events” - (case study)

    Fundamental questions remain:

    What is lost or changed in the process of federating “collections” or items accumulated for a purpose -- to “explore” “demonstrate” “provide insight into” … ? - (collection descriptions)

    How build and retain “contextual mass” (Palmer, 2004), in light of 88% scholarly audience


    • This research has been funded by IMLS, NLG Research and Demonstration grant LG-02-02-0281

  • We wish to acknowledge the important contributions of our team members on the DCC project, especially:

    • Tim Cole and Amy Jackson

  • Questions and comments always welcome
    Questions and comments always welcome

    Carole L. Palmer [email protected]

    Oksana Zavalina [email protected]

    Megan Mustafoff [email protected]

    Institute of Museum and Library Services

    Digital Collections and Content