metadata interoperability for everyone xml tools for catalogers l.
Skip this Video
Loading SlideShow in 5 Seconds..
Metadata interoperability for everyone – XML tools for catalogers PowerPoint Presentation
Download Presentation
Metadata interoperability for everyone – XML tools for catalogers

Loading in 2 Seconds...

play fullscreen
1 / 30

Metadata interoperability for everyone – XML tools for catalogers - PowerPoint PPT Presentation

  • Uploaded on

Metadata interoperability for everyone – XML tools for catalogers Terry Reese Digital Production Unit Head Oregon State University Finding our way Metadata Interoperability Crosswalk systems Common problems Metadata tools Scripting Solutions MarcEdit MarcEdit and MODS

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Metadata interoperability for everyone – XML tools for catalogers' - niveditha

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
metadata interoperability for everyone xml tools for catalogers

Metadata interoperability for everyone – XML tools for catalogers

Terry Reese

Digital Production Unit Head

Oregon State University

finding our way
Finding our way
  • Metadata Interoperability
    • Crosswalk systems
    • Common problems
  • Metadata tools
    • Scripting Solutions
    • MarcEdit
  • MarcEdit and MODS
    • Metadata transformations
    • MODS editing
    • Automatic MODS harvesting
  • Conclusion
why metadata interoperability4
Why metadata interoperability?
  • Today, we have literally hundreds of different metadata schemas. In the library, we have a wide variety as well.
    • MARC (and all its flavors)
    • FGDC
    • Dublin Core
    • EAD
    • METS
    • MODS
    • Onyx
    • OAI
    • TEI
    • FRBR
    • GILS
    • etc…..
if you describe it
If you describe it…..
  • Metadata schemas are created by communities to meet the special descriptive needs of those communities.
  • Of course, one of the dangers is competing standards within groups creating multiple incompatible schema or the creation of variations of a particular schema within a community.
if you describe it6
If you describe it…..


<subject source="lcsh" encodinganalog="650">College students--Iowa--Mount Vernon.</subject>

<subject encodinganalog="650" source="lcsh">Student activities--Iowa--Mount Vernon.</subject>



<subject source="lcsh">

<controlaccess encodinganalog=“650a”>College students</controlaccess>

<controlaccess encodinganalog=“650z”>Iowa</controlaccess>

<controlaccess encodinganalog=“650z”>Mount Vernon.</controlaccess>



if you describe it7
If you describe it…

Some specialized examples:

  • MARC (MAchine Readable Communication)
  • EAD (Encoded Archival Description)
    • (MARC representation:
  • Dublin Core
  • FGDC
if you describe it8
If you describe it…

Why would communities develop shared metadata schemas?

  • Shared schemas provide a structured method for sharing data within a community.
    • Example: MARC…its development paved the way for the current cooperative cataloging model and tools like:
      • OCLC
      • RLIN
      • Z39.50
  • But shared best practices?
why use crosswalks
Why use crosswalks?


  • Are developed by examining the similarities and differences between schemas.
  • Are one of the primary mechanism that can be used to allow different systems to interoperate with each other.
  • Breaks down data transfer barriers, allowing different systems to share data.
why use crosswalks10
Why use crosswalks?
  • To combine metadata catalogs

e.g. Union catalogs

  • To provide cross searchability between unlike datasets

e.g. Federated search tools

  • To perform data/metadata maintenance

e.g. Updating metadata formats – moving away from obsolete standards.

  • Repurposing one schema to another.
why use crosswalks11
Why use crosswalks?
  • Cost
    • Metadata creation costs can be prohibitive
      • University of Indiana reported in 2003 on their digitization costs that 1/3 total cost attributed to metadata create.4 This was just the initial metadata creation costs and didn’t include estimates for ongoing metadata maintenance.
      • However, this isn’t just a digitization issue – its also an issue for traditional catalog workflows (books, serials, etc):
        • Loose OSU cost approximates (including OCLC charges):
          • Books (copy cataloging): $3 /book
          • Books (original): $27 /book
          • Thesis (subject/classification): $20 /thesis
crosswalking challenges
Crosswalking challenges
  • Schema granularity
    • One to many matches and many to one matches
    • Crosswalking from schemas with different granularity levels
      • Trying to map anything from unqualified Dublin Core.
    • Handling object relationships or hierarchies.
      • EAD=>MARC

Crosswalking challenges

  • Dealing with spare parts
    • Since data crosswalking is rarely a one to one mapping, the process nearly always results in unmappable data.
common crosswalking system designs
Common Crosswalking System Designs
  • Type-broker model (Ockerbloom)
    • Facilitates crosswalking – allows users to query known systems
    • Provides analysis and facilitates unknown crosswalking systems:
      • Determines crosswalk path
      • Negotiates system nodes
      • Does negotiations without the need for a control data layer – but allows clients to specify a control data layer that must be utilized in the conversion process.
common crosswalking system designs15
Common Crosswalking System Designs
  • Dumb-down crosswalking model
    • Converting data to its lowest common denominator.
      • Example: OAI’s initial use of Dublin Core as a tranfer format.
metadata tools
Metadata Tools
  • PERL-based:
  • Non-PERL based:
    • MarcEdit – includes XML API and crosswalks for a number of common metadata schemas.
    • LC’s MARC tools:
  • MarcEdit 5.0
    • System Requirements:Using .NET FrameworkWindows 98, ME, NT, 2000, XP, 2003 .NET 1.1 FrameworkMDAC 2.7 runtimesUsing MONO Framework (hopefully available after August 2005).Windows 2000+, Linux and MAC OS XMONO system requirements
marcedit crosswalking design
MarcEdit: crosswalking design
  • Utilizes a modified version of Ockerbloom’s type-broker system.
  • Unlike Ockerbloom’s system, which broker’s transformations between known schemas, MarcEdit utilizes MARCXML as a control schema to facilitate translation.
marcedit crosswalking design19
MarcEdit: crosswalking design
  • Ockerbloom model:broker system would continue doing translations till the desired format was found. Example: MODS, Dublin Core, MARCXML, MARC
broker system model
Broker System model


Type broker

marcedit crosswalking design21
MarcEdit: crosswalking design
  • MarcEdit model:
    • So long as a schema has been mapped to MARCXML, any metadata combination could be utilized. This means that no more than two tranformations will ever take place. Example: MODS  MARCXML  EAD
marcedit crosswalking design22
MarcEdit: crosswalking design
  • MarcEdit Crosswalk model
    • Pro
      • Crosswalks need not be directly related to each other
      • Requires crosswalker to know specific knowledge of only one schema
    • Con
      • each known crosswalk must be mapped to MARCXML.
marcedit crosswalks for everyone25
MarcEdit: Crosswalks for everyone
  • Example Crosswalks:
    • MODS => MARC
    • MODS => FGDC
    • MODS => Dublin Core
    • EAD => MODS
    • EAD=>HTML
marcedit crosswalks for everyone26
MarcEdit: Crosswalks for everyone
  • What’s MarcEdit doing?
    • Facilitates the crosswalk by:
      • Performing character translations (MARC8-UTF8)
      • Facilitates interaction between binary and XML formats.
marcedit simplify editing mods records
MarcEdit: Simplify Editing MODS records
  • New to MarcEdit 5.0 is the ability to edit MODS records in the MarcEditor as if it were a regular MARC file.
    • Allows catalogers unfamiliar with MODS to work with MODS data in a familiar form.
    • Will automatically translate new fields into MODS equivalents.
    • Will only translate MODS equivalent field data.
marcedit simplify editing mods records28
MarcEdit: Simplify Editing MODS records
  • How it works:
    • MODS file is translated to MARCXML
    • MARCXML is translated to MarcEdit Mnemonic format.
    • Internally, the MarcEditor tracks format and changes.
    • On save, mnemonic file will be retranslated back into MODS with edited and added fields being translated to their appropriate MODS mappings.
marcedit making oai simple
MarcEdit: Making OAI Simple
  • New to MarcEdit 5.0 is a Metadata Harvester.
    • From within the MarcEditor, users can harvest DC, oai_marc or MODS records directly into MARC.
  • Ockerbloom, John. Mediating among diverse data formats. School of Computer Science, Carnegie Mellon University. CMU-CS-98-102. January 1998.
  • Digitization Costs & Funding. Digital Library Workshop. Oct. 2003.