1 / 32

Christy Allen & Amy Rudersdorf State Library of North Carolina

A case study in using the Connexion Digital Import tool to streamline metadata creation in a digital state documents collection, or,. Christy Allen & Amy Rudersdorf State Library of North Carolina Southeastern CONTENTdm Users Group Annual Meeting, Starkville, MS July 31, 2008.

rea
Download Presentation

Christy Allen & Amy Rudersdorf State Library of North Carolina

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A case study in using the Connexion Digital Import tool to streamline metadata creation in a digital state documents collection, or, . . . Christy Allen & Amy Rudersdorf State Library of North Carolina Southeastern CONTENTdm Users Group Annual Meeting, Starkville, MS July 31, 2008

  2. The Good, the Bad, and the Ugly [Graphic Removed]

  3. What is Connexion Digital Import? New-ish* feature in Connexion that allows you to: • upload a digital object to a new or existing MARC record in WorldCat, and • automatically “dump” the record (mapped to Qualified Dublin Core) and object into a hostedinstance of CONTENTdm, (and to the Digital Archive if you have a subscription to that, too), • using the OCLC number as the connection point. *See OCLC’s announcement here: http://www.oclc.org/news/announcements/announcement247.htm

  4. What is required? • Connexion version 2.0 or higher • Full-level authorization status or higher in OCLC • A hosted version of CONTENTdm! • An OCLC authorization that includes CONTENTdm authorization • A WorldCat record to attach digital content to

  5. Why did we use it? • State Library of N.C. is the mandated depository for state government documents in North Carolina: • Need to provide access to all state documents • Source for original cataloging of *most* state documents in MARC • Depository Library survey indicated our clients want us to continue full MARC cataloging of documents -- let’s re-use that data! • Pilot project started using already-cataloged paper docs that have electronic versions

  6. How does it work?

  7. How does it work? (cont.)

  8. How does it work? (cont.)

  9. How does it work? (cont.)

  10. How does it work? (cont.)

  11. It’s Magic!!! completely gratuitous picture of Stonehenge taken by our Cataloging Branch Head

  12. The Good… • Multiple access points: WorldCat, ILS, CONTENTdm, and Google • Reuses already-existing metadata (MARC records) • Files are automatically moved into the Digital Archive for those who subscribe to it • Fits into existing cataloging workflow • CONTENTdm support is responsive

  13. The Good . . . (continued) • CONTENTdm is ready out of the box • Built-in functionalities: JPEG2000, full-text searchability, user-friendly interface • Compound object functionality: • Easy-to-use compound object interface • builds compound objects on-the-fly from PDF files • Crosswalking does allow special characters/ diacritics to come through from WorldCat (special characters/diacritics can’t be easily added to records created through the Acquisitions Station until the fall release of CONTENTdm)

  14. Likewise, MARC and QDC are not quite the same… OK, maybe it’s not all magic . . . Stonehenge snow globe. Doesn’t have quite the same effect. [Graphic of Stonehenge snowglobe Removed]

  15. . . . The Bad and the Ugly: post-crosswalk editing At first you feel like this guy. . . [Graphics of sad pig balloon and girl saying “I don’t care what you say. I’m gonna be a horse when I grow up.” Removed] . . . but after a while it’s not so bad

  16. Why edit, you say? • Doesn’t the full-text document contain everything the user needs? Well . . . • The mapping between MARC and QDC is defined by OCLC and is “fixed,” so you don’t get to pick which MARC fields map into which QDC fields! • This means that you may have: • Data mapping to a field in which you don’t want it • Data you don’t want at all that maps anyway • Data you want that doesn’t map anywhere

  17. Data mapping to a field in which you don’t want it • Where is this a problem? • dc.subject - 099/092/096 fields and non-LCSH subject terms applied by other institutions • dc.language – we use ISO 639-2 code as controlled vocabulary, but free text note field in MARC (546) maps to dc.language! • dc.relation – OCLC URL maps to this field instead of to dc.identifier

  18. Data you don’t want at all that maps anyway (1/2) MARC 099/092/096 fields (call & cutter numbers) map to dc.subject field in CONTENTdm

  19. Data you don’t want at all that maps anyway (2/2) • Issues: • CONTENTdm supplies a controlled vocabulary (TGM) for this field or you can implement your own. However, the CV is difficult to apply because every record now contains unique value that does not exist in the controlled vocabulary! • If you DO apply a controlled vocabulary to the dc.subject field and forget to remove the classification number while editing the record, the system will not let you save the record, and you may lose all your other edits to that record.

  20. Data you want that doesn’t map to any CONTENTdm field • 041 (language codes) • 780/785 (title replaces/replaced by fields) only certain indicator/subfield combos are crosswalked • 260 $a (place of publication) • 245 $c (statement of responsibility) So, we manually add some of this information . . .

  21. Fields that don’t exist in MARC We repeatedly input the same data directly into multiple CONTENTdm records because . . . • the data simply doesn’t exist in the MARC record, and • you can’t apply a CONTENTdm template to a record directly dumping from CONNEXION Examples: “Collection,” “Digital Format,” “Rights,” etc.

  22. Controlled vocabulary issues • We use LCSH and LC name authorities in various fields • Terms were loaded into CONTENTdm after pulling the data from our Voyager system • If the WorldCat record had authority headings that were added or changed before load, those terms aren’t in our CV • In Admin module: new controlled vocabulary terms can’t be added to the CV directly from the record (must be laboriously added before record is edited)

  23. MARC record authorization problems • Our OCLC authorization = “Enhance level” • Some of “our” MARC records have been upgraded to Elvl:[blank] (i.e., we can’t edit them anymore) • CDI process replaces record, but we no longer have authorization to do so! • OCLC has recommended we create a duplicate record • We are brainstorming other alternatives with OCLC

  24. Workflow for Editing New Items • New items added through CDI appear in the live repository (not in approval queue) • (We don’t insert a collection name into these records until they are edited/approved so that they don’t come up in a collection-specific search (The item will still come up in a repository search)

  25. Workflow for Editing New Items • Newly imported records are batch-downloaded into the Acquisitions Station, edited, and re-uploaded with the Collection name • They then become accessible through collection and repository searches

  26. A search within the Publications Collection for “Dept. of Transportation” returns 7 hits (all edited records)

  27. A search across the entire repository for the same phrase returns 12 hits (3 of the first 4 are unedited records)

  28. Other Issues • Import isn’t always successful (sometimes, the digital object isn’t “there” when you index the collection) • Unspecified time lags may occur during digital import • Large bandwidth required for digital import to work consistently • Can’t export administrative fields auto-populated by OCLC. (e.g., the OCLC number) *Not really a CDI issues, but since we’re here . . .

  29. Potential improvements? (1/2) • Use templates (or something) to apply “constant data” to imported records • Add controlled vocabulary terms directly from metadata record while working in Admin module • Attach digital content to ALL records (including CONSER/Elvl:[blank] records) • Suppress individual records in the “live” collection until ready to make them publicly available

  30. Potential improvements? (2/2) • Let CONTENTdm talk to the WorldCat authority file for controlled vocabularies • Some kind of visible “required fields” indicator in the Admin interface (customizable on a collection basis). During creation, editing, updating process, required fields would be obvious. • Export ALL fields (both administrative and Dublin Core) from CONTENTdm

  31. The Digital Import Process: Sometimes Weird but Very Useful The Flowbee cuts and vacuums hair at the same time! [Graphic of “Flowbee” in use Removed]

  32. State Library of North Carolina Check out our collections: http://statelibrary.dcr.state.nc.us/dimp/index.html Christy Allen Christy.E.Allen@ncmail.net (soon to be Christy.Allen@ncdcr.gov) Amy Rudersdorf Amy.Rudersdorf@ncmail.net (soon to be Amy.Rudersdorf@ncdcr.gov)

More Related