1 / 42

MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT

MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT. Daniel Gelaw Alemneh University of North Texas. University of North Texas (UNT) Libraries Digital Initiatives. Collaborative Initiatives CyberCemetery GPO NARA – Affiliated Archive Texas Register Archive

oona
Download Presentation

MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAINTAINING QUALITY METADATA: TOWARD EFFECTIVE DIGITAL RESOURCE LIFECYCLE MANAGEMENT Daniel Gelaw AlemnehUniversity of North Texas

  2. University of North Texas (UNT) Libraries Digital Initiatives • Collaborative Initiatives • CyberCemetery • GPO • NARA – Affiliated Archive • Texas Register Archive • Secretary of State’s Office • Texas Laws and Resolutions Archive • Secretary of State’s Office • The Portal to Texas History • 45 Libraries & Museums • Web-at-Risk Project • California Digital Library • New York University • National Digital Newspaper Program (NDNP) • Between 1836 and 1922. ICKM 2008

  3. University of North Texas (UNT) Libraries Digital Initiatives • Library Digital Collections: • Congressional Research Service Archive • 10,000+ CRS Reports • World War Poster Collection • 500 WWI and WWII Posters • Advisory Commission on Intergovernmental Relations • 408 reports = 47,874 pages • Federal Communications Commission (FCC) Record • 136 issues = 43,115 pages (6 of 21 volumes completed) • Electronic Theses and Dissertations (ETDs) • 3000+ more in queue • Jean-Baptiste Lully (Music) Collection • 27 scores = 10,000 pages • Other digitization projects • http://www.library.unt.edu/libraries-and-collections/digital-collections ICKM 2008

  4. ICKM 2008

  5. Metadata Environment • Metadata-based digital resource management activities • UNT Libraries metadata locally qualified Dublin Core based descriptive metadata. • Detailed technical and preservation metadata elements • Web based metadata creation and editing • Interoperability • Metadata Crosswalks • Mods • Marc • oai_dc • PREMIS ICKM 2008

  6. Metadata Quality • The two aspects of digital library data quality: • The quality of the data in the objects themselves • The quality of the metadata associated with the objects • Poor metadata quality: • Ambiguities • Poor recall • Poor precision • Inconsistency of search results ICKM 2008

  7. Metadata Quality … • Most Common errors: • Incorrect Data: • Letter transposition • Letter omission • Letter insertion • Letter substitution or misstrokes • Missing Data • Elements and values not present at all (null) • Insufficient or incomplete data • Ambiguous Data • Confusing or inconsistent data e.g. multiple spellings, multiple possible meanings, mixed cases, initials, etc. ICKM 2008

  8. Factors Influencing Metadata Quality • Local Requirements: • Objects Heterogeneity • What type of objects will the repository contain? • Granularity • How will they be described? • Functionality • What functionality is required? • How will it be interfaced? ICKM 2008

  9. Factors Influencing Metadata Quality … • Collaborative Requirements: • Diversity of Users • How best diverse information-seeking behaviors can be met? • Interoperability • Will metadata be meaningful within aggregations of various kinds? • What is required for interoperability? (Structure, semantics, & syntax) • Digital rights issues • Will access restrictions be imposed? • Are requirements formal or informal? • Are there other access and associated digital rights issues? ICKM 2008

  10. Factors Influencing Metadata Quality… • Training Issues • Necessary expertise to create and manage rigorous metadata • Metadata quality can be determined to a great extent by: • knowledge of the source, and • knowledge of the methodology used to create the statement • Cost • Rigorous metadata is resource intensive and too costly ICKM 2008

  11. UNT Metadata Quality Assurance Mechanisms & Tools • The two main stages of metadata qualities assurances: • Pre-injust • 1. Metadata Creation tools (Templates) • Post-injust • 2. Metadata Analysis tools (Web-based tools) ICKM 2008

  12. Quality Assurance Mechanisms and Tools: Templates • Metadata Creation Tools (Templates) • Validates Mandatory elements • Metadata Template Creator • Template Reader • Controlled vocabularies (UNTLBS) ICKM 2008

  13. ICKM 2008

  14. ICKM 2008

  15. ICKM 2008

  16. ICKM 2008

  17. ICKM 2008

  18. ICKM 2008

  19. ICKM 2008

  20. ICKM 2008

  21. ICKM 2008

  22. ICKM 2008

  23. ICKM 2008

  24. ICKM 2008

  25. ICKM 2008

  26. UNT Metadata Quality Assurance Mechanisms & Tools… • 2. Metadata Analysis Tools • NULL Values • List/Browse All Values (by each qualifiers and elements) • List Authorities Values • Graphical reports and other fun stuff • Clickable Maps by Institution and Collection • Word Clouds by elements • Records added overtime and other graphical reports ICKM 2008

  27. ICKM 2008

  28. ICKM 2008

  29. ICKM 2008

  30. ICKM 2008

  31. ICKM 2008

  32. ICKM 2008

  33. ICKM 2008

  34. ICKM 2008

  35. ICKM 2008

  36. ICKM 2008

  37. ICKM 2008

  38. ICKM 2008

  39. ICKM 2008

  40. Summary • Determine level of quality required • Partners may have much in common, but they have diverse and sometimes conflicting metadata requirements. • Determine nature of gap and how to close it • effectiveness, efficiency, practicability, scalability • Machine verses human error handling • How much of the process can be automated? • Human review of results is still essential (e.g. highlighted items) • Compromise • One size does not fit all! • Prioritize • Resources very unlikely to be available to meet all requirements • Test the workflow • Test, retest, and evaluate the quality cycle continuously ICKM 2008

  41. ICKM 2008

  42. Questions?Daniel.alemneh@unt.edu Thank You! ICKM 2008

More Related