Metadata Tools for JISC Digitisation Projects of still images and text - PowerPoint PPT Presentation

idalee
metadata tools for jisc digitisation projects of still images and text n.
Skip this Video
Loading SlideShow in 5 Seconds..
Metadata Tools for JISC Digitisation Projects of still images and text PowerPoint Presentation
Download Presentation
Metadata Tools for JISC Digitisation Projects of still images and text

play fullscreen
1 / 12
Download Presentation
Metadata Tools for JISC Digitisation Projects of still images and text
95 Views
Download Presentation

Metadata Tools for JISC Digitisation Projects of still images and text

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Metadata Toolsfor JISC Digitisation Projectsof still images and text Ed Fay BOPCRIS, Hartley Library University of Southampton

  2. Overview: BOPCRIS today • Move to work natively with standards • Interoperability • Preservation • Design project procedures from ground up with metadata in mind • File-naming and directory structuring • Metadata capture processes • Production workflow that automates where possible • Minimize possibility for human error / subjectivity • “Final package” of digital object that records preservation information on the “digital shelf” and aims for maximum interoperability between systems, all in one place

  3. Overview: technical details • File-naming / directory structure • Incorporating project-specific “unique ids” • Final package (digital object) • Internally consistent “tarball” [*.TAR] • Relative path-naming conventions • METS wrapper • Extension formats for metadata: descriptive (MODS); technical (MIX); process (PREMIS) • Production workflow • Automated production of final package • Metadata recording • Dynamic input by scanner operators

  4. History • Eighteenth Century Parliamentary Papers • Project under Phase 1 of JISC Digitization Programme • Proprietary system and data formats (Agora) • Manual input of metadata • Descriptive and Structural • Advantages and Disadvantages

  5. History: Advantages • Proprietary system with advanced functionality: • OCR workflow • Web presentation • Highly customizable • Metadata fields specified and modified at will

  6. History: Disadvantages • Non-standard metadata fields • No mapping to standard formats •  difficulties: interoperability; metadata harvesting • Translation • Between systems, or between “use” and “archive” formats •  introduces possibility of versioning issues • No scope for preservation metadata • Separation between workflow / presentation system and preservation strategy • Resulted in disparate collection of scripts and tools to manage data

  7. Present: Metadata Standards • Bibliographic database export • File-system level • Directory structure • File-naming conventions • Scanning level • TIFF headers • Additional descriptive metadata • METS profile • Tailored to project needs • Extension formats (MODS, MIX, PREMIS) • Checksums (MD5)

  8. Present: Metadata Origins File-naming Directory structure Bibliographic Metadata MARC21 / MODS / etc. PRECURSORS GENERATED • Scanned Images • TIFF headers • MIX • (Z39.87) • Other metadata • Process • Additional descriptive • PREMIS • Custom dmdSec OCR (Agora / ABBYY) METS • File formats • TIFF master / Derived JPEG • Flat text (TXT) & Word-co-ordinated OCR (TAR)

  9. Future • One tool for entire process, from scanned images to METS • Tool would: • Extract technical metadata • Include descriptive metadata • Build flat-structure METS • Tool would require: • File-naming, directory-structuring conventions • Image file sources

  10. Future: Advantages • Abstraction = standardization • All digitization projects will produce metadata in similar formats  interoperability • Certain technical base-standards will be present  preservation • Any centrally developed preservation or presentation systems would be able to ingest output from any project • Saves wasted effort developing similar solutions many times, when one solution can be developed once and adapted

  11. Future: Questions… • Usefulness of such a tool? • Relevance to your project? • Problems / obstacles? • How much flexibility is necessary? • Manual input / editing? • Main points: • Abstraction, functionality, flexibility

  12. Further information • Ed Fay, Software Developer • BOPCRIS, Hartley Library • University of Southampton • ef1@soton.ac.uk • 023 8059 3575