iac digital preservation committee l.
Skip this Video
Download Presentation
IAC Digital Preservation Committee ________________________________________________

Loading in 2 Seconds...

play fullscreen
1 / 28

IAC Digital Preservation Committee ________________________________________________ - PowerPoint PPT Presentation

  • Uploaded on

IAC Digital Preservation Committee ________________________________________________. 10 April 2007 Yale University Library. 10 April 2007. IAC Digital Preservation Committee ________________________________________________. Outline Charge & members. Accomplishments Policy Best practices

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'IAC Digital Preservation Committee ________________________________________________' - Sharon_Dale

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
iac digital preservation committee

IACDigital Preservation Committee________________________________________________

10 April 2007

Yale University Library

10 April 2007

iac digital preservation committee2
IAC Digital Preservation Committee________________________________________________
  • Outline
    • Charge & members.
    • Accomplishments
      • Policy
      • Best practices
    • What’s next

10 April 2007

iac digital preservation committee3
IAC Digital Preservation Committee________________________________________________

The DPC is an Integrated Access Council committee charged to:

  • Develop a digital preservation program by evaluating, compiling, documenting and articulating policies, procedures, best practices and systems in order to establish a digital preservation infrastructure at Yale University Library.
  • Work from a base of clearly articulated policies, then focus on preservation program planning and, finally, make recommendations for program implementation through digital preservation projects, initiatives, and system development.

10 April 2007

iac digital preservation committee4
IAC Digital Preservation Committee________________________________________________
  • Members:
    • Rebekah Irwin, BRBL
    • David Gewirtz, ILTS/AM&T
    • Kevin Glick, MSS/A
    • Audrey Novak, ILTS (Co-Chair)
    • Bobbie Pilette, Preservation (Co-Chair)
    • E.C. Schroeder, BRBL
    • Former members:
      • Ann Green, ILTS/ITS, Co-Chair
      • Nicole Bouche, Beinecke Library
      • Gretchen Gano, Social Science Library

10 April 2007

iac digital preservation committee5
IAC Digital Preservation Committee________________________________________________


  • Published a Digital Preservation policy that establishes a mission statement and promulgates preservation policies for institutional standards governing the quality, type and source of digital assets to be archived in the repository (revised Feb 2007).
  • Published best practices addressing: Local practice for implementing PREMIS; Preservation Strategies; Persistent Identifiers; Fixity (checksums, message digest and digital signatures); Format Registries; Encoding & Transmission of Structured Metadata; and Care and Handling of Originals.
  • Modeled an organizational structure for the ongoing coordination and management of digital preservation. This structure recognizes that the responsibility for the creation and administration of digital preservation services at Yale is shared by three services: Metadata, Repository and Preservation.

10 April 2007

digital preservation best practices
Digital Preservation Best Practices________________________________________________

Digital preservation does not have established and vetted standards.

Issues and problems associated with preserving digital resources are

numerous, complex and dynamic. DPC best practices are an effort to

parse the larger digital preservation problem space into discrete issues and

to identify processes, activities and/or methodologies that are emerging as

standards. This work by the DPC is by no means finished. More work is

required to establish additional best practices for the myriad of related

topics and to keep these recommendations current with the latest

thinking and research in this field. Note, too, that although informed by

research, most of these best practices are untested in production

preservation archives.

10 April 2007

best practice care handling of physical collections
Best Practice – Care & Handling of Physical Collections ________________________________________________

“White paper” to advise Library staff on how to protect originals during digital conversion. Available on the web site for easy access

  • Sections include:
    • Assessment of Physical Collections
    • Criteria for Selecting Proper Scanning Equipment
    • Preparing the Scanning Surface
    • Specifications for Scanning
    • Handling Procedures for Library Materials

10 April 2007

care handling of physical collections continued
Care & Handling of Physical Collections, continued ________________________________________________
  • Assessment of Physical Collections
    • Important to include Preservation Department; contact Tara Kennedy, Field Service Librarian
    • List of questions to ask before scanning an object
  • Criteria for Selecting Proper Scanning Equipment
    • Describes available equipment and appropriate use
    • Indicates which materials can be scanned safely on each type of equipment
  • Preparing the Scanning Surface
    • How to clean the scanning surface (flatbed)

10 April 2007

care handling of physical collections continued9
Care & Handling of Physical Collections, continued __________________________________________
  • Specifications for Scanning
    • Illumination levels and types,
    • Proper supports for bound materials,
    • Environmental considerations (dust, temperature, relative humidity)
  • Handling Procedures for Library Materials
    • Mostly “common sense” reminders, but also specific suggestions, e.g. oversized materials
    • Includes paper-based, multimedia (sound, film, historical, optical), objects

10 April 2007

best practice fixity
Best Practice - Fixity________________________________________________
  • Fixity, in preservation terms, means that the digital object has not been changed between two points in time or events.
  • Fixity checks such as checksums, message digests and digital signatures are used to verify a digital object’s fixity.
  • Information created by these fixity checks, provides evidence for the integrity and authenticity of the digital objects and are essential to enabling trust.

10 April 2007

fixity continued
Fixity, continued________________________________________________
  • Fixity checks are all used in the same basic way. A value is initially generated and saved. Then, in response to an event (e.g., ingest) or over time, it is recomputed and compared to the original to ensure the object (file or bitstream) has not changed.
  • All fixity checks are not the same.
    • Checksums are the simplest and least reliable method. They are typically used in error-detection to find accidental problems in transmission and storage. They do not account for such changes as the re-ordering of bytes or changes that cancel one another out.

10 April 2007

fixity continued12
Fixity, continued________________________________________________
  • Message digests are more secure. They are computed by applying a more complex algorithm to the file of any length to produce a unique, short, uniform length character string. Change one pixel or one note in the file and the message digests will be completely different.(Ex: 93326bff6636655dcd6abff18ed2de997).
  • Digital signatures combine message digests with encryption. The message digest is created and then encrypted using a private/public key pair.

10 April 2007

fixity continued13
Fixity, continued________________________________________________

Current best practice for digital preservation


  • The creation of message digests using two algorithms, MD5 and SHA-1.
    • These are implemented in the widely used JHOVE format identification, validation and characterization application (e.g, in the Rescue Repository before and after ingest).

10 April 2007

best practice format registries and tools
Best Practice – Format Registries and Tools ________________________________________________

What is a Format?

  • A technical specification describing a standard encoding or representation of digital content stored in a file.
    • A file format extension such as “.jpg” indicates the encoded content is a digital image.
  • File encoding standards are used by programs to read the encoded information and present useable content of the file to a user’s monitor or another output device.

10 April 2007

format registries
Format Registries________________________________________________

What is a Format Registry?

  • A database that stores information about the technical specifications of an electronic file’s format.
  • Format registries record file format changes over time so that files remain readable in the face of technological obsolescence to a format standard.

How does a format registry work?

  • Global Digital Format Registry

10 April 2007

file format tools
File Format Tools ________________________________________________

File format identification & validation tools

answer two questions:

  • How can we tell a file's type?
  • If we know its type, how can we be sure that it conforms to its format specification so that we know it is still useable?

10 April 2007

file format tools17
File Format Tools __________________________________________
  • JHOVE:  A  widely used tool file type identification, validation and characterization tool developed by Harvard Univ. Library & JSTOR.
    • Handles many format types, (e.g., AIFF, ASCII, BYTESTREAM, GIF, HTML, JPEG, JPEG2000, PDF, TIFF, UTF8, WAV, XML.)
    • Is configurable in many respects, including the option to: select full validation or “short” mode, in which only the header’s signature is analyzed; the ability to include or exclude message digests in the output; and to choose from various output formats, including plain text and XML.
  • Because JHOVE does both file type identification as well as validation, it is currently Yale University Library’s format-related tool of choice.

10 April 2007

file format tools18
File Format Tools _______________________________________________

Other tools:

  • DROID (Digital Record Object Identification): A file type identification tool developed by the Digital Preservation Department of the National Archives of the United Kingdom, to perform automated batch file format identification, using the PRONOM registry .
  • National Library of New Zealand Preservation Metadata Extract Tool: A tool that extracts metadata from file headers. This Java tool uses “adapters” to extract metadata from filetypes including: MS Word, Word Perfect, Open Office, MS Works, MS Excel, MS PowerPoint, TIFF, JPEG, WAV, MP3, HTML, PDF,GIF, and BMP.  This data is output in a standard XML format.

10 April 2007

best practice persistent identifiers
Best Practice – Persistent Identifiers __________________________________________
  • A persistent identifier (PI) is a unique name (identifier) associated with an internet resource that provides a link to the content and persists over changes of server location, ownership, and other state conditions.
    • A location (e.g., a given URL) is not a persistent identifier if the content moves to another location.The principal problem addressed by PIs is: Broken links to internet resources, i.e., “the HTTP 404 Error – Document not found.”
  • Persistent identification is not possible without an associated service. It is the service that supports persistence. The identifier takes you to the service, the service resolves to the object.
  • Optimally a PI should be created and assigned when the digital object is created.

10 April 2007

best practice persistent identifiers20
Best Practice – Persistent Identifiers __________________________________________
  • Several technologies are available to create persistent identifiers such as:
    • CNRI Handle System – A generic system for assigning names to objects and resolving them. Key is the Global Handle Registry which manages the namespace of all handle prefixes.
    • DOI(Digital Object Identifier) - An application of the CNRI Handle System that associates intellectual property to structured metadata. A typical use of a DOI is to give a scientific paper or article a unique identifying number that can be resolved through the DOI resolver or the CNRI global handle resolver.
    • PURL – A Persistent Uniform Resource Locator is a URL that describes an intermediate (and more persistent) location which when retrieved results in a standard HTTP redirect to the current location of the resource.
persistent identifiers handle server
Persistent Identifiers - Handle Server________________________________________________
  • The implementation of a CNRI handle server at YUL is tightly coupled to the implementation of the VITAL/Fedora Digital Repository Service.
  • Digital objects within the Digital Repository Service will have handles such as:

http://moonpie:8085/fedora/get/hdl:10079.2F-2103288706 (opaque), or

http://hdl.rutgers.edu/1782.1/SPCOLSMAPS.Map.b1849 (semantic)

  • A handle server, like a web server, requires ongoing system administration, e.g., when resources are moved.
  • Continuing research in the assignment of handles to resources in other YUL repositories such as the Rescue Repository, Image Commons (DL/Insight), etc.


10 April 2007

best practice maintenance strategies
Best Practice - Maintenance Strategies________________________________________________

A1. Clear Allocation of Responsibilities

A2. Provision of the appropriate technical infrastructure

A3. Establishment & implementation of a plan for system maintenance, support and replacement

A4. Establishment & implementation of plan for regular transfer of records to new storage media

A5. Adherence to appropriate storage and handling conditions for storage media

A6. Ensuring redundancy and regular backup

A7. Establishment of system security

A8. Disaster planning

10 April 2007

best practice preservation strategies
B1. Use of standards

B2. Data extraction and structuring

B3. Encapsulation

B4. Restricting the range of formats to be managed

B5. Technology preservation

B6. Reliance on backward compatibility

B7. Migration

B8. Software re-engineering

B9. Viewers and migration at the point of access

B10. Emulation

B11. Non-digital approaches

B12. Data restoration

Best Practice - Preservation Strategies________________________________________________

10 April 2007

best practice premis
Best Practice - PREMIS__________________________________________

PREservation Metadata: Implementation Strategies

Yale Working Group

Matthew Beacom, Metadata Librarian, Catalog and Metadata Services (Co-chair)

Rebekah Irwin, Catalog Librarian for Digital Projects, Beinecke Library (Co-chair)

Youn Noh, Digital Resources Catalog Librarian, Catalog and Metadata Services

George Ouellette, Senior Programmer Analyst, Library ILTS

David Walls, Preservation Librarian, Library Preservation Dept

Yale Advisory Group

Reed Beaman, Associate Director for Biodiversity Informatics, Peabody Museum

Lee Faulkner, Media Director, Digital Media Center for the Arts

David Gewirtz, Project Manager, Library Projects, ITS

Kevin Glick, Electronic Records Archivist, Manuscripts and Archives

Edward Kairiss, Director, Instructional Computing Instructional Technology, ITS

Daniel Lee, E-Publishing/Internet Marketing Manager, Yale University Press

Thomas Raich, Associate Director, Information Technology, Art Gallery

10 April 2007

best practice premis25
Best Practice - PREMIS_______________________________________________


Develop PREMIS profiles that match specific digital collection and administrative needs

Base profile (up to 6 elements): This base profile of elements would support digital preservation of a wide range of digital assets

Full profile (over 200): This full profile would provide guidance to administrators of digital information assets acting as trusted custodians of material deemed to be of long-term value

10 April 2007

best practices summary
Best Practices - Summary________________________________________________
  • Most of these best practices are the outcome of current research projects.
  • Few are tested in production preservation repositories.
  • At Yale the Rescue Repository is becoming a local testbed.
    • Fixity: MD5 and SHA-1 message digests
    • JHOVE file format identification and validation
    • Maintenance strategies
    • PREMIS base profile element set.
  • VITAL/Fedora Digital Repository Service implementation
    • Persistent identifiers through the CNRI Handle System.

10 April 2007

what s next
What’s Next________________________________________________


  • Creation of a Transition Team to continue the work of the DPC, and most importantly, within a 6 month timeframe, create the roadmap for the implementation of the permanent management model for an ongoing digital preservation program.
    • The recommended structure consists of a core team representing 2FTE comprised of staff with expertise in metadata, repository and preservation services. It is modeled as a virtual Digital Curation Center (DCC). The DCC will put into practice the identified best practices and the Digital Repostiory Service (DRS) Preservation Archive.
  • The Transition Team will prepare a business plan for the Digital Curation Center. The business plan will identify the DCC’s: Vision, mission, goals and first year deliverables; Staffing models; Budget; and Timeline for creation.

10 April 2007

iac digital preservation committee28
IAC Digital Preservation Committee ________________________________________________



10 April 2007