1 / 13

Standards for Long-Term Retention of Digital Information: Can Ontologies Help?

Standards for Long-Term Retention of Digital Information: Can Ontologies Help?. Joshua Lubell National Institute of Standards and Technology lubell@nist.gov Collaborative Expedition Workshop National Science Foundation July 18, 2007. The Problem. Too much digital data!

helenl
Download Presentation

Standards for Long-Term Retention of Digital Information: Can Ontologies Help?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Standards for Long-Term Retention of Digital Information:Can Ontologies Help? Joshua Lubell National Institute of Standards and Technology lubell@nist.gov Collaborative Expedition Workshop National Science Foundation July 18, 2007

  2. The Problem • Too much digital data! • It takes about 15 minutes for the world to churn out new digital information equivalent to the entire collection in US Library of Congress • Proprietary file formats • Expected lifetime of typical manufacturing software application only 3 years • Short-lived Computing hardware and software • Expected lifetime of today’s storage/retrieval technologies only 10 years • Products often outlive computer software/hardware by an order of magnitude • Aircraft can last 50 years or more • Healthcare records should be preserved through the patient’s lifetime, and perhaps beyond • Methods/tools address preservation, but not reuse or re-engineering requirements

  3. Data Standards • Necessary to avoid being locked into a vendor format or application that could disappear in the near future • Likely to be more stable than proprietary tools/formats • But data standards are only part of the solution • Information is more than just data!

  4. DataObject InformationObject RepresentationInformation(metadata) Information = Data + Interpretation from Reference Model for an Open Archival Information System (ISO 14721:2003) Binary File Electronic Tech Manual Definition of PDF Format

  5. InformationObjects ContentInformation PreservationDescriptionInformation Sub-categories • Reference • Provenance • Context • Fixity An Information Package

  6. Tools for Tackling Long-Term Retention • Standards for representing digital artifacts • STEP – ISO 10303 (product data) • XML (documents) • Graphics, audio, video, multimedia standards • Scientific modeling standards • Methods for representing preservation information • Digital object typing/packaging • METS (Metadata Encoding and Transmission Standard) • MPEG-21 • DOPs (Digital Object Prototypes) • Ontology languages • Rules languages • Schematron (ISO 19757-3:2006) • Digital format registries (UK Archives, Harvard, Univ. of Maryland)

  7. Sustaining Digital Information What is sustainability? From The Free Dictionary: • Noun - the act of sustaining life by food or providing a means of subsistence; "they were in want of sustenance"; "fishing was their main sustainment“ • Transitive verb • 1. To keep in existence; maintain. • 2. To supply with necessities or nourishment; provide for. • 3. To support from below; keep from falling or sinking; prop. • 4. To support the spirits, vitality, or resolution of; encourage. • 5. To bear up under; withstand: can't sustain the blistering heat. • 6. To experience or suffer: sustained a fatal injury. • 7. To affirm the validity of: The judge has sustained the prosecutor's objection. • 8. To prove or corroborate; confirm. • 9. To keep up (a joke or assumed role, for example) competently.

  8. Sustaining Digital Information • Minimal • “Prop up” • Prevent destruction • Better • Preserve • Ensure authenticity, availability • Ideal • Nurture • “Care and feeding” • Enable reuse

  9. Sustainability Metrics • Library of Congress digital format sustainability factors • Disclosure • Adoption • Transparency • Self-documentation • External dependencies • Impact of patents • Technical protection mechanisms • What are the sustainability factors for an archiving and/or records management strategy?

  10. OAIS Functional Model

  11. Access Scenarios: The Three Rs • Reference • Preserve information in its original state • Example (product data engineering): 3D visualization • Reuse • Allow for future modification, re-engineering • Example: ISO 10303-203:1994 (STEP AP203) • Rationale • Encode construction history, design intent, tolerancing info, lifecycle management info, etc. • Example: STEP AP203 ed.2 ++ • Ontologies and/or other representations needed

  12. Extended Functional Model

  13. So How Can Ontologies Help? • Digital object type classification • Prediction of records management policy consequences • Evaluating a records management system based on sustainability criteria • Tailoring repository access according to the Three Rs • Measure long-term sustainability based on the Three Rs

More Related