Planning to maximize longevity of digital information
Download
1 / 37

Planning to Maximize Longevity of Digital Information - PowerPoint PPT Presentation


  • 293 Views
  • Uploaded on

Planning to Maximize Longevity of Digital Information. Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard. Planning to Maximize Longevity of Digital Info-. The Ecology Metaphor Why are you Managing this Information? Major Issues Facing Digital Projects

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Planning to Maximize Longevity of Digital Information' - medwin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Planning to maximize longevity of digital information l.jpg

Planning to Maximize Longevity of Digital Information

Howard Besser

UCLA School of Education & Information

http://www.gseis.ucla.edu/~howard


Planning to maximize longevity of digital info l.jpg
Planning to Maximize Longevity of Digital Info-

  • The Ecology Metaphor

  • Why are you Managing this Information?

  • Major Issues Facing Digital Projects

  • The Short Life of Digital Info

  • Important Planning Considerations

  • Key Considerations for Imaging Projects



Why are you managing this information l.jpg
Why are you Managing this Information?

  • Organizational mission & type

  • Users

  • Uses


Major issues facing digital projects l.jpg
Major Issues Facing Digital Projects

  • Dangerous Changes in Intellectual Property Law

  • Intellectual Access

  • Storage

  • Delivery

  • Integration with other tools

  • Interoperability


Serious longevity problems l.jpg
Serious Longevity Problems

  • What we know from prior widespread digital file formats

  • Images separating from their metadata

  • Inaccessibility of software needed to view a work

  • Inability to even decode the file format of a work


The short life of digital info digital longevity problems l.jpg
The Short Life of Digital Info: Digital Longevity Problems-

  • Disappearing Information

  • The Viewing Problem

  • The Scrambling Problem

  • The Inter-relation Problem

  • The Custodial Problem

  • The Translation Problem


The viewing problem l.jpg
The Viewing Problem

  • Digital Info requires a whole infrastructure to view it

  • Each piece of that infrastructure is changing at an incredibly rapid rate

  • How can we ever hope to deal with all the permutations and combinations


The scrambling problem dangers from l.jpg
The Scrambling ProblemDangers from:

  • Compression to ease storage & delivery

  • Container Architecture to enhance digital commerce


The inter relation problem l.jpg
The Inter-relation Problem

  • -Info is increasingly inter-related to other info

  • -How do we make our own Info persist when it points to and integrates with Info owned by others?

  • -What is the boundary of a set of information (or even of a digital object)?


The custodial problem l.jpg
The Custodial Problem

  • In the past, much of survival was due to redundancy

  • How do we decide what to save?

  • Who should save it?

    • Mellon-funded E-Journal Archives

  • How should they save it?-


The custodial problem how to save information l.jpg
The Custodial Problem:How to save information?

  • Methods for later access

    • Refreshing

    • Migration

    • Emulation

  • Issues of authenticity and evidence


The translation problem l.jpg
The Translation Problem

  • Content translated into new delivery devices changes meaning

    • -A photo vs. a painting

    • -If Info is produced originally in digital form in one encoded format, will it be the same when translated into another format?

    • Behaviors


Pieces of the solution 1 2 l.jpg
Pieces of the Solution (1/2)

  • -We need to insist upon clearly readable standardized ways for digital objects to self-identify their formats

  • -We should discourage scrambling

  • -We need to better understand information inter-relates to other Info, and what constitutes “boundaries” of Info objects


Pieces of the solution 2 2 l.jpg
Pieces of the Solution (2/2)

  • -People and organizations wishing to make information persist need guidelines of how to go about doing it

  • -We need to better understand how translating from one storage or display format to another affects the meaning of a work

  • -We need to save the “behaviors” of a digital object, not just its “contents”


Conceptual approaches to digital preservation l.jpg
Conceptual Approaches to Digital Preservation

  • Refreshing always necessary due to volatility of physical strata

    • Impact on evidential value

  • Migration -- advantages & disadvantages

  • Emulation -- advantages & disadvantages


To deal with immediately l.jpg
To deal with Immediately-

  • Persistent IDs

  • Metadata


Persistent ids the problem l.jpg
Persistent IDs--the Problem

  • Need to separate work ID from work location

  • URNs probably won’t be ready until 2003

  • Becomes a business process issue when one organization maintains the resource and another organization references it (ie. licensed from vendors or managed by separate administrative structures)


More persistent ids the approach for today l.jpg
More Persistent IDs--the Approach for today

  • PURLs

  • Handles

  • HTTP redirects

  • And worry about costs now and conversion costs when URNs become feasible


Data set management more issues with referencing ids l.jpg
Data Set ManagementMore issues with referencing IDs

  • References for mirror sites

  • References for back-up sites when main site is down or bottle-necked

  • References for off-site copies and archival copies


Metadata can be the first line of defense l.jpg
Metadata can be the first line of defense

  • Can tell you

    • where the file is (if you can’t find the file)

    • where more info about the file is (if you have the file but most other metadata has become separated)

    • what the file format is

    • what the compression scheme is

    • what application program and version is needed for the file


Structural metadata issues l.jpg
Structural Metadata Issues

  • http://sunsite.berkeley.edu/moa2


Architecture separating longevity and delivery servers l.jpg

User

User

Berkeley

Longevity

Server

Berkeley

Delivery

Server

User

User

Other

Delivery

Server

Other

Delivery

Server

Other

Delivery

Server

Architecture: Separating Longevity and Delivery Servers


Groups working on the big problem http sunsite berkeley edu longevity l.jpg
Groups Working onthe Big Problemhttp://sunsite.Berkeley.EDU/Longevity/

  • CPA Task Force

  • Getty “Time & Bits” Conference & Follow-ups-

  • Emulation experiments in US and Europe

    • NEDLIB, CURL, Michigan

  • Mellon-funded E-Journal Archive experiments

  • Internet Archive

  • Long Now



Time bits participants l.jpg

Steward Brand

Howard Besser

Brian Eno

Danny Hillis

Peter Lyman

Brewster Kahle

Kevin Kelly

Jaron Lanier

Doug Carlston

John Heilemann

Ben Davis

Margaret MacLean

Bruce Sterling

Paul Saffo

Time & Bits Participants


Groups working on pieces of the big problem http sunsite berkeley edu longevity l.jpg
Groups Working onPieces of the Big Problemhttp://sunsite.berkeley.edu/Longevity/

  • Internet Archive

  • Long Now

  • Emulation experiments in US and Europe

    • NEDLIB, CURL, Michigan


Journal archiving l.jpg
Journal Archiving

  • License, don’t own; may not be even able to obtain right to make archival copy

  • Increasingly no paper back-up at all

  • Usually we don’t have the important redundancy factor

  • Stanford’s LOCKSS Project (Lots of Copies Keeps Stuff Safe) and its problems (http://lockss.stanford.edu)


Complexity of rich media l.jpg
Complexity of Rich Media

  • Works often have artistic nature (including video games)

  • Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact)

  • Too complex to save every one of these aspects for every type of material

  • Importance of saving documentation


Important planning considerations l.jpg
Important Planning Considerations

  • File Formats

  • Choosing Interoperable Systems

  • Adhere to standards

  • Vendors with large installed base

  • Refreshing and/or Migration


Key considerations for imaging projects l.jpg
Key Considerations for Imaging Projects-

  • Users' Needs

  • Image Quality

  • Intellectual Property

  • Standards

  • Topology

  • Tools & Processes


Key considerations for imaging projects 1 of 3 l.jpg
Key Considerations for Imaging Projects (1 of 3)

  • Users' Needs

    • Quality of Digital Surrogate

    • Interoperable desktop applications

  • Image Quality

    • Archival

    • Current online delivery


Key considerations for imaging projects 2 of 3 l.jpg
Key Considerations for Imaging Projects (2 of 3)

  • Intellectual Property

  • Standards

    • Modular and Layered Architecture

    • Terminology

    • Technical imaging information

  • Topology


Key considerations for imaging projects 3 of 3 l.jpg
Key Considerations for Imaging Projects (3 of 3)

  • Tools & Processes

    • Scanners

    • Compression techniques

    • Linking files

    • Workflow

    • Interoperable desktop applications


Some nuts and bolts planning considerations l.jpg

Think about users (and potential users), uses, and type of material/collection

Scan at the highest quality that does not exceed the likely potential users/uses/material

Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery

Many documents which appear to be bitonal actually are better represented with greyscale scans

Include color bar and ruler in the scan

Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct)

Don’t use lossy compression

Store in a common (standardized) file format

Capture as much metadata as is reasonably possiple (including metadata about the scanning process itself)

Some nuts-and-boltsPlanning Considerations


Slide36 l.jpg
One Final Question: material/collectionWho will collect the digital works of today that should become the Special Collections of tomorrow?

  • web sites

  • zines

  • electronic journals

  • listserve and email discussions

  • drafts of works that later become famous


Planning to maximize longevity of digital information37 l.jpg

Planning to Maximize Longevity of Digital Information material/collection

Howard Besser

UCLA School of Education & Information

http://sunsite.berkeley.edu/Longevity/

http://www.gseis.ucla.edu/~howard

http://sunsite.berkeley.edu/moa2

http://lockss.stanford.edu

http://www.longnow.com/10klibrary/TimeBitsDisc/

http://www.archive.org/


ad