steps towards a theory of information preservation l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Steps Towards a Theory of Information Preservation PowerPoint Presentation
Download Presentation
Steps Towards a Theory of Information Preservation

Loading in 2 Seconds...

play fullscreen
1 / 30

Steps Towards a Theory of Information Preservation - PowerPoint PPT Presentation


  • 159 Views
  • Uploaded on

Steps Towards a Theory of Information Preservation. Giorgos Flouris, Carlo Meghini Istituto di Scienza e Tecnologie dell’ Informazione (ISTI) CNR, Pisa, Italy {flouris,meghini}@isti.cnr.it Invited Talk (PresDB-07). Introduction. Preservation:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Steps Towards a Theory of Information Preservation' - loren


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
steps towards a theory of information preservation

Steps Towards a Theory of Information Preservation

Giorgos Flouris, Carlo Meghini

Istituto di Scienza e Tecnologie dell’ Informazione (ISTI) CNR, Pisa, Italy

{flouris,meghini}@isti.cnr.it

Invited Talk

(PresDB-07)

introduction
Introduction
  • Preservation:
    • Very important, difficult and interesting problem
    • Need for preservation is self-evident
  • Notes on this work:
    • Ongoing work for CASPAR (suggestions welcome)
    • About digital objects (not about databases, but can be applied to databases)
    • The focus of this work is not to perform preservation, but to describe formally what it means to perform preservation

Giorgos Flouris, PresDB-07

purpose
Purpose

We are trying to come up with a formal, mathematical, logic-based description of preservation as a scientific discipline, to the end of deriving a methodology resting on solid grounds

(then, we will try to apply this methodology to CASPAR)

Giorgos Flouris, PresDB-07

the need for a theory of information preservation
The Need for a Theory of Information Preservation
  • Why is such a theory important?
    • A formal, theoretical, mathematical framework allows the proof of impossibility and existential results
    • Allows us to ground existing (and future) methods upon a common formalism for comparison
    • Provides a set of formal desirable properties for existing and future preservation methods
    • Allows proving that a preservation method works well (or does not work well)
  • Where practitioners believe, a theory can prove

Giorgos Flouris, PresDB-07

preservation types

PRODUCER

CONSUMER

Preservation Types

KR Level

The first letter of the English alphabet

The first letter of the English alphabet

Knowledge Level

Understands Concept

Understands Concept

Information Preservation

A

A

Writes Symbol

Reads Symbol

Data (or Object) Preservation

Symbol Level

01000001

01000001

Writes Bits

Reads Bits

Bit Preservation

Time

Giorgos Flouris, PresDB-07

preservation types example
Preservation TypesExample

Bit Preservation: Database is not corrupt (error correction techniques, backups, refreshment of media)

Data Preservation: Database can be opened (preserve format specification)

Information Preservation: Database can be understood (temperatures in Celsius, dates in dd/mm/yy)

Giorgos Flouris, PresDB-07

statics digital object and uck
StaticsDigital Object and UCK
  • A digital object depends on external information:
    • Bit Format (ASCII codes, integer representation, …)
    • Symbols’ Format (23/03/07 or 03/23/07)
    • Background Knowledge (what is the meaning of 23/03/07)
  • A digital object is attached to a single Underlying Community Knowledge (UCK) that contains this information
  • Therefore:
    • A digital object carries no meaning by itself
    • Its meaning (semantics) is derived from the attached UCK

Giorgos Flouris, PresDB-07

statics schematically
StaticsSchematically

Giorgos Flouris, PresDB-07

information to be preserved questions and answers
Information to be Preserved:Questions and Answers
  • Digital object: a set of questions and answers
    • Not all informationin a digital object needs to be preserved
    • Example: a document (content, format, fonts, pagination)
  • The exact information to be preserved depends on:
    • Type of digital object
    • Producer’s intentions
    • Digital object’s intended reader (Designated Community)
    • Legal issues
    • Practical considerations

Giorgos Flouris, PresDB-07

statics information preservation structure ips

IPS

Digital Object

UCK

L

T

Q

ans

LL

V

PC

VI

P

StaticsInformation Preservation Structure (IPS)
  • IPS = UCK + Digital Object
    • UCK: <L,T>
    • Digital Object: <Q,ans>
  • L is further broken down:
    • L= <LL, V, VI, P, PC, ⊧>

Giorgos Flouris, PresDB-07

ips and preservation models
IPS and Preservation Models
  • Preservation models provide a methodological framework for determining the content of an IPS
    • OAIS (ISO standard 14721:2003)
      • Representation Information (UCK)
        • Structural Information
        • Semantic Information
      • Preservation Description Information (questions and answers)
        • Provenance
        • Reference
        • Context
        • Fixity
      • Digital object’s content (questions and answers)

Giorgos Flouris, PresDB-07

purpose of preservation
Purpose of Preservation

Giorgos Flouris, PresDB-07

preservation and change
Preservation and Change
  • UCK evolves
    • If digital objects remained the same, they would be either unreadable or would carry the wrong meaning
  • Thus we need a methodology that will indicate the appropriate changes to all digital objects attached to a UCK, as a function of:
    • The old digital object
    • The old UCK (producer’s UCK)
    • The new UCK (consumer’s UCK)
    • The UCK evolution specification

Giorgos Flouris, PresDB-07

belief change ontology evolution and information preservation 1
Belief Change, Ontology Evolution and Information Preservation (1)
  • Initial thought: use well-established methods from belief change (belief revision) and ontology evolution
  • Not possible, in general:
    • The UCK may be a logic not supported by the above fields
    • Changes may affect the logic itself
    • Changes may be of infinite nature
    • Input/output may be different
  • Example: Roman to Arabic numerals
    • III 3
    • IV 4

Giorgos Flouris, PresDB-07

belief change ontology evolution and information preservation 2
Belief Change, Ontology Evolution and Information Preservation (2)
  • However, it is possible under some assumptions:
    • The logic does not change
    • The logic in UCK is supported
    • Old UCK and digital object are known, evolution is known
    • Change can be finitely described using standard models
  • Example from astronomy:
    • Pluto was a Planet
    • Planet definition changed recently (24/08/06, Prague)
    • Pluto reclassified as a Dwarf Planet

Giorgos Flouris, PresDB-07

dynamics schematically idealized case
DynamicsSchematically (Idealized Case)

Producer

Consumer

Giorgos Flouris, PresDB-07

dynamics schematically general case
DynamicsSchematically (General Case)

Producer

Consumer

Expanded

Various levels of preservation:complete, essential, modulo logical equivalence, indirect, approximate, partial, …

Giorgos Flouris, PresDB-07

dynamics ips evolution structure ipses

Producer

Expanded

Consumer

IPS

IPS

IPS

Digital Object

Digital Object

Digital Object

UCK

UCK

UCK

L

L

L

T

T

T

Q

Q

Q

ans

ans

ans

LL

LL

LL

V

V

V

PC

PC

PC

VI

VI

VI

P

P

P

DynamicsIPS Evolution Structure (IPSES)

IPSES

Mapping needs a finite representation: Turing Machines

IPSES’ definition is incomplete

Need a way to compute the green arrow from the information given (old digital object, producer’s UCK, consumer’s UCK, IPSES)

Giorgos Flouris, PresDB-07

putting it all together general ideas
Putting it All TogetherGeneral Ideas

What is preservation?

Preservation is the process of retaining the meaning of a digital object unaltered for readers with different background, software, hardware etc

What are the preservation types?

Bit PreservationBits are not corrupt

Data PreservationBits’ format is understood/read

Information PreservationInformation is understood

Giorgos Flouris, PresDB-07

putting it all together statics
Putting it All TogetherStatics

What is a digital object?

A digital object is a sequence of bits (no meaning)

What gives meaning to a digital object?

The underlying (often implicit) format, knowledge, symbols’ meaning etc, represented by UCK

What should be preserved?

A set of questions and their answers

How do we determine the content of an IPS?

Preservation models can help

Giorgos Flouris, PresDB-07

putting it all together dynamics general
Putting it All TogetherDynamics (General)

Why is preservation needed?

Underlying knowledge (UCK) evolves; if digital objects remained the same, they would be not understood or be misunderstood

When is preservation achieved?

When digital objects retain their meaning

Can other research fields help?

Belief Revision and Ontology Evolution, but only partially

Giorgos Flouris, PresDB-07

putting it all together dynamics ipses
Putting it All TogetherDynamics (IPSES)

How can we describe UCK evolution?

Using an expanded UCK, plus a mapping and a number of correspondences between the UCKs

Is preservation always possible?

No; various levels of preservation

How should digital objects evolve?

Open question; a function of the old digital object, the two UCKs and the UCK evolution information (IPSES)

Giorgos Flouris, PresDB-07

future work
Future Work
  • Calculate the evolution of the digital object as a function of:
    • Old digital object
    • Producer’s UCK
    • Consumer’s UCK
    • IPSES (evolution information)
  • Ongoing work: refinements might be required
  • Extensive testing of the theory (real-world examples)
  • Tie the theory to more useful in practice structures

Giorgos Flouris, PresDB-07

slide24

The End

Acknowledgements

This work was carried out during Giorgos Flouris’ tenure of an ERCIM “Alain Bensoussan” Fellowship Programme.

This work was partially supported by the EU project CASPAR (FP6-2005-IST-033572).

Giorgos Flouris, PresDB-07

backup slides
BACKUP SLIDES

Giorgos Flouris, PresDB-07

preservation types revisited

PRODUCER

CONSUMER

The last letter of the English alphabet

Preservation TypesRevisited

KR Level

The 6th letter of the Greek Alphabet

Knowledge Level

Understands Concept

Understands Concept

Information Preservation

Z

Z

Writes Symbol

Reads Symbol

Symbol Level

Data (or Object) Preservation

01011010

01011010

Writes Bits

Reads Bits

Bit Preservation

Giorgos Flouris, PresDB-07

preservation types joke analogy
Preservation TypesJoke Analogy
  • In order to laugh at a joke, you must:
    • Hear the joke (bit preservation) The sound waves should reach your ears; if you are in another room, you won’t laugh at the joke
    • Understand the joke (data preservation)You should understand the language; if I say a joke in Greek, you won’t laugh at the joke
    • Understand the context of the joke (information preservation)You should understand what the joke is about; if I say a joke about the political situation in Greece, you won’t laugh at the joke

Giorgos Flouris, PresDB-07

statics underlying community knowledge uck
StaticsUnderlying Community Knowledge (UCK)
  • UCK: a logical formalism, plus a logical theory
  • Because logics are:
    • Formal
    • Able to express knowledge
    • Suitable to capture question-answering (using inference)
    • Well-studied, mature, well-established field with rich results
    • Allow building theories to express background knowledge
  • We don’t embrace any particular logic

Giorgos Flouris, PresDB-07

contents of a uck

UCK

Contents of a UCK

Producer

Intended Consumer

Digital Object

Knowledge P3

Knowledge P2

Knowledge C2

Knowledge P1

Knowledge C1

Common Knowledge

Giorgos Flouris, PresDB-07

dynamics notes on ipses
DynamicsNotes on IPSES
  • IPS Evolution Structure (IPSES):
    • IPSES = UCK + mapping
  • Exact specification of the change (no side-effects)
    • Usually change is partially specified (has side-effects)
    • Determining side-effects is orthogonal to preservation
  • Change may be infinite (finite representation needed)
    • Example: Roman and Arabic numerals
    • Need Turing Machines to represent the mapping

Giorgos Flouris, PresDB-07