slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Preservation and Long Term Access to Data and Records in a Knowledge-based Society Reagan W. Moore San Diego Supercomput PowerPoint Presentation
Download Presentation
Preservation and Long Term Access to Data and Records in a Knowledge-based Society Reagan W. Moore San Diego Supercomput

Loading in 2 Seconds...

play fullscreen
1 / 20

Preservation and Long Term Access to Data and Records in a Knowledge-based Society Reagan W. Moore San Diego Supercomput - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

Preservation and Long Term Access to Data and Records in a Knowledge-based Society Reagan W. Moore San Diego Supercomputer Center moore@sdsc.edu http://www.npaci.edu/DICE/. Staff Reagan Moore Ilkai Altintas Chaitan Baru Sheau Yen Chen Charles Cowart Amarnath Gupta George Kremenek

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Preservation and Long Term Access to Data and Records in a Knowledge-based Society Reagan W. Moore San Diego Supercomput' - odelia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Preservation and Long Term Access to Data and Records in a Knowledge-based Society

Reagan W. Moore

San Diego Supercomputer Center

moore@sdsc.edu

http://www.npaci.edu/DICE/

data and knowledge systems group
Staff

Reagan Moore

Ilkai Altintas

Chaitan Baru

Sheau Yen Chen

Charles Cowart

Amarnath Gupta

George Kremenek

M. Kulrul

Bertram Ludäscher

Richard Marciano

A. Memon

XuFei Qian

Roman Olshanowsky

Arcot Rajasekar

Abe Singer

Michael Wan

Ilya Zaslavsky

Bing Zhu

Graduate Students

A. Bagchi

S. Bansal

A. Behere

R. Bharath

S. Bharath

L. Sui

Undergraduate Interns

N. Cotofana

D. Le

J. Trang

L. Yin

+/- NN

Data and Knowledge Systems Group
topics
Topics
  • Building persistent archives
  • Data grids
  • Authenticity mechanisms
  • Managing technology evolution
  • Knowledge-based access
archival processes
Archival Processes

 Appraisal –determine the archivable content

 Accession - determine the initial physical location for the data, and the relationship of the new collection to existing collections

  • Arrangement - add administration control, describe the information content (provenance, authenticity, structure, administrative), and decompose digital objects into their components as needed.
  • Description - complete the definition of collection attributes by iterating between arrangement, reformatting, and representation.
  • Preservation – build an archivable form of the digital entities, characterize the collection context , and manage their storage

 Access – provide query mechanisms for discovering, retrieving, and presenting the digital entities.

common approach digital library persistent archive data grid
Common Approach (digital library, persistent archive, data grid)
  • Logical name space used to organize digital entities, and associate attributes
  • Separation of information management from data storage management
  • Definition of abstraction mechanisms for dealing with repositories
  • Emergence of need for knowledge management
slide7

C, C++,

Libraries

Unix

Shell

Databases

DB2, Oracle,

Postgres

Archives

HPSS, ADSM,

UniTree, DMF

File Systems

Unix, NT,

Mac OSX

SDSC Storage Resource Broker & Meta-data Catalog

Levels of Abstraction

Application

Linux

I/O

Web

WSDL

DLL /

Python

Java, NT

Browsers

Prolog

Predicate

Clients

Consistency Management / Authorization-Authentication

Prime

Server

Logical Name

Space

Latency

Management

Data

Transport

Metadata

Transport

Catalog Abstraction

Storage Abstraction

Databases

DB2, Oracle, Sybase

Servers

HRM

authenticity
Authenticity
  • Guarantee that the data has not been changed
    • Collection owned data, only accessible through the data handling system
    • Support roles defining access (curation, owner, annotation, read)
    • Support access controls mapping users to roles
  • Audit trails that record all operations on files
  • Digital signatures - cryptographic checksums
managing technology evolution
Managing Technology Evolution
  • Data grids provide interoperability mechanisms to access data in multiple administration domains and multiple types of storage systems.
  • Persistent archives migrate collections from old technology to new technology to support presentation on new systems
  • Both require the ability to access heterogeneous systems
presentation of digital objects
Presentation of Digital Objects

Application

Operating System

Storage System

Display System

Digital Object

technology management emulation
Technology Management - Emulation

Old Application

Wrap Application

New Operating System

New Storage System

New Display System

Digital Object

technology management
Technology Management

Old Application

Add Operating System Call

New Operating System

New Storage System

New Display System

Digital Object

technology management13
Technology Management

Old Application

Add Operating System Call

New Operating System

Add Operating System Call

Old Storage System

Old Display System

Digital Object

technology management migration
Technology Management Migration

New Application

New Operating System

New Storage System

New Display System

Migrate Encoding Format

Digital Object

technology management sdsc
Technology Management - SDSC

New Application

New Operating System

Wrap Storage System

Wrap Display System

Old Storage System

Old Display System

Migrate Encoding Format

Digital Object

accessing archived data
Accessing Archived Data
  • Name transparency
    • Access data without knowing the file name
    • Map from attributes to a local file name
  • Location transparency
    • Access data without knowing where it is stored
    • Map from global file name to local file name
  • Collection transparency
    • Access data without knowing the collection attributes
    • Map from concept space to collection attributes
information management logical name space
Information Management- Logical Name Space
  • Set of attributes to describe digital entities that are registered into the logical name space
      • SRB metadata - Unix file system semantics
      • Provenance metadata - Dublin Core
      • Resource metadata - User access control lists
      • Discipline metadata - User defined attributes
  • Each digital entity may have unique attributes
knowledge management discovery across collections
Knowledge Management - Discovery across Collections
  • Mapping from collection attributes to discipline concepts
    • Make queries based on discipline concepts
  • Characterization of relationships between attributes
    • Semantic / logical - cross-walks
    • Procedural / temporal - records management
    • Structural / spatial - GIS
slide19

Knowledge Based Data Grids

Ingest

Services

Management

Access

Services

Relationships

Between

Concepts

Knowledge

Repository for

Rules

Knowledge or

Topic-Based

Query / Browse

Knowledge

XTM DTD

  • Rules - KQL

(Model-based Access)

XML DTD

Information

Repository

Attribute- based

Query

Attributes

Semantics

SDLIP

Information

(Data Handling System - SRB)

Data

Fields

Containers

Folders

Storage

(Replicas,

Persistent IDs)

Grids

Feature-based

Query

MCAT/HDF

further information
Further Information

http://www.npaci.edu/DICE