1 / 15

Mike Smorul, Mike McGann, Joseph JaJa

PAWN: A Policy-Driven Software Environment for Implementing Producer-Archive Interactions in Support of Long Term Digital Preservation. Mike Smorul, Mike McGann, Joseph JaJa Institute for Advanced Computer Science Studies University of Maryland, College Park

travis
Download Presentation

Mike Smorul, Mike McGann, Joseph JaJa

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PAWN: A Policy-Driven Software Environment for Implementing Producer-Archive Interactions in Support of Long Term Digital Preservation Mike Smorul, Mike McGann, Joseph JaJa Institute for Advanced Computer Science StudiesUniversity of Maryland, College Park Sponsored by National Archives and Records Administration, Library of Congress and NSF Archiving 2007

  2. Problems Facing Ingestion • Ensure integrity of data ingestion • Each producer-archive interaction is unique • Final destination for items in an archive is unique. • Differing roles between producer and archive • Hostile producers Archiving 2007

  3. What is PAWN? • Software that provides an ingestion framework • Distributed and secure ingestion of digital objects into an archive. • Handles the process • From package assembly • To archival storage • Simple, customizable interface for end-users • Flexible interface for archive publication Archiving 2007

  4. Package Workflow • Create Producer-Archive Agreement • Client package template. • Create package based on template • Once approved, packages can be archived • Rejected packages can be held until rectified or deleted for resubmission. Archiving 2007

  5. Expanding a Simple Workflow • Support for multiple workflows. • Grouped into logical domains • Definable roles per workflow • Pluggable components for assembly and archival publishing • Distributed components • Web-service based components Archiving 2007

  6. Domain Organization • Producers organized into domains, each domain contains a transfer agreement negotiated with the archive. • Each domain contains a hierarchical organization of data grouped into record sets/templates (convenient groupings from the transfer agreement). • Each domain contains its own users. • An end-user operates within a set of record sets. Archiving 2007

  7. Domain Example Archiving 2007

  8. Custom Roles • Actions in PAWN can be grouped together to create roles. • There are no common roles between archives, so allow custom ones. • Default roles • Producer – Individual data supplier • Records Manager – Oversight of producers • Archive Manager – Final review and archive publishing • Global Administrator – Creates domain, sysadmin-like account • Sample Actions • Setting permissions on record sets • Record Schedule creation and modification • Add or delete whole packages • Modify items in a package … Archiving 2007

  9. Data • Type • Descriptive Name • Bits Metadata … • Metadata • Type • Bits • Name • Manifest • Namespace • Type • Descriptive Name Manifest … Custom Package Building • PAWN provides an API for developing custom package builders • Custom package builders can be written in JAVA and implement a simple interface. • Builders interact with a hierarchical structured package Archiving 2007

  10. PAWN Archive Gateway • Pluggable component that provides an API for developing gateways into various services. • Each gateway may have multiple instances, each configured differently • PAWN handles managing and associating gateways with the appropriate data. Archiving 2007

  11. PAWN Architecture • Divided into producer and archive side components • Producer: data supplying and domain management • Archive: data storage, resource allocation and archival publishing • Web-service based communication • Trust relationship between producer and archive components • SAML and PKI Archiving 2007

  12. Components Archiving 2007

  13. ICDL Book Builder SLAC Record Ingestion 10,000 CDroms Case Studies • Custom package builder • Multiple data sources • Model logical books • Sample NARA ingestion • Model government roles • DOE Record Schedule • Remote ingestion • Unskilled labor • Custom hardware Archiving 2007

  14. PAWN Summary • Platform for ingestion • Customizable Components • Roles, ingest and publishing • Distributed architecture Archiving 2007

  15. More information • Web site: • http://www.umiacs.umd.edu/research/adapt • Wiki link for technical details. • Or “I’m feeling lucky” Google keywords: • ADAPT UMIACS Archiving 2007

More Related