1 / 19

Supporting Customized Archival Practices Using the Producer-Archive Workflow Network (PAWN)

Learn how the Producer-Archive Workflow Network (PAWN) provides support for customized archival practices, including extensibility, custom authorization, API for new ingestion interfaces, and flexible structure for publishing into repositories. Case studies highlight ways PAWN addresses bulk ingestion, modeling government interactions, and reliable data transfer.

kristineh
Download Presentation

Supporting Customized Archival Practices Using the Producer-Archive Workflow Network (PAWN)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Supporting Customized Archival Practices Using the Producer-Archive Workflow Network (PAWN) Mike Smorul, Mike McGann, Joseph JaJa

  2. Overview • PAWN overview of extensibility • Custom authorization and role granting • APIs for building new ingestion interfaces • Flexible structure for publishing into repositories • Case Studies using PAWN • Bulk ingestion or at-risk collections • Modeling government interactions

  3. Problems facing ingestion • Reliable data transfer from producer to archive. • Each producer-archive interaction is unique. • How the archive deals with each collection is unique as well.

  4. Distributed Ingestion with PAWN • Multiple producing sites with different requirements. • Separation of administrative responsibility. • Customizable roles for various parties. • Scalable infrastructure.

  5. Components

  6. Package Workflow Overview • Create Producer-Archive Agreement • Client package template. • Create package based on template • Once approved, packages can be archived • Rejected packages can be held until rectified or deleted for resubmission.

  7. Producer-Archive Agreement

  8. Custom Roles • Actions in PAWN can be grouped together to create roles. • Modify items in a package, create users, etc. • Default roles • Producer – Individual data supplier • Records Manager – Oversight of producers • Archive Manager – Final review and archive publishing • Global Administrator – Creates domain, sysadmin-like account

  9. PAWN Actions • Domain creation, modification, deletion • Modification of the organizational structure of a domain • Account creation and modification • Role creation modification • Record set creation and modification • Setting permissions on record sets • Record Schedule creation and modification • Add or delete whole packages • Modify items in a package • Limiting an account to working with it’s own packages, all packages, or all in a domain. • Approving, rejecting, and archiving items in a package • Lock or unlock entire packages to prevent modification • Configure publishing resources

  10. Data • Type • Descriptive Name • Bits Metadata … • Metadata • Type • Bits • Name • Manifest • Namespace • Type • Descriptive Name Manifest … Custom Package Building • PAWN provides an API for developing custom package builders • Custom package builders can be written in JAVA and implement a simple interface. • Builders interact with a hierarchical structured package

  11. Package Builders • Default Builder • Create files and folders • Attach descriptive metadata to files or folders • ICDL Builder • Create ‘books’ with dublin core metadata • Uses ICDL database as source for book list and metadata

  12. PAWN Archive Gateway • Pluggable component that provides an API for developing gateways into various services. • Each gateway may have multiple instances, each configured differently • PAWN handles managing and associating gateways with the appropriate data.

  13. SRB Gateway workflow • Before any submission, the gateway is configured with basic SRB information and associated with a domain. • The client supplies final destination and additional settings • Driver returns handle to final destination for log files

  14. Case Study: 15,000 cdroms • 15,000 cdroms containing landsat data. • CD’s in control of library, processing and data storage across campus. • Moving cd collection not feasible. • Need for untrained (student) labor to ingest without supervision. • Final copy needed to be accessible by several parties.

  15. Case Study: 15,000 cdroms • Custom PAWN Interface. • Two workstations, 4 cd drives apiece. • Generate thumbnails and barcode cdroms. • Use SRB as final archive, and pre-existing PAWN-SRB driver.

  16. Case Study: SLAC Records • What parties are involved in transferring records from a government agency to NARA? • How can the Record Schedule view of required records be simplified and presented to a client

  17. Case Study: SLAC records • Created specialized roles • Records Creator • Create new packages and modify own submissions. • Records Liaison Officer • View or modify any packages in their domain. • Create users. • Create record templates • Records Manager • Sends packages on for more permanent storage. • Create domains and producer-archive agreements • Used pre-existing SRB gateway

  18. Case Study: SLAC Records

  19. More information • Web site: • http://www.umiacs.umd.edu/research/adapt • Wiki link for technical details. • Or “I’m feeling lucky” Google keywords: • ADAPT UMIACS

More Related