Persistent digital archives and library system pedals
1 / 18

- PowerPoint PPT Presentation

  • Uploaded on

Persistent Digital Archives and Library System (PeDALS). Dennis Bitterlich, Electronic Records Archivist. What is PeDALS? .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - Mercy

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Persistent digital archives and library system pedals l.jpg

Persistent Digital Archives and Library System (PeDALS)

Dennis Bitterlich, Electronic Records Archivist

What is pedals l.jpg
What is PeDALS?

  • A grant funded multi-state project financed by the Library of Congress (National Digital Information Infrastructure & Preservation Program (NDIIPP)) and the Institute for Museum and Library Services

  • Includes five state partners: Arizona, Florida, New York, South Carolina and Wisconsin, with Arizona as the lead partner

  • Project will run 18-months, until the middle of 2009; if successful, WHS intends to continue participation beyond this period

  • At the end of the project each partner will have a functioning electronic records repository

Why is pedals needed l.jpg
Why is PeDALS Needed?

  • An increasing number of state government records of long-term value are created in electronic-only format

  • Due to the large and increasing volume of electronic records in varied formats, traditional appraisal and acquisition practices are no longer effective—an automated, rules-based system like PeDALS is one possible response to this new reality

  • PeDALS is not an electronic records management system, but rather a way to acquire electronic records already scheduled for transfer

  • PeDALS is both a learning opportunity and a chance to implement a functioning system

Goals of the project l.jpg
Goals of the Project

  • Develop a methodology to support an automated, integrated workflow to process collections of electronic records

  • Implement an inexpensive storage system that can preserve the integrity and authenticity of electronic records over time

  • Remove barriers to adoption by keeping costs of the system as low as possible

  • Work with Wisconsin Document Depository Program to develop ways to integrate digital format state agency publications into PeDALS processes; since 2005 the Depository has worked to preserve e-publications acquired from state websites

Submission information package sip archival information package aip l.jpg
Submission Information Package (SIP) ArchitectureArchival Information Package (AIP)

  • SIP: Agency records with associated metadata are transferred to the PeDALS system

  • Initial checks for authenticity, integrity, restrictions, and any viruses or malware

  • AIP: Rules-based software will transform records into format for long-term storage

Lots of copies keeps stuff safe lockss http www lockss org lockss home l.jpg
Lots of Copies Keeps Stuff Safe (LOCKSS) Architecture

  • Records are transferred into LOCKSS servers for long-term preservation

  • LOCKSS is a data storage system that scans for and repairs file corruption and other data integrity problems

  • Hardened firewalls and geographic distribution provides added security

Dissemination information package dip l.jpg
Dissemination Information Package (DIP) Architecture

  • Web server will provide Internet access to records through a web-based search interface

  • Access to records restricted by statute or otherwise will be blocked during restriction period

  • Records scheduled for transfer, but not access, are held in the electronic archive, but no user copy is sent to the web server until public access is allowed

Microsoft biztalk overview l.jpg
Microsoft BizTalk Overview Architecture

BizTalk is a middleware application which at its core is an XML Message Queue which will:

Receive Objects → Converts & Performs Logic on Objects → Send Objects

Completed by BizTalk using XML

Biztalk pipelines l.jpg
BizTalk Pipelines Architecture


  • Connections between systems

    • Connect BizTalk to databases

    • Connect BizTalk to web

    • Connect BizTalk to file servers

    • Connect BizTalk to programs

Biztalk business rules l.jpg
BizTalk Business Rules Architecture

Business rules

  • BizTalk speak for high level processes that determine what orchestrations will be performed

    • If record series confidential or restricted then go to orchestration to populate restrictions

Biztalk orchestrations l.jpg
BizTalk Orchestrations Architecture


  • BizTalk speak for the logic to process objects

    • Build in logic to calculate length of restrictions and database fields to populate

Initial biztalk development goals objectives l.jpg
Initial BizTalk Development Goals & Objectives Architecture

1 – Write ARCAT BizTalk Code pipeline

  • Series already cataloged

  • Reduced duplication of work & manual data entry

  • Pipeline will work for CGI/BIN Web Service

  • Copy programming code to create next pipelines

    2 – Write Web Services BizTalk Code pipeline

  • Copied from CGI/BIN ARCAT Service pipeline

  • Generic HTTP pipeline to Agencies Web Pages

  • Can use for PeDALS “Drop Box”

Initial biztalk development goals objectives14 l.jpg
Initial BizTalk Development Goals & Objectives Architecture

3 – Write DHS BizTalk Code pipeline

  • Code copied from prior pipelines

  • Connect to a database

  • Solve issues related to external networks

    4 – Write DWD BizTalk Code pipeline

  • Connect to a file server

  • Issues related to external networks should be solved, but may be different for file server connection

Initial biztalk development goals objectives15 l.jpg
Initial BizTalk Development Goals & Objectives Architecture

5 – Write Call JHOVE, MetaExtractor, or C# Code in BizTalk to wrap records with preservation metadata orchestration

  • Once we can receive records through pipelines

  • Create logic to perform in BizTalk

  • Wrap records in XML in preservation metadata

  • First, execute a third party open source program such as JHOVE or MetaExtractor

  • Second, write code to interact with software programming languages such as C#

Measurement of success l.jpg
Measurement of Success Architecture

1 – Ability to extract MARC records from ARCAT and insert into database

2 – Ability to create external web services pipeline to transfer records to WHS

3 – Ability to create external file pipeline to DHS Quest Archives Manager to transfer records to WHS

4 – Ability to create external file pipeline to DWD to transfer records to WHS

5 – Ability to wrap electronic records with preservation metadata inside of BizTalk

Process to write code l.jpg
Process to Write Code Architecture

Iterative Process to:

1) Write BizTalk programming code

2) Test BizTalk programming code

3) Revise BizTalk programming code

4) Retest BizTalk programming code

Questions l.jpg

Questions? Architecture

Dennis Bitterlich, Electronic Records Archivist

[email protected]