Initial biztalk programming development objectives for pedals
Download
1 / 27

Initial BizTalk Programming Development Objectives for PeDALS - PowerPoint PPT Presentation


  • 263 Views
  • Uploaded on

Initial BizTalk Programming Development Objectives for PeDALS. Dennis Bitterlich, Electronic Records Archivist. What is PeDALS? Persistent Digital Archives & Library System.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Initial BizTalk Programming Development Objectives for PeDALS' - Audrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Initial biztalk programming development objectives for pedals l.jpg

Initial BizTalk Programming Development Objectives for PeDALS

Dennis Bitterlich, Electronic Records Archivist


What is pedals persistent digital archives library system l.jpg
What is PeDALS? PeDALSPersistent Digital Archives & Library System

  • A grant funded multi-state project financed by the Library of Congress (National Digital Information Infrastructure & Preservation Program (NDIIPP)) & the Institute for Museum and Library Services

  • Includes five state partners: Arizona, Florida, New York, South Carolina and Wisconsin, with Arizona as the lead partner

  • Project will run 18-months, until the middle of 2009; if successful, WHS intends to continue participation beyond this period

  • At the end of the project each partner will have a functioning electronic records repository


Why is pedals needed l.jpg
Why is PeDALS Needed? PeDALS

  • An increasing number of state government records of long-term value are created in electronic-only format

  • Due to the large and increasing volume of electronic records in varied formats, traditional appraisal and acquisition practices are no longer effective—an automated, rules-based system like PeDALS is one possible response to this new reality

  • PeDALS is not an electronic records management system, but rather a way to acquire electronic records already scheduled for transfer

  • PeDALS is both a learning opportunity and a chance to implement a functioning system


Goals of the project l.jpg
Goals of the Project PeDALS

  • Develop a methodology to support an automated, integrated workflow to process collections of electronic records

  • Implement an inexpensive storage system that can preserve the integrity and authenticity of electronic records over time

  • Remove barriers to adoption by keeping costs of the system as low as possible

  • Work with Wisconsin Document Depository Program to develop ways to integrate digital format state agency publications into PeDALS processes; since 2005 the Depository has worked to preserve e-publications acquired from state websites


Microsoft biztalk overview l.jpg
Microsoft BizTalk Overview PeDALS

BizTalk is a middleware application which at its core is an XML Message Queue which will:

Receive Objects → Converts & Performs Logic on Objects → Send Objects

Completed by BizTalk using XML


Biztalk pipelines l.jpg
BizTalk Pipelines PeDALS

Pipelines

  • Connections between systems

    • Connect BizTalk to databases

    • Connect BizTalk to web

    • Connect BizTalk to file servers

    • Connect BizTalk to programs


Biztalk business rules l.jpg
BizTalk Business Rules PeDALS

Business rules

  • BizTalk speak for high level processes that determine what orchestrations will be performed

    • If record series confidential or restricted then go to orchestration to populate restrictions


Biztalk orchestrations l.jpg
BizTalk Orchestrations PeDALS

Orchestrations

  • BizTalk speak for the logic to process objects

    • Build in logic to calculate length of restrictions and database fields to populate


Initial biztalk development goals objectives l.jpg
Initial BizTalk Development Goals & Objectives PeDALS

1 – Write ARCAT BizTalk Code pipeline

  • Series already cataloged

  • Reduced duplication of work & manual data entry

  • Pipeline will work for CGI/BIN Web Service

  • Copy programming code to create next pipelines

    2 – Write Web Services BizTalk Code pipeline

  • Copied from CGI/BIN ARCAT Service pipeline

  • Generic HTTP pipeline to Agencies Web Pages

  • Can use for PeDALS “Drop Box”


Initial biztalk development goals objectives10 l.jpg
Initial BizTalk Development Goals & Objectives PeDALS

3 – Write DHS BizTalk Code pipeline

  • Code copied from prior pipelines

  • Connect to a database

  • Solve issues related to external networks

    4 – Write DWD BizTalk Code pipeline

  • Connect to a file server

  • Issues related to external networks should be solved, but may be different for file server connection


Initial biztalk development goals objectives11 l.jpg
Initial BizTalk Development Goals & Objectives PeDALS

5 – Write Call JHOVE, MetaExtractor, or C# Code in BizTalk to wrap records with preservation metadata orchestration

  • Once we can receive records through pipelines

  • Create logic to perform in BizTalk

  • Wrap records in XML in preservation metadata

  • First, execute a third party open source program such as JHOVE or MetaExtractor

  • Second, write code to interact with software programming languages such as C#


Measurement of success l.jpg
Measurement of Success PeDALS

1 – Ability to extract MARC records from ARCAT and insert into database

2 – Ability to create external web services pipeline to transfer records to WHS

3 – Ability to create external file pipeline to DHS Quest Archives Manager to transfer records to WHS

4 – Ability to create external file pipeline to DWD to transfer records to WHS

5 – Ability to wrap electronic records with preservation metadata inside of BizTalk


Process to write code l.jpg
Process to Write Code PeDALS

Iterative Process to:

1) Write BizTalk programming code

2) Test BizTalk programming code

3) Revise BizTalk programming code

4) Retest BizTalk programming code


Slide14 l.jpg
Pre-BizTalk Training Development Plans PeDALSInitial Thoughts on How I Would Get Objects into BizTalk pre September 2008

Initially PeDALS to use FTP to Receive Electronic Records

  • Authentication, integrity, security, and user friendliness issues

  • Now a generic “Drop Box” (probably a Web service)

    Initial Knowledge of BizTalk

  • A middleware application which at its core is an XML Message Queue

  • Uses XML to complete the connections to and from external applications

    Needed automated processes to provide BizTalk with XML objects


Pre biztalk training development plans l.jpg
Pre-BizTalk Training Development Plans PeDALS

Use of Third Party Open Source Code to convert/wrap in XML:

MARC21 to MARCXML Converter: http://www.loc.gov/standards/marcxml/

MarcEdit: http://oregonstate.edu/~reeset/marcedit/html/index.php

JHOVE: http://hul.harvard.edu/jhove/

MetaExtractor: http://meta-extractor.sourceforge.net/


Pre biztalk training development plans16 l.jpg
Pre-BizTalk Training Development Plans PeDALS

MARC21 to MARCXML Converter: http://www.loc.gov/standards/marcxml/

The MARCXML toolkit is a set of Java programs which allow users to convert to and from the MARC file format (including full character set conversion) and other formats available in the MARCXML architecture. The toolkit requires Java and works best with Java 1.4. If using a earlier version of Java, you need to modify the marcxml.bat file to include an xml parser in the classpath. Unzip the marcxml.zip file in a directory and run marcxml.bat for more instructions. Make sure java is in your PATH. In this version the stylesheets and character conversion mappings are downloaded via http from LC's website therefore Internet access is required when using these utilities.


Pre biztalk training development plans17 l.jpg
Pre-BizTalk Training Development Plans PeDALS

MarcEdit: http://oregonstate.edu/~reeset/marcedit/html/index.php

Is a MARC editing tool with a Native Z39.50 client and automatic batch conversions to/from:

  • Comma/Tab Delimited Files

  • Dublin Core

  • EAD

  • MARC

  • OAI

  • XML


Pre biztalk training development plans18 l.jpg
Pre-BizTalk Training Development Plans PeDALS

JHOVE: http://hul.harvard.edu/jhove/

JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.

Format identification is the process of determining the format to which a digital object conforms; in other words, it answers the question: "I have a digital object; what format is it?"

Format validation is the process of determining the level of compliance of a digital object to the specification for its purported format, e.g.: "I have an object purportedly of format F; is it?" Format validation conformance is determined at two levels: well-formedness and validity.

  • A digital object is well-formed if it meets the purely syntactic requirements for its format.

  • An object is valid if it is well-formed and it meets additional semantic-level requirements.


Pre biztalk training development plans19 l.jpg
Pre-BizTalk Training Development Plans PeDALS

MetaExtractor: http://meta-extractor.sourceforge.net/

The Metadata Extraction Tool was developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats:

  • Images: BMP, GIF, JPEG and TIFF

  • Office documents: MS Word (version 2, 6), Word Perfect, Open Office (version 1), MS Works, MS Excel, MS PowerPoint, and PDF

  • Audio and Video: WAV and MP3

  • Markup languages: HTML and XML

    The Metadata Extraction Tool:

  • Automatically extracts preservation-related metadata from digital files

  • Outputs that metadata in a standard format (XML) for use in preservation activities

  • The Tool was designed for preservation processes and activities, but can be used to for other tasks, such as the extraction of metadata for resource discovery


Pre biztalk training development plans20 l.jpg
Pre-BizTalk Training Development Plans PeDALS

MarcEdit & ARCAT MARC Catalog Records:

1) Use Z39.50 gateway to retrieve records as .mrc files

2) Use MarcEdit to convert .mrc files to XML

3) BizTalk receives XML files

4) BizTalk performs logic

5) BizTalk inserts/updates SQLServer Database


Post september biztalk training development plans l.jpg
Post September BizTalk Training Development Plans PeDALS

Pipelines can connect directly to:

  • Web services like ARCAT or OCLC or even HTTP

  • File servers like at DWD

  • Databases like DHS Quest Archives Manager

    Orchestrations can:

  • Call other orchestrations

  • Call other executable programs

  • Call other applications written in various software languages (C# or Java)


Post biztalk training development plans l.jpg
Post-BizTalk Training Development Plans PeDALS

ARCAT MARC Catalog Records:

1) Create pipeline

  • From ARCAT

  • To PeDALS Database

    2) Create search page to enter variables or a list of series to retrieve from ARCAT

  • Automates process

  • Decreases manual labor needed compared to using MarcEdit

  • Reduced duplication of work


Post biztalk training development plans23 l.jpg
Post-BizTalk Training Development Plans PeDALS

ARCAT MARC Catalog Records:

3) Create Orchestration

- To automatically map data from MARC to PeDALS database

- To execute MarcEdit (if necessary)

- That will insert or update PeDALS database

- Then export from PeDALS database to ARCAT, file, or OCLC


Possible involvements after initial development l.jpg
Possible Involvements PeDALS(After Initial Development)

  • State Archivist: Peter Gottlieb

    • Ultimate sign off on development

  • Collection Development Archivist: Helmut Knies

    • Initial sign off on development

  • Electronic Records Archivist: Dennis Bitterlich

    • Programming, testing, & verification

  • Public Records Accessioner: Abbie Norderhaug

    • Testing & verification

  • Head of Cataloging & Collections Mgmt Services: Maija Cravens

    • Policies & procedures


Possible involvements after initial development25 l.jpg
Possible Involvements PeDALS(After Initial Development)

  • Archivist: Jacquelyn Ferry

    • Policies & procedures

    • Testing & verification

  • Information Technology Director: Paul Hedges

    • Hardware, networks, & security

  • WI State Government Publications Librarian: Nancy Knies

    • State publications to store in LOCKSS

  • DHS Records Officer: Steve Bose

    • Transfer of records

  • DHS IT: Jovy Swanton

    • Hardware, network, programming, & security


Possible involvements after initial development26 l.jpg
Possible Involvements PeDALS(After Initial Development)

  • DPI WDDP: Abby Swanton

    • State publications to store in LOCKSS

  • DWD Records Officer: Dawn Bluma

    • Transfer of records

  • DWD IT

    • Hardware, network, programming, & security

  • UW IT

    • Hardware, network, programming, & security


Thank you l.jpg

Thank You! PeDALS

Collecting, Preserving and Sharing Stories Since 1846


ad