1 / 16

Using Scalable and Secure Web Technologies to Design Global Format Registry

Using Scalable and Secure Web Technologies to Design Global Format Registry. Muluwork Geremew, Sangchul Song and Joseph JaJa Institute for Advanced Computer Science Studies Department of ECE, University of Maryland Sponsored by Library of Congress and NSF. Motivation.

arin
Download Presentation

Using Scalable and Secure Web Technologies to Design Global Format Registry

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Scalable and Secure Web Technologies to Design Global Format Registry Muluwork Geremew, Sangchul Song and Joseph JaJa Institute for Advanced Computer Science Studies Department of ECE, University of Maryland Sponsored by Library of Congress and NSF

  2. Motivation • Handling of digital formats is an essential part of long-term preservation. • Format obsolescence • Technology evolution and the obsolescence of systems and applications software may leave users unable to access their old files. • Software developers may go out of business and no longer support the applications. • Digital preservation requires • Different essential aspects of objects. • Tools for capturing the essential format characteristics of information stored as digital object and processing it.

  3. Existing Methodologies • Standardizing the digital contents to few common formats. • JPEG2000, OMF, and PDF/A are among the few selected open standard formats. • Migration • Transforms older versions to newer formats. • Tends to be costly and prone to errors. • Emulation • The original bit-streams are executed using an emulator. • Implementing such a strategy is extremely challenging and can be viewed as a transformation.

  4. Our Goal • A flexible framework for incorporating advances achieved through the existing approaches. • Development of an efficient, scalable and platform independent prototype to enable the tracking and handling of format obsolescence. • Development of a Global Digital Format Registry (GDFR) – FOrmat CUration Service (FOCUS) • Development of enabler modules that can interface between GDFR and end-user applications.

  5. FOCUS Architecture

  6. FOCUS on LDAP and SOAP • Interoperability • Protocols are platform independent • Performance • Most operations are read-only queries. LDAP gives high performance in this environment. • Extensibility • LDAP schema can be easily extended • Scalability • By the use of Distributed LDAP • Security • SOAP can be on top SSL (https) • LDAP-based Format Registry can be easily integrated with any other LDAP-based authentication/authorization mechanisms.

  7. Global Digital Format Registry • GDFR serves to provide detailed information about formats. • Existing Format Registries: • UPenn’s FRED- (http://tom.library.upenn.edu/fred) • Pronom- (http://www.nationalarchives.gov.uk/pronom/) • Wotzit’s Format- (http://www.wotsit.org) • Not clear how extensible, scalable, or how they can be interfaced with existing preservation systems.

  8. FOCUS Software Web Service Agent Global Digital Format Registry Software • The registry contains information • File formats • Software tools • Multiple ways to access GDFR in FOCUS are provided. • Directly through LDAP interface • Indirectly through SOAP interface

  9. GDFR-Internal Structure • General descriptive properties. • Processing : format taken as input and/or output. • General descriptive properties. • Processing: rendering, editing, conversion and validation services/systems.

  10. Web Service Agent Global Digital Format Registry Format Inquiry Client Web-Service Agent • Mediator between user and registry • Serviced via SOAP • Contains a file format identifier module, FIDER • Java module for format identification • Uses file magic number • Sequential from restrictive to general

  11. Web-Service Agent • Tailorability • Specific needs of an existing preservation system can be met by custom-tailoring Web-Service. • Interoperability • Independent of OS and languages • Convenience • Multiple LDAP queries can be reduced to one Web Service function call. • Any updates can be done in a single place, not having to distribute new modules to end users

  12. FOCUS- Supplementary Tools • Validation Software • Verifies and validates file formats of given file. • Rendering Software • Interprets bit streams of files into human-friendly representation on the screen. • Editing Software • Adds/Deletes/Modifies the contents of given file, keeping the correct file format. • Conversion Software • Converts a file format to current or emerging formats.

  13. FOCUS Service Model Web Service Agent Format Registry Locates transformation services to convert DO from source format to format of interest. Conversion Software Identification Service Identifies format of a specific DO using the internal signature Validation Software Determines a verification service to verify the format of a specific DO Identifies current rendering conditions for specific digital format. Rendering Software

  14. Example Scenario: Digital Object Format Verification Web Service Agent Web Service Agent Format Registry Format Registry Format ? Verifier? Conversion service App ID / App Info Format ID / Format Info ID Service Verify this? Validation Service Valid/Well-formed Step 1: User requests to identify the format a file via Web Service Step 3: User requests for information on available verifier for this format Step 5: User connects to the validation service and verify the format Step 2: Registry returns format ID and format information Step 4: Registry returns validation service ID and information, such as its service location Step 6: Validation service returns the verification result Rendering Service

  15. Demo

  16. Conclusion • FOCUS design offers maximum • Flexibility – Web Service Agent can be easily tailored to meet the various needs of different preservation institutions. • Scalability – Format registry can also be distributed. • FOCUS integrates current format preservation techniques and makes them available through SOAP-based web interface. • In summary, we believe that the FOCUS prototype represents a significant advance towards the development of secure and scalable digital format registry.

More Related