1 / 20

Policy-Based Data Management integrated Rule Oriented Data System

2. . Preservation is a Stage in the Data Life Cycle. ProjectCollectionPrivateLocalPolicy. DataGridSharedDistributionPolicy. DigitalLibraryPublishedDescriptionPolicy. DataProcessingPipelineAnalyzedServicePolicy. ReferenceCollectionPreservedRepresentationPolicy. Federa

burke
Download Presentation

Policy-Based Data Management integrated Rule Oriented Data System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. 1 Policy-Based Data Management integrated Rule Oriented Data System Reagan Moore rwmoore@renci.org Arcot Rajasekar sekar@diceresearch.org Mike Wan mwan@diceresearch.org

    2. 2 Preservation is a Stage in the Data Life Cycle

    3. 3 Policy-based Preservation Environment Purpose - reason a preservation environment is assembled Properties - attributes needed to ensure the purpose Policies - control for ensuring maintenance of properties Procedures - functions that implement the policies State information - results of applying the procedures Assessment criteria - validation that state information conforms to the desired purpose Federation - controlled sharing of logical name spaces These are the necessary elements for a preservation environment

    4. 4 iRODS - Policy-based Data Management Turn policies into computer actionable rules Compose rules by chaining standard operations Standard operations (micro-services) executed at the remote storage location Manage state information as attributes on namespaces: Files / collections /users / resources / rules Validate assessment criteria Queries on state information, parsing of audit trails Automate administrative functions Minimize labor costs

    5. Policy-based Preservation - Authenticity Purpose - Maintain authenticity of records Properties - Define template for required representation information Policies - Extract and register representation information for each file on ingestion Procedures - Parse record / XML file to extract metadata State information - Register representation information into metadata catalog Assessment criteria - Compare registered metadata with template defining required values A preservation environment should automate each of these steps

    6. Assessment Criteria NARA Electronic Records Archive capabilities list 853 defined capabilities Mapped to 174 computer actionable rules Mapped to 212 state information attributes RLG/NARA Trusted Repository Audit Checklist Mapped to 105 computer actionable rules Included 66 rules specific to preservation ISO Mission Operations Information Management System repository audit checklist 106 policies for operation and control Mapped to 52 computer actionable rules

    7. Examples of Assessment Criteria Specify a template that governs the representation information required for a specific record series content of a Submission Information Package (SIP) content of an Archival Information Package (AIP) number of replicas Verify compliance of SIP with specification compliance of AIP with specification compliance with required replica number integrity of the replicas

    8. Preservation Communities NARA Transcontinental Persistent Archive Prototype Develop policies to automate preservation of selected digital holdings National Optical Astronomy Observatory Accession images from a telescope in Chile Carolina Digital Repository Preserve institutional collections

    9. 9 National Archives and Records Administration Transcontinental Persistent Archive Prototype

    10. NOAO Zone Architecture

    12. Preservation Concepts Preservation environments are inherently distributed and federated Mitigate risk of data loss Mitigate dependence on a single vendor Mitigate dependence on a single institution Management of technology evolution can be done through same mechanisms that support interoperability across heterogeneous storage systems At the point in time when add new technology, both the old and new technologies are present Migrate from old protocols to new protocols using data grids

    13. Preservation Concepts (Cont.) Preservation requires management of communication with the future Need to migrate records to future technology Need procedural infrastructure independence to ensure can parse data formats in the future Preservation requires management of communication from the past Need to know what policies and procedures were applied by prior archivists Need to validate that policies were enforced Federation minimizes risk of data loss Deep archive implemented through rules that: turn on data staging, data versioning, replication turn off deletion, external write, external data grid access

    14. Preservation Concepts (Cont.) Periodic verification of assessment criteria Check that required properties still hold These rules are in addition to the rules that enforce policies Compare values in metadata catalog with expected values Number of replicas, checksums of files, required metadata Verify relationships between files in storage and entries in metadata catalog Metadata record <----> files in storage Parse audit trails to track compliance over time Evaluate impact of changing preservation policy

    15. 15

    16. Managing Properties of Records Namespaces Record (file name) Users Storage resources Rules State information State information User-defined metadata (provenance) System attributes Procedures Basic operations performed on data Store, retrieve, move, copy, replicate, parse, aggregate Extract metadata, checksum, synchronize, version

    17. 17 Migration of Procedures

    18. 18 Format of an iRODS Rule Action | Condition | MS1, …, MSn | RMS1, …, RMSn Action Name of action to be performed Name known to the server and invoked by server Condition – condition under which the rule applies Micro-services - Chain of micro-services to be executed Recovery micro-service - If any micro service fails, recovery micro-service(s) executed to maintain transactional consistency Example of MS/RMS createFile(*F) removeFile(*F) ingestMetadata(*F,*M) rollback

    19. 19 iRODS - Distributed Operating System

    20. 20 iRODS is a "coordinated NSF/OCI-Nat'l Archives research activity" under the auspices of the President's NITRD Program and is identified as among the priorities underlying the President's 2009 Budget Supplement in the area of Human and Computer Interaction Information Management technology research. Reagan W. Moore rwmoore@renci.org http://irods.diceresearch.org

More Related