1 / 27

WMS, RUcore and Fedora Mini-Conference

WMS, RUcore and Fedora Mini-Conference. Wednesday Morning Greetings and Introduction – Grace Collaboration and Architecture Overview – Ron RUcore Data Model – Grace WMS Tutorial - Mary Beth, Kalaivani, Sharon Lunch (box lunch in conference room) Wednesday Afternoon

ilya
Download Presentation

WMS, RUcore and Fedora Mini-Conference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WMS, RUcore and FedoraMini-Conference • Wednesday Morning • Greetings and Introduction – Grace • Collaboration and Architecture Overview – Ron • RUcore Data Model – Grace • WMS Tutorial - Mary Beth, Kalaivani, Sharon • Lunch (box lunch in conference room) • Wednesday Afternoon • Hands-On Experience – Mary Beth, Kalaivani, Sharon • Feedback from WMS sessions • Collaboration Discussion – All

  2. WMS, RUcore and FedoraMini-Conference • Thursday Morning • Brief Recap – Ron • WMS architecture - Yang • User Interface, Search engine and collections - Chad • Management services - Ron • Lunch (on your own) • Thursday Afternoon • Further collaboration discussion • Wrap-up and next steps

  3. Data Registries File formats Content Models Software Development Requirements Sharing software Joint development Life cycle support Sharing Content Exchange, harvesting Federated Searching Fedora Experimentation Relationship services Directory ingest Use of xacml Very large files Event management Possible Areas for Collaboration

  4. Fedora Enterprise Architecture Major Goals – 2007 thru 2009 • Paradigm Focus • Scholarly Communication Collaboration • Libraries and Museums Access and Publishing • Infinite Scalability • Size of and number of objects • Capacity and throughput (e.g. ingest 20TB a day) • Life cycle preservation • Trust Model • Transactions - Begin/Commit • Transactions across repositories • Enable graph based objects (compound objects)

  5. Middleware App. Prog. Interface Repository Persistence and Layered Architecture Applications Data

  6. API Layered Architecture - RUcore Applications and Portals (NJDH, RUcore, workflow, etc) Middleware Services (searching, alerting, integrity, etc) Fedora Core & Framework FOXML & Datastreams

  7. User Input Metadata and Archival masters XML RUcore - How it Works RUCORE Portal NJ Digital Highway Custom Portals Dissertations User, Collection, & Preservation Services Workflow Management System Fedora Repository Service Faculty Submissions Digital Object Repository (Fedora) Digital Object Ingest 7

  8. Simple and Compound Objects Compound Object - Graph Model Article Object (Simple) Persistent ID IsAnnotationOf article Metadata Behaviors (Disseminators) Data streams IsAnnotationOf SMAP1 – StrMap (TOC) A2 DJVU1- presentation PDF1 - presentation XML1 – OCR text A1 ARCH1- Archival master (tiffs of each page)

  9. Collections In RUcore • A digital collection is simply a grouping of objects according to some criteria. • Types of digital collections in RUcore • Explicit – A digital collection whose object membership is specified explicitly within the descriptive metadata. • Dynamic – A digital collection of objects which are grouped according to user specified criteria.

  10. Using Explicit and Dynamic Collections • Personal Collections • Department Collections • Including Faculty Personal collections (e.g. preprints, reports, etc) • ETDs for the Department • Centers and Grant Funded Research • New Jersey Digital Highway • Center for Remote Sensing and Spatial Analysis (CRRSA) – Access and preservation of GIS resources related to New Jersey

  11. New Jersey Historical Society O1 P1 P2 B1 O2 N2 O1 N3 N1 M1 Roosevelt RUcore Collection Architecture Circles – collection objects Rectangles – content objects RUCORE Solid line – explicit membership Dashed line – dynamic membership NJDH (Grant Project) Rutgers University Libraries Rutgers University Centers/ Departments Eagleton Archive General Collections Special Collections 11

  12. Princeton (1782.1) Penn State (1782.1) Rutgers University (1782.2) ETDs (Graduate School) Department D3 D2 D1 Collection Architecture - Lefty RUCORE N’Western (1782.1) RUL (1782.1) Center/Dept Collections RU ETDs Dept. ETDs FacColl One FacColl Two • http://hdl.rutgers.edu/1782.1/NorthwesternU.collection.165 • http://hdl.rutgers.edu/1782.1/PennStateUniv.collection.164 • http://hdl.rutgers.edu/1782.1/PrincetonUniv.collection.166 Solid line – explicit membership Dashed line – dynamic membership 12

  13. Management Services(incl. Collection and Preservation) • Management • Super-user editing (handles, datastreams, metadata) • Purging an object • Export (foxml, mets) • Collections • Collection administration • Statistics • Preservation • Creation of archival master • Creation of persistent ID (handle) • Checksum verification

  14. Management Services • Access to individual objects is provided by a special search portal using the same indexes as the public search but providing Fedora API management functionality: • Viewing, Exporting and/or purging objects • Editing metadata, adding/changing datastreams • Validating objects, checking audit trails, testing signatures • There is a special Fedora database search allowing access to all objects whether or not they are members of an active collection.

  15. Collection Administration • Edit collection information • Add parents to a collection • Add dynamic search terms to a collection • Generate an XML structure map

  16. Collections - Indexing and Ingest • Active Collections may be indexed individually or all together at any time, though this is typically done using a nightly cron job. • Ingest is done through the management API and is typically called by the WMS program, but may be called directly from the management interface as well.

  17. Preservation - Alerting • All Fedora API management functions trigger alerting messages, are stored in the Fedora audit trails, and are registered in the collection statistics database. • Statistics are kept for all object downloads as well as editing activities and may be accessed at collection or repository levels.

  18. Preservation – PIDs and Handles • Handles are normally created as part of the ingest process, but may be manually created, changed, or purged on a per object basis using the management interface. • Three global registries for RU • 1782.1 – Rutgers University Libraries • 1782.2 – Rutgers University • 1782.3 – NJ Digital Highway

  19. Object Integrity – Verifying Checksums • Archival datastreams have SHA1 checksums, created during the WMS pipeline process, as well as filesize data stored in the technical metadata section of each objects. • SHA1 checksums are tested using the sha1sum checking algorithm in conjunction with a management function that polls the repository and extracts sha1sum character strings from the techMD of individual objects or groups of objects. It has a calendar feature that allows it to be run as a cron on a subset of objects for each day of the week with result reports emailed to appropriate data managers.

  20. Certification as a Trusted Repository* • Ultimately, we want to become certified as a trusted repository. There are four major areas: A. Organization B. Repository Functions Repository actively monitors Archival Information Package Integrity. Repository staff have skills appropriate to their duties. C. Designated Community D. Technologies Repository has technologies to monitor security. Repository defines its Designated Community • * RLG/NARA draft “An Audit Checklist for the Certification of Trusted Digital Repositories”

  21. Preservation Services Architecture Preservation Portal Preservation Services . . . Alerting Migration Monitoring Statistics Event Messaging Preservation Integrity Preservation Monitoring Fedora Repository Service Content Models Digital Object Repository Format Registry Fedora Service Framework 21

  22. Content Models(Content Model Dissemination Architecture – CMDA) • The CM object specifies constraints on the digital object (DO) • MIME type and format • Min/max of number of datastreams • Whether multiple datastreams are ordered • The CM is used to determine runtime behavior • On ingest, Fedora validates DO based on CM constraints • Disseminators are not bound into the DO • Run time binding occurs through the CM object and the rels-ext datastream • The CM can point to a format registry

  23. Book Object Content Model Bmech Object Persistent ID Persistent ID Persistent ID Metadata Metadata Metadata Rels-Ext (cmodel: book) Rels-Ext Rels-Ext hasBdef hasBmech hasCM Composite Model WSDL Data streams Bdef Object SMAP1 – StrMap (TOC) Persistent ID DJVU1- presentation <dsCompositeModel> <dsTypeModel ID=“PDF1” ordered=“false” min=“1” max=“1”> <form MIME=“application/pdf”</form> </dsTypeModel> <dsTypeModel ID=“ARCH1” ordered=“false” min=“1” max=“1”> <form MIME=“application/tar”</form> </dsTypeModel> . . </dsCompositeModel> PDF1 - presentation Metadata Format Registry XML1 – OCR text MethodMap ARCH1- Archival master (tiffs of each page) pdf tar tiff Content Models, Formats, and Disseminators 23

  24. Events and Outcomes • An event is an: • . . . action that involves at least one object, agent, and/or rights entity (PREMIS). • . . . occurrence that is significant to the performance of a task • Event outcome – a situation or state that follows an event and is a result of the event.

  25. Fedora Event Management • Generic Framework • Events can have messages which are associated with all types of services (preservation, collection, user, etc) • Messages represent events with actions and outcomes • Fedora will provide a middle-ware messaging solution based on open-source Java Messaging Service (JMS) • Fedora Working Group Focus • Preservation events are atomic (i.e. associated with a Fedora API) • The event message will be based on the PREMIS event entity • Initial types: ingest, delete, modify, fixityCheck

  26. The Event Message • Event message structure • The message payload will be xml-based and use the PREMIS event entity semantic units • Global identifiers (URIs) will be used for event type and outcome • An example might look like the following: <event> <eventIdentifier> <eventIdentifierType>Rucore event</eventIdentifierType> <eventIdentifierValue>30169</eventIdentifierValue> </eventIdentifier> <eventType>info:premis/preservation/event/ingest<eventType> <eventDateTime>2006-07-16T19:20:30</eventDateTime> <eventDetail>(to be used for general information)</eventDetail> <eventOutcomeInformation> <eventOutcome>info:premis/preservation/outcome/success</eventOutcome> <eventOutcomeDetail>(more text)</eventOutcomeDetail> </eventOutcomeInformation> <linkingAgentIdentifier>rutgers-lib:200</linkingAgentIdentifier> <linkingAgentIdentifier>rutgers-lib:400</linkingAgentIdentifier> <linkingObjectIdentifier>rutgers-lib:4291</linkingObjectIdentifier> </event>

  27. Preservation Service (reporting) Preservation Service (alerting) JMS (snd/rcv) JMS (snd/rcv) JMS (snd/rcv) XML Event Management - Ingest(Using the publisher/subscriber model) User Input JMS Topic Queue <eventType>ingest<> <eventType>delete<> <eventType> <eventType> Workflow Management System <eventType> Digital Object Repository (Fedora) Digital Object Ingest

More Related