1 / 24

Versioning of Digital Objects in a Fedora-based Repository

Versioning of Digital Objects in a Fedora-based Repository. Matthias Razum FIZ Karlsruhe DORSDL Workshop Alicante September 21, 2006. Outline. Motivation Versioning Concepts in eSciDoc Content Models Technical Approach Conclusion. Project Setup and Mission.

patsy
Download Presentation

Versioning of Digital Objects in a Fedora-based Repository

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Versioning of Digital Objects in a Fedora-based Repository Matthias RazumFIZ Karlsruhe DORSDL WorkshopAlicanteSeptember 21, 2006

  2. Outline • Motivation • Versioning Concepts in eSciDoc • Content Models • Technical Approach • Conclusion

  3. Project Setup and Mission • eSciDoc is a joint project of the Max-Planck-Society (MPS) and FIZ Karlsruhe • 6 million € five-year grant (2004 – 2009) from the German Federal Ministry of Education and Research • It aims to build an integrated information, communication and publishing platform for web-based scientific work, exemplarily demonstrated for multi-disciplinary applications in the MPS • eSciDoc is not a mere research project, but aims at establishing an innovative productive system

  4. Repositories for eScience • The contents of an institutional repository or a digital library form the ‘institutional memory’ of an organization • And just like human memory, they should allow for associating information objects in novel contexts, thus creating new scholarship • Interdisciplinary work is becoming increasingly important, so systems have to span scientific disciplines • Repositories should be open, application-independent and flexible, thus laying the ground today for repurposing the information in future applications

  5. Turning Static Objects into ‘Living’ Knowledge • e-Scholarship allows to publish all intermediate results of knowledge generation from first ideas, theories, discussions with peers to final results • Institutional Repositories and Digital Libraries need to support scholars already in the early steps of this process, thus enabling their users to share their work in progress with peers • Thinking a step further leads to interactive authoring environments with support for collaboration and annotations • As a result, objects loose their static nature and become ‘active nodes’ in a network of knowledge

  6. Implications • The concept of ‘ownership’ of an artifact is loosened and partly replaced by an ongoing authoring process which spans persons, places, and time • Collaborative authoring raises an issue familiar to software developers: versioning of digital objects • All intermediate or working versions of artifacts should become part of the repository, not just the final versions • Good Scientific Practice requires provenance data for objects and versioning

  7. Outline • Motivation • Versioning Concepts in eSciDoc • Content Models • Technical Approach • Conclusion

  8. Versioning on Object Level • Fedora’s basic object model – as defined in FOXML – is composed of an identifier, some key descriptive properties and a set of datastreams • Currently, each change to a datastream leads to a new version of the datastream, but not of the object itself. • On the other hand, authors and editors perceive objects as one coherent entity, not as a set of datastreams. • They request a ‘whole-object’ versioning which complies with their mental model.

  9. Fixed and Floating Object References • Scholarly work strongly relies on citations and external references to existing material (e.g. primary data and supplementary material) • In the context of digital repositories, these associations are expressed as object relations. • Versioning of objects then raises the question how to handle relations pointing to a versioned object. • eSciDoc implements two approaches: fixed relations pointing exactly to a given version of an object and floating relations which always point to the latest version of an object.

  10. Internal and Public Versions • Versions represent intermediate work statuses and are only visible to authors of digital objects • Revisions are published versions of objects with persistent identifiers. • Creating a revision is an intellectual step which most often includes some form of quality assurance, whereas versioning is an automated process.

  11. Container Objects • eSciDoc allows the grouping of objects by means of container objects like collections or bundles. • A change to one of the contained objects substantially changes the container object as well. Therefore, any change to a contained object should lead to a new version of the container object. • The same applies to revisioning: container objects are citable objects with their own persistent identifier. Revisioning of contained objects forces a new revision of the container object too.

  12. Outline • Motivation • Versioning Concepts in eSciDoc • Content Models • Technical Approach • Conclusion

  13. Content Models in General • An important part of implementing a Fedora repository is modeling different classes or “genre” of digital object that will be created, stored, and managed in the repository. • A content model will typically describe the following: • Datastream composition • the number and kinds of datastreams that must be present in the digital object • the format(s) for those datastreams, either MIME or format identifiers • whether each kind of datastream is required or optional • whether each kind of datastream has cardinality contraints • Semantic identifiers for each kind of datastream relationships • in the cases where a content model is a “graph” of related content models • Disseminators (optional)

  14. Essential Properties hasProperties 1 hasDefaultMD 1 eSciDoc Metadata hasRevision hasMD * * Metadata hasComponent * hasLicense Content Component License * hasMD hasLicense * 1 CC License CC Metadata Structural View of Content Item Content Item

  15. Content Item Modeled as Fedora Object hasComponent * Content Item Content Component RELS-EXT RELS-EXT eSciDoc MD CC MD MD1 License1 ... ... MDn Licensen WOV MD Content Stream

  16. Container Modeled as Fedora Object hasMember * Container Content Item RELS-EXT RELS-EXT eSciDoc MD eSciDoc MD MD1 MD1 ... ... MDn MDn Structure Map WOV MD WOV MD

  17. Outline • Motivation • Versioning Concepts in eSciDoc • Content Models • Technical Approach • Conclusion

  18. Whole-Object Versioning Metadata • Fedora versioning works automatically within objects • The eSciDoc middleware keeps track of whole object versions via objectVersion metadata • The eSciDoc middleware also can tag particular whole object versions as “revisions” which will be official published views of the object

  19. Animated View Revision t0 t1 t2 t3 t4 PID: parent:1 VersionID: 1.0 DOI: -- PID: parent:1 VersionID: 1.1 DOI: -- PID: parent:1 VersionID: 1.2 DOI: -- PID: parent:1 VersionID: 1.3 DOI: x.y/rev:1 PID: parent:1 VersionID: 1.4 DOI: -- Content Item CC1 PID: child:1 Version: t0 PID: child:1 Version: t0 PID: child:1 Version: t0 PID: child:1 Version: t0 PID: child:1 Version: t4 CC2 PID: child:2 Version: t0 PID: child:2 Version: t1 PID: child:2 Version: t1 PID: child:2 Version: t1 PID: child:2 Version: t1 CC3 PID: child:3 Version: t2 PID: child:3 Version: t2 PID: child:3 Version: t2

  20. Object Version XML <objectVersion versionID=”1.0”> <comment> this is the first whole object version </comment> <component PID=”child:5” dateTime=”2006-05-10T12:21:57Z”/> <component PID=”child:6” dateTime=”2006-05-10T12:21:57Z”/> </objectVersion> <objectVersion versionID=”1.1” revisionID=”doi:10.11.1234”> <comment>demo:5 is the same; demo:6 modified; demo:7 ingested </comment> <component PID=”child:5” dateTime=”2006-05-10T12:21:57Z”/> <component PID=”child:6” dateTime=”2006-08-11T09:23:09Z”/> <component PID=”child:7” dateTime=”2006-08-11T09:23:09Z”/> </objectVersion>

  21. Outline • Motivation • Versioning Concepts in eSciDoc • Content Models • Technical Approach • Conclusion

  22. Conclusion • Versioning is essential for repositories which cover the whole object lifecycle • Fedora already comes with a powerful versioning mechanism, but cannot fulfill all requirements of eSciDoc • Atomistic content models make versioning even more complex • The proposed approach provides a solution for advanced versioning requirement and at the same time is a demonstration of Fedora’s flexibility and adaptability

  23. Acknowledgements The concepts in this presentation are based on • eSciDoc’s Logical Data Model, created by Natasa Bulatovic (ZIM, Max Planck Society) • a joint workshop of ZIM and FIZ with Sandy Payette and Carl Lagoze

  24. Questionsmatthias.razum@fiz-karlsruhe.dewww.escidoc-project.de/homepage.htmlQuestionsmatthias.razum@fiz-karlsruhe.dewww.escidoc-project.de/homepage.html

More Related