ESciDoc, VIRR and Digitization Lifecycle - insights into an infrastructure for management of digitiz...
Download
1 / 34

Natasa Bulatovic Max Planck Digital Library Research and Development - PowerPoint PPT Presentation


  • 75 Views
  • Uploaded on

eSciDoc, VIRR and Digitization Lifecycle - insights into an infrastructure for management of digitized resources. Natasa Bulatovic Max Planck Digital Library Research and Development. The Max Planck Digital Library (MPDL) in a Nutshell.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Natasa Bulatovic Max Planck Digital Library Research and Development' - dympna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

eSciDoc, VIRR and Digitization Lifecycle - insights into an infrastructure for management of digitized resources

Natasa Bulatovic

Max Planck Digital Library

Research and Development


The max planck digital library mpdl in a nutshell
The Max Planck Digital Library (MPDL) in a Nutshell infrastructure for management of digitized resources

  • Max Planck Digital Library (MPDL) is a service unit within the Max Planck Society (MPG)

  • MPG consists of about 80 institutes in three scientific sections

    • the Chemistry, Physics and Technology Section

    • the Biology and Medicine Section

    • the Human Sciences Section

  • The core activities of the MPDL lie in building up service infrastructure and tools for publications and research data

  • MPDL develops software solutions in close cooperation with scientists, librarians and technicians

  • In the Human Sciences Section several institutes have digitizedcultural artefacts and want to make them open access


Escidoc soa landscape
eSciDoc SOA Landscape infrastructure for management of digitized resources


Which data are managed
Which data are managed? infrastructure for management of digitized resources


How? infrastructure for management of digitized resources

  • PubMan – Publication Management

  • VIRR – Textual digitized resources management

  • IMEJI – Image management


PubMan infrastructure for management of digitized resources : Management of publications


Virr is about
VIRR is about infrastructure for management of digitized resources

  • Collaboration of the MPDL with the Max Planck Institute for European Legal History

  • Motivation: The period of the Holy Roman Empire produced a enormous corpus of legislative sources.Till now no complete collection of this works exist.


Virr key features
ViRR infrastructure for management of digitized resources Key features

  • Web-based collaborative application

  • Editor (bibliographic metadata, table of contents and structural metadata)

  • Viewer (online representation)

  • Browser


Virr editor
ViRR Editor infrastructure for management of digitized resources

  • Combines a set of tools

    • Paginator

    • Table of Contents Editor

    • Metadata Editor

  • One complex, but flexible workspace

  • No default order for the usage of the tools


Virr editor paginator
ViRR Editor - Paginator infrastructure for management of digitized resources

  • Assign the logical page numbers to the physical ones

  • Choose between different formats (Arabic, Latin, custom)

  • Paginate manually or automatically


Virr editor toc editor
ViRR Editor - ToC Editor infrastructure for management of digitized resources

  • Gather the logical structure of a work by breaking it down in structural elements

  • Arrange the hierarchical order of structural elements in the tree

  • Assign scans to structural elements

  • Choose from fine granular structural element types (over sixty)


Virr editor metadata editor
ViRR Editor – Metadata Editor infrastructure for management of digitized resources

Assign descriptive metadata to structural elements

  • Detailed description of every structural element

  • Systematic browsing

  • Dedicated search will be possible


ViRR Viewer infrastructure for management of digitized resources

Browse by ToC

Navigate to page

View metadata of structural element

Browse by scan

Page

(web resolution)

Page

(full resolution)

on click


Virr sharing and reuse
ViRR: Sharing and reuse infrastructure for management of digitized resources

http://virr.mpdl.mpg.de


From virr to digitization lifecycle project
From ViRR to Digitization Lifecycle Project infrastructure for management of digitized resources

  • Goal

    • support the complete Digitization Lifecycle with guideliness, standards, tools and a publishing platform

  • Partners:

    • MPI for European Legal History, Frankfurt

    • KunsthistorischesInstitut, Florenz (KHI)

    • Bibliotheca Hertziana, Rom

    • MPI for Human Development, Berlin

  • Related projects:

    • ViRR(see http://colab.mpdl.mpg.de/mediawiki/ViRR:_Virtueller_Raum_Reichsrecht)

    • XML-Workflow (see http://colab.mpdl.mpg.de/mediawiki/MPDL_Project_XML_Workflow)


Imeji management of image collections
Imeji infrastructure for management of digitized resources : Management of image collections


Imeji repository of digital images
Imeji: repository of Digital Images infrastructure for management of digitized resources

Organized into

  • Collections

    Created and defined by the institution, project, working group

  • Albums

    Created and defined by the researcher


Imeji what is so different about it
Imeji: what is so different about it? infrastructure for management of digitized resources

Imeji is not Flickr, nor Facebook...

  • Freely definable metadata profiles at collection level

  • Controlled Vocabularies may be integrated

  • Smart search for dates, ranges (based on the metadata type)

    Helps gathering the metadata more effectively

    Focusses on collaboration and metadata quality

    Repository: Data can be exported at any time


Escidoc and other services
eSciDoc and other services infrastructure for management of digitized resources


Escidoc soa landscape1
eSciDoc SOA Landscape infrastructure for management of digitized resources


Escidoc core infrastructure

Report Handler infrastructure for management of digitized resources

Report Definition Handler

Aggregation Definition Handl.

Statistics Data Handler

Scope Handler

Admin Handler

Set Handler

(OAI-PMH)

Item Handler

Container Handler

Content Relation Handler

Context Handler

Organizational Unit Handler

Content Model Manager

User Account Handler

Role Handler

Group Handler

eSciDoc core infrastructure

Statistics

Security

Resources & Data


Cone service
CoNE Service infrastructure for management of digitized resources

  • Manages named entities

    • Journals

    • Persons

    • Dewey Decimal Classification (3 public levels)

    • Creative Commons Licenses (CC licenses)

    • ISO 639-3 Languages

    • MIME Types

    • PACS classification

    • Custom classifications

  • Reuse

    • Data delivered in multiple formats (JSON, HTML, RDF/XML, Options list)

  • Motivation

    • Metadata quality: autosuggest components in solutions during metadata editing

    • Disambiguation: each entity is a named graph

    • Data linking: CoNE identifiers in publication metadata

    • Technical facilitation: all lists in one place

    • Persons: Researcher Portfolio

  • Extensions

    • Refresh data from external sources


Cone control of named entities http cone mpdl mpg de
CoNE – Control of Named Entities infrastructure for management of digitized resources http://cone.mpdl.mpg.de/

http://pubman.mpdl.mpg.de/cone/persons/resource/persons2450

+

Content negotiation supported


Transformation service
Transformation Service infrastructure for management of digitized resources

  • Transforms textual data formats

    • Metadata

    • Resources

    • Standard formats

    • Specific formats (e.g. EndNote custom fields)

  • Motivation

    • Migration of data from MPI

    • Exports and dissemination

    • Imports

    • Continuous interoperability enhancement

    • Implement once, use wherever needed


Search export service ciation style manager
Search&Export Service infrastructure for management of digitized resources Ciation style manager

  • Searches and exports results

    • Citation styles (Citation style manager)

    • EndNote

    • BibTex

  • Reuse

    • Data delivered in multiple formats (PDF, HTML, XML, ODT)

    • By external systems (content management, wordpress)

  • Motivation

    • Search results should be available in various outputs

    • One service – many presentations (e.g. Wordpress Plug-in)

    • One interface – easy inclusion of various export formats


Syndication service
Syndication Service infrastructure for management of digitized resources

  • Provides with the latest data updates

    • RSS

    • Atom

  • Reuse

    • Subscription to feeds and data reuse

    • By any external clients

  • Extensions

    • Media RSS

Feeds:

<feed>

<!--The title of the feed -->

<title>Recent releases in repository</title>

<!--Feed's description -->

<description>Recent releases in repository (item versions)</description>

</feed>

Feeds:

<feed>

<!--The title of the feed -->

<title>Recent releases in repository</title>

<!--Feed's description -->

<description>Recent releases in repository (item versions)</description>

</feed>

Feeds:

<feed>

<!--The title of the feed -->

<title>Recent releases in repository</title>

<!--Feed's description -->

<description>Recent releases in repository (item versions)</description>

</feed>

2: Get feed definition

2: Get feed definition

2: Get feed definition

Syndication

Service

1

4

Syndication

Service

1

4

Syndication

Service

1

4

3: Search/retrieve items

3: Search/retrieve items

3: Search/retrieve items

eSciDoc

Repository

eSciDoc

Repository

eSciDoc

Repository


Validation service
Validation service infrastructure for management of digitized resources

  • Semantical validation

  • Contextual validation

  • Validation rule editor (upcoming)


Data acquisition service
Data acquisition service infrastructure for management of digitized resources

  • Fetches data from known sources via identifier (unAPI interface)

  • Transforms data to other format


Pubman sword server
Pubman SWORD Server infrastructure for management of digitized resources

  • Deposit of data packages (metadata and fulltexts)

  • Logic implements a pubman specific workflow


Pid cache manager
PID Cache manager infrastructure for management of digitized resources

  • Fetches Handles from the GWDG Handle System (dummy resolution)

  • Assigns a pre-fetched handle to the resource

  • Synchronizes the assigned handle with the resolution to a resource in the Handle system

EPIC – European Persistent Identifier Consortium (GWDG Germany, SARA Netherlands, CSC Finland, http://www.pidconsortium.eu/ )


A note on the m etadata profiles
A note on the m infrastructure for management of digitized resources etadata profiles

  • DCAP based (Dublin Core Application Profile)

  • DC terms (identified URIs)

  • eSciDoc solution specific terms (identified by URIs)

  • METS/MODS

  • Publicly available

    • Functional description http://colab.mpdl.mpg.de/mediawiki/ESciDoc_Application_Profiles

    • Schemas http://metadata.mpdl.mpg.de/escidoc/metadata/schemas/0.1/

  • Interoperability levels

    • Shared term definitions (done)

    • Semantic interoperability (done)

    • Description set syntactic interoperability (prepared)

    • Description set profile interoperability (prepared)


Premises
Premises infrastructure for management of digitized resources

  • Applications

    • Web-based

    • Internationalized

    • Integrated Help system

    • Easy to use

    • Easy to install

  • Services and infrastructure

    • Reusable, interoperable, composed, technology-independent

    • Extensible, Scalable and performant 

  • Data

    • Persistently identified, versioned, discoverable, provenance and authenticity information, fine-grained authorization

    • Described with published metadata profiles

    • Interoperable and enabled for reuse and repurpose


Related projects and new developments
Related projects and new developments infrastructure for management of digitized resources

  • DARIAH

    Digital Research Infrastructure for Arts and Humanities (see http://dariah.eu)

    • Imeji

  • AWOB

    • Astronomers Workbench

  • Resource Registries

  • ECHO – European Cultural Heritage Online

    (seehttp://echo.mpiwg-berlin.mpg.de/home )


Thank you
Thank you! infrastructure for management of digitized resources

  • bulatovic@mpdl.mpg.de

    http://colab.mpdl.mpg.de

    http://escidoc.org