stk5800 and eprints
Download
Skip this Video
Download Presentation
STK5800 and EPrints

Loading in 2 Seconds...

play fullscreen
1 / 21

STK5800 and EPrints - PowerPoint PPT Presentation


  • 72 Views
  • Uploaded on

STK5800 and EPrints. Services for Object Storage and Preservation March 2008 . All content in these slides is considered work in progress. In no way does it represent an absolute view of any final end product and at this stage should purely be considered a set of realistic ideas. . Outline.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' STK5800 and EPrints' - ashanti


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
stk5800 and eprints

STK5800 and EPrints

Services for Object Storage and Preservation

March 2008

All content in these slides is considered work in progress. In no way does it represent an absolute view of any final end product and at this stage should purely be considered a set of realistic ideas.

outline
Outline
  • StorageTek 5800 (The Honeycomb) provides high resilience data storage with a built in metadata layer.
  • EPrints is a piece of repository software for managing large collections of digital objects and their related metadata.
eprints
EPrints
  • Open Source repository software to provide open access to institutional output.
  • Provides a powerful plugin based package which can easily be extended at any layer to suit a users requirements.
  • 2 types of archive
    • Those used to manage publications and small objects.
    • Those used to deposit large objects. These tend to contain heavier customisation.
preserv2
Preserv2
  • Preserv2 is the 2nd iteration of a project looking at preservation services for repositories.
  • Beyond simple backup

Format Renderers, Format Translation, Risk Assessment, Interoperability and long term storage.

why use a honeycomb
Why use a Honeycomb?
  • A Honeycomb is not just a “Big Disk”
  • A Service Based Architecture:
    • Big object, big storage, more powerful plugins/services.
    • Smaller Repositories can jointly use a single Honeycomb as a “Preservation Service”.
  • Preservation Service Providers
    • Can combine several servers into a “Honeycomb Cloud”
eprints architecture
EPrints Architecture

EPrints (Repository) Layer

Object Storage

Metadata Storage

eprints and honeycomb
EPrints andHoneycomb

EPrints (Repository) Layer

STK5800

HoneyComb

services for repositories
Services for Repositories

EPrints (Repository) Layer

Metadata Services

Storage Beans

Automated

Wide Area Backup

metadata services
Metadata Services
  • Same resilience as data.
  • Averts the need to store a file id/url somewhere in order to find an object.
  • Enables collections to be constructed by independent parties.
  • Objects can be exported into many formats accurately.
storage beans
Storage Beans
  • Can perform operations upon the objects in the system without reliance upon the repository to manage these processes. (e.g. Object Translation)
  • Preservation services can provide feedback to repository administrators on potential risks to their objects. (e.g. Object Classification, age)
  • Can be used to extend the metadata layer to provide more powerful access to objects and their parts/pages. (e.g. Retrieve me page 10 of volume 6 of X)
wide area replication backup
Wide AreaReplication (Backup)
  • The possibility to link two or more Honeycombs together over a wide area to provide mirrored backup.
  • This can be implemented by the archive which can store its objects in a “Honeycomb Cloud”
possible architectures 2
Possible Architectures (2)

Repository

Repository

Repository

possible architectures 3
Possible Architectures (3)

Repository

Repository

Repository

possible architectures 4
Possible Architectures (4)

Repository

Repository

Repository

preservation services
Preservation Services
  • A “Honeycomb Cloud” provides the basis for a preservation service which can be provided to many small scale (<200Gb) repositories.
  • Options for object storage:
      • Locally with Honeycomb acting purely as a preservation service.
      • Hand all object storage and retrieval to Honeycomb Cloud.
      • A half and half solution:
        • Small Objects served locally, Large Objects from Honeycomb.
        • Recent and Popular Objects served locally, Older Objects considered preserved.
eprints with the stk 4500
EPrints with the STK 4500

The out of the box repository solution for Large Repositories.

thumpers big disk
Thumpers “Big Disk”
  • The Thumper system (STK 4500) is essentially a “Big Disk” server.
  • “Out of the Box” solution.
  • Expansions:
    • Services to enable replication between 2 thumpers.
    • Preservation services using a Honeycomb.

Aimed at Repositories where tape backup is not ideal.

ecrystals possible use case
Ecrystals (Possible Use Case)
  • Large Chemistry repository which currently stores only processes result objects (small).
  • These result files are generated from >1Gb raw datasets.
  • 8+ Datasets generated a day.
  • After 6 months results sets are of less worth.
    • This represents 1TB of raw data in a 6 month period.
ecrystals single honeycomb architecture
ECrystals – Single Honeycomb Architecture
  • Current Repository Remains
  • All Results Sets Stored on HoneyComb

Pros

Simplistic Architecture

Sole use of Honeycomb

Year of “on-site” storage.

Cons

Cost

Backup Procedure?

EPrints (Repository) Layer

ecrystals thumper with honeycomb cloud
ECrystals – Thumper with “Honeycomb Cloud”

Thumper System

EPrints (Repository) Layer

Pros

Single local machine

6 months+ locally Accessible

Automated Preservation

Preservation Services managed by Honeycomb Cloud.

Storage Beans on Honeycomb Cloud compress older/less popular objects

Cons

?

summary
Summary
  • Honeycomb provides:
    • Better separation of repository layer from storage layer.
    • Repository interoperability.
    • Anew approach to storing and preserving data from institutional repositories based on EPrints and other software.