experience of a low maintenance distributed data management system n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Experience of a low-maintenance distributed data management system PowerPoint Presentation
Download Presentation
Experience of a low-maintenance distributed data management system

Loading in 2 Seconds...

play fullscreen
1 / 20

Experience of a low-maintenance distributed data management system - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

Experience of a low-maintenance distributed data management system. W.Takase 1 , Y.Matsumoto 1 , A.Hasan 2 , F.Di Lodovico 3 , Y.Watase 1 , T.Sasaki 1. 1. High Energy Accelerator Research Organization (KEK), Japan 2. University of Liverpool, UK 3. Queen Mary, University of London, UK.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Experience of a low-maintenance distributed data management system' - crete


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
experience of a low maintenance distributed data management system

Experience of a low-maintenance distributed data management system

W.Takase1, Y.Matsumoto1, A.Hasan2,

F.Di Lodovico3, Y.Watase1, T.Sasaki1

1. High Energy Accelerator Research Organization (KEK), Japan

2. University of Liverpool, UK

3. Queen Mary, University of London, UK

contents
Contents
  • KEK iRODS system
    • Running in production over 2 years
    • Rules enable to store file efficiently
    • Federation with QMUL
  • iRODS applications
    • SCALA : Visualization tool for SCALA
    • iRODS XOR-based backup
  • Summary
irods overview
iRODS overview
  • Distributed data management system
  • Client-server architecture
  • Allows data management policies to be enforced on the server-side
  • Provides interface to many different types of storage
  • Client can access to iRODS via
    • i-commands : Commands-line utilities
    • iRODS Browser : Web interface
kek irods systems
KEK iRODS Systems
  • 4 iRODS servers
    • RHEL 5.6
    • iRODS 2.5 ⇒ 3.2
    • PostgreSQL 9.1.1
    • 2 years〜
  • iRODS Zone
    • KEK-T2K
    • KEK-MLF
    • KEKZone
    • demoKEKZone
  • Storage resource

HPSS

(High Performance Storage System)

Disk System

data management for t2k
Data Management for T2K
  • Tokai to Kamioka (T2K) Neutrino experimental group
  • The experimental data is stored to KEK storage
  • The group needed to provide an easy way to quickly access data collected to evaluate the quality of the data from outside of KEK
  • iRODS provided the solution

http://t2k-experiment.org/wp-content/uploads/t2kmap.gif

data management for t2k1
Data Management for T2K
  • KEK-T2K Zone for the experimental group started operation from October 2010
  • Detected data are processed then transferred to KEK iRODS
  • People in the group became to able to access the stored data easily and quickly
    • i-commands
    • iRODS Browser
irods rules for kek t2k zone

file

file

file

iRODS Rules for KEK-T2K Zone
  • Bundle and replicate the data

DB

Disk system

tar file

Client

iRODS

server

disk

HPSS

T2K

data server

tar file

rodsweb

Each experimental data file is small (〜several MB)

HPSS prefers large file

irods rules for kek t2k zone1

file

iRODS Rules for KEK-T2K Zone
  • Response to request

DB

Disk system

Client

file

iRODS

server

disk

HPSS

T2K

data server

rodsweb

tar file

request

federation with qmul
Federation with QMUL
  • Data replication among 2 sites
  • Share each site data

KEK-T2K

Experimental data

Federation

QMULZone

Analytical data

amount of data in kek t2k
Amount of data in KEK-T2K

T2K group start the data taking on 22nd Dec, 2011

scala visualization tool for irods
SCALA : Visualization tool for iRODS
  • Statistical Charts And Log Analyzer
  • iRODS lacked an interface for usage statistics and also for debugging problems
  • We developed a web interface for visualizing iRODSstatus overview
    • Statistical Chartspage
    • Log Analyzer page
  • SCALA has been installed to KEK iRODS
scala overview
SCALAOverview
  • Input : iRODS outputs
  • Output : Visualized system daily status as charts

SCALA

iRODS

Summarize

Parse

Display

Resource usage

Parsed table

Log files

Summarized table

Database

statistical charts
Statistical Charts
  • Visualizes iRODS daily operational data
log analyzer
Log Analyzer
  • Provides error debugging tool

1. User clicks an bar

2. Error detail displayed

3. User clicks an error message

4. Related log displayed

download scala
Download SCALA
  • http://tgwww.kek.jp/scala/
irods xor based backup
iRODS XOR-based backup
  • Full file replication
    • Current method for reliable storage of data is replicate data
    • If disk fails or server fails still have a copy
    • Requires much storage space
    • Portion of the file becomes corrupt you have to replace the full file
    • XOR-based backup
      • Reduces the space with same robustness
      • Splits file into some blocks and creates parity blocks
      • If a block becomes corrupt you have to recreate only corrupted block
xor based backup 100 recovery with any 2 servers fail
XOR-based backup:100% recovery with any 2 servers fail

XOR-based backup uses

4 servers but only needs 200GB

Full-File Replication uses

3 servers andneeds 300GB

  • iRODS rule enables automatic processing
summary
Summary
  • KEK iRODS system has been running in production over 2 years
  • iRODS gives a way to quickly and easily access data outside of KEK
  • Rule of bundle and replicate the data leads to store files efficiently
  • Federation with QMUL enables to share each data and backup
  • SCALA is a visualizing tool and has been installed KEK iRODS
    • It leads to better management of the iRODS overall service
  • XOR-based backup provides data reliability and less storage cost compared with replication
    • iRODS rule enables automatic processing
thank you for your attention
Thank you for your attention!

Wataru Takase wataru.takase@kek.jp