Experiences deploying xrootd at ral
Download
1 / 22

Experiences Deploying Xrootd at RAL - PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on

Experiences Deploying Xrootd at RAL. Chris Brew (RAL). Contents. Introduction to xrootd What is xrootd? How does xrootd work? What are xrootd’s advantages? Deploying xrootd at RAL. What is xrootd?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Experiences Deploying Xrootd at RAL' - duman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Contents
Contents

  • Introduction to xrootd

    • What is xrootd?

    • How does xrootd work?

    • What are xrootd’s advantages?

  • Deploying xrootd at RAL


What is xrootd
What is xrootd?

  • xrootd (eXtended Root Daemon) was written at SLAC and INFN Padova as part of the work to migrate the BaBar event store from Objectivity to Root I/O

  • It’s a fully functional suite of tools for serving data, including server daemons and clients which take to each other using the xroot protocol

  • Since Sept 2004 (Root V4.01-02) it has been distributed as part of the Root distribution as well as being separately available from http://xrootd.slac.stanford.edu


Xrootd goals
Xrootd Goals

  • High Performance File-Based Access

  • Fault tolerance

  • Flexible Security

  • Simplicity

  • Generality


Xrootd architecture
xrootd Architecture

application

Protocol Manager

xrd

xrootd

Protocol Layer

xroot

Authentication

Filesystem Logical Layer

ofs

Authorization

optional

Filesystem Physical Layer

oss

(included in

distribution)

Filesystem Implementation

_fs

mss


Performance features
Performance Features

  • Scalable request/response protocol

    • Connection multiplexing

    • Heavily Multithreaded

    • Request redirection supported

    • Request deferral

    • Unsolicited Reverse Request Mode

  • Adaptive Reconfiguration


Server clustering
Server Clustering

  • Multiple xrootd servers can be clustered using olbd software supplied with the release

    • Architected as self-configuring structured peer-to-peer (SP2) network of data servers and managers

    • Servers can be added & removed at any time

    • Single namespace can be shared across many servers

    • Provides scalability and Fault Tolerance

    • Client library understands SP2 layout and handles redirection and failover seamlessly for Application


Simple multi server example

Data Server

xroot daemon

olb daemon

Load Balancer

olb daemon

xrootd daemon

Simple Multi-Server Example

Client


Simple load balanced example
Simple Load Balanced Example


Load balanced example with mss
Load Balanced Example with MSS


The cache file system
The Cache File System

  • Xrootd allows us to span a single namespace across multiple servers

  • CacheFS allows us to span a single namespace across multiple disks on the same server

  • Distributed with xrootd but not a part of it

  • Provides scripts to manage namespace

    • Namespace served by xrootd is just a tree of links /some/file/path/…

    • Links point to /cacheNN/some%file%path%...

  • Links to the…


Mps migration purging and staging tools
MPS (Migration, Purging and Staging) Tools

  • The MPS commands manage the interface between the disk cache and the MSS

    • Migration step copies new data from the disk to MSS*

    • Purge step deletes unused data from disk*

    • Stage step copies data back from MSS to disk when requested by xrootd†

  • Supports pre-staging and stage request queuing

  • Files can be pinned on disk for a set period or indefinitely

  • Calls user defined commands to interface to any MSS


Deploying xrootd at ral
Deploying Xrootd at RAL

  • Where we started

    • BaBar at RAL had 40TB increasing to 75TB of disk on 57 filesystems spread across 26 servers

    • Single namespace was implemented using a large directory tree containing links to files on the disks

    • Very difficult to maintain


  • Where we wanted to get to:

    • All BaBar data files server via xrootd

    • Files backed up in ADS

    • No single point of failure

    • Simple deployment


Server layers
Server Layers

  • Xrootd/Olbd system serves files to clients

  • CacheFS aggregates multiple FS’s on each server

  • MPS manages the disk space, calls …

    ─────────────────────────

  • ADS link to put/retrieve files from …

  • Atlas Datastore, tape backup

XROOTD RAL


Babar ral data model
BaBar RAL Data Model

  • Client’s locate data via two olb manager machines with one DNS alias

  • Disks are held at 95-97% full so if a server fails requests for the data files it held automatically cause files to be staged from the ADS to the spare space

  • Stage requests can also be triggered by excessive loads on a server

  • A purge process on each server will eventually delete these extra files returning us to the initial position

  • Daily catalogues of each filesystem should mean that we can rebuild lost filesystems from the ADS


Installation
Installation

  • Rpm from SLAC provides basic software plus some additional packages but does not include:

    • Configuration

    • Start up scripts

    • Interface to our MSS

  • These are provided by a local config rpm

  • Both rpms are installed/monitored by RAL’s Yum/YumIT infrastrucure


Ral config rpm
RAL Config rpm

  • Installs the ADS link software

  • Creates init.d entries for xrootd and olbd and registers them with chkconfig

  • Builds server config file from common and machine specific files

  • Sets up the CacheFS system across the disk

  • Starts, condrestarts or stops the services for install, upgrade or removal of the rpm


Ads link
ADS Link

  • Locally written PERL module with wrapper utilities

  • Chains together sysreq path, datastore and tape commands to allow easy access to ADS entities by filename

  • Supported operations:

    • put

    • get

    • stat

    • rm


Benefits
Benefits

  • For Users:

    • Jobs don’t crash if a disk/server goes down, they back off, contact the olb manager and get the data from somewhere else

    • Queues aren’t stopped just because 2% of the data is offline

  • For Admins:

    • No need for heroic efforts to recover damaged filesystems

    • Much easier to schedule maintenance


Where we are now
Where we are now?

  • Most of the above has been deployed very painlessly

    • xrootd/olbd and the CacheFS has been deployed on 26 servers with > 70TB of disk and > 65TB of data on it

    • The MPS ADS Link ADS system has been written and tested, it is in successful production on one server and will be deployed on more in the very near future


Conclusion
Conclusion

  • Xrootd has proved to be easy to configure and link to our MSS. Initial indications are that the production service is both reliable and performant

  • This should improve both the lives of the users and sysadmins with huge advances in both the robustness of the system and it’s maintainability without sacrificing performance

  • Talks, software (binaries and source), documentation and example configurations are available at http://xrootd.slac.stanford.edu


ad