Srm lite overcoming the firewall barrier for data movement
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

SRM-Lite: overcoming the firewall barrier for data movement PowerPoint PPT Presentation


  • 71 Views
  • Uploaded on
  • Presentation posted in: General

SRM-Lite: overcoming the firewall barrier for data movement. Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory. SDM Center All-Hands Meeting November, 2007. Outline. What are Resource Storage Managers (SRM) Requirement of using SRM behind firewalls

Download Presentation

SRM-Lite: overcoming the firewall barrier for data movement

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Srm lite overcoming the firewall barrier for data movement

SRM-Lite:overcoming the firewall barrier for data movement

Arie Shoshani

Alex Sim

Viji Natarajan

Lawrence Berkeley National Laboratory

SDM Center All-Hands Meeting

November, 2007


Outline

Outline

  • What are Resource Storage Managers (SRM)

  • Requirement of using SRM behind firewalls

  • Satisfying the Requirements

  • Architecture

  • Potential uses


Storage resource managers

Storage Resource Managers

SRMs are middleware components whose function is to provide:

dynamic space allocation AND file management in spaces

for storage components on the local or wide-area network

Based on a common standard

SRM

(DPM)

SRM

(StoRM)

SRM/

dCache

SRM/

CASTOR

SRM

(StoRM)

dCache

CASTOR

client/user applications

SRM

(BeStMan)

GPFS

Unix-based

Disk

Pools

Unix-based

Disk

Pools

CCLRC RAL

Examples of storage systems currently supported by SRMs


Storage resource managers main concepts

Storage Resource Managers:Main concepts

  • Non-interference with local policies

  • Advance space reservations

  • Dynamic space management

  • Pinning file in spaces

  • Support abstract concept of a file name: Site URL (SURL)

  • Temporary assignment of file names for transfer: Transfer URL (TURL)

  • Directory Management and ACLs

  • Multi-file requests (srmRquestToPut, srmRequestToGet, srmCopy)

  • Transfer protocol negotiation

  • Peer to peer request support

  • Support for asynchronous multi-file requests

  • Support abort, suspend, and resume operations

  • SRM relies on other services for data movement (GridFTP, HTTPS, SCP, …)


Concepts site url and transfer url

Concepts: Site URL and Transfer URL

  • Provide: Site URL (SURL)

    • URL known externally – e.g. in Replica Catalogs

    • e.g. srm://ibm.cnaf.infn.it:8444/dteam/test.10193

  • Get back: transfer URL (TURL)

    • Path can be different than SURL – SRM internal mapping

    • Protocol chosen by SRM based on request protocol preference

    • e.g. gsiftp://ibm139.cnaf.infn.it:2811//gpfs/dteam/test.10193

  • One SURL can have many TURL

    • Files can be replicated in multiple storage components

    • Files may be in near-line and/or on-line storage

  • In light-weight SRM (a single file system on disk)

    • SURL can be the same as TURL except protocol

  • File sharing is possible

    • Same physical file, but many requests

    • Needs to be managed by SRM


Srm lite overcoming the firewall barrier for data movement

Earth Science Grid Analysis Environment(in production for 4 years)

>5000 users

160 TBs managed

LBNL

HPSS

High Performance

Storage System

disk

ANL

CAS

Community Authorization Services

NCAR

HRM

Storage Resource

Management

gridFTP

Striped

server

gridFTP

server

openDAPg

server

Tomcat servlet engine

MyProxy

server

LLNL

disk

MCS client

MyProxy client

CAS client

DRM

Storage Resource

Management

RLS client

DRM

Storage Resource

Management

gridFTP

server

GRAM

gatekeeper

ORNL

gridFTP

server

gridFTP

HRM

Storage Resource

Management

ISI

gridFTP

gridFTP

server

HRM

Storage Resource

Management

MCS

Metadata Cataloguing Services

SOAP

HPSS

High Performance

Storage System

RLS

Replica Location Services

RMI

MSS

Mass Storage System

disk

disk

SRMs are used and inter-communicate in several sites

SRMs


Srm lite overcoming the firewall barrier for data movement

Disk

Cache

Disk

Cache

Robust Data Movement provided by SRMs and DataMover

  • Problem: move thousands of files robustly

    • Takes many hours

    • Need error recovery

      • Mass storage systems failures

      • Network failures

  • Solution: Use Storage Resource Managers (SRMs)

    • File streaming paradigm

    • By reserving and releasing storage space automatically

  • Problem: too slow

  • Solution:

    • in GridFTP

      • Use parallel streams

      • Use large FTP windows

    • Pre-stage files from MSS

    • Use concurrent transfers

Anywhere

DataMover

Get list

of files

SRM-COPY

(thousands of files)

NCAR

LBNL

SRM-GET (one file at a time)

SRM

(performs writes)

SRM

(performs reads)

GridFTP GET (pull mode)

MSS

Network transfer

archive files

stage files

Example setup for Earth System Grid (ESG)


File tracking shows recovery from transient failures

File tracking shows recovery from transient failures

Total:

45 GBs


Requirements for srm lite

Requirements for SRM-Lite

  • Run SRM behind a firewall

    • Cannot have third party transfers (source/target is local)

  • May not be able to run GridFTP

    • Remote site may not support it

    • Some communities choose not to use GSI

  • Need support for multi-file transfer

    • Or entire directory

  • Need support for asynchronous request

    • Also support for intermediate status of request

  • Need to support concurrent file transfers


Satisfying the requirements srm lite

Satisfying the Requirements: SRM-Lite

  • Run SRM behind a firewall

    • Must have a client tool (SRM-Lite)

  • May not be able to run GridFTP

    • Support high-performance SCP: Use HPN-SSS from Pittsburgh supercomputing Center

    • But, also use other transfer protocols (GridFTP, bbcp, https, …)

  • Need support for multi-file transfer

    • Manage queues for large requests

  • Need support for asynchronous request

    • SRM-Lite returns a “request token”; token can be used for “request status”

  • Need to support concurrent file transfers

    • Use multi-threading to manage concurrent transfers

    • Monitor transfers and recover from mid-transfer interruptions


Srm lite overcoming the firewall barrier for data movement

Process Steps

Login to ORNL using OTP

At ORNL invoke SRM-Lite

User composes XML input file, srmlite.xml for selectedfiles/directories to copy from/to another site

Or, user gives command lineoption for a selected file/directory

SRM-Lite uses srmlite.xml orcommand line inputto automatically

Push/Pull files to/from NERSC

Use multiple threads for concurrent transfers

Scenario A: firewall at one site

OTP

Login

ORNL

NERSC

SRM-

Lite

SSH

Channel

(SCP)

SSH Server

Local Commands

And

Protocols

GridFTP/FTP/

BBCP/HTTP

transfers

srmlite.xml

Disk

Cache

Disk

Cache

Put example: Source: file:////my_directory/file_foo Target: scp://host/target_dir/file_foo

Get example: Source: GridFTP://host/target_dir/file_foo Target: file:////my_directory/file_foo


Srm lite overcoming the firewall barrier for data movement

Scenario B: one end has a firewall,

The other end has SRM

OTP

Login

ORNL

NERSC

SRM-

Lite

SRM

Request

SRM

srmlite.txt

GridFTP/FTP/

SCP

transfers

Disk

Cache

Disk

Cache

HPSS

Put example: Source: file:////my_directory/file_foo Target: srm://host/target_dir/file_foo


Srm lite overcoming the firewall barrier for data movement

Process Steps

Login to Site1 using OTP

At site1 invoke SRM-Lite

SRM-Lite at site1 uses SSH to invoke SRM-Lite at site2

Use SSH channel for SCP

Same as before:

User composes XML input file, srmlite.xml for selected files/directories to copy from/to another site

Or, user gives command line option for a selected file/directory

Scenario C: firewalls at both ends

SRM-

Lite

SRM-

Lite

OTP

Login

site2

site1

SSH

Channel

(SCP)

SSH Server

srmlite.xml

Disk

Cache

Disk

Cache


Srm lite overcoming the firewall barrier for data movement

Scenario C: SRM-Lite manages MSS access

SRM-

Lite

SRM-

Lite

OTP

Login

site2

site1

SSH

Channel

(SCP)

SSH Server

srmlite.xml

Disk

Cache

Disk

Cache

HPSS

HPSS


Gui for srm lite

GUI for SRM-Lite

  • Called DataMover-Lite

  • Versions exist for Linux, PC, Mac

  • Used in ESG

  • Special version for data movement to user workstations


Usage

Usage

  • Combustion project

    • The Applied Partial Differential Equations Center (APDEC)

    • John Bell

    • Efficient, robust data movement from sites behind firewalls

    • At DoE and DoD sites

  • Kepler-SRM-Lite actor

    • To be used for managing multi-file transfers from sites behind firewalls

    • Launch SRM-Lite remotely through SSH

      • Initial version – help from NCSU: Pierre Mouallem

    • Two modes

      • Entire request

      • Streaming file requests

    • To be used in CPES workflows first with Norbert’s help


  • Login