Standards in dpm
This presentation is the property of its rightful owner.
Sponsored Links
1 / 37

Standards in DPM PowerPoint PPT Presentation


  • 54 Views
  • Uploaded on
  • Presentation posted in: General

Standards in DPM. Outline. Disk Pool Manager (DPM) Grid Data Management Why Standards POSIX Access (NFS4.1) HTTP / WebDAV. Disk Pool Manager (DPM). DPM Main Goals. Provide lightweight “grid enabled” storage Manage space on distributed disk servers Manage a hierarchical namespace

Download Presentation

Standards in DPM

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Standards in dpm

Standards in DPM


Outline

Outline

  • Disk Pool Manager (DPM)

  • Grid Data Management

  • Why Standards

  • POSIX Access (NFS4.1)

  • HTTP / WebDAV


Disk pool manager dpm

Disk Pool Manager (DPM)


Dpm main goals

DPM Main Goals

  • Provide lightweight “grid enabled” storage

  • Manage space on distributed disk servers

  • Manage a hierarchical namespace

  • Expose interfaces for

    • Space Management (socket,SRM1.1,SRM2.1,SRM2.2)

    • Remote data access (gridFTP,HTTP/HTTPS)

    • POSIX like access (rfio)


Dpm architecture

DPM Architecture

HEAD NODE

METADATA

DPNS

DPM

SRM

NFS 4.1

CLIENT

GRIDFTP

NFS 4.1

CLI

PYTHON

C API

XROOT

RFIO

HTTP

DIRECT DATA ACCESS

DISK NODES

DISK NODE

DISK NODE


Dpm architecture1

DPM Architecture

File and directory metadata operations

HEAD NODE

METADATA

DPNS

DPM

SRM

NFS 4.1

File access requests, interaction with Disk Node daemons

Storage management via SRM clients


Dpm architecture2

DPM Architecture

POSIX like access

File upload / download using GSI over FTP

GET / PUT operations (Apache Web Server)

GRIDFTP

NFS 4.1

XROOT

RFIO

HTTP

DISK NODES

DISK NODE

POSIX like access

DISK NODE


Dpm main features

DPM Main Features

  • Separation of data and metadata

  • Hierarchical namespace

    • /dpm/<domain>/home/<vo>

  • Strong security

    • X509 & KRB5 (auth), VOMS & Virtual IDs (authz)

  • UNIX like commands

    • dpns-ls, dpns-mkdir, dpns-<usual-command-here>

  • Database centric

    • Support for MySQL, Oracle and PostgreSQL, easily load balanced

  • SLC4/5 and Solaris for server side, Debian clients, prototype for MacOSX clients


Dpm further details

DPM Further Details

  • Written mostly in C

  • Statistics from GStat

    • https://gstat-wlcg.cern.ch/gstat/stats/

    • Over 200 grid sites use DPM

  • Largest deployment: 1.2PB

  • More Information

    • https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm


Grid data management

Grid Data Management


Grid data management1

Grid Data Management

  • Heterogeneous environment

  • > 250 sites in total

  • DPM is one of multiple storage elements

    • Others are CASTOR, dCache, STORM/GPFS, …

  • SRM is the only “standard” protocol

    • Helps, but only for space management

    • And even here implementations differ

  • No standard protocol for data access

    • rfio, dcap(++), xroot


Grid data management2

Grid Data Management

BDII

INFOSYS

GFAL

CLIENT

ToA

CASTOR

DCACHE

DPM

...

RFIO

XROOT

DCAP / DCAP++

XROOT

RFIO

XROOT

GRIDFTP

GRIDFTP

HTTP / WEBDAV

GRIDFTP

HTTP

  • Clients need knowledge of storage backend type

  • Complex, hard to deploy and maintain


Grid data management3

Grid Data Management

  • Library dependency issues

  • Requirement of user interfaces (UIs)

    • Entry points to the grid

    • Maintained by experts

  • Very hard to use “standard” distributions

    • Even transition from SLC4 to 5 is problematic

  • Validation takes a long time


Why standards

Why Standards?

  • Accessibility

    • Not limiting access to OS X version Y with library Z

  • Validation

    • Using common validation and test tools

  • Stability

    • Evolution discussed in a wide group

  • Ease of implementation

    • Sharing of experiences, common code base

  • No vendor lock-in


Posix data access nfs 4 1

POSIX Data Access & NFS 4.1


Posix data access

POSIX Data Access

  • We have some specific requirements

  • Strong authentication

    • Ideally using X509 certificates

  • Support for clustered filesystems

    • Separation of data and metadata access

  • (Global?) Hierarchical namespace

  • Performance (even in WAN)

    NFS4.1 offers all of this… and more

  • We now have a standard we can use

  • It’s just POSIX, no need for any additional library


Some nfs history

Some NFS History

  • NFS2 in 1989 (RFC 1094)

    • NFS1 was Sun internal

  • NFS3 in 1995 (RFC 1813)

    • Large file support (64bit)

    • Performance enhancements (transfer buffer, num round trips)

  • NFS4 in 2003 (RFC 3010)

    • Better WAN performance

    • Strong security

    • Locking

    • Delegations

    • Callbacks

    • Backwards compatible extensions

  • NFS4.1 in 2010 (RFC 5661)

    • Main feature is parallel NFS


Nfs 4 1

NFS 4.1

  • IETF Standard (RFC 5661)

  • Different in nature from previous versions

  • Parallelism is the key word

    • No single server bottleneck

    • Meets needs of HPC and clustered systems

  • Supported by major vendors

  • Let’s look a bit in detail…


Nfs 4 1 overview

NFS 4.1 Overview

CLIENT

Callbacks

pNFS Protocol

Storage Access Protocol

(Layouts)

DATA

SERVER

METADATA SERVER

DATA

SERVER

CONTROL PROTOCOL

(undefined)


Nfs 4 1 feature 1 unified protocol

NFS 4.1 Feature 1 – Unified Protocol

  • One protocol, one port (2049)

  • Previous versions required additional protocols

    • mount, lock, status, …


Nfs 4 1 feature 2 strong auth z

NFS 4.1 Feature 2 – Strong Auth(z)

  • Based on RPCSEC_GSS (RFC 2203)

  • Support for multiple security mechanisms

    • KRB5 is mandatory, negotiation is in the protocol

    • Working to have X509 support (probably via globus GSSAPI plugin)

  • String based identities

  • Basic permissions + ACLs

  • Example: Linux Client

1

3

KERNEL CLIENT

system call

KERNEL SPACE

MECHGLUE

2

USER SPACE

RPC.GSSD

2

sec context

negotiation

CREDENTIAL STORE


Nfs 4 1 feature 3 bulk operations

NFS 4.1 Feature 3 – Bulk Operations

  • Protocol defines only two procedures

    • NULL and COMPOUND

  • COMPOUND procedure holds Operations

    • Open, Read, Write, Close, …

  • Much less round trips

    • Better performance, especially over WAN


Nfs 4 1 feature 4 sessions

NFS 4.1 Feature 4 - Sessions

  • Decouples transport (connection) from client

  • Persistent state on the server

    • Locks, opens, delegations, layouts, …

  • Multiple connections per session (from the same client)

  • Multiple sessions per client

  • Exactly Once Semantics (EOS)

    • Even in failure/recovery, thanks to reply cache


Nfs 4 1 feature 5 delegations

NFS 4.1 Feature 5 - Delegations

  • Given for files and directories

  • Moves part of the logic from server to client

    • Regarding access permissions, …

  • Multiple types of delegations

  • Can be recalled (via callback)

    • On conflicting request from other client

  • Improved Performance

    • Less round trips


Nfs 4 1 feature 6 layouts

NFS 4.1 Feature 6 - Layouts

  • Describe how to access data in storage

    • Multiple storage protocols supported

      • File, Object (RFC 5663), Block (RFC 5664)

      • Striping available with pNFS File Layout

  • Clients must request layout to MD server

  • Storage servers refuse access if request does not match layout

  • Layouts can be recalled, via callback

    • Ex: change in access permissions


Nfs 4 1 feature 7 multi server namespace

NFS 4.1 Feature 7 – Multi Server Namespace

  • Namespace spawning multiple domains

  • Servers redirect clients when data is not local

    • Redirection is the key word here

  • Can also be used to provide clients with alternative locations

NFS SITE A

open (/grid/siteB/myFile)

1

NFS4ERR_MOVED ( fs_locations )

CLIENT

2

NFS SITE B

open (/grid/siteB/myFile)


Nfs 4 1 additional goodies

NFS 4.1 Additional Goodies

  • Clients provided by industry

    • Linux, Solaris, Windows

  • Free client caching

    • It’s just there… we benefit from experts implementing caching in the OS

  • Support from major industry vendors

    • Netapp, Panasas, IBM, Oracle, EMC

    • Waiting for wide client availability

    • dCache also has support for NFS4.1


Nfs 4 1 client availability

NFS 4.1 Client Availability

  • Linux since 2.6.32

    • pNFS part coming with 2.6.36

    • pNFS builds in Fedora 12, 13

      • We also keep a debian one 

    • pNFS expected in RHEL 6.1

  • Solaris driver available (but not shipped yet)

  • Windows driver available


Dpm nfs4 1 implementation

DPM NFS4.1 Implementation

  • Aiming for a basic prototype by end of August

  • Additional frontend in the DPM Head Node

  • Possible to reuse existing implementations

    • Benefit from implementing a standard

    • Kernel Space: Linux spnfs

    • User Space: Ganesha (CEA)


Dpm nfs4 1 implementation1

DPM NFS4.1 Implementation

  • Based on sPNFS

    • A prototype pNFS server implementation for Linux

    • Done as a proof of concept, but good for us to use as a starting point

  • Still, a lot of the code can be reused

    • We integrate the name server requests with the DPM API

NFSD

MODULE

KERNEL SPACE

PIPEFS

USER SPACE

DPM NFS DAEMON

DPM API


Dpm nfs 4 1 picture

DPM NFS 4.1 Picture

CLIENT SIDE

SERVER SIDE

NFS KERNEL SERVER

NFS KERNEL CLIENT

KERNEL SPACE

USER SPACE

DPM NFS DAEMON

USER APP

RPC.GSSD

RPC.SVCGSSD

DPM DB

DPM API


More info

More Info

https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm/Dev/NFS41


Data transfer

Data Transfer


Http s

HTTP (s)

  • DPM already supports HTTP (s)

    • As a transfer protocol

  • Easy authentication / authorization

    • Newer versions of openssl with X509 proxy support make this even easier

  • Implemented as an apache module or cgi

  • Firewall friendly

  • Clients? They are everywhere…


Webdav

WebDAV

  • Extensions to HTTP 1.1 for document management (RFC 2518)

  • Enables wide collaboration

    • Locking

    • Namespace management (copy, move, …)

    • Metadata / properties on files

  • Maybe not so interesting for HEP users

    • But very popular within other communities

    • dCache has had very good feedback on it

  • Implementation not yet scheduled, but in the plan


Conclusions

Conclusions

  • Our environment is not standards friendly

  • Standard protocols exist today fitting all our use cases

  • Benefits for users, developers, admins

    • Usability, maintainability, evolution

  • DPM will continue focusing on standards

    • And will soon use them for all our use cases

  • Ongoing work also within the EMI data management group in the same direction


Standards in dpm

?


  • Login