1 / 39

Advanced topic on data management

Advanced topic on data management. The SRM protocol and the StoRM implementation. Advanced topic on data management. I will briefly describe how the classic SE works: Highlight design points and consequences for file security. File security: POSIX-like ACL access to files from the GRID .

whitby
Download Presentation

Advanced topic on data management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced topic on data management The SRM protocol and the StoRM implementation

  2. Advanced topic on data management • I will briefly describe how the classic SE works: • Highlight design points and consequences for file security. • File security: POSIX-like ACL access to files from the GRID. • I’ll then talk about the SRM protocol: • Its origin to allow tape resources to be accessed from the GRID. • Particular attention to design differences with classic SE. • SRM transition as an interface to disk storage resources. • Differences with Tape based systems. • I’ll finally talk about StoRM: an SRM implementation that allows POSIX like ACL access.

  3. I. Classic SE

  4. Classic SE • It allows disk resources to be accessed from the GRID. • What makes a machine into a SE? Three components are needed: • A component that publishes and tells the GRID that it is an available storage resource. • The usual framework for authentication: GSI. • A component that actually moves the files around: the characterizing feature!

  5. Classic SE • Component that allows the GRID to be aware of its presence, i.e. to be included in the GRID information system • There is an LDAP Server that publishes information about the SE. • Information organised according to the GlueSchema: specifically by the GlueSEUniqueID entity. • Information describing the SE such as its name and listening port of service. • Information specific to each VO that the SE is serving such as the local path to the file holding directory, available space, etc. • Part of the information is updated dynamically, especially that concerning the disk space available and disk space occupied. • It is done through LDAP Providers found in /opt/lcg/libexec. • The providers run periodically scripts which update the dynamic information. • Finally the rest of the grid information system periodically polls the information made available by the SE present there.

  6. Classic SE • User authentication: Grid Security Infrastructure GSI • Core of GLOBUS 2.4 libraries: used by service in charge of moving files around! • i.e. /opt/globus/lib/libglobus_gsi_credential_gcc32dbg.so.0, /opt/globus/lib/libglobus_gsi_proxy-core_gcc32dbg.so.0, etc. • Set of scripts run by cron jobs to manage pool accounts: • /opt/edg/sbin/edg-mkgridmap creates a gridmap file by reading a local configuration file that specifies sources of allowed credentials, from LDAP server or a specific file. • /opt/edg/sbin/lcg-expiregridmapdir used to remove the mapping to local credentials when a grid user no longer is working on that machine. • /opt/edg/sbin/edg-fetch-crl used to retrieve revocation lists of invalid certificates.

  7. Classic SE Component that carries out the functionality of moving files around the GRID. • In general it is just any implementation of a transport protocol that implements GSI! • GridFTP most common! • RFIO • Anything that somebody comes up with as long as it is GSI enabled: it is just a matter of who will adopt it and use it!

  8. Classic SE GridFTP: • Essentially an FTP server extended/optimized for large data transfers: • Parallel streams for speed. • Allows checkpoints during file transfers, for later resuming. • Authentication through GSI certificates instead of user name + password

  9. Classic SE • Central point: • It is FTP! A user can do what an FTP client allows to be done! • There is no separation of what can be done from the grid, and the actual transport protocol. • There is no explicit and separate list of file manipulation operations that can be done from the grid! • There is no uniform view of the possible file manipulations: they are linked to the underlying transport protocol! • Depending on the protocol you may not have the same functionality • For the same functionality the specific protocol must be used: it may not be possible to access seamlessly all SEs!

  10. Classic SE Compare with CEs that have LRMS interface to forked jobs or to batch jobs. • It is an abstraction layer on the kinds of computations that can be done. • LRMS may not be a great protocol (gLite CEs are somewhat different)… yet it is an attempt to introduce an abstraction.

  11. Classic SE A more serious consequence of the lack of abstraction is how to apply POSIX ACL like control on files, from the grid. It is left up to the transport protocol! • For GridFTP: • It is FTP modified for GSI. • FTP allows file manipulation compatible with underlying Unix filesystem permissions. • If grid control on files is needed, it is the underlying filesystem that must be carefully managed! • Map users to specific local accounts: not pool accounts. Each grid user can be controlled individually once it gets into the machine. • Partition local accounts into especially created groups: reflects data access patterns. • Carefully crafted directory tree guides data access. • So a grid user with no access rights to a file is stopped because the GridFTP server gets stopped on its track by the local filesystem!

  12. Classic SE • In any case the proposed solution is problematic because data may be present in several SEs: • Users have same UID across all SEs. • Replication/Synchronisation of directory structure across all SEs. • Users supplied with tools to manage permissions coherently across all SEs.

  13. Classic SE Central point: • GRID lacked the concept of access control within the same VO. • It was only possible to find it when passing to the local machine. • The local machine had the means to enforce it: users + group membership. • Security therefore is set up behind the scenes at the implementation level! • No GRID concept involved! No GRID abstraction available to: • Express fine grained authorization. • Express what can be accessed. • Check GRID credentials.

  14. Classic SE VOMS proxies and GridFTP • Allows to define roles and groups: it therefore allows for fine tuning who the GRID user is. • It is up to the system receiving these detailed credentials to decide what local resources to use. • For SE there is still the same problem of explicitly listing what these resources are: dependency on the transport protocol as stated.

  15. II. The SRM protocol

  16. The SRM protocol Storage Resource Manager protocol: • Originally devised to allow grid access to tape based resources that had a disk area acting as cache. • Staging of files: • A request for a file arrives • If it is in cache it is returned right away • Otherwise it is first fetched from tapes, copied to disk and then returned. • The system takes care of consistency between cache and tapes. • Needed to offset latency due to robotic arm switching tapes.

  17. The SRM protocol SRM designed to handle that Tape/Disk-cache scenario, from the GRID: • The presence of cache area introduces the concept of file type: • Volatile: files get written in cache and the system then removes them automatically after a lifetime expires. • Permanent: the files that get into cache are not removed automatically by the system • Durable: files do have a lifetime that may expire but the system does not remove them and instead sends an e-mail notification to the user.

  18. The SRM protocol • File staging introduces the concept of asynchronous calls to get or put a file: • SRM request issued to get a file • Server replies immediately without waiting for staging to complete. • Server returns a Request Token which the client uses to periodically poll the request’s status.

  19. The SRM protocol • The cache area also introduces a partition of file namespace: • Tape must store files: there have to be names that uniquely identify the file in tape! • The cache area must serve files. • It may return a path to fetch the file on disk that is different from the name that allows to uniquely identify the file in tape. • It can easily support different fetching mechanisms… that is different transport protocols! • SRM reflects this distinction in the concept of SURLs and TURLs: • SURL: Storage URL - A name that identifies a grid file in SRM storage: it is what the GRID sees! • srm://storage.egrid.it:8334/old-stocks/NYSE.txt • TURL: Transfer URL – A name that identifies a transport protocol and the path to fetch the file: it is how the GRID moves the file around! • gridftp://storage.egrid.it:2110/home/ecorso/examples/2005/data.txt

  20. The SRM protocol Central point: • SRM introduces an abstraction to separate transfer protocol from the file operation itself. • Although introduced to handle the cache area, it also solves classic SE issues! • It decouples file operations from transfer protocol!

  21. The SRM protocol Direct consequence: • SRM servers do not move files in and out of GRID storage! • They only return TURLS! • It is up to the SRM client once it gets a TURL to call a GridFTP/RFIO/etc client for moving files! • SRM acts only as a broker for file management requests! • Transfer is decoupled from data presentation!

  22. The SRM protocol Extra features and concepts in the protocol: • Big issue of not running out of space during a large file transfer. • System used by the HEP community to store/manage huge amounts of data from LHC. • SRM introduced space management and reservation interface.

  23. The SRM protocol • It distinguishes three types of reserved disk space: • Volatile: will be freed by the system as soon as its lifetime expires. • Permanent: will not be freed by the system. • Durable: will not be freed but the user that allocated it will be warned. • Space type and file type cannot be mixed in arbitrary ways: • Permanent space will be able to host all three types of files. • Volatile space can only host Volatile files. • The general way of working: • Space request is made. • Server returns a SpaceToken. • All subsequent SRM calls made by the client pass on the token. • The SRM server keeps track tokens and recognises allocated space.

  24. The SRM protocol The protocol calls: Data Transfer Functions • Misnomer… no data is moved by an SRM server • srmPrepareToPut, srmPrepareToGet: for putting a file into GRID storage or getting one out. • srmStatusOfPutRequest srmStatusOfGetRequest for polling! • They work on SURLs!

  25. The SRM protocol The protocol calls: Cache area management • srmExtendFileLifeTime for extending lifetime of volatile files • srmRemoveFiles to remove permenent files • srmReleaseFiles, srmPutDone to force early lifetime expiry

  26. The SRM protocol The protocol calls: Directory functions to manage files in tape • srmRmdir • srmMkdir • srmRm • srmLs • They work on SURL!

  27. The SRM protocol • The protocol calls: Space management functions • srmReserveSpace • srmReleaseSpace • srmGetSpaceMetaData • Space Token returned and used with all Data transfer functions.

  28. III. SRM applied to disk storage!

  29. SRM applied to disk storage! • SRM addresses the issues of classic SE: it is natural to use it also for disk resources. • There was also another important driving force for its adoption: • Many facilities were in place for LHC analysis of data coming from experiments production centres. • The facilities had high performance storage solutions in place, employing disk parallel file systems such as GPFS and Lustre. • With advent of GRID technologies it became necessary to adapt existing installations to the GRID.

  30. SRM applied to disk storage! • The context of operation is now different: • No tape with a cache in between • In general all concepts are kept with slight semantic adjustments • SURL/TURL distinction is kept - it decouples transfer protocol from data presentation as stated. • Three file types are kept - some files may be copied and live just for a certain amount of time. • Space reservation is kept - it is an important functionality. • Directory functions are kept.

  31. SRM applied to disk storage! Some compromises: • Asynchronous nature of srmPrepareToGet, srmPrepareToPut and srmCopy, remain although don’t make sense. • SpaceType distinction makes less sense: • Arguably the whole disk can be seen as permanent space, and so allow all three file types. • Akin to tapes that are permanent by their nature. • Releasing of file and lifetime extension remain for volatile files; srmRemoveFiles for managing cache files does not make sense

  32. IV. StoRM SRM implementation

  33. StoRM SRM implementation Result of collaboration between: INFN - Grid.IT Project from the Physics community + ICTP - EGRID Project: to build a pilot national grid facility for research in Economics and Finance (www.egrid.it)

  34. StoRM SRM implementation • StoRM’s implementation of SRM 2.1.1 meant to meet three important requirements from Physics community: • Large volumes of data exasperating disk resources: Space Reservation is paramount. • Boosted performance for data management: direct POSIX I/O call. • Security on data as expressed by VOMS: strategic integration with VOMS proxies.

  35. StoRM SRM implementation • EGRID Requirements: • Data comes from Stock Exchanges: very strict legally binding disclosure policies. POSIX-like ACL access from GRID environment. • Promiscuous file access: existing file organisation on disk seamlessly available from the grid + files entering from the grid must blend seamlessly with existing file organisation. Very challenging – probably only partly achievable! • StoRM: disk based storage resource manager… allows for controlled access to files – major opportunity for low level intervention during implementation.

  36. StoRM SRM implementation • How StoRM solves POSIX-like ACL access from the GRID: • All file requests are brokered with SRM protocol. • When StoRM receives an SRM request for a file: • StoRM asks policy source for access rights to: given SURL for given grid credentials. • Check is made at the grid credential level: not local user as before! And it is done on a grid view of a file as identified by the SURL!

  37. StoRM SRM implementation • The only part of the implementation outside of the protocol is the Policy Source: a GRID service that is able to formulate/express physical access rules to resources. • StoRM leverages grid’s LogicalFileCatalogue (LFC) as policy source: it is intended for Logical Names! StoRM therefore stretches its use. Still, it is very GRID-friendly: it is not a proprietary solution! • It would be better to have it explicitly in the SRM protocol: SRM 2.1.1 does have some Permission functions but their expressive power is weak, and in the next version of the protocol they will be re-addressed (srmSetPermission, srmReassignToUser, srmCheckPermission).

  38. StoRM SRM implementation • A last note: physical enforcement through JustInTime ACL setup. • All files have no ACLs setup: no user can access files. • Local Unix account corresponding to grid credentials is determined. • ACL granting requested access set up for local user. • ACL removed when file no longer needed.

  39. Advanced topic on data management Thank-you!

More Related