1 / 22

Information and GLUE schema

Information and GLUE schema. CASTOR ext’l conf CERN 13-15 Nov 2006. What is a Storage Element ?. Grid interface to storage Provides: Control protocol E.g., SRM 1.1 or SRM 2.2 Data transfer protocol E.g., GridFTP or RFIO Information service (this talk). Storage & The Grid…. RM. FPS.

bing
Download Presentation

Information and GLUE schema

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information and GLUE schema CASTOR ext’l conf CERN 13-15 Nov 2006

  2. What is a Storage Element? • Grid interface to storage • Provides: • Control protocol • E.g., SRM 1.1 or SRM 2.2 • Data transfer protocol • E.g., GridFTP or RFIO • Information service (this talk)

  3. Storage & The Grid… RM FPS PhEDEx Fireman lcg-* GFAL Computing Element FTS LFC srmcp SE SE SE GridFTP Storage

  4. Why information service (theory) • To enable clients to discover: • Which protocols (control & data xfer) the SE supports • How much space is “available” and “used” (more…) • Storage with specific properties (more…) • Availability to my VO

  5. Why information service (practice) • Low level SRM-only tool works without • srmcp • Higher level Grid tools cannot access SE • lcg-* • GFAL • SRM implementers implement SRMs  • Sometimes without knowing higher level Grid software stack 

  6. What is information? • LDAP implementation • (In GT2; in GT3 OGSI, in GT4 WSRF) • Globus MDS (“Monitoring and Discovery Service”) • Each Element provides a GRIS (“Globus Resource Information Service”)

  7. Crash Course in GLUE Storage Schema Version 1.2 Overview SE Control Protocol 1..* Access Protocol 1..* Policy SA 1..* State volatile,durable,permanent AccessControlBase In theory, each SA could publish multiple VOs In practice, each SA is published once per VO.

  8. Crash Course in GLUE Storage Schema Version 1.3 Proposal Overview SE Control Protocol 1..* Access Protocol 1..* Policy SA 1..* State RetentionPolicy AccessLatency ExpirationMode 1..* VOInfo AccessControlBase

  9. SEs in 1.3 • Status: Production, Draining, Closing, Queueing • Analogous to CE • Total online and nearline size • Implementation name and version • Can query grid to see who has upgraded

  10. Types of storage • In GLUE 1.2 • FileLifetime: volatile, durable, permanent • In GLUE 1.3 • RetentionPolicy: replica, output, custodial • AccessLatency: offline, nearline, online • ExpirationMode: releaseWhenExpired, warnWhenExpired, neverExpire

  11. LCG Storage Classes • Tape is custodial • But custodial doesn’t have to be tape • Tape is nearline (or offline) • Publishing LCG classes in 1.3 • Tape1Disk1online custodial • Tape0Disk1online replica or online output • Tape1Disk0nearline custodial

  12. Accounting for “free” space • Multiple copies on disk recyclable? • Files with expired pins • “Volatile” files with expired lifetime • Deleted files or other gaps on tape • Race when used for selection (flocking) • Free != Available – across resources + != 2 1 1

  13. Accounting for “used” space • Are deleted files “used” • Gap (on tape) may not be reclaimable • Disk overhead? • Tape1Disk1 counted twice? • Account for nearline and online separately • Multiple pinned copies are “used”? • Int’l optimisation vs DiskN for N>1

  14. Accounting • Also publish reserved space (?) • Reserved but unused is not available • Total size • Also separate for nearline (tape) and online (disk) • Is Total = Available + Used + Reserved? • Not necessarily

  15. Accounting • srmGetSpaceMetadata • “default” space for space that isn’t • TMetaDataSpace • RetentionPolicy • “Owner” • Total, guaranteed, used size • “Lifetime” (in seconds)

  16. Implementation at RAL • First version accounted for Disk only (CK) • Fairly hairy query • Second query accounted for tape (JJ) • Queried vmgr db only • Assumes SAs do not share tape pools • Counts deleted files and disabled tapes as “used” • Counts compressed data • Even hairier query, not deployed yet

  17. Implementation at CERN • Jean-Philippe wrote his own query • Uses the name server • Counting compressed files on tape

  18. Implementation TODO • Needed since October ’05 • Adapt to new interpretations • of {available-free, used, reserved, total} • Decide how we map between: • LCG service classes (TapeMDiskN) • The StorageArea in 1.2 • The StorageArea in 1.3 • Service classes (and other internals)

  19. Implementation TODO default CMS CMS default LHCb default CMS Atlas LHCb LHCb default default Atlas Atlas SA in 2.0? SA in 1.3 VOInfo in 1.3 SA in 1.2

  20. Spaces in 1.3 information system • Select VOInfo: • In: VO name OR AccessControl (eg FQAN) • Out: Path OR space token descr • Select SA: • In: qualities • Out: find appropriate VOInfo • Missing: selection by protocol (in 2.0?)

  21. Implementation TODO • The SA is a space • In 1.3, each SA publishes multiple VOs • Compatibility problem with SA.Path • Locate VOInfo for your SA • New version is VOInfo.Path • If clients need SA.Path, still need to publish each SA for each VO

  22. Conclusion • Information is important ! • More work is needed • Test with higher level middleware • Track ongoing GLUE process

More Related