srm 2 2 issues well er and 2 3 too n.
Skip this Video
Loading SlideShow in 5 Seconds..
SRM 2.2 Issues Well, er, and 2.3 too PowerPoint Presentation
Download Presentation
SRM 2.2 Issues Well, er, and 2.3 too

Loading in 2 Seconds...

play fullscreen
1 / 30

SRM 2.2 Issues Well, er, and 2.3 too - PowerPoint PPT Presentation

  • Uploaded on

SRM 2.2 Issues Well, er, and 2.3 too. Jens Jensen (STFC RAL/GridNet2) On behalf of GSM-WG OGF22, Cambridge, MA. This Talk. Deviates from previous principles of being for beginners Technical Less polished… May be useful for others… Expose standard and protocol process

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'SRM 2.2 Issues Well, er, and 2.3 too' - eudora

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
srm 2 2 issues well er and 2 3 too

SRM 2.2 IssuesWell, er, and 2.3 too

Jens Jensen (STFC RAL/GridNet2)

On behalf of GSM-WG

OGF22, Cambridge, MA

this talk
This Talk
  • Deviates from previous principles of being for beginners
    • Technical
    • Less polished…
    • May be useful for others…
  • Expose standard and protocol process
    • Not many answers – kickstart(restart) process
  • Combines the two sessions
    • Input (mainly) from dCache, CASTOR, StoRM
  • Revisit specification
    • Implementations’ deviations from OGF specifications
    • Ensure another group can interoperate
    • If someone else were to start from scratch
    • E.g. SRB (ASGC work)
  • Aim is not to start work on 2.3
    • I.e. the aim is not – not the aim is to not, not that aim is not to start
    • If that makes sense
a very brief history
A Very Brief History
  • Spec from 2006
  • Then came implementations
  • Then came WLCG
  • …revisit spec
  • Now getting experiences
  • …revisit spec, highlight issues
  • …think about next steps
  • Manage diverse storage systems (but nothing else)
  • User interface (not admin)
  • Open Standard
    • A standard is not a standard until it is a standard (next slide)
  • Open participation (no fees, no closed societies)
  • Protect storage from Grid?
  • Encourage best practices?
  • Encourage uniformity? Allow diversity?
  • The File is the unit of currency (not datasets)
compare oasis
Compare OASIS
  • “Approved within an OASIS Committee,”
  • “Submitted for public review,”
  • “Implemented by at least three organizations,”
  • “And finally ratified by the Consortium's membership at-large.”
  • We would add that the three implementations “must interoperate”!
  • Wide deployment
  • “Now get experience” with WLCG
  • MoU: Significant changes to spec…
  • Do they make sense? Process.
  • What about smaller customers?
    • …No. In cache does not mean always in cache
space tokens on get
Space Tokens on Get
  • srmPrepareToPut uses a space token (description)
  • srmPrepareToGet doesn’t
    • Also for srmBringOnline
  • Problem for many implementations
    • dCache, CASTOR
    • dCache: MSS doesn’t see space token
    • StoRM: not needed
other get issues
Other get issues
  • Getting directories?
    • Not supported?
    • Or special permissions required?
    • Also to apply for large bulk requests?
finance use cases
Finance Use Cases
  • Ezio Corso (ICTP/E-Grid) (StoRM)
    • Compare EGEE industry liaison
    • “Complexity of financial instruments”
    • “more stringent risking and reporting requirements”
    • “Point solution” grids inefficient (silo)
    • Big computing makes data bottleneck
    • Access control by individuals
  • Access Control on spaces
    • Also to be published in GLUE 1.3 schema as ACBR on VOInfo
  • Reserving subspaces of spaces
  • Summarising spaces for Owner
  • Query space status?
what is a space anyway
What is a Space Anyway?
  • A collection at least one of physical storage component area?
  • With a common baseline set of capabilities (access latency etc)?
  • Not to even mention “free” space, “used” space, etc.
    • Tricky to define
    • Even more tricky to measure
    • Still more tricky to get agreement
what is a space anyway1
What is a Space anyway?
  • Is everything a space?
    • Suggestion to have toplevel static spaces
  • Is disk a space? Or can space have disk?
  • Spaces can be named by token descrs
    • Always named by space token descr?
    • Can be referenced by path? Non-uniquely?
    • Can be referenced (non-uniquely) by capabilities?
  • Is a (static) space an SA?
space behaviour
Space Behaviour
  • What happens if a file is released?
    • Space given back to the Space?
    • Space does not re-grow?
  • Permanent file in limited space?
    • Used to be: not permitted
    • Now, space is shrunk and released
    • Keep token around, or permit recycling?
  • Simple Unixy (POSIX) permissions
  • Default permissions on directories
    • Inheritance from above?
    • Consistent with space permissions, if applicable?
    • Default (per VO?)
  • Permit for roles and groups?
  • Stage in permission (protect write cache)
    • Not the same as reading
  • StoRM calls out to LFC
    • Access control API in SRM not adequate
    • Use LFC’s API
  • Multiple StoRMs can share an LFC
  • => Can synchronise between SE and LFC
return codes
Return Codes
  • srmCopy()
use of gsi authentication
Use of GSI authentication
  • Currently using SOAP over GSI sockets
  • GSI needed for delegation
  • Delegation needed for srmCopy() (only)
  • Incompatible with SSL
  • Proposal to use gLite delegation
    • SOAP API specifically for delegation
    • AstroGrid uses home-made REST-based
  • Not using WS-Anything
    • Many are Java only, too complex, not mature
  • Volatile, Durable, Permanent
  • Should have been:
  • ReleaseWhenExpired, WarnWhenExpired, NeverExpire
    • Avoid confusion with overloaded term from 1.1 – wrongly named in spec.
  • What is done on Durable/WarnWE timeout? (“raise error condition”)
access latency
Access Latency
  • OFFLINE not defined
  • Not used by WLCG
  • But does that mean it doesn’t exist?
  • LOST…
  • Certain aspects of API optional
    • Standard default?
    • Or implementation-defined default?
    • E.g., “default” space
  • Default filesize on put?
    • Is it 1?
    • Is it implementation dependent? Space dependent?
    • Is it returned?
Implicit pinning

Implicit reservations

Implicit lifetimes

Implicit changes on action:

Implicit changes on expiry

Surprising for users?

Complicates implementations?

What if permission denied for implicit action?

What is reasonable?

explicit but unknown
Explicit but unknown
  • Changing spaces (capabilities)
    • WLCG restricted D1T1 <-> D0T1 (more or less)
best practices for clients
Best Practices for Clients
  • Propagate errors to user
  • Clean up after yourself…
    • Even after unclean exit
  • Should SRM use request timeout and keepalive?
    • Cancel at any point?
    • Or only when queueing
  • Was always slightly tricky (also in 1.0  1.1)
  • Needs delegation (GSI problem)
  • How and when does client check status
  • What if remote host is not an SRM2?
  • Push modes and pull modes – and firewalls
  • And then the GridFTP modes (push/pull)
  • And the GridFTP streams
  • Can’t always get good results if implementation uses defaults or tries to guess
  • No way to set most parameters
srmls problem
srmLs problem
  • Classical problem with large directories
  • Exercise: on a normal filesystem ls -R dir with large directories. While you wait, try to use the system.
  • Large data volumes in SOAP
    • Attachment supported?
  • Truncate, offset
which bits are optional
Which bits are optional…?
  • Many features
  • Most parameters
  • TExtraInfo
next steps
Continue this process

Define terminology

Assess “damage”


No, not yet

Too soon, not enough experience with 2.2

Adaption difficult


Do nothing

Too late (WLCG)

Document differences

Retrofit things into 2.2

Add to 2.2 (incremental)

Postpone to “2.3”

Postpone to 3.1

Next Steps
future stuff
Future Stuff
  • WSRF
    • Rich Wellner (2004)
    • (WSRT?)
  • Avoid duplication
  • Compare OGSA-D-Arch
    • Proposes modular architecture for data
more capabilities
More Capabilities
  • Integrity checking
    • Act when integrity checking fails?
  • Service description, agreement (dynamic)
  • File content
  • Data sets, chunks
  • Dynamic resource allocation
    • Networks, additional storage, disk servers (now known as virtualisation)
    • Recovery