1 / 10

LHCb input to DM and SM TEGs

LHCb input to DM and SM TEGs. Introduction. We have already provided some input during our dedicated session of the TEG Here are a list of questions we want to address for each session Not exhaustive, limited to one slide per session

bryga
Download Presentation

LHCb input to DM and SM TEGs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LHCb input to DM and SM TEGs

  2. Introduction • We have already provided some input during our dedicated session of the TEG • Here are a list of questions we want to address for each session • Not exhaustive, limited to one slide per session • It would have been useful to hear what are the proposals for evolution of SM and DM as well • These should be proposals, not diktat • Extensive discussions are needed with users and sites before embarking on full scale “prototypes” that no longer are prototypes • Whatever the future is, WLCG must ensure the long term support (no external “firm”) • Do not underestimate the amount of work needed for users to adapt • Therefore plan well ahead, gather functionality requirements… F2F Data and Storage Management TEG, Amsterdam

  3. Data Placement and Federation • Does it imply replica catalogs are no longer needed? • How is brokering of jobs done? • Random or “where most files are” • When is data transfer performed? • By the WMS: implies a priori brokering, incompatible with pilot jobs • By the job: inefficiency of slots • Is it a cache (life time?) or just a download facility? • What is the advantage compared to placement using popularity? • Limited number of “master replicas” (e.g. 2) • Add replicas when popularity increases • Remove replicas when popularity decreases • … but still with a catalog and job brokering • What is the minimal number of sites for which it becomes worth it? F2F Data and Storage Management TEG, Amsterdam

  4. WAN protocols and FTS • We need third party transfer! • http 3rd party transfer? OK if commercial, why support it ourselves? • We need a transfer “batch” system! • Asynchronous bulk files transfer • Whatever reliable and efficient underlying protocol is used is just fine… • There is a need to define the service class where the file has to be put (or one service class per SE) • What about the dedicated network (OPN)? • Requires a service for using it? • Not all bells and whistles may be necessary • The real point is for a user (experiment): • Transfer this list of LFNs to this SE (SE = storage class at site) • The actual physical source is irrelevant • The TS should discover whether there is an online replica, if not it should bring it online before making the transfer • Ideally it (or the SE) should register the new replica (keep consistency) • All this was already said in… Mumbai (February 2006)! • FTS 3 was looking promising… why is it dead? F2F Data and Storage Management TEG, Amsterdam

  5. Management of Catalogues and Namespaces • See Data placement… • Do we need a replica catalog? • LHCb answer is YES: we want to be able to do brokering of jobs • May only contain information on the SEs (+ file metadata + usability flags) • Do we need a catalog with URLs? • Not necessarily: the URL can be formed from the SE information and the LFN (trivial catalog), as SE information is quite static. • Do we need a single URL (used for transfers and for protocol access)? • No problem as long as the access is transparent and fast • See SRM slide for more comments… • Namespace vs storage class? F2F Data and Storage Management TEG, Amsterdam

  6. Security and Access Control • We MUST protect our data from deletion • LHCb doesn’t care about protecting from access so much • The current situation is INACCEPTABLE • Anyone with little knowledge (available on the web) can delete all files in Castor! • VOMS (or equivalent) identification and authorisation is a MUST! What about ARGUS? • Identity and role • Currently in Castor we have only 2 (uid, gid)! • Protection done by the LFC, but all backdoors are open • Backdoors should be closed (nsrm, stager_xxx active commands…) • Explicit “delete” permission would be desirable • Change of DN should be straightforward (not trivial, but OK in LFC, DPM) • Action from VO administrator F2F Data and Storage Management TEG, Amsterdam

  7. Separation of disk and tape • We need two (and only two) storage classes: • T1D0 and T0D1 • This is because no space (storage class) change is possible in some implementations of SRM • T1D0 has two functions: • Archive of T0D1 • Permanent storage of read-few data (RAW, RECO) • For this the BringOnline functionality is mandatory • We need to access the data directly from the MSS without a need for replication onto another storage • Pinning is also a must (suboptimal usage of tape drives without it) • Help to the garbage collector F2F Data and Storage Management TEG, Amsterdam

  8. Storage Interfaces: SRM and Clouds • Clouds??? • We need an interface for: • BringOnline • Pinning • Defining the storage class (unless different endpoints are used, i.e. different SEs) • Currently (Mumbai) this is done by gfal (what is its future?) • SRM is far from perfect but… • It provides the above • All efforts put into defining a standard were a miserable failure… don’t expect any other interface will be any better • … but… • We could probably wrap the minimal functionality on top of the SE native interface, if available • BringOnline and pinning not available for dCache except in SRM • Can xroot provide this functionality? • Isn’t there the danger it becomes as clumsy as SRM depending on the implementation? F2F Data and Storage Management TEG, Amsterdam

  9. Storage IO, LAN Protocols • What is wrong with POSIX file: protocol? • Very efficient in the StoRM-GPFS implementation used at CNAF • Of course abuse is could be a danger (recursive ls) but this could be taken into account in the implementation (throttling) • Almost anybody can now write a fuse plugin to make it happen, so why not use a powerful commercial protocol? • Should access protocols be more than protocols? • i.e. interact behind the scene with the MSS, discover the file location etc… • Can a tURL be just an access point? • <protocol>://diskserver.cern.ch//<path> • … or better file:<path> • Avoid accepting URLs like “/castor/cern.ch/…” • Needs to be fixed in application layer? No “guesses”? • Do we need different URLs for different operations? • Transfer and posix-like access F2F Data and Storage Management TEG, Amsterdam

  10. Conclusion • Whatever the future is: • Consider we need a permanently running system • No disruption of service of more than 24 hours for migration • Millions of replicas are in the current system and… it works (even when not optimal) • Any drastic change requires a lot of work on the user side (experiments framework) • Old and new systems must exist in parallel • Requirements may be different depending on the Computing Model of experiments • 7 to 12 analysis centers (LHCb) is different from 50 to 70 centers • Solutions may not be universal and complication may not be required F2F Data and Storage Management TEG, Amsterdam

More Related