90 likes | 98 Views
LCG2 Deployment Status. LCG2 Core Sites 8 Core Sites (limited number of sites -> fast reaction to changes) FZK Barcelona-PIC (CASTOR MSS access) FNAL (ENSTORE MSS access via dCache) CNAF (CASTOR MSS access) NIKHEF (MSS access via dCache (SARA)) Taipei
E N D
LCG2 Deployment Status • LCG2 Core Sites • 8 Core Sites (limited number of sites -> fast reaction to changes) • FZK • Barcelona-PIC (CASTOR MSS access) • FNAL (ENSTORE MSS access via dCache) • CNAF (CASTOR MSS access) • NIKHEF (MSS access via dCache (SARA)) • Taipei • RAL • CERN (CASTOR MSS access) RLS endpoints • Currently released and installed LCG2 pre release (~3weeks) • Improved versions of RM, RLS, WP1, information system (faster, more reliable) • No “classic” SE distributed (assumed SRM operational soon for disk and MSS) • New RLS production endpoints which are not schema compatible with LCG1 • Information provider for CASTOR MSS and GRIS • Experiment software distribution mechanism Grid Deployment Board – 10 February 2004 - 1
LCG2 Deployment Status • LCG2 pre release • Problems • SE with SRM • Interoperability of tools for data location and access • Several discovered: RM,POOL,SRM,CASTOR-gridFTP • Some initial problems with the required setup for the experiment software distribution • Config errors (mostly fixed) • Write access to used shared file system (AFS) • Affects mainly CERN LSF site Grid Deployment Board – 10 February 2004 - 2
LCG2 Deployment Status • Storage Element: SE • Plan was to base LCG2 from the start on SRM managed storage • CASTOR SRM with MSS backend • Tested at CERN basic functionality present • advisoryDelete not available but used by RM • CASTOR SRM for disk based SEs • Was packaged by GD, no longer supported • dCache SRM with ENSTORE backend at FNAL • Basic SRM tests work, GFAL-dCache interface seems O.K. • RM tests in preparation • advisoryDelete not implemented, but developers are working on it • dCache SRM for disk pools • RAL finished now packaging, tests started on Monday at CERN • advisoryDelete NOT implemented, but mandatory for a disk system • the functionality is provided via the gridFTP interface, but RM can access storage either through SRM or gridFTP Grid Deployment Board – 10 February 2004 - 3
LCG2 Deployment Status • SE New Plan • Start with “classic” gridFTP SE everywhere and use RM tools • CASTOR with MSS backend • Tests last week showed that fix for the CASTOR gridFTP server was needed (done and tested) • Small remaining problem with access to the root storage dir. fixed by latest RM version • Disk only classic SE • test with RM between Taipei SE and CASTOR done O.K. • dCache • Tests needed to verify RM interoperability • No access to gridFTP - dCache interfaced system • CERN is setting up dCache system for tests using the RAL distribution Grid Deployment Board – 10 February 2004 - 4
LCG2 Deployment Status • What does this mean? • We have now the components for basic functionality in hand and can start • For applications using the higher level tools (RM) the change to SRM-SEs will be transparent (conversion of the RLS entries have to be done by us) • Disk only SE can be seen as a staging buffers only • Migration of data/or catalogue entries almost certainly required • Next Steps • dCache disk pool managers will be deployed as disk only SEs. • access via gridFTP to be compatible with “classic” SE • managed storage for tier2 sites • SRM interfaces will be activated for MSS and disk SEs after • throughout testing of all components and their interoperability • changes needed to provide the required functionality of a full SE are implementes • In the last days it became clear that the tools aren’t as mature as was expected Grid Deployment Board – 10 February 2004 - 5
LCG2 Deployment Status • RM/RLS/POOL Compatibility • RLS schema change made attribute entries “non case sensitive” • RLS change that fixes this available, needs testing • Test server will be deployed on Tuesday by DB group • EDG Replica manager and POOL use different prefixes • RM uses: • LFN - lfn:this-is-my-logical-file-name • GUID - guid:73e16e74-26b0-11d7-b1e0-c5c68d88236a • SFN - srm://lxshare0384.cern.ch//flatfiles/cms/data/05/x.dat • POOL uses no prefix and supports ROOT format • Protocol prefix: rfio:// or dcap:// • Entries from POOL and RM can be incompatible • Short term fix is now available • RM has been changed to drop for LFNs and GUIDs the prefix (some more tests needed) • Long term solution requires better understanding the relation between POOL/RM/RLS Grid Deployment Board – 10 February 2004 - 6
LCG2 Deployment Status • Plan for the next few days based on the available tools • Upgrade sites to the version of the C&T on Monday • Return of the classic SE • Disks only at all sites except those that run CASTOR • Publish the CASTOR gridFTP MSS SEs in CNAF, CERN, PIC • Additional features of this version (compared with the pre release) • Sites with disk only storage can provide staging space • RM and SEs interoperate • Alternative additional BDII that allows easy inclusion and exclusion of sites • New version of GridICE monitoring • Not covered by this version • POOL<->RM interoperability • Resources • Limited (< 50 Nodes) • Can be added when functionality is adequate for ALICE • CERN has approx 150 nodes reserved for this • This version should enable ALICE to start the data challenge Grid Deployment Board – 10 February 2004 - 7
LCG2 Deployment Status • Plan for the next 1 - 3 weeks • After testing move to interoperating versions of POOL,RLS,RM interoperability • This requires an upgrade of the RLS servers at CERN • This is the start version for CMS (will be available earlier on EIS testbed) • After receiving and testing dCache disk mangers add them as SEsto tier2 centres • Integrating CERN LSF nodes (first tests successful) • Plan from then on • add resources (sites and nodes) • add functionality that doesn’t interfere with the data challenges • proper accounting, better monitoring etc… • Switch to new LCG-GIIS based information system • Medium term • Convert to SRM • GFAL with more than PFNs Grid Deployment Board – 10 February 2004 - 8
LCG2 Deployment Status • Taking ownership of the edg code used in LCG • RSYNC from edg CVS repository stopped on Monday • Autobuild setup at CERN • Working almost, only a few packages fail • During the next few weeks we will switch to the LCG system Grid Deployment Board – 10 February 2004 - 9