Experiment Software Installation toolkit on LCG-2

Experiment Software Installation toolkit on LCG-2 www.eu-egee.org EGEE is a project funded by the European Union under contract IST-2003-508833

The current implementation of the general schema discussed elsewhere (http://grid-deployment.web.cern.ch/grid-deployment/eis/docs/SoftwareInstallation/index.html foresees a three layer structured software. Tank & Spark represents a component of the lower level layer and it is mainly used for propagate software to the rest of a farm whenever no file system is provided. Experiment Software Installation

- Structure of the Experiment Software Installation toolkit lcg-ManageSofwtare: represents the “middle layer” on the current implementation • It is the steering script to be invoked for installing/removing/validating application software. • It checks for the local (WN) environment and decides the workflow to be performed. • Is there running some other process for the same software • Is it a shared file system or not ? • Is it a AFS file system or not (conversion of GSI credential to AFS Tokens)? • Is there installed Tank&Spark (invocation of the propagation)? • It allows for a reliable download of the tarball(s) (if specified) through the lcg-* commands to the local WN by-passing the outbound connectivity requirement. • Eventual un packaging of these tarball(s) • It invokes the experiment specific script (provided in somehow by the ESM) and checks the result of the script • It creates a temporary directory used for install/validate/remove software and later on it cleans up such temporary directory. • It publishes TAGs on the Information System with a flavor that depends on the action that it’s going to be performed and the result of a such action (do a man of the command for more information) Lcg-ManageVOTag: is a component of the lower level layer It is the command used by lcg-ManageSoftware in order to add/list/remove tags published on the Information System using the Gris running on a given CE. (It just adds/removes entries to the GlueHostApplicationSoftwareRunTimeEnvironment attribute of the IS .) It could be also used as a standalone application. It only requires the following format for the TAG: VO-<voname>-<whatever_string> Tank&Spark: a component of the lower level layer running either on the WN and on the CE It is here mainly used to propagate software to other WNs. BUT: • It can be used as a standalone mechanism grid-independent • It can allow for installation by-passing the grid-job-submission (high prioritization of software management) • It keeps track about the installer which is strongly authenticated and univocally identified. • It complies with external policy set by the site administrators • It can manage all possible topologies of file system (shared, no-shared, AFS, a mix of them!) • It allows for a-synchronous (currently) and synchronous installation. • It allows for failure recovery. (re-try of the installation on the node) • It can allow for roll-back of a given installation (not in place) • It allows for an exhaustive notification (with success and problems node by node) to the ESM and (automatically) to the site admin. • It allows for storing many information about a given software -internally identified through GUIDs - (ex. date, size, owner, path, status and so on). • Automatic farm management: (It a node out? Is a new node there?) It adds/removes nodes into its central DB (MySQL) • It modifies the Information System by changing the “flavour” of the tag gssklog: is another component of the lower level layer and it’s part of another mechanism externally developed: gssklog-gssklogd It represents the client of this mechanism allowing for the conversion of GSI credential into valid KRB5 AFS tokens, Lcg-asis lcg-asis: is a friendly user interface • It hides the difficulties that lcg-ManageSoftware invocation implies (see previous slides) • It uploads (if specified) the sources of the software on the grid (tarball(s)). • It loops over all available sites to the VO (complying with some requirements provided by the user in terms of CPU, memory, disk space and CPU-time) and for each site: • Checks if another software management process is running on the site through the Information System (see later) • Creates automatically the JDL for that site • Submits the jobs and stores job information UI Lcg-ManageSofwtare WN Lcg-ManageVOTag Tank&Spark gssklog CE Experiment Software Installation

Tank & Spark It consists of three different components: Tank : =multithread (gSOAP based) service (running on the CE) listening for GSI-authenticated (and non) connections Spark :=client application runningon each WN (through a cronjob and/or through a normal “grid-job” from lcg-ManageSoftware) and contacting tank for retrieve/insert/delete software informations. R-sync server running on another machine (a SE for instance) and acting as central repository of the software. Experiment Software Installation

At the end of the whole process TANK will e-mail the ESM indicating the result of the installation; the Information System is upgraded accordingly to the result of the process JDL-installation job from ESM arrives on CE ESM TANK is contacted by all WNs one at the time External conditions are checked. Special site policies can be taken into account. Local installation on WNs is triggered. No authentication is required : each WN trusts TANK. 7 TANK registers the new tag and synchronize through R-SYNC the new directory created in SPARK in a central repository 1 Spark-client program is called. Delegated credentials of the ESM are checked in TANK. SPARK ask for a sw tag registration in TANK central DB. ESM requests ends up on WN that becomes SPARK SE TANK 3 2 4 5 6 WN WN WN WN WN WN WN CE abc ab “c” “c” The software (here labeled as “c”) is installed locally through the middle layer lcg-ManageSoftware. A pre-validation is highly recommended before triggering the propagation. The Information System is upgraded abc ab Site Firewall

Flag flavors: • VO-dteam-orca-8.3-processing-installInstallation on going • VO-dteam-orca-8.3-processing-remove Removal on going • VO-dteam-orca-8.3-processing-validate Validation on going • VO-dteam-orca-8.3-aborted-install Installation failure • VO-dteam-orca-8.3-aborted-remove Removal failure • VO-dteam-orca-8.3-aborted-validate Validation failure • VO-dteam-orca-8.3-to-be-validated Installation OK • Removal OK • VO-dteam-orca-8.3 Validation OK • Advantages: • Normal users continue to use the same mechanism to know about the software on a site • The ESMs know about the status of his management experiment software jobs. • There is not possibility to have concurrent management software jobs for the same software version on the same site.

Experiment Software Installation toolkit on LCG-2

Experiment Software Installation toolkit on LCG-2

Presentation Transcript

LCG Phase-2 Planning

Automated software packaging and installation for the ATLAS experiment

Experiment 2

Experiment 2

Experiment 2

Globus Toolkit 4.0.5 Installation Report

LCG Software Activities in India

Software installation

Experiment 2

Security APIs in LCG-2 Andrea Sciab à LCG Experiment Integration and Support CERN IT

Ideas on restructuring the LCG external software service

Panda Production on LCG

Experiment 2

Experiment 2

Experiment 2

Ideas on restructuring the LCG external software service

Globus Installation Toolkit

Experiment Software Installation in the LHC Computing Grid

RTAG on LCG Software Process Management

Experiment 2

Software Installation