180 likes | 186 Views
The ATLAS software in the Grid Alessandro De Salvo < Alessandro.DeSalvo@roma1.infn.it > 20-02-2008. Outline Overview The Experiment Software Structure in the Grid Using the ATLAS software from a Grid node Selecting software releases Using the Installation System for EGEE
E N D
The ATLAS software in the GridAlessandro De Salvo <Alessandro.DeSalvo@roma1.infn.it>20-02-2008 Outline Overview The Experiment Software Structure in the Grid Using the ATLAS software from a Gridnode Selecting software releases Using the Installation System for EGEE Documentation & contacts A. De Salvo – 20 Apr 2008
The ATLAS software in the Grid: overview • The Gridnodes are installedwith standard distributionkits • Usingpacmanfrom the centralcaches or fromoneof the mirrors • Sameinstallationasyouwouldhave in yourlocalmachine • pacman-getam-CERN:13.0.40 • After the installationstep, each site isvalidatedusingKitValidation • Onlyif the site passes the KV testsisconsideredvalidated • When a site isvalidatedfor a givenreleasenumber, the relevanttagispublishedto the Information System • In EGEE a tagispublishedto the CE (we’llseelaterhowtouseit) • In OSG thiscorrespondstopublishing the numberof the release and the pathto the information system
The Experiment Software Area • The ATLAS software in the Gridisinstalled in theExperiment Software Area reservedfor ATLAS • Disk area sharedamong the WorkerNodes via a sharedfilesystem(NFS, AFS, GPFS, PANFS, …) • The usermayaccess the software from a WN byusing some variablesdefined at runtime in the WN • EGEE • $VO_ATLAS_SW_DIR • OSG • $OSG_APP/atlas • Example: CERN • VO_ATLAS_SW_DIR=/afs/cern.ch/project/gd/apps/atlas/slc3
The structureof the software installations in EGEE • Eachreleasehas a separate entry point • $VO_ATLAS_SW_DIR/software/<release_number> • Example • $VO_ATLAS_SW_DIR/software/13.0.40 • The entry point (logicalinstallation) is a displacedinstallationof the physicalrelease • The physical area islocated in a differentplace and mayhave multiple releasessharing the same disk area • Differentareasfor production, development and nightlyreleases • Production: $VO_ATLAS_SW_DIR/prod/releases • Development: $VO_ATLAS_SW_DIR/dev/releases • Nightlies: $VO_ATLAS_SW_DIR/nightlies • The usersshouldneverusedirectly the physicalreleases, whichcouldchange location at anypoint • Alwaysuse the release entry point under $VO_ATLAS_SW_DIR • Once the releasehasbeen set up from the logicalrelease, the variable SITEROOT willbe set topointto the physicalinstallation area
Using the main ATLAS releases in EGEE • Setup the runtimeenvironment • Forreleases < 13.0.30 • For future releases (> 13.0.30) • Runathenaasnormal source $VO_ATLAS_SW_DIR/software/<rel_num>/setup.sh cd $SITEROOT/AtlasOffline/<rel_num>/AtlasOfflineRunTime/cmt source setup.sh cd - source $VO_ATLAS_SW_DIR/software/<rel_num>/setup.sh Athena <jobOption>
Simulating a Gridrun:running Athena HelloWorld at Roma1 • From the practicalpointofview, runningwithin the Gridenvironment and in local mode isequivalent, providedthat • Wehave the envvar $VO_ATLAS_SW_DIR set • Wehaveaccessto the Experiment Software Area • Let’s simulate a Gridenvironment at CERN and runan Athena HelloWorld, usingrelease 13.0.40 • Login to the Roma1 Tier2 (atlas-ui.roma1.infn.it) • Use the followingcommands # Set the Experiment Software Area export VO_ATLAS_SW_DIR=/opt/exp_soft/atlas # Setup the release source $VO_ATLAS_SW_DIR/software/13.0.40/setup.sh cd $SITEROOT/AtlasOffline/13.0.40/AtlasOfflineRunTime/cmt source setup.sh cd – # Start Athena athena AthExHelloWorld/HelloWorldOptions.py
Using patch releases • Patch releases are shippedseparately, after the mainreleaseisbuilt • Mainrelease • 13.0.40 • Patch releases • 13.0.40.1 • 13.0.40.2 • Each patch releaseis • Sharing the samephysical area of the mainrelease • Sharing the samerelease entry pointof the mainrelease • Touse a patch release the userhasto • Setup the mainrelease • Setup the patch releasefrom the AtlasProduction or AtlasPoint1 package
Using patch releases (2) • Example: setting up AtlasProduction 13.0.40.2 • Exampleforrecentreleases (> 13.0.30) # Setup the main release source $VO_ATLAS_SW_DIR/software/13.0.40/setup.sh # Setup the patch release unset CMTPATH cd $SITEROOT/AtlasProduction/13.0.40.2/AtlasProductionRunTime/cmt source setup.sh cd – # Setup the patch release source $VO_ATLAS_SW_DIR/software/13.0.40/setup.sh –tag=13.0.40.1,AtlasProduction,runtime
Compilinguser code • User code maybecompiledagainstaninstalled kit during a Grid job • Tocomplieuser code • Create a test area using the create-cmthome.sh script fromhttps://twiki.cern.ch/twiki/bin/view/Atlas/UseAtlasSoftwareProjectsKit • …oruse a simplerequirements file • Compile your code and useyourtestarea ($HOME/testarea) $> cat requirements set CMTSITE STANDALONE macro ATLAS_DIST_AREA ${SITEROOT} apply_tag projectArea macro SITE_PROJECT_AREA ${SITEROOT} macro EXTERNAL_PROJECT_AREA ${SITEROOT} apply_tag opt macro ATLAS_TEST_AREA ${HOME}/testarea use AtlasLogin AtlasLogin-* $(ATLAS_DIST_AREA) $> source $VO_ATLAS_SW_DIR/software/13.0.40/setup.sh $> cmt config $> source setup.sh –tag=AtlasOffline,13.0.40,opt,oneTest,runtime
Whichreleases do weexpecttofind in the Grid? • The releaseinstallation in the Gridiscentrallymanaged • Production releases are automaticallypushed in the sites • Obsolete releases are automaticallyremovedwhenthey are no more requiredforproduction • The analysis people shouldalso play a fundamentalrolehere • Needto pin releasesifusedforanalysis, evenifconsidered obsolete from the production pointofview • The ATLAS Installation System isabletomanageuserrequestsforinstallation, testing and removalsofreleases • Whatyoumayinstall in the grid • Production releases (>= 11.0.X) • (Some) developmentreleases • Nightlyreleases (alsowithautomaticdeployment) • Production patches • Point1 patches • Installationarchitectures • All the nodes are currentlyinstalledwith SLC3, gcc323, 32bits releases, independentfrom the underlyingarchitecture • 64 bitsreleases are notyetofficiallyreleased • Needtouse SLC3 software until the majorityof the nodes in the Gridwillbeupdatedto SL4 or copatible OS • SLC4 nodesmayrun SLC3 software, while the reverse isnottrue
Selectingsiteswith the required software release • Foreachreleaseinstalled, a VO software tagispublished • VO-atlas-<project>-<rel_num> • Examples • VO-atlas-production-13.0.40 • VO-atlas-production-13.0.40.2 • The VO software tagsmaybeused in the requirementsofyourjobstoselect the sites and resourceswhere a given software releaseisavailable and working • In case offailuresof a release in a site • Trytoidentify the problem • Open a ticket in GGUS, possiblyspecifying the applicationyouwererunning, the software release, the site and the nodewhereyouhad the problem • http://www.ggus.org Requirements =(Member(“VO-atlas-production-13.0.40",other.GlueHostApplicationSoftwareRunTimeEnvironment));
Checkingif a releaseisactuallypresent in the sites • Sometimes the sitesstillpublishtagsevenif the releaseismissing • Tocheckif a releaseisactuallypresent in the sites • Easy approach • Checkfor the presenceof the $VO_ATLAS_SW_DIR/software/<rel_num>/setup.sh file • Saferapproach • Check the return code of “source $VO_ATLAS_SW_DIR/software/<rel_num>/setup.sh” #!/bin/sh if [ ! -s $VO_ATLAS_SW_DIR/software/13.0.40/setup.sh ] ; then echo “Release not found!!!” fi #!/bin/sh source $VO_ATLAS_SW_DIR/software/13.0.40/setup.sh if [ $? –ne 0 ] ; then echo “Release not found!!!” fi
The ATLAS Installation System for LCG/EGEE • The software installations in EGEE are operated via the Installation System • https://atlas-install.roma1.infn.it/atlas_install • Everyusermaybrowse the online status of the installations in LCG/EGEE
The ATLAS Installation System for LCG/EGEE (2) • On-demandinstallations or maintenancerequestsmayalsoberequested • https://atlas-install.roma1.infn.it/atlas_install/protected/req.php • The pagewillbeshownonlyifyouhave a valid personal certificate imported in your browser
The ATLAS Installation System for LCG/EGEE (3) • The Installation System mayalsogiveyou the followinginformations • Informations on the releases (Release Matrix: https://atlas-install.roma1.infn.it/atlas_install/protected/rel.php) • Releasename • Architecture • Project name • Installationpaths • Tags • … • Snapshotsof the currenttags in the Grid(Tags Matrix: https://atlas-install.roma1.infn.it/atlas_install/protected/tags.php)
The ATLAS Installation System for LCG/EGEE (4) • Usersmay set pinstoreleasestoavoid the Installation System toremovespecificreleases • Forexampleifyouneed a specificreleaseforyouranalysis in a minimum numberofsites and youwanttokeepitevenifithasbeenconsidered obsolete • https://atlas-install.roma1.infn.it/atlas_install/protected/pin.php
The ATLAS Installation System for LCG/EGEE (5) • Exercise • Login to the Roma1 Tier2 (atlas-ui.roma1.infn.it), setuprelease 13.0.40 using the logical area • Discoverwhereis the physical location of the release 13.0.40 • Compare the pathsof the logical and physical location youfoundtowhatyougetfrom the Release Matrix page in the Installation System
Documentation and contacts • Using the DistributionKits • https://twiki.cern.ch/twiki/bin/view/Atlas/UseAtlasSoftwareProjectsKit • In case offailuresof a release in a site • Trytoidentify the problem • Open a ticket in GGUS, possiblyspecifying the applicationyouwererunning, the software release, the site and the nodewhereyouhad the problem • http://www.ggus.org • Forsite-specificissues (EGEE only) • The ATLAS Installation Team • atlas-grid-install@cern.ch • The ATLAS Installation System • Mainpage • https://atlas-install.roma1.infn.it/atlas_install • On-demandinstallation and test requests • https://atlas-install.roma1.infn.it/atlas_install/protected/req.php • Releasepinning • https://atlas-install.roma1.infn.it/atlas_install/protected/pin.php • Overview and status of the installedreleases • https://atlas-install.roma1.infn.it/atlas_install/protected/rel.php • Overviewof the tags in the sites • https://atlas-install.roma1.infn.it/atlas_install/protected/tags.php