80 likes | 205 Views
This document outlines the current strategies used for running Gaudi applications on the LHCbDirac environment, focusing on deployment through CVMFS across various Tier1 and Tier2 sites. It discusses application support capabilities, maintaining dual installations, and improving job bookkeeping with XML summary reports. The need for effective metadata management, user job tracking, and the integration of tools for better performance monitoring is highlighted. Additionally, challenges in application deployment and environment setup are addressed, ensuring robust and efficient job execution on the Grid.
E N D
Running Gaudi Applications on the Grid • Application deployment • CVMFS on all Tier1s (but Gridka) and a few Tier2s • Pushing for it (GDB) • Other sites: mostly NFS or GPFS • Installation done with install_project in dedicated jobs • If SW not available, local installation in the job • Support for dual installation (shared / local) mandatory • Other use case: running ROOT in LHCbDirac • Need to set up environment • Tell user to “SetupProjectLHCbDirac ROOT” • Set it up internally: currently broken by dual installation changes • LHCbDirac is also deployed on the Grid and CVMFS • Not used (yet) by jobs, but by users • Application support • Can we be more generic, i.e. support any Gaudi Application? • Support random application? • Support for just ROOT: SetupProject ROOT? Core Software workshop, PhC
Extra packages, AppConfig • First sight, all OK now • Versions fixed in step manager • Dynamic SQLite installation (see yesterday’s talk from MCl) • Transparent? • Additional options (set up by LHCbDirac) • Input dataset • Output dataset • Special options • Setting time stamps, defining MsgSvc format • Should LHCbDirac know about them? • Should AppConfig contain templates with just placeholders • E.g. file name replaced by actual name by Dirac, e.g. • @BHADRONFileName@ to be replaced with 00001234_0012.bhadron.dst • Mechanical operation rather than “knowledge” Core Software workshop, PhC
Job finalization • LHCbDirac needs information from the application • Success / failure ! • Bookkeeping report • Nb of events processed • Nb of events per output stream, GUID (also on catalog) • Memory used, CPU time? • Production system • Files successfully processed • Failed files • Event number of crash? • Most (all) information now in XML summary reports • XML summary browsing implemented (Mario) • Needs thorough testing in jobs (already tested with many cases) • Get rid (!) of AnalyseLogFile… • Any specific requirements for MC simulation? • Should info be added to BK? Core Software workshop, PhC
Bookkeeping • Is there enough information? • Can it be used for producing reports (e.g. performance benchmark)? • Accessing metadata from within a job: • Use case: determine the DDDB to be used for MC • Would BK sustain the load? • Where to query: jobs, BK GUI, ganga? • What if more than one in the dataset? • User job bookkeeping? • Is it worth investing? • Definition of requirements Core Software workshop, PhC
Step manager and production requests • Is the interface adequate? • Reco / Stripping: production manager mostly • MC: WG representatives, Gloria • Is it a limitation that the Condition should be in the BK for creating a request? • E.g. prepare a reconstruction production before data is available? • Can this be avoided? • Is the request progress monitoring OK? • Currently when enough events are in BK, one kills jobs and deletes files • Is this useful? Should we just let go and flush? • Disk extra usage vs wasted CPU Core Software workshop, PhC
Tools for users • Not directly for Core Software but… education! • How to get a proper PFN? • Too many users of PFN:/castor/cern.ch/… • Triggers a disk 2 disk copy to a smallish pool (overload) • Should be root://castorlhcb.cern.ch/ • Currently available: • PFNs from BK GUI • CLI tools: to be improved to get directly the PFN at a site • Resurrect genXMLCatalog and include in LHCbDirac • Documentation and Education! • How to replicate datasets? • To local disk • To shared SE • Tools exist but are not streamlined nor documented Core Software workshop, PhC
Software tools • Synergy between LHCbDirac and Core Software • Eclipse • SVN vsgit • Why are branches / merging a nightmare for LHCbDirac and not for Core SW? • Savannah vs JIRA • Benefit from the CS and Applications’ experience • Is LHCbDirac that different? • Is packaging a bonus or a nuisance? Monolithic vs packages • Is getpack useful? How to couple it with Eclipse? • Setup the environment from Eclipse working area • Can one use SetupProject to get LHCbDirac on WNs? • LHCb private pilot jobs • Any particular requirement? • For LHCbDirac services and agents: • How to get a controlled Grid environment without doing GridEnv? Core Software workshop, PhC