The ATLAS Grid Progress

The ATLAS Grid Progress Roger Jones Lancaster University GridPP CM QMUL, 28 June 2006

RAW ESD2 AODm2 0.044 Hz 3.74K f/day 44 MB/s 3.66 TB/day RAW ESD (2x) RAW RAW AODm (10x) 1.6 GB/file 0.02 Hz 1.7K f/day 32 MB/s 2.7 TB/day 1.6 GB/file 0.02 Hz 1.7K f/day 32 MB/s 2.7 TB/day 1 Hz 85K f/day 720 MB/s OtherTier-1s OtherTier-1s EachTier-2 ESD2 ESD2 ESD2 ESD1 ESD2 AOD2 AOD2 AODm1 AODm2 AODm1 AODm2 AODm2 AODm2 AODm2 T1 T1 T1 T1 T1 T1 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 10 MB/file 0.2 Hz 17K f/day 2 MB/s 0.16 TB/day 10 MB/file 0.2 Hz 17K f/day 2 MB/s 0.16 TB/day 500 MB/file 0.036 Hz 3.1K f/day 18 MB/s 1.44 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 500 MB/file 0.036 Hz 3.1K f/day 18 MB/s 1.44 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 500 MB/file 0.004 Hz 0.34K f/day 2 MB/s 0.16 TB/day 500 MB/file 0.004 Hz 0.34K f/day 2 MB/s 0.16 TB/day ATLAS partial &“average” T1 Data Flow (2008) Tape Tier-0 diskbuffer Plus simulation and analysis data flow CPUfarm diskstorage RWL Jones 28 June 2006 QMUL

Computing System Commissioning • ATLAS developments are all driven by the Computing System Commissioning (CSC) • Runs from June 06 to ~March 07 • Not monolithic, many components • Careful scheduling needed of interrelated components – workshop next week for package leaders • Begins with Tier-0/Tier-1/(some) Tier-2s • Exercising the data handling and transfer systems • Lesson from the previous round of experiments at CERN (LEP, 1989-2000) • Reviews in 1988 underestimated the computing requirements by an order of magnitude! RWL Jones 28 June 2006 QMUL

CSC items • Full Software Chain • Tier-0 Scaling • Streaming tests • Calibration & Alignment • High-Level Trigger • Distributed Data Management • Distributed Production • Physics Analysis RWL Jones 28 June 2006 QMUL

ATLAS Distributed Data Management • ATLAS reviewed all its own Grid distributed systems (data management, production, analysis) during the first half of 2005 • Data Management is key • A new Distributed Data Management System (DDM) was designed, based on: • A hierarchical definition of datasets • Central dataset catalogues, Distributed file catalogues • Data blocks as units of file storage and replication • Automatic data transfer mechanisms using distributed services (dataset subscription system) • The DDM system supports the basic data tasks: • Distribution of raw and reconstructed data from CERN to the Tier-1s • Distribution of AODs (Analysis Object Data) to Tier-2 centres for analysis • Storage of simulated data (produced by Tier-2s) at Tier-1 centres for further distribution and/or processing RWL Jones 28 June 2006 QMUL

ATLAS DDM Organization RWL Jones 28 June 2006 QMUL

Central vs Local Services • The DDM system has a central role with respect to ATLAS Grid tools • Its slow roll-out on LCG is causing problems to other components • Predicated on distributed file catalogues and auxiliary services • Do not ask every single Grid centre to install ATLAS services • We decided to install “local” catalogues and services at Tier-1 centres • Then we defined “regions” which consist of a Tier-1 and all other Grid computing centres that: • Are well (network) connected to this Tier-1 • Depend on this Tier-1 for ATLAS services (including the file catalogue) • CSC will establish if this scales to the LHC data-taking era needs: • Moving several 10000s files/day • Supporting up to 100000 organized production jobs/day • Supporting the analysis work of >1000 active ATLAS physicists RWL Jones 28 June 2006 QMUL

ATLAS Data Management Model • In practice, it turns out to be convenient ( & more robust) to partition the Grid so that there are default (not compulsory) Tier-1↔Tier-2 paths • FTS channels are installed for these data paths for production use • All other data transfers go through normal network routes • In this model, a number of data management services are installed only at Tier-1s and act also on their “associated” Tier-2s: • VO Box • FTS channel server (both directions) • Local file catalogue (part of DDM/DQ2) RWL Jones 28 June 2006 QMUL

Tiers of ATLAS T0 VO box LFC T1 T1 VO box LFC …. FTS Server T0 T2 FTS Server T1 T2 LFC: local within ‘cloud’ All SEs SRM RWL Jones 28 June 2006 QMUL

Job Management: Productions • Next step: rework the distributed production system to optimise job distribution, by sending jobs to the data (or as close as possible to them) • This was not the case previously, as jobs were sent to free CPUs and had to copy the input file(s) to the local WN, from wherever in the world the data happened to be • Make better use of the task and dataset concepts • A “task” acts on a dataset and produces more datasets • Use bulk submission functionality to send all jobs of a given task to the location of their input datasets • Minimise file transfers and waiting time before execution • Collect output files from the same dataset to the same SE and transfer them asynchronously to their final locations RWL Jones 28 June 2006 QMUL

Job Management: Analysis • A central job queue is good for scheduled productions (priority settings), but too heavy for user analysis • Interim tools developed to submit Grid jobs on specific deployments and with limited data management: • LJSF for the LCG/EGEE Grid • Pathena can generate ATLAS jobs that act on a dataset and submits them to PanDA on the OSG Grid • Baseline tool to help users to submit Grid jobs is Ganga • Job splitting and bookkeeping • Several submission possibilities • Collection of output files • Now becoming useful as DDM is populated • Rapid progress after user feedback, rich features RWL Jones 28 June 2006 QMUL

Local system (Ganga) Prepare JobOptionsFind dataset from DDMGenerate & submit jobs Local system (Ganga) Job book-keeping Get Output GridRun Athena ProdSys Run Athena on Grid Store o/p on Grid Local system (Ganga) Job book-keeping Access output from Grid Merge results Local system (Ganga) Prepare JobOptionsFind dataset from DDMGenerate & submit jobs ATLAS Analysis Work Model Local system (shell) Prepare JobOptions  Run Athena (interactive or batch)  Get Output Job preparation: Medium-scale (on demand) running &testing: Large-scale (scheduled) running: RWL Jones 28 June 2006 QMUL

Analysis Jobs at Tier-2s • Analysis jobs must run where the input data files are • Most analysis jobs will take AODs as input for complex calculations and event selections • And most likely will output Athena-Aware Ntuples (AAN, to be stored on some close SE) and histograms (to be sent back to the user) • People will develop their analyses on reduced samples many many times before launching runs on a complete dataset • There will be a large number of failures due to people’s code! • Exploring a priority system that separates centrally organised productions from analysis tasks RWL Jones 28 June 2006 QMUL

ATLAS requirements • General production • Organized production • Share defined by the management • Group Production • Organized production • About 24 groups identified • Share defined by the management • General Users • Chaotic use pattern • Fair share between users • Analysis service to be deployed over summer • Various approached to prioritisation (VOViews, gpbox, queues) to be explored RWL Jones 28 June 2006 QMUL

Conditions data model • All non-event data for simulation, reconstruction and analysis • Calibration/alignment data, also DCS (slow controls) data, subdetector and trigger configuration, monitoring, … • Several technologies employed: • Relational databases: COOL for Intervals Of Validity and some payload data, other relational database tables referenced by COOL • COOL databases in Oracle, MySQL DBs, or SQLite file-based DBs • Accessed by ‘CORAL’ software (common database backend-independent software layer) - independent of underlying database • Mixing technologies part of database distribution strategy • File based data (persistified calibration objects) - stored in files, indexed / referenced by COOL • File based data will be organised into datasets and handled using DDM (same system as used for event data) RWL Jones 28 June 2006 QMUL

Calibration data challenge • ATLAS, Tier-2s have only done simulation/reconstruction • Static replicas of conditions data in SQLite files, or preloaded MySQL replicas - conditions data already known in advance • ATLAS calibration data challenge (late 2006) will change this • Reconstruct misaligned/miscalibrated data, derive calibrations, re-reconstruct and iterate - as close as possible to real data • Will require ‘live’ replication of new data out to Tier-1/2 centres • Technologies to be used @ Tier-2 • Will need COOL replication either by local MySQL replicas, or via Frontier • Currently just starting on ATLAS tests of Frontier - need experience • Decision in a few months - what to use for calibration data challenge • Will definitely need DDM replication of new conditions datasets (sites subscribe to evolving datasets) • External sites will submit updates as COOL SQLite files to be merged into central CERN Oracle databases RWL Jones 28 June 2006 QMUL

Conclusions • We are trying not to impose any particular load on Tier-2 managers by running distributed services at Tier-1s • Although this concept breaks the symmetry and forces us to set up default Tier-1–Tier-2 associations • All that is required of Tier-2s is to set up the Grid environment • Including whichever job queue priority scheme will be found most useful • And SRM Storage Elements with (when available) a correct implementation of the space reservation and accounting system RWL Jones 28 June 2006 QMUL

The ATLAS Grid Progress