ATLAS Grid Activities Preparing for Data Analysis

ATLAS Grid ActivitiesPreparing for Data Analysis Jim Shank

Overview • ATLAS Monte Carlo produciton in 2008 • Data (cosmic and single beam) in 2008 • Production and Distributed Analysis (PandDA) system • Some features of the ATLAS Computing Model • Analysis model for the US • Distributed Analysis Worldwide: Ganga/PanDA and Hammercloud + other readiness tests • Tier 3 centers in the US

Beam Splash Event

First ATLAS Beam Events, 10 Sept. 2008 Data Exports to T1s Throughput in MB/s Effect of concurrent data access from centralized transfers and user activity (overload of disk server) CERN Storage system overload. DDM worked. Subsequently we limited user access to the storage system. Number of errors

December 2008 Reprocessing

PanDA production (Monte Carlo Simulation/Reconstruction) 2008 Grouped by Cloud = Tier 1 center + all it’s associated Tier 2 centers

U.S. Production in 2008 More than our share—indicates others not delivering their expected levels

DDM : Data Replication BNL&AGLT2 ATLAS Beam and Cosmics data replication from CERN to Tier-1s and calibration Tier-2s. Sep-Nov 2008 Data replication to Tier-2s US Tier2s Datasets subscription intervals

DDM : Data replication between Tier-1s Functional Test. Tier-1-Tier-1s data replication status. FZK experienced problems with dCache. Data export is affected Tier-1-Tier-1s and prestaging data replication status. Data reprocessing. All Tier-1s Operational. Red : data transfer completion on 95% (data staging at CNAF)

PanDA Overview Workload management system for Production ANd Distributed Analysis • Launched 8/05 by US ATLAS to achieve scalable data-driven WMS • Designed for analysis as well as production • Insulates users from distributed computing complexity • Low entry threshold • US ATLAS production since late ‘05 • US analysis since Spring ’06 • ATLAS-wide production since early ‘08 • ATLAS-wide analysis still rolling out • OSG WMS program since 9/06

Panda/pathena Users 4 million jobs in last 6 months 473 users in last 6 months 352 users in last 3 months 90 users in last month 271 users with >1000 jobs 96 users with >10000 jobs

ATLAS ANALYSIS

ATLAS Data Types • Still evolving…

ATLAS Analysis Data Flow

Analysis Readiness Tests US T2 sites US

Ideas for a Stress Test (1) • Initiated by Jim Cochran (US ATLAS Analysis Support Group Chair). • Below is a summary of plans from Akira Shibata (March 10th). • Goal:Stress testing of the analysis queues in the Tier2 sites with analysis jobs as realistic as possible both in volume and quality. We would like to make sure that the Tier2 sites are ready to accept real data and analysis queues to analyze them. • Time scale: sometime near the end of May 2009. • Outline of this exercise: • To make this exercise more useful and interesting we will generate and simulate (Atlfast-II) a large amount of mixed sample at Tier2’s. • We are currently trying to define the jobs for this exercise and we expect this to be finalized after the BNL jamboree this week. • The mixed sample is a blind mix of all Standard Model processes, which we call "data" in this exercise. • For the one day stress test, we will invite people with existing analysis to try and analyze the data using Tier2 resources only. • We will compile a list of people who have the ability to participate. Nurcan Ozturk

Ideas for a Stress Test (2) • Estimate of data volume: A very rough estimate of the data volume is 100M-1B events. Assuming 100kB/event (realistic considering no truth info and no trigger info), this sets an upper limit of 100TB in total (split among 5 Tier2’s). This is probably an upper-limit from the current availability of USER/GROUP disk on Tier2 (which is in addition to MC/DATA/PROD and CALIB disk). • Estimate of computing capability: There are "plenty" of machines assigned to analysis though the current load of analysis queue is rather low. The computing nodes are usually shared between production and analysis and typically configured with upper limit and priority. For example MWT2 has 1200 cores and setup to run analysis jobs with priority with an upper limit of 400 cores. If production jobs are not coming in, the number of running analysis jobs can exceed this limit. • Site configuration: Site configuration varies among the Tier2 sites. We will compile a table showing configuration of each analysis queue; direct reading versus local copying, xrootd versus dcache, etc. We will compare the performance of queues based on their configuration. Nurcan Ozturk

Four Types of Tier 3 Systems • T3gs • T3 with Grid Services Details in next slides • T3g • T3 with Grid Connectivity details in next slides • T3w • Tier 3 Workstation • unclustered workstations...OSG, DQ2 client, root, etc • T3af • Tier 3 system built into lab or university analysis facility

Conclusions • Monte Carlo Simulation/Reconstruction working well world wide with PanDA submission system • Data reprocessing with PanDA working, but need further tests of file staging from tape. • Analysis Model still evolving • In the U.S., big emphasis on getting T3’s up and running • Analysis stress test coming in May-June • Ready for collision data in late 2009

Backup

PanDA Operation Data management ATLAS production T. Maeno Analysis

PanDA Production Dataflow/Workflow

Analysis with PanDA: pathena Outputs can be sent to xrootd/PROOF farm, directly accessible for PROOF analysis Running the ATLAS software: Locally:athena <job opts> PanDA:pathena --inDS --outDS <job opts> Tadashi Maeno

ATLAS Grid Activities Preparing for Data Analysis

ATLAS Grid Activities Preparing for Data Analysis

Presentation Transcript

Preparing for Fall Data Analysis

Preparing Quantitative data for analysis

Activities of IEAP for ATLAS Upgrade

Preparing Data for Analysis

Preparing Data for Analysis

Distributed Services for Grid Enabled Data Analysis

STORK: A Scheduler for Data Placement Activities in Grid

ATLAS Grid

Current Monte Carlo calculation activities in ATLAS (ATLAS Data Challenges )

Advanced Grid Technologies in ATLAS Data Management

Belle Data Analysis using GRID

ATLAS Grid Activities Preparing for Data Analysis

Datasets for the Grid and for ATLAS

LHC ATLAS users analysing data on the Grid

The ATLAS Grid Progress

The LHC Computing Grid Project Preparing for LHC Data Analysis NorduGrid Workshop

The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures

ATLAS Grid Information System

Distributed Heterogeneous Data Warehouse For Grid Analysis

WORKSHOP on ‘GRID Computing and e-Science: Data Analysis of ATLAS

Advanced Grid Technologies in ATLAS Data Management