1 / 23

ATLAS computing in Russia

ATLAS computing in Russia. A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 10/03/08. ATLAS RuTier-2 tasks.

deion
Download Presentation

ATLAS computing in Russia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATLAS computing in Russia A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 10/03/08 A.Minaenko

  2. ATLAS RuTier-2 tasks • Russian Tier-2 (RuTier-2)computing facility is planned to supply with computing resources all 4 LHC experiments including ATLAS. It is a distributed computing center including at the moment computing farms of 6 institutions: ITEP, KI, SINP (all Moscow), IHEP (Protvino), JINR (Dubna), PNPI (St.Petersburg) • The main RuTier-2 task is providing facilities for physics analysis using AOD, DPD and user derived data formats as ROOT trees. • Full current AOD and 30% of previous AOD version should be available • Development of reconstruction algorithms should be possible which require some subsets of ESD and Raw data • All the data used for analysis should be stored on disk servers (SE) and some unique data (user, group DPD) to be saved on tapes also as well as previous AOD/DPD version • The second important task is production and storage of MC simulated data • The planned RuTier-2 resources should supply the fulfilment of these goals A.Minaenko

  3. ATLAS RuTier-2 resource evolution • The table above was included in the table of Russia pledge to LCG and it illustrates our current understanding of the resources needed. It can be corrected in future when we’ll understand our needs better • Not taken into account: AOD increase due to inclusive streaming, change of rate MC events (30% instead of 20%), possible increase of AOD event size (taken 100 KB), increase of total DPD size (taken 0.5 of AOD) A.Minaenko

  4. Current RuTier-2 resources for all experiments • Red – will be available in 1-2 month • ATLAS request for 2008 = 780 kSI2k, 280 TB A.Minaenko

  5. Normalized CPU time (hour*kSI2k) A.Minaenko

  6. RuTier-2 for ATLAS in 2007 ATLAS – 21% ATLAS – 846 kh*kSI2k A.Minaenko

  7. Site contributions in ATALAS in 2007 A.Minaenko

  8. ATLAS RuTier-2 in the SARA cloud • The sites of RuTier-2 are associated with ATLAS Tier-1 SARA • Now 5 sites IHEP, ITEP, JINR, SINP, PNPI are included in TiersOfAtlas list and FTS channels are tuned for the sites • 4 sites (IHEP, ITEP, JINR, PNPI) successfully participated in 2007 in data transfer functional tests (next slide). This is a coherent data transfer test Tier-0 →Tiers-1→Tiers-2 for all clouds, using existing SW to generate and replicate data and to monitor data flow. • Other 2007 ATLAS activity is replication of produced MC AOD from Tiers-1 to Tiers-2 according to ATLAS computing model. It is done using FTS and subscription mechanism. RuTier-2 sites (except ITEP) did not participate in the activity because of the severe lack of a free disk space • 4 sites (IHEP(15%), ITEP(20%), JINR(100%), PNPI(20%)) participated in replication of M4 data. Here percentage of requested for replication data is shown. Only JINR obtained all the data, the other sites were limited by the size of free disk space • During one week M4 exercises (Aug-Sep07) about two millions of real muon events were detected, written down on disks and tapes and reconstructed in ATLAS Tier-0. Then the reconstructed data (ESD) in quasi-real time were exported to Tiers-1 and their associated Tiers-2. All the chain was working as it should be during real LHC data taking. This was the first successful experience of this sort for ATLAS • Two slides (10, 11) illustrate the M4 exercises and the 2nd one shows results for the SARA cloud: practically all subscribed data were successfully transmitted A.Minaenko

  9. Activities. Functional Tests 10 Tier-1s and 46 Tier-2s participated Sep 06 Oct 06 Nov 06 Sep 07 Oct 07 New DQ2 SW release . Oct 2007, DQ2 0.4 New DQ2 SW release . Jun 2007, DQ2 0.3 New DQ2 SW release (0.2.12) A.Minaenko

  10. M4 Data Replication Activity Summary for All Sites Summary for all Tier-1 sites Summary for all Tier-2 sites IHEP, ITEP, JINR, PNPI Complete replicas Incomplete replicas Datasets subscribed A.Minaenko

  11. Transfer status: IHEP: 1 trouble file (0.2%)JINR: 1 trouble file (0.3%) ITEP: no troublesPNPI: no troubles ESD data only M4 Data Replication Activity Summaryfor SARA Cloud A.Minaenko

  12. M5 Data Replication Activity Summary ITEP,IHEP, JINR,PNPI participated Delay in replication < 24h Total subscriptions Completed Transfers IHEP, ITEP, JINR,PNPI A.Minaenko

  13. Russian contribution to the central ATLAS sw/computing • Russia contribution to ATLAS M&O budget of Category A has amounted 0.5 FTE this year. Two our colleagues (I.Kachaev, V.Kabachenko) were involved in central ATLAS activities at CERN concerning Core sw maintenance. They fulfilled a number of tasks: • Support of atlas-support@cern.ch list, i.e. managing user quotas, scratch space distribution, user requests/questions concerning AFS space, access rights etc. • Support of atlas-sw-cvsmanagers@cern.ch list, i.e. managing central ALAS CVS • Official ATLAS sw release builds: releases 13.0.20, 13.0.26, 13.0.28, 13.0.30 have been build and 13.0.40 is under construction • Corresponding documentation update: release pages, librarian documentation • ATLAS AFS management • a lot of scripts have been written to support release builds, release copy and move, command line interface to TagCollector, cvs tags search and comparison in the TagCollector, etc. A.Minaenko

  14. Russian contribution to the central ATLAS sw/computing • Two our colleagues (A.Zaytsev, S.Pirogov) were visiting CERN (4+4 month) to make contribution to the activity of ATLAS Distributed Data Management (DDM) group. Their tasks included corresponding sw development as well as participation in central ATLAS DDM operations like support of data transfer functional tests, M4 exercises, etc. Special attention were given to SARA cloud to which Russian sites are attached • During the visit the following main tasks were fulfilled: • Development of the LFC/LRC Test Suite and applyingit to measuring performance of the updated version of the production LFC server and a new GSI enabled LRC testbed • Extending functionality and documenting the DDM Data Transfer Request Web Interface • Installing and configuring a complete PanDA server and a new implementation of PanDA Scheduler Server (Autopilot) at CERN and assisting LYON Tier-1site to do the same • Contributing to the recent DDM/DQ2 Functional Tests (Aug 2007) activity, developing tools for statistical analysis of the results and applying them to the data gathered during the tests • All the results were reported at the ATLAS internal meetings and at the computing conference CHEP2007 • Part of the activity (0.3 FTE) was accounted as Russia contribution to ATLAS M&O Category A budget (Central Operations part) A.Minaenko

  15. Challenges in 2008 • FDR-1 • 10 hrs. data taking @200 Hz a few days in a row • CCRC-1 • 4 weeks operation of full Computing Model • All 4 LHC experiments simultaneously • Sub detector runs • M6 • First week of March • FDR-2 Simulation Production • 100M events in 90 days plus merging • Using new release • CCRC-2 • Like CCRC-1 but the whole month of May • FDR-2 • Like FDR-1 but at higher luminosity • Timing uncertain now • M7 ? A.Minaenko

  16. Planned ATLAS activity in 2008 A.Minaenko

  17. A.Minaenko

  18. ATLAS Production Tiers (Feb 08. Full Dress Rehearsal) status 10 Tier-1s and 56 “Tier-2s” Metrics for T1 success : 100% data transferred (from CERN, from Tier-1s and to Tier-2s) Metrics for T2/T3 success : 95+% data transferred (transfer within cloud) Metrics for cloud success : 75% of sites participated in the test and 75% passed the test done part failed No test A.Minaenko

  19. A.Minaenko

  20. CCRC08-1 results at RuTier-2 Activity Summary ('2008-02-24 08:50' to '2008-03-01 12:50') A.Minaenko

  21. A.Minaenko

  22. Structure of ATLAS data used for physics analysis • The streaming of ATLAS data is under discussion now and final decision is not accepted yet • Streaming is based on trigger decision and the assignment of a given event to a stream can not change over time (does not depend on offline procedures) • There will be 4-7 RAW/ESD physics streams • One or a few AOD streams per a ESD stream, with of about 10 final AOD streams • There are two possible types of streaming • Inclusive streaming – one and the same event can be assigned to different streams if it has corresponding trigger types • Exclusive streaming – a given event can be assigned to only one stream; if it has signatures permitting to assign it to more than one stream it goes to special overlap stream • Now the inclusive streaming is considered as preferable • A given DPD is intended for a given type(s) of analysis and it can collect events from different streams. A DPD contains only needed for a given analysis set of events and only needed part of event information • Physics analysis will be carried out using AOD streams and (mainly) different DPDs including specific user created formats (as ROOT trees) A.Minaenko

  23. Possible scenarios of data distribution and analysis in RuTier-2 • Scenario A: a given AOD stream (or DPD) is thoroughly kept at a given Tier-2 site: • advantage – can be easily done from the technical point of view using present ATLAS DDM and analysis tools • disadvantage – very hard to supply uniform CPU load. At some sites (with “popular” data) CPUs will be overloaded but at other there will be idle CPUs • Scenario B: each AOD stream (large DPD) is split between all the sites: • advantage – uniform CPU load • disadvantage – i) possible difficulties with subscription providing automated splitting of data (?); ii) will be analysis grid sub-jobs able to find sites with needed data (?) • From the point of view of functionality scenario B is more preferable but the question is: do existing ATLAS tools permit to realize the scenario (present answer – yes, but it is necessary to test this practically) • AOD and DPD to be distributed proportionally to the CPU (kSI2k) between the participating sites A.Minaenko

More Related