1 / 40

GRID in JINR and participation in the WLCG project

GRID in JINR and participation in the WLCG project. Korenkov Vladimir LIT, JINR. 17 .07.1 2. Tier Structure of GRID Distributed Computing: Tier-0/Tier-1/Tier-2/Tier-3. Tier-0 (CERN): accepts data from the CMS Online Data Acquisition and Trigger System archives RAW data

questa
Download Presentation

GRID in JINR and participation in the WLCG project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRID in JINR and participation in the WLCG project Korenkov Vladimir LIT, JINR 17.07.12

  2. Tier Structure of GRID Distributed Computing:Tier-0/Tier-1/Tier-2/Tier-3 Tier-0 (CERN): • accepts data from the CMS Online Data Acquisition and Trigger System • archives RAW data • the first pass of reconstruction and performs Prompt Calibration • data distribution to Tier-1 Tier-1 (11 centers): • receives a data from the Tier-0 • data processing (re-reconstruction, skimming , calibration etc) • distributes data and MC to the other Tier-1 and Tier-2 • secure storage and redistribution for data and MC Tier-2 (>200 centers): • simulation • user physics analysis 2 Ian.Bird@cern.ch

  3. Some history 1999 – Monarc Project Early discussions on how to organise distributed computing for LHC 2001-2003 - EU DataGrid project middleware & testbed for an operational grid 2002-2005 – LHC Computing Grid – LCG deploying the results of DataGrid to provide a production facility for LHC experiments 2004-2006 – EU EGEE project phase 1 starts from the LCG grid shared production infrastructure expanding to other communities and sciences 2006-2008 – EU EGEE-II Building on phase 1 Expanding applications and communities … 2008-2010 – EU EGEE-III 2010-2012 - EGI-InSPIRE CERN

  4. Russian Data Intensive Grid infrastructure (RDIG) The Russian consortium RDIG (Russian Data Intensive Grid), was set up in September 2003 as a national federation in the EGEE project. Now the RDIG infrastructure comprises 17 Resource Centers with > 20000 kSI2K CPU and > 4500 TB of disc storage. • RDIG Resource Centres: • – ITEP • – JINR-LCG2 (Dubna) • – RRC-KI • – RU-Moscow-KIAM • – RU-Phys-SPbSU • – RU-Protvino-IHEP • – RU-SPbSU • – Ru-Troitsk-INR • – ru-IMPB-LCG2 • – ru-Moscow-FIAN • – ru-Moscow-MEPHI • – ru-PNPI-LCG2 (Gatchina) • – ru-Moscow-SINP • - Kharkov-KIPT (UA) • BY-NCPHEP (Minsk) • UA-KNU • - UA-BITP

  5. JINR Tier2 Center Availability and Reliability = 99% CPU: 2560 Core Total performance 24000 HEPSpec06 Disk storage capacity 1800 TB Performance Growth of the JINR Tier2 resources in 2003 - 2011 More than 7,5 millions tasks and more than 130 millions Normalised CPU time (HEPSpec06) were executed during July 2011- July 2012 Disk storage capacity

  6. Scheme of the CICC network connections

  7. Support of VOs and experiments Now CICC JINR as a grid-site of the global grid infrastructure supports computations of 10 Virtual Organizations (alice, atlas, biomed, cms, dteam, fusion, hone, lhcb, rgstest and ops), and also gives a possibility of using the grid-resources for the CBM and PANDA experiments.

  8. JINR monitoring main page JINR LAN Infrastructure Monitoring is one of the most important services. The Network monitoring information system (NMIS) with web-based graphical interface – http://litmon.jinr.ru provides the operational monitoring processes and monitors the most important network elements: JINR LAN routers and switches, remote routers and switches, cooling units, DNS servers, dCache servers, Oracle servers, RAIDs, WLCG servers, working nodes, etc. More than 350 network nodes are in round-the-clock monitoring.

  9. Country Normalized CPU time per Country LHC VO (July 2011 - July 2012) All Country - 3,808,660,150Russia-95,289,002 (2.5%) 2011 - 2,108,813,533Russia - 45,385,987 9

  10. Russia Normalized CPU time per SITE LHC VO (July 2011 - July 2012) 10

  11. CMS Computing Support at the JINR CMS T2 center at JINR: CPU 1240 kSI2k Slots 495 jobs Storage Disk 376 TB Contribution of RDMS CMS sites – 4% (16,668,792 kSI2K) CMS Sites in Russia All CMS Sites Statistics for 12 last months: 08/2011-07/2012 Contribution of JINR – 52% JINR

  12. JINR Monitoring and Analysis Remote Centre • Monitoring of detector systems • Data Monitoring / Express Analysis • Shift Operations (except for run control) • Communications of JINR shifter with personal at CMS Control Room (SX5) and CMS Meyrin centre • Communications between JINR experts and CMS shifters • Coordination of data processing and data management • Training and Information Ilya Gorbunov and Sergei Shmatov“RDMS CMS Computing”, AIS2012, Dubna, May 18, 2012

  13. Atlas Computing Support at JINR Atlas Sites in Russia Atlas at JINR (Statistics for 12 last months: 07.2011- 07.2012) : CPU time (HEPSpec06): 51,948,968 (52%) Jobs : 4,885,878 (47%)

  14. System of remote access in real time (SRART) for monitoring and quality assessment of data from the ATLAS at JINR One of the most significant results of the team TDAQ ATLAS at LIT during the last few years was the participation in the development of the project TDAQ ATLAS at CERN. The system of remote access in real time (SRART) for monitoring and quality assessment of data from the ATLAS at JINR was put in operation. At present the system of remote access in real time is debugged on real data of the ATLAS experiment.

  15. ALICE Computing Support at JINR Seven RDIG sites (IHEP, ITEP, JINR, PNPI, RRC-KI, SPSU and Troitsk) in last 6 months contribute 4.19% to whole CPU of ALICE; -”- 9.18% to whole DONE Jobs number in last 6 months In July 2012 Delivered Efficiency Job statistics Group CPU Wall CPU/Wall Assigned Completed Efficiency RDIG   1293  2395   54.02%   717047  528445 73.7%

  16. SE usage More than 1PB of disk space are allocated today by seven RDIG sites for ALICE. ~744 TB of this disk space is used today. ALICE management asks about increase disk capacity as a speed of new data taken and development is much faster than speed of disk cleaning. Example of CPU usage share between seven RDIG sites.

  17. Aggregate SE traffic Summary SE in traffic in 6 months was = 943 TB, out was = 9.6PB

  18. The tasks of the Russia & JINR in the WLCG(2012years): Task 1. MW (gLite) testing (supervisor O. Keeble) Task 2. LCG vs Experiments (supervisor I. Bird) Task 3. LCG monitoring (supervisor J. Andreeva) Task 4. Tier3 monitoring (supervisor J. Andreeva, A.Klementov) Task 5/6. Genser/ MCDB ( supervisor W. Pokorski) Task 7. Tier1 (supervisor I. Bird) Worldwide LHC Computing Grid Project (WLCG) The protocol between CERN, Russia and JINR on participation in LCG Project was approved in 2003. MoU on Worldwide LHC Computing Grid (WLCG) signed by Russia and JINR in October, 2007

  19. Worldwide LHC Computing Grid Project (WLCG) • The tasks of the JINR in the WLCG: • WLCG-infrastructure support and development at JINR; • participation in WLCG middleware testing/evaluation, • grid monitoring and accounting tools development; • Development software for HEP applications; • MCDB development; • support of Users, Virtual Organization (VO) and application; • User & Administrator training and education; • support of JINR Member States in the WLCG activities;

  20. JINR CICC RDIGmonitoring&accounting http://rocmon.jinr.ru:8080 • Monitoring – allows to keep an eye on parameters of Grid sites' operation in real time • Accounting - resources utilization on Grid sites by virtual organizations and single users • Monitored values • CPUs - total /working / down/ free / busy • Jobs - running / waiting • Storage space - used / available • Network - Available bandwidth • Accounting values • Number of submitted jobs • Used CPU time • Totally sum in seconds • Normalized (with WNs productivity) • Average time per job • Waiting time • Totally sum in seconds • Average ratio waiting/used CPU time per job • Physical memory • Average per job

  21. Architecture

  22. FTS (FILE TRANSFER SERVICE) MONITORING FOR WORLDWIDE LHC COMPUTING GRID (WLCG) PROJECT A monitoring system is developed which provides a convenient and reliable tool for receiving detailed information about the FTS current state and the analysis of errors on data transfer channels, maintaining FTS functionality and optimization of the technical support process. The system could seriously improve the FTS reliability and performance.

  23. The Worldwide LHC Computing Grid (WLCG)

  24. ATLAS DQ2Deletion service During the 2010 year the works on development of the deletion service for ATLAS Distributed Data Management (DDM) system were performed Works started at the middle of April 2010. At the end August new version of Deletion Service was tested for set of sites and from November of 2010 for all sites managed by DQ2 Development comprises the building of new interfaces between parts of deletion service (based on the web service technology), creating new database schema, rebuilding the deletion service core part, development of extended interfaces with mass storage systems and extension of the deletion monitoring system. Deletion service maintained by JINR specialists.

  25. Interactive analysis plots, fits, toy MC, studies, … Tier-3 DPD Tier-2 120+ Sites Worldwide 2 WNs 2 WNs AOD 2 WNs 2 WNs 12 Sites Worldwide Tier-1 Headnode + WN Headnode Headnode + WN Headnode PBS (Torque) PROOF Condor Oracle Grid Engine Raw/AOD/ESD 2 servers 2 servers OSS Gangliawebfrontend Tier-0 CERN Analysis Facility development host Raw MDS Manager (redirector) Manager (redirector) Client XRootD XRootD Lustre Tier 3 sites monitoring project • Traditional LHC Distributed Computing • Tier-0 (CERN) → Tier-1 → Tier-2 • Additional → Tier-3 • Needs →the global view of the LHC computing activities • The LIT participates in the development of a software suite for Tier-3 sites monitoring • A virtual testbed has been created at JINR which allows simulation of various Tier3 clusters and solutions for data storage.

  26. Tier 3 sites monitoring project (2011) • Tier-3 sites consist of resources mostly dedicated for the data analysis by the geographically close or local scientific groups. Set of Tier 3 sites can be joined to federation. • Many Institutes and National Communities built (or have plans to build) Tier-3 facilities. Tier-3 sites comprise a range of architectures and many do not possess Grid middleware, which would render application of Grid monitoring systems useless. • Joined effort of ATLAS, BNL (USA), JINR and CERN IT (ES group) • Objectives for Tier3 monitoring • Monitoring of Tier 3 site. • Monitoring of Tier 3 sites federation. • Monitoring of Tier 3 site • Detailed monitoring of the local fabric (overall cluster or clusters monitoring, monitoring each individual node in the cluster, network utilization) • Monitoring of the batch system. • Monitoring of the mass storage system (total and available space, number of connections, I/O performance) • Monitoring of VO computing activities at a site • Monitoring of Tier 3 sites federation • Monitoring of the VO usage of the Tier3 resources in terms of data transfer and job processing and the quality of the provided service based on the job processing and data transfer monitoring metrics.

  27. xRootD monitoring architecture Implementation of the XROOTD monitoring foresees 3 levels of hierarchy: site, federation, global. Monitoring on the site level is implemented in the framework of the Tier3 monitoring project and is currently under validation by the ATLAS pilot sites. The site collector reads xrootd summary and detailed flows, process them in order to reformat into event-like information and publishes reformatted info to MSG. At the federation level data consumed from MSG is processed using hadoop/mapreduce to correlate events which are related to the same transfer and to generate federation level statistics which becomes available via UI and APIs. The UI is similar to the UI of the Global WLCG transfer Dashboard, therefore most of the UI code can be reused. Finally messages similar to FTS transfer messages describing every particular transfer are published from the federation to MSG to be further consumed by Global WLCG transfer Dashboard.

  28. xRootDmonitoring: the schema of information flow

  29. Data transfer global monitoring system

  30. Grid training and education (T-infrastructure) • Training courses on grid technologies • for users (basic idea and skills), • for system administrators (grid sites deployment), • for grid applications and services developers. • Evaluation of different implementations of grid middleware and testing particular grid services. • Grid services development and porting existing applications into grid environment.

  31. T-infrastructure implementation • All services are deployed on VMs (OpenVZ) • Main parts: • three grid sites on gLite middleware, • GT5 testbed, • desktop grid testbed based on BOINC, • testbed for WLCG activities. • Running since 2006

  32. JINR Grid-infrastructure for training andeducation – first step towards construction of the JINR Member States grid-infrastructure Consists of three grid sites at JINR and one at each of the following sites: • Institute of High-Energy Physics - IHEP (Protvino), • Institute of Mathematics and Information Technologies AS of Republic of Uzbekistan - IMIT (Tashkhent, Uzbekistan), • Sofia University "St. KlimentOhridski" - SU (Sofia, Bulgaria), • Bogolyubov Institute for Theoretical Physics - BITP (Kiev, Ukraine), • National Technical University of Ukraine "Kyiv Polytechnic Institute" - KPI (Kiev, Ukraine). Letters of Intent with Moldova “MD-GRID”, Mongolia “Mongol-Grid”, Kazakhstan Project with Cairo University https://gridedu.jinr.ru

  33. ParticipationinGridNNNproject • GridsupportforRussiannationalnanotechnologynetwork • Toprovideforscienceandindustryaneffectiveaccesstothedistributedcomputational, informationalandnetworkingfacilities • Expectingbreakthroughinnanotechnologies • Supportedbythespecialfederalprogram • Mainpoints • basedonanetworkofsupercomputers (about 15-30) • hastwogridoperationscenters (mainandbackup) • isasetofgridserviceswithunifiedinterface • partiallybasedonGlobusToolkit 4

  34. GridNNNinfrastructure 10 resourcecentersatthemomentindifferentregionsofRussia • RRC KI, «Chebyshev» (MSU), IPCP RAS, CC FEB RAS, ICMM RAS, JINR, SINP MSU, PNPI, KNC RAS, SPbSU

  35. Russian Grid Network • Goal — to make a computational base forhi-tech industry and science • Using the network ofsupercomputers andoriginal software created within recently finished GridNNN project • Somestatistics • 19 resourcecenters • 10 virtualorganizations • 70 users • morethan 500'000 tasksprocessed

  36. Project: Model of a shared distributed system for acquisition, transfer and processing of very large-scale data volumes, based on Grid technologies, for the NICA accelerator complex Terms: 2011-2012 Cost: federal budget - 10 million rubles, extrabudgetary sources - 25% of the total cost Leading executor: LIT JINR Co-executor: VBLHEP JINR MPD data processing model ( from “The MultiPurpose Detector – MPD Conceptual Design Report v. 1.4 ”)

  37. WLCG Tier1 in Russia • Proposal to create the LCG Tier-1 center in Russia: official letter by A. Fursenko, Minister of Science and Education of Russia, has been sent to CERN DG R. Heuer in March 2011 • The corresponding points were included to the agenda of Russia – CERN 5x5 meeting in 2011 • serving all four experiments: ALICE, ATLAS, CMS and LHCb • ~10% of the total existing Tier1 resources (without CERN) • increase by 30% on each year • draft planning (proposal under discussion) to have prototype in the end of 2012 and full resources in 2014 to be ready for the start of the next LHC session 37

  38. JINR Resources

  39. Frames for Grid cooperation of JINR Worldwide LHC Computing Grid (WLCG); EGI-InSPIRE RDIG Development CERN-RFBR project “Global data transfer monitoring system for WLCG infrastructure” NASU-RFBR project “Development and support of LIT JINR and NSC KIPT grid-infrastructures for distributed CMS data processing of the LHC operation” BMBF grant “Development of the Grid-infrastructure and tools to provide joint investigations performed with participation of JINR and German research centers” “Development of Grid segment for the LHC experiments” was supported in frames of JINR-South Africa cooperation agreement; Development of Grid segment at Cairo University and its integration to the JINR GridEdu infrastructure JINR - FZU AS Czech Republic Project “The GRID for the physics experiments” JINR-Romania cooperation Hulubei-Meshcheryakovprogramme JINR-Moldova cooperation (MD-GRID, RENAM) JINR-Mongolia cooperation (Mongol-Grid) JINR-Slovakia cooperation JINR- Kazakhstan cooperation (ENU Gumelev) Project “Russian Grid Network”

  40. WEB-PORTAL “GRID AT JINR” – “ГРИД В ОИЯИ”: http://grid.jinr.ru A new informational resource has been created at JINR: web-portal “GRID AT JINR”. The content includes the detailed information on the JINR grid-site and JINR’s participation in grid projects: • КОНЦЕПЦИЯ ГРИД • ГРИД-технологии • ГРИД-проекты • Консорциум РДИГ • ГРИД-САЙТ ОИЯИ • Инфраструктура и сервисы • Схема • Статистика • Поддержка ВО и экспериментов • ATLAS • CMS • CBM и PANDA • HONE • Как стать пользователем • ОИЯИ В ГРИД-ПРОЕКТАХ • WLCG • ГридННС • EGEE • Проекты РФФИ • Проекты ИНТАС • СКИФ-ГРИД • ТЕСТИРОВАНИЕ ГРИД-ПО • СТРАНЫ-УЧАСТНИЦЫ ОИЯИ • МОНИТОРИНГ И АККАУНТИНГ • RDIG-мониторинг • dCache-мониторинг • Dashboard • FTS-мониторинг • Н1 МС-мониторинг • ГРИД-КОНФЕРЕНЦИИ • GRID • NEC • ОБУЧЕНИЕ • Учебная грид-инфраструктура • Курсы и лекции • Учебные материалы • ДОКУМЕНТАЦИЯ • Статьи • Учебные материалы • НОВОСТИ • КОНТАКТЫ

More Related