1 / 14

BINP Contribution to ATLAS TDAQ SysAdmin Group Activities (2007-2010)

BINP Contribution to ATLAS TDAQ SysAdmin Group Activities (2007-2010). BINP Contribution to ATLAS TDAQ SysAdmin Group Activities (2007-2010). Overview. BINP is contributing to all the activities of ATLAS Trigger/DAQ SysAdmin Group since 2007: D.Popov ( 2007-2008, 1 visit )

akira
Download Presentation

BINP Contribution to ATLAS TDAQ SysAdmin Group Activities (2007-2010)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BINP Contributionto ATLAS TDAQ SysAdminGroup Activities (2007-2010) BINP Contributionto ATLAS TDAQ SysAdminGroup Activities (2007-2010)

  2. Overview • BINP is contributing to all the activities of ATLAS Trigger/DAQ SysAdmin Group since 2007: • D.Popov (2007-2008, 1 visit) • A.Zaytsev (2008-2010, 2 visits up to now) • A.Korol (2009-2010, 1 visits up to now) • A.Bogdanchikov (2009-2010, 2 visits up to now) • The contribution includes: • Support of the existing TDAQ environment (> 1500 servers,> 200 racks of equipment in SDX1 and USA15, ATLAS MainControl Room and Satellite Control Room equipment) • Support of ATLAS Point 1 users (> 3000 users, > 300 user roles) • Development of various system administration tools for internal use within the group • Building and validating hardware solutions for future use in the ATLAS TDAQ environment • Taking part in 24-hours TDAQ SysAdminshifts (since mid-summer 2008) BINP Contribution to ATLAS TDAQ

  3. LHC Point1 SysAdmins IT Centre Lab4 Lab32 Lab40 BINP Contribution to ATLAS TDAQ

  4. BINP Contribution to ATLAS TDAQ

  5. ATLAS Point 1 Computing Facilities ATLAS-Novosibirsk Group Meeting @ Novosibirsk 11 June, 2009

  6. SysAdmin Group Evolution (2008-2010) • Nominal amount of resources assigned to the team:10 FTE (stabilized for 2011-2012) • Minimum number of people ever observed in the team:4 (2009Q1) • Present situation: 10sysadmins working on site plus 10-12 sysadmins on remote sites (Pakistan + Russia: BINP only) • 3 shifts per month per person in the average • Two rotation cycles are now established: • 3 people in the loop (BINP: the 2nd cycle ongoing, remote operations areallowed) • 10 people in the loop (Pakistan: 1st cycle ongoing, remote operations not allowed) • 80% of the team is renewed since 2007 • No more than 30% of staff renewal is expected in 2010-2011 BINP Contribution to ATLAS TDAQ

  7. Previous Achievements (2009) • Migration of the ATLAS Gateways to the new servers provided with XEN based virtualization solution: • Initial deployment is performed in 2008Q4 • Migration was finalized in 2009Q2-3 • Implementation of bulk server firmware upgrade tools for the netbooted nodes deployed in ATLAS Point 1: • Successfully applied in 2008Q4 for upgrading of more than 1000 nodes installed in SDX1 • Deployment and support of ATLAS Remote Monitoring servers: • Evaluation of commercial and free NX servers and the SGD (Sun Global Desktop) based solutions for ATLAS remote monitoring infrastructure • Implementation of monitoring and accounting data analysis tools based on ROOT toolbox which were successfully applied in 2008Q4-2009Q2 for • ATLAS DCS and Nagios RRD temperature data analysis for SDX1 • ATLAS Gateway accounting system data visualization • Contributing to everyday activities of the group including ATLAS TDAQ SysAdmin shifts since Sep 2008 & taking part in multiple hardware maintenance operations in SDX1 and ATLAS Control Room

  8. Recent Achievements (2010Q1-2) • Major upgraded of the ATLAS Remote Monitoring nodes: • Reinstalling the nodes under SLC5.4 x86_64 • The current installation is fully documented • Supporting the ATLAS P1 Gateways and Remote Monitoring nodes: • Keeping the nodes up-to-date • Adding more functionality and increasing the reliability of these subsystems • Getting through the highest peaks of user activity, e.g. the recentLHC media day (Mar 30, 2010) smoothly • Continuing to contribute to everyday activities on supporting the ATLAS TDAQ computing environment over the period of LHC data taking • Providing ATLAS TDAQ SysAdmins Team with the virtualized nodes used for testing solutions for a new components, e.g.: • New ATLAS P1 webservers, • Tools for deploying the nodes of ATLAS HLT farm (BWM, Quattor/Puppet), etc. • Taking part in commissioning of the new ATLAS TDAQ HLT computing hardware to be deployed in Point1 in 2010Q3 • 10 racks of equipment (new high density computing nodes) • Adding more than 5000 CPU cores to the ATLAS HLT computing farm (SDX1)

  9. New High Density Machines for HLT Farm • New HLT racks: 95 boxes • one 2Us box has 4 motherboards • 10 x HLT rack – 80 boxes • 15 extras for ONL/MON, LFSes, replacements • Overall Dell chassis features: • 4 CPU Sockets/1U • 16 real CPU cores/1U • 32 CPU threads/1U • 64 GB RAM/1U • 1 kW/1U ($300/CPU thread)

  10. Areas of Our Responsibility (2010-2011) • Support/Maintenance (since 2009) • ATLAS P1 Gateways (‘atlasgw’) • Preseries Gateways (‘preseriesgw’) • ATLAS RMON Infrastructure (‘pc-atlas-rmon’) • Development/Validation (added in 2010) • ATCN Test VM Box (test webservers, LFC, Puppet, ClamAV) • GPN Test VM Box (test public webserver, Puppet, upgraded BWM infrastructure VMs) • Future prospects (starting from 2010Q3) • Put virtualized BWM infrastructure to production • Virtualization of Lab32 (for sake of compactification) • Virtualization of ATLAS TDAQ MON subsystem(archiving higher stability) • Load balancing solutions for Point1 proxy and webservers (archiving better handling of the peak load) BINP Contribution to ATLAS TDAQ

  11. Generic Milestones in 2010-2011 • Past (up to 2010Q2) • ATLAS RMONs reinstallation under SLC5 (Feb 2010) • LHC Media Day (Mar 30, 2010): continuous data taking period begins, no more intensive development allowed • ATLAS P1 Gateways upgrade (new VM image, Apr 2010) • ATLAS P1 Gateways proxy authentication schema upgrade (migration to NTLM, May 2010) • Recovery from 18 kV power line failure (end of May 2010) • Near Future (2010) • “LHC First Heavy Ion Physics” Public Event (?) • Put extra 5000 CPU cores to production in HLT farm (SDX1) • Put ConfDB UI v2.0 in production • Migrating to the new ATLAS Point1 webservers • 2010-2011 Christmas Shutdown • Distant Future (2011) • Put and improved access manager into production • Replacing extender solution for the ACR (?) • LHC long term shutdown in the end of 2011 BINP Contribution to ATLAS TDAQ

  12. Talks and Conference Contributions (2008-2009) published ATLAS TDAQ Week, 2008Q4 2008Q2 CHEP2009 Poster Contribution, Mar 2009

  13. Talks and Conference Contributions (2010) accepted ICSOFT2010 Poster Contribution, Jul 2010 CHEP2010, Oct 2010 (not yet accepted)

  14. Questions & Discussion

More Related