1 / 11

IT-GD-OPS attendance to EGEE’09

IT-GD-OPS attendance to EGEE’09. G eneral. Good location! Good attendance -> good networking, discussions over coffee Demos at lunch time not well attended... Plenary talks. Sessions attended. User support related sessions Regional Operation tools MPI WLCG operations

sorcha
Download Presentation

IT-GD-OPS attendance to EGEE’09

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009

  2. General • Good location! • Good attendance -> good networking, discussions over coffee • Demos at lunch time not well attended... • Plenary talks... IT/GD Group Meeting, 05 February 2009

  3. Sessions attended • User support related sessions • Regional Operation tools • MPI • WLCG operations • SA1 coordination meeting • Other IT/GD Group Meeting, 05 February 2009

  4. User support • Operations support in EGEE and EGI • Presentation of GGUS interface to regional helpdesk, and of several regional helpdesk solution implemented • Dissemination of the plans to introduce the messaging interface • USAG • Five candidates for the new EGEE TPM model prototyping presented their offers • Long discussions, mix EGEE-EGI bids • Master/slave tickets: relationship retained after ticket closure. However, a reopening of a ticket with a master or a slave will not trigger an automatic reopening of the associated ticket(s). IT/GD Group Meeting, 05 February 2009

  5. User support • Ensuring reliable User Support beyond EGEE-III • User communities: positive feedback from HEP Vos, new feature requests to GGUS, SSC/EGI commitment to use GGUS • Mw: plans to use GGUS as main entry point IT/GD Group Meeting, 05 February 2009

  6. Regional Operation Tools • Introduction to Regional Operational Tools • Dissemination session • SAM, Regional Nagios, regional dashboard, myEGEE portal, GOCDB4, APEL and ActiveMQ, Gstat2 • Monitoring a grid site using Nagios • 2 tutorials: Installing Site and ROC Nagios, and Messaging between Site and ROC Nagios • Very well attended! • Monitoring of the activities of the user communities on the EGEE infrastructure • Dissemination about FTS dashboard and experiment dashboards, discussion between team members IT/GD Group Meeting, 05 February 2009

  7. MPI • Support for MPI Applications within EGEE • SAM team is developing MPI sensors based on TMB WG recommendations (not very clear) • Current implementation of MPI on the EGEE infrastructure has a lot of problems. The success rate of jobs is unacceptably low. The time necessary to find the cause of the problem is long. • MPI Task Force created, to work out present issues and improve deployment/usage till the end of EGEE • New guidelines for SAM tests discussed with them and already implemented IT/GD Group Meeting, 05 February 2009

  8. WLCG operations • WLCG Operations: Perspectives for Imminent Data Taking • Summary of STEP09 • daily WLCG ops meeting & how representatives gather information for the meeting • RSS feeds, LCG_ROLLOUT, GOC Wikifor community support • Survey carried out to get the main issues faced by experiments (SRM scalability, poor error messages, difficulty of installing gLite UI, etc.): https://twiki.cern.ch/twiki/bin/view/LCG/WLCGTechnicalForum • Discussions on whether data management and workload management are ready for the start up of the LHC IT/GD Group Meeting, 05 February 2009

  9. WLCG operations • Services and Support for the WLCG, HEP and Related Communities • ROSCOE - RObust Scientific Communities for EGI • EGI SA4 (heavy users) • Details of areas covered, money and FTEs (moving target) IT/GD Group Meeting, 05 February 2009

  10. SA1 coordination meeting • OAT plans • Update on milestones, most of them late • Agreed minimal feasible plan: regional nagios interfaced to central ops tools • Release/deployment of operation tools • SLA changes • Suspend sites with low A/R • Monthly reports of low A/R values • Site support metrics to be measured • Changes in intervention procedures (downtime declaration and publication) • security: recent progress and issues, transition to EGI • Security monitoring, incident response • EGEE migration plan to EGI global tasks • NGIs that will get responsibility, timeline and plan to migrate • TPM transition: present model till end of the year, then EGI model IT/GD Group Meeting, 05 February 2009

  11. Cloud computing • mostly to do with experiences of running virtual WNs. Not much actual cloud computing involved, apart from a couple of references to Amazon EC2 and how the I/O is what makes it too expensive for sites to use. • The following sites/infrastructures have implemented or are implementing virtualization: • CNAF (including a basic interface to EC2): In production. Being used by ATLAS. • Grid Ireland: in production • CERN: work in progress • SARA: work in progress • BalticGrid • There were discussions on the pros and cons of using different types of images (site generated, third party repositories, user generated) and also on which VMware is most suitable. There were not any solid conclusions. IT/GD Group Meeting, 05 February 2009

More Related