1 / 24

Federico Ruggieri – INFN on behalf of the Italian Grid GGF16 – Athens 14 February 2006

GRID Infrastructure in Italy. Federico Ruggieri – INFN on behalf of the Italian Grid GGF16 – Athens 14 February 2006. Outline. The Italian Grid Production Infrastructure Operations and Organisation Support, Monitoring & Accounting Long term sustainability of a Grid Infrastructure in Italy.

rex
Download Presentation

Federico Ruggieri – INFN on behalf of the Italian Grid GGF16 – Athens 14 February 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRID Infrastructure in Italy Federico Ruggieri – INFN on behalf of the Italian Grid GGF16 – Athens 14 February 2006

  2. Outline • The Italian Grid Production Infrastructure • Operations and Organisation • Support, Monitoring & Accounting • Long term sustainability of a Grid Infrastructure in Italy. • Conclusions

  3. The Italian Grid Production Infrastructure • about 40 Resource Centers • The grid resources can be accessed through central or VO-specific services (e.g. Resource Brokers) • 28 sites are also part of the EGEE/LCG Grid infrastructure (and are registered in the central database of the Grid Operation Center) • the other 12 sites can be accessed through the Italian grid services only http://grid-it.cnaf.infn.it

  4. Organisation & Operations Regional Operation Centre (ROC) Central Management Team (CMT) Common Tools Site Managers Site Managers Site Managers Site Managers

  5. Grid.IT Production Grid: Operations Portal • User documentation • site managers documentation • Software repository • Monitoring • Trouble tickets system • Knowledge base http://grid-it.cnaf.infn.it

  6. The Italian Regional Operation Centre (ROC) • First level support: Italy • Geographically based, local front line support to Virtual Organization, Users and Resources Centres • ROC team is organized in daily shifts: 2 people per shift, 2 shifts per day, from Monday to Friday covering working hours (8.30-19.30). Experts on call • Shifters are both from CNAF and major Italian Centres. • Check list to be covered during the shift Log trouble tickets created, updated and closed, problems on grid services and sites, monitor successful site certification • check the actions of the previous shift and the downtime page • check the status of production grid services and the GRIS status of production CE and SE. • check the status of the production sites using the Site Functional Tests report • Periodic (every 15 days) phone conference (ROC teams and site managers) • ROC report to EGEE and leverage from activities of the Italian Production Grid Central Management Team (CMT) • Second level support: EU • Operation of the EU e-Infrastructure • Italian ROC guarantee the weekly shifts rotating among major EU Centres.

  7. The Central Management Team (CMT) • Guarantees Release Distribution and Site Certification • The CMT is responsible of the certification: dynamically checking the functionalities and configuration of a site services before including it in the Italian production grid. • In particular the following checks are systematically done: • Information System data consistency. • Local jobs submission (LRMS). • Grid submission with Globus (globus-job-run). • Grid submission with the EGEE Resource Broker. • Replica Manager functionalities. • To certify a site the CMT uses dedicated grid services located at CNAF - Bologna. • Only certified sites are dynamically included in the production grid to guarantee the robustness of operations.

  8. Grid Central Management Team • Site registration procedure • Site managers of the candidate site contact the Italian ROC (Regional Operation Center) team, provide all the relevant information (contact points, etc , SLA) and sign a policy document for acceptance. • The ROC team verifies the completeness of the information and then creates a new record for the site in the GOC database. • Site is flagged as 'candidate‘ in the DB. • The site administrators of the candidate site fill the GOC database with additional information and request the validation of the site to the ROC • The site status is set to 'uncertified'. • Site Certification Procedure; the resource administrators of a site should: • Apply for the DTEAM and INFNGRID VO membership in order to be able to submit test jobs to check the correctness of the local installation. • Ask the Central Management Team to perform acceptance tests before including the new site in the Information System. If the acceptance tests are successful the site information will be published in the Information System. Then the site status is set to 'certified' in the GOC database. • The site is included in the daily report and functional test of the production infrastructure

  9. INFNGRID-2.7.0: deployed services LCG 2.7 INFNGRID 2.7 Gridice FTS MyProxy RB (DGAS) HLR LFC VOMS BDII

  10. INFN GRID 2.7.0 • INFN – GRID 2.7.0 customizations to LCG 2.7.0: • Support for the following VOs: • egrid, babar, zeus, biomed, magic, esr (managed via LDAP VO server); • libi, pamela, infngrid, cdf, gridit, compchem, planck, bio, enea, theophys, ingv, inaf, virgo, argo  (managed via VOMS server); • euchina, eumed (optional and managed via VOMS server). • DGAS (DataGrid Accounting System): • Patched WMS lcg2.1.73 on the Resource Broker to support DGAS • DGAS HLR (Home Location Register) server: it is responsible for keeping the accounting information for both users and grid resources. • Network Monitor Element, interfaced with GridIce for data presentation. • Support for MPI jobs via home synchronization with scp and host based authentication. • Special setup for WorkerNodes on private network: • A local DNS can be run on a Apt+Kickstart server; the ComputingElement acts as gateway for the Worker Nodes. • new profiles for Worker Nodes without AFS • Customized tools to install and use the grid: • installation by a customized version of LCG yaim (ig-yaim) • support to interface ig-yaim with a Quattor installation; • UIPnP: a PlugAndPlay User Interface to access the grid as user of every Linux system without RPMs.

  11. User, Operation and VO support • The user support system provides a system for exchange of tickets between: • ROC on Duty and site managers • Site managers and Central management team and viceversa • Site manager and certification team during installation/upgrade • Global Grid User Support (GGUS) to ROC. • The Italian ROC, user support system is interfaced to the Global Grid User Support helpdesk application using web-services technologies

  12. The support system • Italian ROC ticketing system is built upon a suite of web based tools written in PHP: Xhelp • The support system components are accessible form the main interface of the deployment portal (grid-it.cnaf.infn.it) providing a SSO point of registration/identification certificate-based. • The end-user can open a request, view and follow his/her own tickets and related replies; • A supporter can view tickets assigned to his/her own groups, add responses and solutions, and change status/priority. • Ticket manager can moreover work with FAQ and with ticket assignment (to other supporters and groups). • While operating tickets, a side content is always available for all classes of users (related to their access level): • Site Functional Tests, • site downtimes calendaring system • file archive • net query tools • IRC applet, contextual questions and answers • reports from CMT daily shifts

  13. Grid Monitoring • The status of the Italian grid infrastructure is monitored using GridIce, a monitoring tool developed by INFN. • It is one of the monitoring tools used by EGEE • It is used to control • the status of the submitting queues • Process/daemons status in the services (RB, BDII) • VO view: list of CE and SE available for a given VO an its status • Job monitoring

  14. Monitoring

  15. Accounting The DataGrid Accounting System (DGAS) has been developed within the EDG and EGEE project. • It implements a resource usage metering and economic accounting in a fully distributed grid environment • It is part of the InfnGrid middleware release and has been deployed on the Italian Grid Infrastructure • Grid computing resources and grid users are registered in appropriate servers, known as HLRs (Home Location Registers), which keep track of every submitted job. An arbitrary number of HLR servers can be used.

  16. DGAS HLR flow

  17. Jobs per site (January, 15 – 31) Total jobs =179.310

  18. Italian MW activities • MW components supported and released by INFN include • WMS: Workload Mangement Service (with EDG, LCG, EGEE) for distributed scheduling and resource management in a Grid environment • Data Management Services • COSTANZA: Virtual Db replication and Replicas Consistency Service (with Grid.it) • SToRM: Storage Resource Management Service for Storage allocation and File pinning with SRM interface over Unix file systems (with ICTP) • Portals and Grid User Interface: UI (with Datamat) , Genius Portal (with Nice) • With PDA and Cellular Phone interface • VOMS: VO oriented Authentication/Authorization Service (with LCG, EGEE, Grid.it) • GridICE: General Grid Monitoring Service (with LCG) • DGAS: Economy based Grid Accounting Service (with EGEE) • G-PBOX: VO oriented Policy enforcing framework (with Grid.it) • VO oriented User Support systems (with Grid.it) integrated with GGUS • Important outcome of Grid.it from CNR and Uni-Pi • Parallel Programming Environments (Assist) • All are made available with the general Open Source License of EGEE, supported and evolving towards SOA

  19. Long term sustainability • Italian Grid Infrastructure, originally based on INFN-GRID project, was then extended to other Scientific and Academic Institutions in the Grid.It project funded by the Italian Ministry of Education, University and Research (MIUR). • New organisations needed to: • Support and operate the Italian Grid Infrastructure (IGI). • Manage and support long term availability of middleware releases to favour industrial take-up (c-Omega).

  20. IGI – Italian Grid Infrastructure • Originally conceived to provide national coordination of the different pieces of the national e-Infrastructure present in EGEE II. Recognized at EU level as Joint Research Unit, supported by MIUR. Focus on setting up and operate a common e-Infrastructure for the Italian Science, including main public resources providers: INFN, CNR, SPACI, ENEA, ICTP, INAF, INGV, Computing Centres, Regional Initiatives, etc. • Provide a consistent/coordinated Italian strategy as a step towards the European Grid Organization (EGO) and an interface to: • EU Grid infrastructure projects, eIRG and ESFRI • International activities • Support activities of a vast range of Scientific disciplines: Physics, Astrophysics, Biology, Health, Chemistry, Geophysics, Economy, Finance, and possible extensions to other sectors as Civil Protection, e-Learning, dissemination in Universities and secondary school. • Agreement reached on the 23 Jan ’06.

  21. C-OMEGA • Main objective of c-OMEGA is to support the innovation and the commercial exploitation process of grids in Italy. • Other Objectives are: • Become the national reference organization, also for activities at EU and International level, aiming at developing, support, diffuse and exploit a platform of Open Source components derived by current Grid projects components and increasingly obeying to international standards • Favor synergy between the Research and Academia and the industrial world, in particular SME, the public Services (Health, Administration..) etc. • Support with formation and dissemination activities and pilot projects the early commercial adoption of grids to increase Italian and EU competitiveness. • Partners involved:Public Research Institutions, Universities, Computing Consortia, Large end-user companies, International IT industries, National IT companies, SMEs, etc. • Good chances to provide a foundation for Italian and EU middleware support and its industrial exploitation • c-OMEGA, together with UK OMII is at the foundation of current EU OMII proposal.

  22. Conclusions • First generation of Grid services in LCG/EGEE, DEISA production Grids are currently in use in Italy and Europe. • They are still evolving for more functionalities, robustness and security. • Some needed services are still under test, development or missing and we may still discover that other important functionalities are required by specific user communities. • Together with National Initiatives EU needs to address long term sustainability of GRID Infrastructures.

More Related