1 / 18

INFN-GRID Globus evaluation (WP 1)

INFN-GRID Globus evaluation (WP 1). Massimo Sgaravatto INFN Padova for the INFN Globus group globus@infn.it http://www.infn.it/globus. Globus.

carnig
Download Presentation

INFN-GRID Globus evaluation (WP 1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INFN-GRID Globus evaluation(WP 1) Massimo Sgaravatto INFN Padova for the INFN Globus group globus@infn.it http://www.infn.it/globus

  2. Globus • Some basic services (security, information service, resource management, …) must be implemented in order to implement and use a Grid for real applications • Globus identified as possible Grid framework providing these services • … but it has been developed mainly for “traditional” computing, different from computing in HEP • High performance vs. high throughput • Supercomputers vs PC farms • Distributed data intensive computing not addressed • Need to assess what can be used for HEP environment  WP1 “Installation and Evaluation of the Globus Toolkit” of the INFN-GRID Project • Goal: evaluation of the Globus toolkit • Which services can be useful ? • What is necessary to integrate/modify ? • What is missing ?

  3. Local Services Condor MPI TCP UDP LSF Easy NQE AIX Irix Solaris Globus Architecture Applications High-level Services and Tools GlobusView Testbed Status DUROC MPI MPI-IO CC++ Nimrod/G globusrun Core Services Nexus GRAM Metacomputing Directory Service Globus Security Interface Heartbeat Monitor Gloperf GASS

  4. Proposed work plan • Security • To access GRID resources mechanisms for user authentication and authorization needed •  Evaluation of GSI service • Information Service • To discover the GRID resources (CPU, storage, network, …) mechanisms to “publish” them must be defined •  Analysis of GIS service to “publish” information using a uniform and standard interface • Resource Management • Necessary a uniform interface to submit jobs on GRID resources • Uniform standard interface to different resource management systems • Uniform standard language for task management •  Assessment of Globus services for resource allocation and process management

  5. Proposed work plan • Data Access and Migration • High performance and reliable tools needed to “manage” data (access to remote data, data transfers, wide area replica, …) •  Assessment of Globus tools for data management (GASS, Globusftp) • Fault Monitoring • Faults in a GRID environment must be promptly detected and recovery mechanisms must be implemented •  Evaluation of HBM service for fault detection • Execution Environment Management • Code migration (moving the application where the job will actually be executed) as a possible implementation strategy •  Evaluation of GEM service to support code migration • Globus installation tools • Reduce complexity and manpower for Globus installation and maintenance

  6. Globus installation tools • Flavia’s presentation • INFN-GRID installation tool to shorten the installation time of the Globus toolkit, avoid common mistakes, support for specific customisations • Possibility (option) to install optional software, to proceed with INFN specific customizations (INFN CA, configuration of a hierarchical GIS architecture), to install and use specific INFN tools • Proven to be successful within INFN (used to setup a INFN GRID testbed) and also outside (CERN, FNAL, …)

  7. Security • Evaluation of Globus GSI • User authentication (implementation based on X.509 certificates) • User authorization “managed” by grid-mapfile (mapping between Grid users and local users) • Some shortcomings, but the GSI security model seems to satisfy our requirements • Some shortcomings already addressed • INFN-CA used to sign certificates • CRL (issued by INFN CA) distribution • Centralized management of grid-mapfiles

  8. Security • Centralized management of the grid-mapfiles • Goal: Ease the sharing of the same access policies (represented by the grid-mapfiles) for groups of hosts with common purposes • Proposed system • Central repository (LDAP server) to store user certificates and to define groups of users • Certificates published by CA manager • Group manager responsible for editing group memberships (using a LDAP client) • Resource owners (Globus administrators) periodically (i.e. cron job) “connect” to this repository, “download” the subject of the certificates that meet a specified criterion (i.e. all users of group X), and produce grid-mapfile entries

  9. Security • AFS tests • Analysis of what can be done now with the existing tools (quite unfit for any real need) • Possible ways to address the existing shortcomings identified • New Globus tool (gsiklog) available

  10. Information Service • Alessandro’s presentation • Evaluation of Globus GIS (Grid Information Service) • Definition and implementation of a hierarchical architecture of GIS 1.1.3 • Performance and scalability tests • Web interface for browsing • Various shortcomings must be addressed (to use the GIS in a production environment) • Mixed push/pull model more suitable than a pull model • Performance • Lack of security • …

  11. INFN GIS Topology Dc=infn,dc=it, o=grid Top Level INFN GIIS Exp=cms, o=grid Dc=bo, Dc=infn, dc=it,o=grid INFN CMS GIIS Dc=pd,Dc=infn, dc=it,o=grid GIIS GIIS GRIS Padova Bologna

  12. Resource Management • Most of these activities as collaboration with Grid Workload Management work package • Evaluation of Globus resource management architecture • Evaluation of Globus GRAM • Tests with fork, Condor, LSF and PBS as underlying resource management systems • The model is fine, but lack of “robustness” (needed for real production environments) • Memory leaks in the Globus job manager (fixed) • Scalability (one job manager for each job) • Reliability (the job manager is not persistent) • …

  13. Globus resource management architecture (simplified design) Resource Discovery RSL Submit jobs Broker Grid Information Service (GIS) Broker chooses in which resources to submit the jobs (not implemented in the Globus framework) RSL Information on characteristics and status of local resources RSL Globus GRAM as uniform interface to different local resource management systems Globus GRAM Globus GRAM Globus GRAM Local Resource Management Systems CONDOR LSF PBS Site1 Farms Site2 Site3

  14. Resource Management • Evaluation of GRAM API • Evaluation of GRAM Reporter (“cooperation” between GRAM and GIS) in particular for farms • Many useless attributes (at least for our needs), attributes not calculated (always defined as 0), some attributes not properly calculated by Globus shell scripts • Some important information describing the farms and the submitted jobs (necessary for example for a resource broker) missing •  Draft proposal for a possible modification of the default schema • Evaluation of RSL as uniform language to specify resources • More flexibility required • Submission of Condor jobs to Globus resources • Condor-G (useful as a reliable crash-proof job submission service) • GlideIn • Evaluation of MPICH-G2 vs. MPICH • Some shortcomings found (lack of support for shared memory, worse latency performance wrt. MPICH)

  15. Data management • Tests with GASS • Service to ease the access to remote files without having a distributed file system and/or transferring files from/to remote storage systems • Tests with command line tools and APIs • Problems (huge decrease in transfer rate) when transferring big files • Tests with Globusftp alpha release 2 • Collaboration with WP network INFN-GRID • Tests of new features • Support for GSI mechanisms • Capability of resuming interrupted file transfers • Throughput tests using parallel data transfers • Antonio’s presentation

  16. Other services • Fault Monitoring (HBM) • Evaluation of HBM for fault detection (for “system” and “user” processes) • … but the HBM package is not seeing active development • Execution Environment Management (GEM) • Evaluation of GEM as service for code migration • … but the GEM service now provides only limited capabilities (executable staging)

  17. WP 1: Deliverables & Milestones • Deliverables • Tools, documentation and operational procedures for Globus deployment (6 Months)  • Final report on suitability of the Globus toolkit as basic Grid infrastructure (6 Months)  • Milestones • Basic deployment Grid infrastructure for the INFN GRID (6 months)  • Globus installed on ~ 40 machines on ~ 10 different sites

  18. Conclusions • The activities of WP 1 are over • The Globus toolkit can provide basic services useful to create and deploy usable Grids, but many shortcomings and issues must be addressed • … more details in the report • Other info: http://www.infn.it/globus

More Related