1 / 30

GRID Implementation and Requirements

GRID Implementation and Requirements. F. Semeria INFN-Bologna HEPix/HEPnt LAL, 23 April 2001. Introduction. In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information. The Grid Information Service (GIS)

licia
Download Presentation

GRID Implementation and Requirements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRID Implementation and Requirements F. Semeria INFN-Bologna HEPix/HEPnt LAL, 23 April 2001

  2. Introduction • In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource information. • The Grid Information Service (GIS) is the way of making information available to Grid applications.

  3. GIS=GRIS+GIIS • Globus implements the GIS by using two kinds of LDAP (v2) servers: • GRIS (Grid Resource Information Service) runs on each resource (machine). It uses an LDAP shell backend to gather the resource configuration and status. It registers itself to a GIIS providing info about itself.

  4. GIS=GRIS+GIIS (cont.) • GIIS (Grid Index Information Service) is a LDAP sever that usually runs on few machines per organization and is a search engine for the GRIS’s registered under it.

  5. Why LDAP? • PROs: • it is a standard way to describe and collect data. • it provides an effective distributed model for the data. • CONs: • directories are designed more for reading than for writing. Good for address book or NIS, but not for storing dynamic data like the CPU load of a machine.

  6. General implementation • The proposed implementation of the Information Service is to have an hierarchical structure of servers (GIIS’s) having a root server at CERN.

  7. General implementation (cont.) • Each organization has its top level GIIS registered on the root server, but can choose its own low level topology

  8. EU GIIS (Cern) o=grid INFN (Italy) dc=infn,dc=it,o=grid France GIIS dc=fr,o=grid LIP (Portugal) dc=lip,dc=pt,o=grid ... ... IN2P3 (France) dc=in2p3,dc=fr,o=grid ... ... ... ...

  9. INFN implementation • INFN has implemented a hierarchical structure of GIIS based on INFN departments (about 25) • Each GRIS registers itself to the site GIIS which in turn registers itself to the top level INFN GIIS

  10. Top level GIIS dc=infn,dc=it,o=grid GIIS Bologna GIIS Milano GRIS GRIS dc=bo,dc=infn,dc=it,o=grid dc=mi,dc=infn,dc=it,o=grid

  11. INFN top level GIIS • 11 GIIS’s registered • More than 40 GRIS’s

  12. http://bond.cnaf.infn.it/ cgi-bin/mdsbrowse1.pl

  13. GIS Requirements • Each experiment needs to be able to select its own set of machines (with its own name space) • We need more attributes to describe the status of jobs and machines. • Data replication for failure recovery and mirroring

  14. Experiments resources • Each GRIS can register itself to several GIIS’s. • This allows repartitioning of resources by experiment.

  15. Top level INFN GIIS dc=infn,dc=it,o=grid EU CMS GIIS ou=cms,o=grid GIIS Bologna GIIS Milano GRIS GRIS

  16. Jobs and machines info • The underlying resource management systems, like Condor,LSF,PBS, provide useful information about machines and jobs that should be published in the GIS.

  17. Examples of jobs info • job id • current status of the job • the size of the executable • the name of the user • the submitting and the executing host • why the job is not running • etc.

  18. Example of machines info • the total and available physical memory and swap space • the speed of the machine in MIPS • the number of CPUs • the CPU load average • etc.

  19. Extending the GRIS • The GRIS uses programs called information providers to collect information from the machines. • The requirements for an information provider are: • the program must emit LDIF objects to stdout • the object generated must respect the GLOBUS schema

  20. Data flow • Information are not pushed periodically from a GRIS to a GIIS, but is the GIIS that queries the GRIS’s when an application needs information.

  21. query application GIIS query GRIS

  22. Caching • Information are stored in cache for a period of time (TTL=Time To Live). • Higher the level of the GIIS higher the TTL, lower the details.

  23. query application GIIS query GRIS cache not expired cache expired

  24. Performance • In the worst case the whole set of machines must be queried. • Some indexing techniques should be used to implement a search space pruning. • Also a periodicinformation update mechanism can be investigated.

  25. Some tests • We have tested the performance dependency from caching and cpu load. • Test have been made on WAN. • The same queries on a GIIS take < 1 sec. when cache is on and > 10 sec. when off

  26. Some tests (cont.) • When a GRIS has a loaded CPU the response time from its own GIIS is much higher when cache is expired (> 1 min. vs 1 sec.) • Also when a GIIS has a loaded CPU and the cache is not expired the response time is higher (6-7 sec.): better do not use a GIIS for computation!

  27. Security and access policies • In the current implementation any machine can register itself to a GIIS • No access control when searching the GIIS. From any LDAP client I can: ldapsearch –p 389 –h mds.infn.it –b “o=grid” –s sub “*=*” and get all the information from the GIIS

  28. Conclusions • The Globus Information Service is based on a standard protocol (LDAP). • It provides flexibility and a potentially good distributed data model. • But...

  29. Conclusions (cont.) • A good topology for the HEP experiments must be still implemented. • The GRIS must be extended with new information providers. • Lack of data replication. • Some new mechanism should be introduced to improve performance and security.

More Related