1 / 46

Computing Element & Torque Server Installation

Computing Element & Torque Server Installation. Giuseppe Andronico INFN Sezione di Catania First EELA Tutorial Madrid - 21 February 2006. Outline. What is a Computing Element (CE) ? What is a Torque Server ? How to install a Computing Element and a Torque Server. How to configure.

lixue
Download Presentation

Computing Element & Torque Server Installation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing Element & Torque Server Installation Giuseppe Andronico INFN Sezione di Catania First EELA Tutorial Madrid - 21 February 2006

  2. Outline • What is a Computing Element (CE) ? • What is a Torque Server ? • How to install a Computing Element and a Torque Server. • How to configure EELA First Tutorial, Madrid, 21.02.2006

  3. What is a CE? • The CE is a service representing a computing resource. • Its main functionally is job management (job submission, job control, etc.). • For job submission, the CE can work in: • push model (where the job is pushed to a CE for its execution). • pull model (where the CE asks the WMS for jobs). EELA First Tutorial, Madrid, 21.02.2006

  4. What is Torque? • TORQUE(Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource. • The Torque System is composed by a: • pbs_server which provides the basic batch services such as receiving/creating a batch job or protecting the job against system crashes. • job_scheduler which contains the site's policy by means of it decide which job must be executed. • pbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user. EELA First Tutorial, Madrid, 21.02.2006

  5. Installing CE + Torque Server EELA First Tutorial, Madrid, 21.02.2006

  6. Installation • Start from a fresh install of SLC 3.0.4 • Installation via • Installer script (http://glite.web.cern.ch/glite/packages) • APT http://glite.web.cern.ch/glite/packages/APT.asp • Installation will install all dependencies, including • other necessary gLite modules • external dependencies • JAVA is not included in distribution. Install it separately (>= 1.4.2_10) http://java.sun.com/j2se/1.4.2/download.html EELA First Tutorial, Madrid, 21.02.2006

  7. Installing pre-requisites • Request host certificates for CE. • https://gilda.ct.infn.it/CA/mgt/restricted/srvreq.php • Install host certificate (hostcert.pem and hostkey.pem) in /etc/grid-security. • chmod 644 hostcert.pem • chmod 400 hostkey.pem • If planning to use certificates released by unsupported EGEE CA’s, be sure that their public key and CRLs (usually distributed with an rpm) are installed. • The CRL of the GILDA CA are available from https://gilda.ct.infn.it/RPMS/ca_GILDA-0.28.1.i386.rpm EELA First Tutorial, Madrid, 21.02.2006

  8. Installing pre-requisites (cont.) • The Resource Management System must be installed on CE node, or on a separate dedicate node, before installing and configuring the CE module. • This release of the CE module supports PBS, Torque and LSF. EELA First Tutorial, Madrid, 21.02.2006

  9. Installing CE+Torque Server via apt • Verify if apt is present: • rpm -qa | grep apt • Install apt if necessary: • rpm -ivhhttp://linuxsoft.cern.ch/cern/slc30X/i386/SL/RPMS/apt-0.5.15cnc6-8.SL.cern.i386.rpm • Add gLite apt repository: • Put one this line in a file (e.g. glite.list) inside the /etc/apt/sources.list.d directory (R 1.4) • rpm http://glitesoft.cern.ch/EGEE/gLite/APT/R1.4/ rhel30 externals Release1.4 updates • apt-get update • apt-get upgrade • Install Torque Server + CE: • apt-get install glite-torque-server-config • apt-get install glite-ce-config See http://glite.web.cern.ch/glite/packages/APT.asp EELA First Tutorial, Madrid, 21.02.2006

  10. Installing CE+Torque Server via apt (cont.) • If the installation is performed successfully, the following components are installed: • gLite in /opt/glite • Condor in /opt/condor-x.y.x (where x.y.z is the current condor version) • Globus in /opt/globus • Tomcat in /var/lib/tomcat5 • Torque in /var/spool/pbs EELA First Tutorial, Madrid, 21.02.2006

  11. Torque Server & CE Configuration • Configuration comes through the execution of pyhton scripts, which takes as input XML files. • So services have to be configured by editing these XML files. • Attributes in XML file are well commented and self-explaining. • XML files are provided as templates, under /opt/glite/etc/config/templates • Copy templates file to /opt/glite/etc/config • Edit each of them separately. • Then we could launch the configurator scripts for Torque Server and CE. EELA First Tutorial, Madrid, 21.02.2006

  12. Torque Server & CE Configuration (cont.) • List of XML files to configure: glite-global.cfg.xml glite-security-utils.cfg.xml glite-torque-server.cfg.xml glite-rgma-common.cfg.xml glite-rgma.gin.cfg.xml glite-rgma-servicetool.cfg.xml glite-rgma-servicetool-serviceName.cfg.xml glite-dgas-client.cfg.xml glite-ce.cfg.xml EELA First Tutorial, Madrid, 21.02.2006

  13. glite-global.cfg.xml <JAVA_HOME description="Environment variable pointing to the SUN Java JRE or J2SE package for example '/usr/java/j2re1.4.2_08/' or '$JAVA_HOME' (if it is defined as an environment variable)" value="/usr/java/j2sdk1.4.2_10"/> Check your java package installed. EELA First Tutorial, Madrid, 21.02.2006

  14. glite-security-utils.cfg.xml • Set the parameters to correctly build files needed by GSI. • Enable fetch-crl cron-job <install.fetch-crl.cron description="Install the glite-fetch-crl cron job. Possible values are 'true' (install the cron job) or 'false' (do not install the cron job)" value="true"/> EELA First Tutorial, Madrid, 21.02.2006

  15. glite-security-utils.cfg.xml (cont.) • Enable glite-mkgridmap cron-job. <install.mkgridmap.cron description="Install the glite-mkgridmap cron job and run it once. Possible values are 'true' (install the cron job) or 'false' (do not install the cron job)" value=“true"/> EELA First Tutorial, Madrid, 21.02.2006

  16. /opt/glite/etc/glite-mkgridmap.conf Edit /opt/glite/etc/glite-mkgridmap.conf as follow: #### GROUP: group URI [lcluser] group vomss://voms.ct.infn.it:8443/voms/gilda?/gilda .gilda gmf_local /opt/glite/etc/grid-mapfile-local Edit /opt/glite/etc/grid-mapfile-local as follow: “/gilda/*” .gilda EELA First Tutorial, Madrid, 21.02.2006

  17. glite-torque-server.cfg.xml <instance name="gil09.ciemat.es" service="wn-torque"> <parameters> <torque-wn.name description="worker node name to be used by the torque server. It can also be the CE itself. Example: lxb1426.cern.ch." value=“gil09.ciemat.es"/> <torque-wn.number.processors description="Number of virtual processors on the node. Example: 1,2, .... [Type: string]" value=“1"/> </parameters> </instance> EELA First Tutorial, Madrid, 21.02.2006

  18. glite-rgma-common.cfg.xml <rgma.server.hostname description="Host name of the R-GMA server. [Example: lxb1420.cern.ch] [Type: 'string']" value=“rgmasrv.ct.infn.it"/> <rgma.schema.hostname description="Host name of the R-GMA schema service. (See also configuration parameter 'rgma.server.run_schema_service‘ in the R-GMA server configuration file in case you install a server). [Example: lxb1420.cern.ch] [Type: 'string']" value=“rgmasrv.ct.infn.it"/> EELA First Tutorial, Madrid, 21.02.2006

  19. glite-rgma-common.cfg.xml (cont.) <rgma.registry.hostname description="Host name of the R-GMA registry service. You must specify at least one hostname and you can specify several if you want to use several registries. (See also configuration parameter 'rgma.server.run_registry_service‘ in the R-GMA server configuration file in case you install a server). [Example: lxb2029.cern.ch] [Type: 'string']"> <value>rgmasrv.ct.infn.it</value> </rgma.registry.hostname> EELA First Tutorial, Madrid, 21.02.2006

  20. glite-rgma-gin.cfg.xml <rgma.gin.run_generic_info_provider description="Run generic information provider (gip) backend (yes|no). GIP cannot be used together with the R-GMA CE provider and this parameter must be set to no if gin is used on a gLite CE node (where rgma.gin.run_ce_provider is set to yes by default) [Example='yes'][Type='string']" value="no"/> EELA First Tutorial, Madrid, 21.02.2006

  21. glite-rgma-gin.cfg.xml (cont.) <rgma.gin.run_fmon_provider description="Run fmon backend (yes|no). This is used by LCG for gridice. If desired, this provider can be used togther with either GIP or the CE Provider [Example='yes'][Type='string']" value="yes"/> In this case you have to install an extra package: gridice-sensor-1.5.1-pl5_sl3.i386.rpm http://infnforge.cnaf.infn.it/project/showfiles.php?group_id=8 EELA First Tutorial, Madrid, 21.02.2006

  22. glite-rgma-servicetool.cfg.xml • Define the parameters for the gLite RGMA servicetool Service <rgma.servicetool.sitename description="DNS name of the site publisher node. This parameter must have the same value as the rgma.site-publisher.sitename parameter in the R-GMA Server configuration. Example: lxb2029.cern.ch] [Type: 'string']" value=“${HOSTNAME}"/> EELA First Tutorial, Madrid, 21.02.2006

  23. glite-rgma-servicetool-serviceName.cfg.xml <rgma.servicetool.service_type description="The service type. This should be uniquely defined for each service type. The naming convention is org.glite.subsystemname.componentname for gLite components and corresponding names for external components. [Example: org.glite.rgma.server] [Type: 'string']" value="org.glite.rgma.server"/> <rgma.servicetool.name description="Name of the service. This should be globally unique. The naming convention is hostname_voname_servicetype. [Example: ${HOSTNAME}_${vo.name}_${rgma.servicetool.type}] [Type: 'String']" Value="${HOSTNAME}_${vo.name}_${rgma.servicetool.type}"/> EELA First Tutorial, Madrid, 21.02.2006

  24. glite-rgma-servicetool-serviceName.cfg.xml <rgma.servicetool.vo description="List of VOs that this service is considered part of. Optional parameter - you can specify one or several or it can be left empty or be removed. [Example: EGEE] [Type: 'string']"> <value>gilda</value> </rgma.servicetool.vo> EELA First Tutorial, Madrid, 21.02.2006

  25. glite-dgas-client.cfg.xml <dgas-client.atmClient.resource.PA.id description="Specifies the contact string of the PA where the Computing Element is registered (i.e. the PA that is responsible for setting the CE's price).The PA contact string is formed as: PA host name:port:subject of host cert" value="grid-demo1.ct.infn.it:56567:/C=IT/O=GILDA/OU=Host/L=INFN Catania/CN=grid-demo1.ct.infn.it/emailAddress=gilda-ca@ct.infn.it"/> EELA First Tutorial, Madrid, 21.02.2006

  26. glite-dgas-client.cfg.xml (cont.) <dgas-client.atmClient.resource.Bank.id description="Specifies the contact string of the site HLR where the Computing Element is registered (i.e. the HLR that manages the CE's account). The HLR contact string is formed as: HLR host name:port:subject of host cert" value="grid-demo1.ct.infn.it:56568:/C=IT/O=GILDA/OU=Host/L=INFN Catania/CN=grid-demo1.ct.infn.it/emailAddress=gilda-ca@ct.infn.it"/> EELA First Tutorial, Madrid, 21.02.2006

  27. glite-ce.cfg.xml <voms.voname <value>gilda</value> </voms.voname> <voms.vomsnode description="The full hostname of the VOMS server responsible for each VO. Even if the same server is reponsible for more than one VO, there must be exactly one entry for each VO listed in the 'voms.voname' parameter."> <value>voms.ct.infn.it</value> </voms.vomsnode> EELA First Tutorial, Madrid, 21.02.2006

  28. glite-ce.cfg.xml (cont.) <voms.vomsport> <value>15001</value> </voms.vomsport> <voms.vomscertsubj> <value> /C=IT/O=GILDA/OU=Host/L=INFN Catania/CN=voms.ct.infn.it/Email=emidio.giorgio@ct.infn.it</value> </voms.vomscertsubj> EELA First Tutorial, Madrid, 21.02.2006

  29. glite-ce.cfg.xml (cont.) <voms.cert.url description="URL from where the VOMS server certificate public key can be downloaded for each VO listed in the voms.voname parameter. If no URL is available for one or more VOs, set the corresponding value to empty string [Example:http://www.nikhef.nl/VOMS/kuiken.nikhef.nl.pem] [Type: 'string']"> <value>/etc/grid-security/vomsdir/gilda-voms.ct.infn.it.pem</value> </voms.cert.url> EELA First Tutorial, Madrid, 21.02.2006

  30. glite-ce.cfg.xml (cont.) <vo.sgm.user description="User name of the account allowed to update software management tags on this CE node. This is a parameter array that must match the list in voms.voname"> <value>gildasgm</value> </vo.sgm.user> <vo.sgm.group description="Group name of the account allowed to update software management tags on this CE node. This is a parameter array that must match the list in vo.sgm.user"> <value>gildasgm</value> </vo.sgm.group> EELA First Tutorial, Madrid, 21.02.2006

  31. glite-ce.cfg.xml (cont.) <vo.sgm.vo.role description="The VO Role mapped to the SGM account in the grid-mapfile. This is a parameter array that must match the list in vo.sgm.user [Example: LCGAdmin][Type: string]"> <value>SoftwareManager</value> </vo.sgm.vo.role> EELA First Tutorial, Madrid, 21.02.2006

  32. glite-ce.cfg.xml (cont.) <pool.account.basename description="The prefix of the set of pool accounts to be created for each VO. Existing pool accounts with this prefix are not recreated"> <value>gilda</value> </pool.account.basename> <pool.account.group description="The group name of the pool accounts to be used for each VO. It can be left empty to use the base name as group name. For some batch systems like LSF, this group may need a specific gid. The gid can be set using the pool.lsfgid parameter in the LSF configuration section"> <value>gildausers</value> </pool.account.group> EELA First Tutorial, Madrid, 21.02.2006

  33. glite-ce.cfg.xml (cont.) <pool.account.number description="The number of pool accounts to create for each VO. Each account will be created with a username of the form prefixXXX where prefix is the value of the pool.account.basename parameter. If matching pool accounts al ready exist, they are not recreated. The range of values for this parameter is from 1 to 999"> <value>50</value> </pool.account.number> EELA First Tutorial, Madrid, 21.02.2006

  34. glite-ce.cfg.xml (cont.) <cemon.wms.host description="Array of the hostnames of the WMS server(s) that should receive notifications from this CE. This list is used to create the predefined subscriptions to be used in pull mode (asynchronous mode with predefined subscriptions). Entries in this array must match entries in the cemon.wms.port array"> <value>gil05.ciemat.es</value> <value>glite-rb.ct.infn.it</value> <value>glite-rb2.ct.infn.it</value> </cemon.wms.host> EELA First Tutorial, Madrid, 21.02.2006

  35. glite-ce.cfg.xml (cont.) <cemon.wms.port description="Array of the port number on which the WMS server(s) receiving notifications from this CE is listening. Entries in this array must match entries in the cemon.wms.host array"> <value>5120</value> <value>5120</value> <value>5120</value> </cemon.wms.port> EELA First Tutorial, Madrid, 21.02.2006

  36. glite-ce.cfg.xml (cont.) <cemon.lrms description="The type of Local Resource Managment System. It can be pbs, lsf or condor. The value pbs is also used for torque. If this parameter is absent or empty, the default type is pbs" value="pbs"/> <cemon.lrms.version description="The version Local Resource Managment System Example: OpenPBS_2.3“ value="OpenPBS_2.3"/> <cemon.cetype description="The type of Computing Element. It can be blah, condorc or gram. If this parameter is absent or empty, the default type is blah. blah and condorc are equivalent, they are both valid values for historical reasons" value="condorc"/> EELA First Tutorial, Madrid, 21.02.2006

  37. glite-ce.cfg.xml (cont.) <cemon.cluster description="The cluster entry point host name. Normally this is the CE host itself“ value="gil09.ciemat.es"/> <cemon.cluster-batch-system-bin-path description="The path of the lrms commands. Example: /usr/pbs/bin or /usr/local/lsf/bin" value="/usr/bin"/> <cemon.queues description="A list of queues defined on this CE node Example: long, short, infinite, etc"> <value>long</value> <value>short</value> <value>infinite</value> </cemon.queues> EELA First Tutorial, Madrid, 21.02.2006

  38. glite-ce.cfg.xml (cont.) <lb.user description="The account name of the user that runs the local logger daemon If the user doesn't exist it is created. In the current version, the host certificate and key are used as service certificate and key and are copied in this user's home in the directory specified by the global parameter 'user.certificate.path' in the glite-global.cfg.xml file" value="root"/> EELA First Tutorial, Madrid, 21.02.2006

  39. glite-ce.cfg.xml (cont.) <custom.runtime.environment description="The entries specified in this array parameter are added to the CE info provider file as additional GlueHostApplicationSoftwareRunTimeEnvironment entries. [Example: MY_APP_1_0_0] [Type: 'string']"> <value>CSOUND-4.13</value> <value>DEMTOOLS-1.0</value> <value>POVRAY-3.5</value> <value>RASTER-3D</value> </custom.runtime.environment> EELA First Tutorial, Madrid, 21.02.2006

  40. Installation of VOMS Certificate • Remove all the remaining changeme • Install the GILDA's VOMS server host certificates gilda-voms.ct.infn.it.pem in the directory /etc/grid-security/vomsdir/ • Edit the /opt/glite/etc/vomses file as follow: "gilda" "voms.ct.infn.it" "15001" "/C=IT/O=GILDA/OU=Host/L=INFN Catania/CN=voms.ct.infn.it/Email=emidio.giorgio@ct.infn.it" "gilda" EELA First Tutorial, Madrid, 21.02.2006

  41. CondorC In /opt/condor-c/etc/condor_config set GRIDMANAGER_DEBUG = D_FULLDEBUG EELA First Tutorial, Madrid, 21.02.2006

  42. Post configuration • In order to commit configuration, execute python /opt/glite/etc/config/scripts/glite-torque-server-config.py –-configure python /opt/glite/etc/config/scripts/glite-torque-server-config.py --start • Check for queues: qmgr –q • If no queues available use: qmgr < pbs_server.conf where pbs_server.conf is a configuration file for Torque/PBS EELA First Tutorial, Madrid, 21.02.2006

  43. Post configuration python /opt/glite/etc/config/scripts/glite-ce-config.py –-configure python /opt/glite/etc/config/scripts/glite-ce-config.py --start Now your CE should be capable to receive jobs coming for the WMS. EELA First Tutorial, Madrid, 21.02.2006

  44. Key-exchange • Edit /etc/ssh/sshd_config and add the following lines at the end: HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes • Restart the server with: /sbin/service sshd restart EELA First Tutorial, Madrid, 21.02.2006

  45. Key-exchange (cont.) • On the CE generate an updated version of /etc/ssh/ssh_know_hosts by running: /opt/edg/sbin/edg-pbs-knownhosts • Copy that file into all the WorkerNodes. EELA First Tutorial, Madrid, 21.02.2006

  46. Questions… EELA First Tutorial, Madrid, 21.02.2006

More Related