1 / 26

Special Jobs

Special Jobs. Claudio Cherubino INFN Catania Fourth EELA Tutorial for Managers and Users Mexico City, 28 August-1 September 2006. Outline. MPI jobs on gLite DAG Job Collection Parametric jobs. MPI Overview.

Download Presentation

Special Jobs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Special Jobs Claudio Cherubino INFN Catania Fourth EELA Tutorial for Managers and Users Mexico City, 28 August-1 September 2006

  2. Outline • MPI jobs on gLite • DAG • Job Collection • Parametric jobs Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  3. MPI Overview • Execution of parallel jobs is an essential issue for modern informatics and applications. • Most used library for parallel jobs support is MPI (Message Passing Interface) • At the state of the art, parallel jobs can run inside single Computing Elements (CE) only; • several projects are involved into studies concerning the possibility of executing parallel jobs on Worker Nodes (WNs) belonging to different CEs. Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  4. Requirements & settings • In order to guarantee that MPI job can run, the following requirements MUST BE satisfied: • the MPICHsoftware must be installed and placed in the PATH environment variable on each WNs of the CE. • Some MPI’s applications require a shared filesystem among the WNs to run. • The variable VO_<name_of_VO>_SW_DIR will contain the name of a directory in case of SHARED filesystem. • The variable VO_<name_of_VO>_SW_DIR will contain “.” if there is NO SHARED filesystem. Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  5. This attribute defines the required number of CPUs needed for the application • From the user’s point of view, jobs to be run as MPI are specified setting the JDL JobType attribute to MPICH and specifying the NodeNumber attribute as well. • E.g.: … • JobType = “MPICH”; NodeNumber = 4; … Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  6. (other.GlueCEInfoTotalCPUs >= NodeNumber) && Member (“MPICH”,other.GlueHostApplicationSoftwareRunTimeEnvironment) • When the previous two attributes are included in a JDL, the User Interface (UI) automatically adds the following expression: to the JDL Requirements expression in order to find out the best resource where the job can be executed. Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  7. MPI exercise Create the file “mpi-test.jdl” inside $HOME/gLite/Other and put this contents inside the file: [ Type = "Job"; JobType = "MPICH"; Executable = “cpi"; NodeNumber = 2; StdOutput = “cpi.out"; StdError = “cpi.err"; InputSandbox = {"cpi"}; OutputSandbox = {“cpi.err",“cpi.out"}; RetryCount = 0; ] Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  8. MPI submission [glite-tutor] /home/claudio > glite-job-submit -o id mpi-test.jdl Selected Virtual Organisation name (from proxy certificate extension): gilda Connecting to host glite-rb.ct.infn.it, port 7772 Logging to host glite-rb.ct.infn.it, port 9002 ========== glite-job-submit Success ====================== The job has been successfully submitted to the Network Server. Use glite-job-status command to check job current status. Your job identifier is: - https://glite-rb.ct.infn.it:9000/bsrbbzbcXZWSzU3iUYlm6g The job identifier has been saved in the following file: /home/claudio/id ========================================================== Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  9. MPI status and output Query the status of the job using the following command: [glite-tutor] /home/claudio > glite-job-status -i id ……………………………………………. When the status of the job is “DONE”, you can retrieve output with the following command: [glite-tutor] /home/claudio > glite-job-output -i id …………………………………………… Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  10. MPI on the web… • LCG-2 User Guide Manuals Series • https://edms.cern.ch/file/454439/LCG-2-UserGuide.pdf • http://oscinfo.osc.edu/training/ • http://www.netlib.org/mpi/index.html • http://www-unix.mcs.anl.gov/mpi/learning.html • http://www.ncsa.uiuc.edu/ Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  11. Workload Manager Proxy Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  12. WMProxy overview • WMProxy (Workload Manager Proxy) • is a new service providing access to the gLite Workload Management System (WMS) functionality through a simple Web Services based interface. • has been designed to efficiently handle a large number of requests for job submission and control to the WMS • the service interface addresses the Web Services and SOA architecture standards, in particular adhering to WS-I • developed in C++ using gsoap 2.7.6b as soap stubs generator Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  13. New request types • Support for new types strongly relies on newly developed JDL converters and on the DAG submission support • all JDL conversions are performed on the server • a single submission for several jobs • All new request types can be monitored and controlled through a single handle (the request id) • each sub-jobs can be however followed-up and controlled independently through its own id • “Smarter” WMS client commands/API • allow submission of DAGs, collections and parametric jobs exploiting the concept of “shared sandbox” • allow automatic generation and submission of collections and DAGs from sets of JDL files located in user specified directories on the UI Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  14. WMProxy : submission & monitoring • In order to submit jobs with WMProxy, it’s mandatory to delegate credentials: • The submission/monitoring commands are slightly different, but most of the “old” options are supported glite-wms-job-delegate-proxy -d del_ID glite-wms-job-submit -d del_ID collection.jdl glite-wms-job-status jobID glite-wms-job-output \ https://glite-rb.ct.infn.it:9000/LHIIGaCVdl7Olm sz0jpI_g Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  15. DAG job nodeA nodeB nodeC NodeF nodeD • A DAG job is a set of jobs where input, output, or execution of one or more jobs can depend on other jobs • Dependencies are represented through Directed Acyclic Graphs, where the nodes are jobs, and the edges identify the dependencies Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  16. JDL structure Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  17. Attribute: Nodes Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  18. Attribute: Dependencies Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  19. DAG jdl [ type = "dag"; max_nodes_running = 4; nodes = [ nodeA = [ file ="nodes/nodeA.jdl" ; ]; nodeB = [ file ="nodes/nodeB.jdl" ; ]; nodeC = [ file ="nodes/nodeC.jdl" ; ]; nodeD = [ file ="nodes/nodeD.jdl"; ]; dependencies = { {nodeA, nodeB}, {nodeA, nodeC}, { {nodeB,nodeC}, nodeD } } ]; ] Node description could also be done here, instead of using separate files Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  20. Job Collection • A job collection is a set of independent jobs that user wants to submit and monitor via a single request • Jobs of a collection are submitted as DAG nodes without dependencies • JDL is a list of classad, which describes the subjobs [ Type = "collection"; VirtualOrganisation = “gilda"; nodes = { [ <job descr 1 >], [ <job descr 2 >], … }; ] Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  21. ‘Scattered’ Input Sandboxes • Input Sandbox can contain • file paths on the UI machine (i.e. the usual way) • URI pointing to files on a remote gridFTP/HTTPS server InputSandbox = { "gsiftp://neo.datamat.it:2811/var/prg/sim.exe", "https://ghemon.cnaf.infn.it:8443/data/idat_1", "file:///home/pacio/myconf“ }; • A base URI to be applied to all sandbox files can also be specified InputSandboxBaseURI = "gsiftp://matrix.datamat.it:2811/var"; • Only local files (file://) are uploaded to the WMS node • File pointed by URIs are directly downloaded on the WN by the JobWrapper just before the job is started Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  22. ‘Scattered’ Output Sandboxes • JDL has been enriched with new attributes for specifying the destinations for the files listed in the OutputSandbox attribute list OutputSandbox = { "jobOutput", "run1/event1", "jobError" }; OutputSandboxDestURI = { "gsiftp://matrix.datamat.it/var/jobOutput", "https://grid003.ct.infn.it:8443/home/cms/event1", "gsiftp://matrix.datamat.it/var/jobError" }; • A base URI to be applied to all sandbox files can also be specified OutputSandboxBaseDestURI = "gsiftp://neo.datamat.it/home/run1/"; • Files are copied when the job has completed execution by the JobWrapper to the specified destination without transiting on the WMS node Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  23. Job collection example [ type = "collection"; InputSandbox = {"date.sh"}; RetryCount = 0; nodes = { [ file ="jobs/job1.jdl" ; ], [ [ Executable = "/bin/sh"; Arguments = "date.sh"; Stdoutput = "date.out"; StdError = "date.err"; OutputSandbox ={"date.out", "date.err"}; ] ], [ file ="jobs/job3.jdl" ; ] }; ] All nodes will share this Input Sandbox Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  24. Parametric Job • A parametric job is a job where one or more of its attributes are parameterized • Values of attributes vary according to a parameter • Job monitoring / managing is always done through an unique jobID, as if the job was single (see submission of collection [ JobType = "Parametric"; Executable = "/bin/sh"; Arguments = "md5.sh input_PARAM_.txt"; InputSandbox = {"md5.sh", "input_PARAM_.txt"}; StdOutput = "out_PARAM_.txt"; StdError = "err_PARAM_.txt"; Parameters = 4; ParameterStart = 1; ParameterStep = 1; OutputSandbox = {"out_PARAM_.txt", "err_PARAM_.txt"}; ] Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  25. Parametric job / 2 • Parameter can be also a list of string • InputSandbox (if present) has to be coherent with parameters [ui-test] /home/giorgio/param>cat param2.jdl [ JobType = "Parametric"; Executable = “/bin/cat"; Arguments = “input_PARAM_.txt”; InputSandbox = "input_PARAM_.txt"; StdOutput = "myoutput_PARAM_.txt"; StdError = "myerror_PARAM_.txt"; Parameters = {earth,moon,mars}; OutputSandbox = {“myoutput_PARAM_.txt”}; ] [ui-test] /home/giorgio/param > ls inputEARTH.txt inputMARS.txt inputMOON.txt param2.jdl Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

  26. References • JDL attributes specification for WM proxy • https://edms.cern.ch/document/590869/1 • WMProxy quickstart • http://egee-jra1-wm.mi.infn.it/egee-jra1-wm/wmproxy_client_quickstart.shtml • WMS user guides • https://edms.cern.ch/document/572489/1 Fourth EELA Tutorial, Mexico City, 28 August-1 September 2006

More Related