490 likes | 619 Views
Globus Toolkit I - Basic structure, and job submission. Globus. Open source grid software toolkit that has been developing since late 1990’s. Provides: An Underlying Grid Security Infrastructure (GSI) Resource Management Data Management Information services.
 
                
                E N D
Globus ToolkitI - Basic structure, and job submission Grid Computing, B. Wilkinson, 2004
Globus Open source grid software toolkit that has been developing since late 1990’s. Provides: • An Underlying Grid Security Infrastructure (GSI) • Resource Management • Data Management • Information services. Higher level tools are meant to be implemented above these basic services Grid Computing, B. Wilkinson, 2004
Grid Security Infrastructure Provides security functions including: • Authentication • Authorization • Delegation • Confidential Communication Grid Computing, B. Wilkinson, 2004
Resource Management • Job submission • Job status • Basic resource allocation Globus does not have its own job scheduler to find resources and automatically send jobs to suitable machines. For that use a separate scheduler - we will use Condor-G. Grid Computing, B. Wilkinson, 2004
Information Services • Support for gathering information and querying information. • Based upon Lightweight Directory Access Protocol (LDAP) Grid Computing, B. Wilkinson, 2004
Data Management • Support for transferring files between machines Grid Computing, B. Wilkinson, 2004
Key Components • GSI (Grid Security Infrastructure) • Grid security. • GRAM (Grid Resource Allocation Manager) • Remote job submission and control. • GridFTP • Secure data transfer. • MDS (Monitoring and discovery Service) • Interface to system and service information. Grid Computing, B. Wilkinson, 2004
Version 2 (pre-2004) Contained named components: • GSI • GRAM • GridFTP • MDS. Relatively stable. Grid Computing, B. Wilkinson, 2004
From: “Introduction to Grid Computing with Globus,” IBM Redbooks, SG24-6895-012003, Fig. 4-6. Grid Computing, B. Wilkinson, 2004
Globus version 2 Grid Computing, B. Wilkinson, 2004 From: “Introduction to Grid Computing with Globus,” IBM Redbooks, SG24-6895-012003, Fig. 7-2.
Globus version 2 From: “Introduction to Grid Computing with Globus,” IBM Redbooks, SG24-6895-012003, Fig. 7-3. Grid Computing, B. Wilkinson, 2004
Resource Specification LanguageRSL Provides a specification for: • Resource requirements - machine type, number of nodes, memory, etc. • Job description - directory, executable, arguments, environment Grid Computing, B. Wilkinson, 2004
Version 1 (Pre GT3) - a metalanguage describing job and its required execution. Specification given in an rsl file • Version 2 (GT3) - specification described in a XML document with a schema. Grid Computing, B. Wilkinson, 2004
RSL Version 1Constraints ExampleConjunction (AND): & • To create 3-5 instances of myProg, each on a machine with at least 64 Mbytes memory available to me for 1 hours: & (executable=myProg) (count>=3)(count<=5)(memory>=64) (max_time=60) Grid Computing, B. Wilkinson, 2004
RSL Version 1Constraints ExampleDisjunction (OR): | • To create 5 instances of myProg, each on a machine with at least 64 Mbytes memory or 7 instances of myProg, each on a machine with at least 32 Mbytes memory : &(executable=myProg) (|(&(count=5)(memory>=64)) (&(count=7)(memory>=32))) Grid Computing, B. Wilkinson, 2004
RSL version 1Requesting multiple resourcesmultirequest: + • To execute 5 instances of myProg1 on a machine with at least 64 Mbytes memory and execute 2 instances of myProg2: +(&(count=5)(memory>=64)) (executable=myProg1)) (&(count=2)(executable=myProg2)) Grid Computing, B. Wilkinson, 2004
Can specify different resource managers on different machines using resourceManagerContact attribute Grid Computing, B. Wilkinson, 2004
Can specify different resource managers on different machines using resourceManagerContact attribute. Example + ( & (resourceManagerContact= “flash.isi.edu:754:/C=US/…/CN=flash.isi.edu-fork”) (count=1) (executable=my_appl1) ) ( & (resourceManagerContact= “sp139.sdsc.edu:8711:/C=US/…/CN=sp097.sdsc.edu-lst”) (count=2) (executable=my_appl2) ) Grid Computing, B. Wilkinson, 2004
RSL creation with Globus version 2 • GT2 globus-job-run can be used to generate RSL from command line arguments with -dumprsl flag • -help gives options Grid Computing, B. Wilkinson, 2004
We used Globus version 2.4 in a Supercomputing 2003 demo organized by the University of Melbourne. • 21 countries involved, numerous sites. Grid Computing, B. Wilkinson, 2004
Version 3 A re-implementation based upon the Open Grid Service Architecture (OGSA). • OGSI-compliant Java interfaces. • Contains GSI, GRAM, GridFTP, and MDS. • Includes additional Web service components, some built on top of OGSI. • Version 3.2 (March 2004) a major revision of version 3.0. Even commands changed. Grid Computing, B. Wilkinson, 2004
Grid Computing, B. Wilkinson, 2004 From http://www.globus.org
GT 3 structureEarly representation Non-GT3 services Replica management for large data sets, ... Job management, etc. SSL, Certificates OGSI (Grid services) Grid Computing, B. Wilkinson, 2004
GT3More recent representation Grid Computing, B. Wilkinson, 2004
Grid Service Container • “Contains” all the files of GT3 system shielding it from environment. • “White” part is the GT3 core Operates with a web service engine - stand-alone web service container provided for test purposes. Tomcat can be used. Grid Computing, B. Wilkinson, 2004
Globus 3.2 • Has suite of command line tools. • To start container: globus-start-container (from $GLOBUS-LOCATION directory).Should see a list of available services including any you have created. Grid Computing, B. Wilkinson, 2004
GUI • GT3 also has a Java based GUI “service browser.” To start: globus-service-browser (from $GLOBUS-LOCATION directory). Grid Computing, B. Wilkinson, 2004
From “Globus Toolkit 3.0 Quick Start,” IBM, Sept 2003, http://www.redbooks.ibm.com/redpapers/pdfs/redp3697.pdf For educational purposes only. Double click service to invoke it Grid Computing, B. Wilkinson, 2004
From “Globus Toolkit 3.0 Quick Start,” IBM, Sept 2003, http://www.redbooks.ibm.com/redpapers/pdfs/redp3697.pdf For educational purposes only. Grid Computing, B. Wilkinson, 2004
Resource Management issues in a Grid • Sites owned by others • Different Systems and Software • Different Policies, ... Grid Computing, B. Wilkinson, 2004
GT 3.2 GRAM“Globus Resource Allocation Manager” A set of OGSI compliant services provided to start remote jobs. notably: • Master Managed Job Factory Service (MMJFS). Also a set of non-OGSI compliant services (Gatekeeper, Jobmanager) from pre-GT3. Grid Computing, B. Wilkinson, 2004
Resource Specification Language Version 2 (RSL -2) • Now an XML schema. • Requirements specified in RSL-2 schema in an XML file. • Can specify everything from executable, paths, arguments, input/output, error file, number of processes, max/min execution time, max/min memory, job type etc. etc. Grid Computing, B. Wilkinson, 2004
RSL-2 • Much more elegant and flexible, and in keeping with systems using XML. • Can use XML parsers. • Allows more powerful mechanisms with job schedulers. • Resource scheduler/broker applies specification to local resources. Grid Computing, B. Wilkinson, 2004
RSL-2 ExampleSpecifying Executable(executable=/bin/echo) <gram:executable> <rsl:path> <rsl:stringElement value="/bin/echo"/> </rsl:path> </gram:executable> Grid Computing, B. Wilkinson, 2004
RSL-2 ExampleSpecifying Directory(directory=“/bin”) <gram:directory> <rsl:path> <rsl:stringElement value="/bin/"/> </rsl:path> </gram:directory> Grid Computing, B. Wilkinson, 2004
RSL-2 ExampleSpecifying Number(count=1) <gram:count> <rsl:integer value="1"/> </gram:count> Grid Computing, B. Wilkinson, 2004
RSL-2 ExampleSpecifying Arguments(arguments=“Hello World”) <gram:executable> <rsl:path> <rsl:stringElement value="/bin/echo"/> </rsl:path> </gram:executable> Grid Computing, B. Wilkinson, 2004
RSL-2 ExampleSpecifying input, output, and errorstdin=/dev/null, stdout="stdout”, stderr="stderr" <gram:stdin> <rsl:path><rsl:stringElement value="/dev/null"/> </rsl:path> </gram:stdin> <gram:stdout> <rsl:pathArray> <rsl:path> <rsl:substitutionRef name="HOME"/> <rsl:stringElement value="/stdout"/> </rsl:path> </rsl:pathArray> </gram:stdout> <gram:stderr> <rsl:pathArray> <rsl:path> <rsl:substitutionRef name="HOME"/> <rsl:stringElement value="/stderr"/> </rsl:path> </rsl:pathArray> </gram:stderr> Grid Computing, B. Wilkinson, 2004
<?xml version="1.0" encoding="UTF-8"?> • <rsl:rsl xmlns:rsl="http://www.globus.org/namespaces/2003/04/rsl" • xmlns:gram="http://www.globus.org/namespaces/2003/04/rsl/gram" • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" • xsi:schemaLocation=" • http://www.globus.org/namespaces/2003/04/rsl • c:/ogsa-3.0/schema/base/gram/rsl.xsd • http://www.globus.org/namespaces/2003/04/rsl/gram • c:/ogsa-3.0/schema/base/gram/gram_rsl.xsd"> • <gram:job> • <gram:executable> <rsl:path> • <rsl:stringElement value="/bin/echo"/> </rsl:path> • </gram:executable> • <gram:directory> <rsl:path> • <rsl:stringElement value="/bin"/> </rsl:path> • </gram:directory> • <gram:arguments> • <rsl:string> <rsl:stringElement value="Hello World"/> </rsl:string> • </gram:arguments> • <gram:stdin> <rsl:path> • <rsl:stringElement value="/dev/null"/> </rsl:path> </gram:stdin> • <gram:stdout> • <rsl:pathArray> • <rsl:path> • <rsl:substitutionRef name="HOME"/> • <rsl:stringElement value="/stdout"/> • </rsl:path> • </rsl:pathArray> • </gram:stdout> • <gram:stderr> • <rsl:pathArray> • <rsl:path> • <rsl:substitutionRef name="HOME"/> • <rsl:stringElement value="/stderr"/> • </rsl:path> • </rsl:pathArray> • </gram:stderr> • <gram:count> <rsl:integer value="1"/> </gram:count> • <gram:jobType> • <gram:enumeration> • <gram:enumerationValue> <gram:multiple/> </gram:enumerationValue> • </gram:enumeration> • </gram:jobType> • <gram:gramMyJobType> • <gram:enumeration> • <gram:enumerationValue> <gram:collective/> </gram:enumerationValue> • </gram:enumeration> • </gram:gramMyJobType> • <gram:dryRun> <rsl:boolean value="false"/> </gram:dryRun> • <gram:saveState> <rsl:boolean value="true"/> </gram:saveState> • <gram:twoPhase> <rsl:integer value="600"/> </gram:twoPhase> • </gram:job> • </rsl:rsl> RSL and RSL-2 Comparison for program echo(echo used in assignment 3) &((executable=/bin/echo) (directory="/bin") (arguments="Hello World") (stdin=/dev/null) (stdout="stdout") (stderr="stderr") (count=1) ) Grid Computing, B. Wilkinson, 2004
Running Job(Assignment 3) GT3 command: managed-job-globusrun -factory factoryservicename -file xmlfile where: • factoryservicename specified job service to process request • filename specifies xml file containing RSL-2 specification of job Grid Computing, B. Wilkinson, 2004
Starting a job • Master Managed Job Factory Service (MMJFS) needed to submit job. Grid Computing, B. Wilkinson, 2004
Invoking MMJFS • Invoke MMJFS with: managed-job-globusrun and arguments -- named master job factory service to process job and an xml file to specify job. • Command equivalent to GT 2 globusrun command. Grid Computing, B. Wilkinson, 2004
Example [user $GLOBUS-LOCATION] $ grid-proxy-init [user $GLOBUS-LOCATION] $ managed-job-globusrun -factory http://terra:8080/ogsa/services/base/gram/ MasterForkManagedJobFactoryService -file $GLOBUS_LOCATION/etc/test.xml This command causes the service to process test.xml - an xml document that contains the description of the job. Grid Computing, B. Wilkinson, 2004
More details and practice • See Assignment 3 Grid Computing, B. Wilkinson, 2004
More Information On-line: • “Globus Toolkit 3.0 Quick Start,” IBM redbooks, Sept 2003. http://redbooks.ibm.com/redbooks/pdf/sg246895.pdf http://redbooks.ibm.com/redpapers/pdfs/redp3697.pdf Grid Computing, B. Wilkinson, 2004
“Introduction to the Globus ToolKittm,” Resource Management Services, Nov. 6, 2003. http://www.globus.org • “The Master Managed Job Factory Service (MMJFS)” by V. Silva, May 25, 2004. http://www-106.ibm.com/developerworks/ grid/library/gr-factory/?ca=dgr-lnxw961GridMMJFS • http://grid.hpctools.uh.edu/6397 Grid Computing, B. Wilkinson, 2004