1 / 65

Globus: A Core Grid Middleware

Globus: A Core Grid Middleware. Source: The Globus Project Argonne National Laboratory, University of Southern California / ISI www.globus.org Localised by: Rajkumar Buyya. The Globus Project. Basic research in grid-related technologies

payton
Download Presentation

Globus: A Core Grid Middleware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Globus: A Core Grid Middleware Source: The Globus Project Argonne National Laboratory, University of Southern California / ISI www.globus.org Localised by: Rajkumar Buyya

  2. The Globus Project • Basic research in grid-related technologies • Resource & data management, security, QoS, policy, communication, adaptation, etc. • Development of Globus Toolkit • Core services for grid-enabled tools & apps • Construction of production grids & testbeds • Multiple deployments to distributed organizations for production & prototyping • Application experiments • Distributed applications, tele-immersion, etc.

  3. Globus Approach • A toolkit and collection of services addressing key technical problems • Modular “bag of services” model • Not a vertically integrated solution • General infrastructure tools (aka middleware) that can be applied to many application domains • Inter-domain issues, rather than clustering • Integration of intra-domain solutions • Distinguish between local and global services

  4. Globus Hourglass • Focus on architecture issues • Propose set of core services as basic infrastructure • Use to construct high-level, domain-specific solutions • Design principles • Keep participation cost low • Enable local control • Support for adaptation • “IP hourglass” model A p p l i c a t i o n s Diverse global services Core Globus services Local OS

  5. Technical Focus & Approach • Enable incremental development of grid-enabled tools and applications • Model neutral: Support many programming models, languages, tools, and applications • Evolve in response to user requirements • Deploy toolkit on international-scale production grids and testbeds • Large-scale application development & testing • Information-rich environment • Basis for configuration and adaptation

  6. Globus Toolkit Services • Security (GSI) • PKI-based Security (Authentication) Service • Job submission and management (GRAM) • Uniform Job Submission • Information services (MDS) • LDAP-based Information Service • Remote file management (GASS) • Remote Storage Access Service • Remote Data Catalogue and Management Tools • Support by Globus 2.0 released in 2002

  7. High throughput Collab. design Remote control Application Toolkit Layer Data- intensive Remote viz Information Resource mgmt . . . Grid Services Layer Security Data access Fault detection Grid Services Architecture High-energy physics data analysis Collaborative engineering On-line instrumentation Applications Regional climate studies Parameter studies Grid Fabric Layer Transport . . . Multicast Instrumentation Control interfaces QoS mechanisms

  8. Layered Architecture Applications Application Toolkits GlobusView Testbed Status DUROC MPI Condor-G HPC++ Nimrod/G globusrun Grid Services Nexus GRAM GSI-FTP I/O HBM GASS GSI MDS Grid Fabric Condor MPI TCP UDP DiffServ Solaris LSF PBS NQE Linux NT

  9. Sample of High-Level Services • Resource brokers and co-allocators • DUROC, Nimrod/G, Condor-G, GridbusBroker Communication & I/O libraries • MPICH-G, PAWS, RIO (MPI-IO), PPFS, MOL • Parallel languages • HPC++, CC++, Nimrod Parameter Specification • Collaborative environments • CAVERNsoft, ManyWorlds • Others • MetaNEOS, NetSolve, LSA, AutoPilot, WebFlow

  10. The Nimrod-G Grid Resource Broker • A resource broker for managing, steering, and executing task farming (parameter sweep/SPMD model) applications on the Grid based on deadline and computational economy. • Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability. • Key Features • A single window to manage & control experiment • Persistent and Programmable Task Farming Engine • Resource Discovery • Resource Trading • Scheduling & Predications • Generic Dispatcher & Grid Agents • Transportation of data & results • Steering & data management • Accounting • Uses Globus – MDS, GRAM, GSI, GASS

  11. Condor-G: Condor for the Grid • Condor is a high-throughput scheduler • Condor-G uses Globus Toolkit libraries for: • Security (GSI) • Managing remote jobs on Grid (GRAM) • File staging & remote I/O (GSI-FTP) • Grid job management interface & scheduling • Robust replacement for Globus Toolkit programs • Globus Toolkit focus is on libraries and services, not end user vertical solutions • Supports single or high-throughput apps on Grid • Personal job manager which can exploit Grid resources

  12. Production Grids & Testbeds • Production deployments underway at: • NSF PACIs National Technology Grid • NASA Information Power Grid • DOE ASCI • European Grid • Research testbeds • EMERGE: Advance reservation & QoS • GUSTO: Globus Ubiquitous Supercomputing Testbed Organization • Particle Physics Data Grid • World-Wide Grid (WWG)

  13. Production Grids & Testbeds NASA’s Information Power Grid The Alliance National Technology Grid GUSTO Testbed

  14. WW Grid World Wide Grid (WWG) Australia North America GMonitor Melbourne+Monash U: VPAC, Physics ANL: SGI/Sun/SP2 NCSA: Cluster Wisc: PC/cluster NRC, Canada Many others Gridbus+Nimrod-G MEG Visualisation Solaris WS Internet @ SC 2002/Baltimore Europe Grid MarketDirectory ZIB: T3E/Onyx AEI: Onyx CNR: Cluster CUNI/CZ: Onyx Pozman: SGI/SP2 Vrije U: Cluster Cardiff: Sun E6500 Portsmouth: Linux PC Manchester: O3K Cambridge: SGI Many others Asia AIST, Japan: Solaris Cluster Osaka University: Cluster Doshia: Linux cluster Korea: Linux cluster

  15. Example Applications Projects (via Nimrod-G or Gridbus) • Molecular Docking for Drug Discovery • Docking molecules from chemical databases with target protein • Neuro Science • Brain Activity Analysis • High Energy Physics • Belle Detector Data Analysis • Natural Language Engineering • Analyzing audio data (e.g., to identify emotional state of a person!)

  16. Example Application Projects • Computed microtomography (ANL, ISI) • Real-time, collaborative analysis of data from X-Ray source (and electron microscope) • Hydrology (ISI, UMD, UT; also NCSA, Wisc.) • Interactive modeling and data analysis • Collaborative engineering (“tele-immersion”) • CAVERNsoft @ EVL • OVERFLOW (NASA) • Large CFD simulations for aerospace vehicles

  17. Example Application Experiments • Distributed interactive simulation (CIT, ISI) • Record-setting SF-Express simulation • Cactus • Astrophysics simulation, viz, and steering • Including trans-Atlantic experiments • Particle Physics Data Grid • High Energy Physics distributed data analysis • Earth Systems Grid • Climate modeling data management

  18. The Globus Advantage • Flexible Resource Specification Language which provides the necessary power to express the required constraints • Services for resource co-allocation, executable staging, remote data access and I/O streaming • Integration of these services into high-level tools • MPICH-G: grid-enabled MPI • globus-job-*: flexible remote execution commands • Nimrod-G Grid Resource broker • Gridbus: Grid Business Infrastructure • Condor-G: high-throughput broker • PBS, GRD: meta-schedulers

  19. Resource Management • Resource Specification Language (RSL) is used to communicate requirements • The Globus Resource Allocation Manager (GRAM) API allows programs to be started on remote resources, despite local heterogeneity • A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services

  20. Broker Co-allocator Resource Management Architecture RSL specialization RSL Application Information Service Queries & Info Ground RSL Simple ground RSL Local resource managers GRAM GRAM GRAM LSF EASY-LL NQE

  21. GRAM Components MDS client API calls to locate resources Client MDS: Grid Index Info Server Site boundary MDS client API calls to get resource info GRAM client API calls to request resource allocation and process creation. MDS: Grid Resource Info Server Query current status of resource GRAM client API state change callbacks Globus Security Infrastructure Local Resource Manager Allocate & create processes Request Job Manager Create Gatekeeper Process Parse Monitor & control Process RSL Library Process

  22. A simple run • [raj@belle raj]$ globus-job-run belle.anu.edu.au /bin/date • Mon May 3 15:05:42 EST 2004

  23. Resource Specification Language (RSL) • Common notation for exchange of information between components • Syntax similar to MDS/LDAP filters • RSL provides two types of information: • Resource requirements: Machine type, number of nodes, memory, etc. • Job configuration: Directory, executable, args, environment • API provided for manipulating RSL

  24. RSL Syntax • Elementary form: parenthesis clauses • (attribute op value [ value … ] ) • Operators Supported: • <, <=, =, >=, > , != • Some supported attributes: • executable, arguments, environment, stdin, stdout, stderr, resourceManagerContact,resourceManagerName • Unknown attributes are passed through • May be handled by subsequent tools

  25. Constraints: “&” • globusrun -o -r belle.anu.edu.au "&(executable=/bin/date)" • For example: & (count>=5) (count<=10) (max_time=240) (memory>=64) (executable=myprog) “Create 5-10 instances of myprog, each on a machine with at least 64 MB memory that is available to me for 4 hours”

  26. Disjunction: “|” • For example: • & (executable=myprog) • ( | (&(count=5)(memory>=64)) • (&(count=10)(memory>=32))) • Create 5 instances of myprog on a machine that has at least 64MB of memory, or 10 instances on a machine with at least 32MB of memory

  27. Multirequest: “+” • A multi-request allows us to specify multiple resource needs, for example + (& (count=5)(memory>=64) (executable=p1)) (&(network=atm) (executable=p2)) • Execute 5 instances of p1 on a machine with at least 64M of memory • Execute p2 on a machine with an ATM connection • Multirequests are central to co-allocation

  28. Co-allocation • Simultaneous allocation of a resource set • Handled via optimistic co-allocation based on free nodes or queue prediction • In the future, advance reservations will also be supported • globusrun and globus-job-* will co-allocate specific multi-requests • Uses a Globus component called the Dynamically Updated Request Online Co-allocator (DUROC)

  29. DUROC Functions • Submit a multi-request • Edit a pending request • Add new nodes, edit out failed nodes • Commit to configuration • Delay to last possible minute • Barrier synchronization • Initialize computation • Bootstrap library • Monitor and control collection

  30. RM1 RM2 RM3 Job 1 Job 2 Job 3 RM4 Job 4 Job 5 DUROC Architecture Controlled Jobs Subjobstatus Controlling Application RSL multi-request Edit request Barrier

  31. RSL Creation Using globus-job-run • globus-job-run can be used to generate RSL from command-line args: globus-job-run –dumprsl \ -: host1 -np N1 [-s] executable1 args1 \ -: host2 -np N2 [-s] executable2 args2 \ ... > rslfile • -np: number of processors • -s: stage file • argument options for all RSL keywords • -help: description of all options

  32. Job Submission Interfaces • Globus Toolkit includes several command line programs for job submission • globus-job-run: Interactive jobs • globus-job-submit: Batch/offline jobs • globusrun: Flexible scripting infrastructure • Other High Level Interfaces • General purpose • Nimrod-G, Condor-G, PBS, GRD, etc • Application specific • ECCE’, Cactus, Web portals

  33. globus-job-run • For running of interactive jobs • Additional functionality beyond rsh • Ex: Run 2 process job w/ executable staging globus-job-run -: host –np 2 –s myprog arg1 arg2 • Ex: Run 5 processes across 2 hosts globus-job-run \ -: host1 –np 2 –s myprog.linux arg1 \ -: host2 –np 3 –s myprog.aix arg2 • For list of arguments run: globus-job-run -help

  34. globus-job-submit • For running of batch/offline jobs • globus-job-submit Submit job • Same interface as globus-job-run • Returns immediately • globus-job-status Check job status • globus-job-cancel Cancel job • globus-job-get-output Get job stdout/err • globus-job-clean Cleanup after job

  35. globusrun • Flexible job submission for scripting • Uses an RSL string to specify job request • Contains an embedded globus-gass-server • Defines GASS URL prefix in RSL substitution variable: (stdout=$(GLOBUSRUN_GASS_URL)/stdout) • Supports both interactive and offline jobs • Complex to use • Must write RSL by hand • Must understand its esoteric features • Generally you should use globus-job-* commands instead

  36. “Perform a parameter study involving 10,000 separate trials” Parameter study specific broker " . . ." “Create a shared virtual space with participants X, Y, and Z” Collaborative environment-specific resource broker " . . ." Resource Brokers “Run a distributed interactive simulation involving 100,000 entities” “Supercomputers providing 100 GFLOPS, 100 GB, < 100 msec latency” DIS-Specific Broker Information Service Supercomputer resource broker “80 nodes on Argonne SP, 256 nodes on CIT Exemplar 300 nodes on NCSA O2000” Simultaneous start co-allocator "Run SF-Express on 80 nodes” "Run SF-Express on 256 nodes” “Run SF-Express on 300 nodes” Argonne Resource Manager CIT Resource Manager NCSA Resource Manager

  37. Brokering via Lowering • Resource location by refining a RSL expression (RSL lowering): (MFLOPS=1000)Þ (& (arch=sp2)(count=200))Þ (+ (& (arch=sp2) (count=120) (resourceManagerContact=anlsp2)) (& (arch=sp2) (count=80) (resourceManagerContact=uhsp2)))

  38. Remote I/O and Staging • Tell GRAM to pull executable from remote location • Access files from a remote location • stdin/stdout/stderr from a remote location

  39. What is GASS? (a) GASS file access API • Replace open/close with globus_gass_open/close; read/write calls can then proceed directly (b) RSL extensions • URLs used to name executables, stdout, stderr (c) Remote cache management utility (d) Low-level APIs for specialized behaviors

  40. GASS Architecture &(executable=https://…) main( ) { fd = globus_gass_open(…) … read(fd,…) … globus_gass_close(fd) } (b) RSL extensions GRAM GASS Server HTTP Server (a) GASS file access API FTP Server Cache (c) Remote cache management (d) Low-level APIs for customizing cache & GASS server % globus-gass-cache

  41. GASS File Naming • URL encoding of resource names https://quad.mcs.anl.gov:9991/~bester/myjob protocolserver address file name • Other examples https://pitcairn.mcs.anl.gov/tmp/input_dataset.1 https://pitcairn.mcs.anl.gov:2222/./output_data http://www.globus.org/~bester/input_dataset.2 • Supports http & https • Support ftp & gsiftp.

  42. GASS RSL Extensions • executable, stdin, stdout, stderr can be local files or URLs • executable and stdin loaded into local cache before job begins (on front-end node) • stdout, stderr handled via GASS append mode • Cache cleaned after job completes

  43. GASS/RSL Example &(executable=https://quad:1234/~/myexe) (stdin=https://quad:1234/~/myin) (stdout=/home/bester/output) (stderr=https://quad:1234/dev/stdout)

  44. Example GASS Applications • On-demand, transparent loading of data sets • Caching of data sets • Automatic staging of code and data to remote supercomputers • (Near) real-time logging of application output to remote server

  45. GASS File Access API • Minimum changes to application • globus_gass_open(), globus_gass_close() • Same as open(), close() but use URLs instead of filenames • Caches URL in case of multiple opens • Return descriptors to files in local cache or sockets to remote server • globus_gass_fopen(), globus_gass_fclose()

  46. GASS File Access API (cont) • Support for different access patterns • Read-only (from local cache) • Write-only (to local cache) • Read-write (to/from local cache) • Write-only, append (to remote server)

  47. no Modified Remove cache reference yes Upload changes globus_gass_open()/close() no URL in cache? Download File into cache yes open cached file,add cache reference globus_gass_close() globus_gass_open()

  48. GASS File API Semantics • Copy-on-open to cache if not truncate or write-only append and not already in cache • Copy on close from cache if not read only and not other copies open • Multiple globus_gass_open() calls share local copy of file • Append to remote file if write only append: e.g., for stdout and stderr • Reference counting keeps track of open files

  49. globus-gass-server • Simple file server • Run by user wherever necessary • Secure https protocol, using GSI • APIs for embedding server into other programs • Example globus-gass-server –r –w -t • -r: Allow files to be read from this server • -w: Allow files to be written to this server • -t: Tilde expand (~/…  $(HOME)/…) • -help: For list of all options

  50. program GASS server stdout 1 Host name Contact string jobmanager globus-job-run 2 Command Line Args RSL string GRAM & GASS: Putting It Together 1. Derive Contact String 2. Build RSL string 3. Startup GASS server 4. Submit to request 5. Return output 5 5 4 5 5 3 4 4 gatekeeper

More Related