1 / 21

Generalized Resource Management In Computational Grids

Generalized Resource Management In Computational Grids. Carl Kesselman Information Sciences Institute University of Southern California http://www.globus.org/. Acknowledgements. Presentation is on work in progress Joint work with Ian Foster Contributions by: Steve Tuecke, Alain Roy (ANL)

garron
Download Presentation

Generalized Resource Management In Computational Grids

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generalized Resource Management In Computational Grids Carl Kesselman Information Sciences Institute University of Southern California http://www.globus.org/

  2. Acknowledgements • Presentation is on work in progress • Joint work with Ian Foster • Contributions by: • Steve Tuecke, Alain Roy (ANL) • Soonwook Hwang, Bob Lindell (ISI) • Craig Lee, James Stepanek (The Aerospace Corporation)

  3. Computational Grids • Assemble distributed resources ... • High-end computers • Information sources • Scientific instruments, etc. • …and apply to challenging problems • Smart instruments • Collaborative engineering • Data mining

  4. Example: Real-time Microtomography: APS beamline @ Argonne Resource location “10 Gflop/sec, 20 Mb/sec, 10 minutes; rendering, 10 GB storage” Resource allocation Configuration Parallel computation Remote I/O

  5. Resource Management in Grids • Resources include: • computers, networks, data, people.. • Problem: How do we manage heterogeneous collections of distributed high-performance resources • Locating resources, • Allocating resources, • Authentication and access control, • Activities to prepare a resource for use

  6. Why it is Hard? • Site autonomy • No control over local administration • Heterogeneous substrate • Many different platforms • Policy extensibility • Application specific allocation requirements • Co-allocation • simultaneous access of resources • Online control • Must access resources from applications

  7. Resource Allocation • Interact with local allocation systems • LoadLeveler, NQE, LSF, etc. • Coordinate allocation across multiple domains • Control resulting resource allocation • status, terminate, etc. • Must deal with un-availability of resource • no guarantees

  8. Initial Approach • Local resource managers • Site autonomy, heterogeneous substrate • Resource specification language • Online control, policy extensibility • Resource brokers • Map high-level requests into local requests • Resource co-allocators • Co-allocation

  9. Broker Co-allocator Resource Management Architecture RSL specialization RSL Application Information Service Queries & Info Ground RSL Simple ground RSL Local resource managers GRAM GRAM GRAM LSF EASY-LL NQE

  10. Local Resource Management MDS client API calls to locate resources GRAM Client MDS Update MDS with resource state information GRAM client API calls to request resource allocation and process creation. Gram Reporter Site boundary Query current status of resource Gatekeeper Local Resource Manager Allocate & create processes Authentication Create Request Job Manager Process Globus Security Infrastructure Parse Process Monitor & control RSL Library Process

  11. Limitations • Focus on resource allocation, not scheduling • necessary but not sufficient • Cumbersome support for introducing quality of service constraints • RSL extensions • Difficult to support advanced reservation • needed for effective co-allocation

  12. Resource Scheduling • Traditionally allocations have been “best effort” • IP networks, time-sharing CPU schedulers, queueing systems • Not sufficient to support advanced applications • Advanced reservation essential for effective co-allocation • Integration of quality of service into resource management architecture • Quality of service concerns

  13. Extending the Architecture • Support end-to-end management of networks, computers, memory, disks, etc. • Advance reservations, QoS, adaptation, etc. • Integrate diverse approaches • Current Globus CPU scheduling • Klara Nahrstedt’s work on CPU/ATM scheduling • Work on RSVP signaling (Qualis) • Differentiated services (Clipper) • Proposed new approach • Enhance Globus RM architecture

  14. Generalized GRAMs + Reservations • Enhance scope of Globus RM elements • Use “GRAMs” to control network, memory, etc. • Brokers for networks, computers, etc. • Treat end-to-end management as co-allocation • Separate concepts of reservation and creation • Reservation as explicit abstraction in architecture • Reservation specifies when and how much • Reservation doesn't guarantee allocation success

  15. Advanced Reservations • Required for resource allocation request • default “best-effort” reservation preserves current behavior • Specifies when and how-much • start-time, duration, both can be unknown • how-much initially focused on fractional resources such as CPU cycles or bandwidth • RSL used to express reservations • Reservation request produces handle which can be passed around and reused.

  16. Allocation With Reservations • Allocation can be for range of resources • flow, thread, process, etc. • Reservation provided with allocation request • Reservation can be changed for established allocation • Object can be destroyed without destroying reservation • RSL used to specify objects

  17. API Overview • create_reservation • Maps RSL to reservation handle • create_object • Maps RSL and reservation handle to process, flow, etc. • modify_reservation • alters reservation associated with an object • callback interface • monitors state of reservation and object.

  18. Requirements candidate resources Reservation Co-allocator {ResHandles} Broker(s) Object creation Co-allocator {net a: 100 Mb/s, MPP 1: 40 nodes, net b: 30 Mb/s, CPU: 0.5 } 40 nodes Online monitor CPU 100 Mb/s 50 Mb/s 30 Mb/s 0.5 CPU Exclusive {ObjHandles} Modify Reservation Example: Online Data Analysis Information Service G MPP1 G G b G a G c G d MPP2 G

  19. Object manager Object Object root API object creation file operations globus Generalized GRAM Architecture user MDS client API calls to locate resources GRAM Client MDS update MDS with resource state information GRAM client API calls request reservation and object creation Site boundary GRAM reporter create object manager Object GRAM read object records write Resv. GRAM object records create & monitor object delete object, modify reservation authenticate Auth/map server create reservation, delete reservation or register reservation, delete reservation either read reservation list Resource manager Resv. manager resv. list check policy policy manager write reservation list

  20. Issues • Open versus closed systems • we may not control all access to resources • Limited support for advanced reservation on current platforms • may have to provide reservation support as part of system • Preemption and failure • notification mechanism needed • Reservation brokering and co-allocation techniques

  21. Summary • Advanced reservation critical for computational grid applications • Existing Globus resource management architecture can be extended to include reservation • Power of brokers, RSL and GRAMS can be applied to reservations as well as allocations • addresses end-to-end problem • More detailed design in progress

More Related