1 / 17

Adding Standards Based Job Submission to a Commodity Grid Broker

Adding Standards Based Job Submission to a Commodity Grid Broker. David Wallom 1 David Colling 2 , A. Stephen McGough 2 , Tiejun Ma 1 , Vesso Novov 2 , Jazz Mack Smith 2 , and Xin Xiong 1 1. Oxford e-Research Centre, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, UK

ingrid
Download Presentation

Adding Standards Based Job Submission to a Commodity Grid Broker

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adding Standards Based Job Submission to a Commodity Grid Broker David Wallom1 David Colling2, A. Stephen McGough2, Tiejun Ma1, Vesso Novov2, Jazz Mack Smith2, and Xin Xiong1 1. Oxford e-Research Centre, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, UK 2. Imperial College London, London SW7 2AZ, UK

  2. Outline • Introduction • System Architectural Design • Implementation • Evaluation • Conclusion

  3. Introduction • Diversity of incompatible Grid schedulers, platform/ecosystem dependence, private interfaces • Lack of an OGSA HPC Basic Profile compliant metascheuler • Grid Information Collector +Condor+GridSAM+DRMs as the solution

  4. Compute Related Standards Architecture OGSA EMS Scenarios (GFD 106) Use Cases Grid Scheduling Use Cases (GFD 64) Education ISV Primer (GFD 141) Job Definition Agreement WS-Agreement (GFD 107) Programming Interface SAGA (GFD 90) Job Description JSDL (GFD 56/136) Uses Programming Interface DRMAA (GFD 22/133) Accounting Usage Record (GFD 98) Application Description HPC Application (GFD 111) Supports Produces Application Description SPMD Application (GFD 115) Extend Job Management OGSA-BES (GFD 108) Job Parameterization Parameter Sweep (GFD. 149) Information GLUE Schema 2.0 (GFD. 147) Describes Profiles File Transfer HPC File Staging (GFD 135) HPC Domain Specific Profile HPC Basic Profile (GFD 114)

  5. Compute Related Standards Architecture OGSA EMS Scenarios (GFD 106) Use Cases Grid Scheduling Use Cases (GFD 64) Education ISV Primer (GFD 141) Job Definition Agreement WS-Agreement (GFD 107) Programming Interface SAGA (GFD 90) Job Description JSDL (GFD 56/136) Uses Programming Interface DRMAA (GFD 22/133) Accounting Usage Record (GFD 98) Application Description HPC Application (GFD 111) Supports Produces Application Description SPMD Application (GFD 115) Extend Job Management OGSA-BES (GFD 108) Job Parameterization Parameter Sweep (GFD. 149) Information GLUE Schema 2.0 (GFD. 147) Describes Profiles File Transfer HPC File Staging (GFD 135) HPC Domain Specific Profile HPC Basic Profile (GFD 114)

  6. Our Aims • Direct submission from GridSAM into PBS (pro & Torque), LSF and gLite resources. • Condor Matchmaker (Broker) with standards based resource discovery and job submission • GLUE Resource information Retrieval • GLUE information translation to Condor ClassAds • Use GridSAM job Submission to give HPC-BP support

  7. Our Contribution • Implementations of high functionality clients for OGF recommended standards • Adding automated meta-scheduler capability to Condor to interface directly with any standards compliant DRM. • Any new ‘HPC-BP’ compliant system may be quickly and easily added to the grid, providing a clean and clear interface between the DRM and Condor. • Concrete examples through the integration of Microsoft HPCServer 2008 cluster and resources through the OGSA HPC Basic Profile and other resources available through GridSAM and submission of jobs to these systems.

  8. Grid Brokering Architecture • Resources: advertise and accept jobs • DRM: distributed resource manager • Grid Broker: job matchmaking • Users: submit jobs

  9. System Architecture

  10. Grid Information Collector

  11. Job Submission and Monitoring

  12. Data Staging • Condor uses elements within the ClassAd to list those files that need to be staged to and from the resource, this information is available to the GAHP server as it receives the whole ClassAd. • In a JSDL document files are marked for transfer along with a URL indicating where the file should be staged from/to. • If the location where the user has placed their files is already exposed through a file transfer protocol (such as FTP, GridFTP, HTTP(S)) then these locations can be used as the URL. • If the files are not in exposed locations then the GAHP server itself will copy the files to a location where they can be accessed.

  13. Evaluation • At the moment, the system has been brokering jobs to around 40 Grid and cluster resources, both locally in Oxford but also in the UK National Grid Service (NGS) • During this time our system has processed in excess of 600000 individual jobs smoothly and managing job distribution according to user requirements. • This has allowed us to develop a whole Grid ecosystem which can make the basis of a campus/other Grid toolkit allowing quick and easy deployment of Grid systems through institutions. • We have been able to reinstall this system onto a variety of OS and hardware configurations, currently operating on our institutional cloud

  14. Evaluaton:View • Map

  15. Future Work • Support GLUE 2.0 standard for both BDII and UDDI. • Ensure support for newer HPC-BP compliant systems • Incorporating the system into the current OGF Global HPC Basic Profile Interoperability demonstration.

  16. Conclusion • Shown how the Condor system can be integrated with OGF recommended standards for job submission (JSDL and OGSA-BES) and resource description (GLUE). • Allows the use of a powerful brokering service with resources exposed through standards based interfaces. • A more scalable solution than using GAHP alone as different DRM systems need only develop a BES interface for their existing software rather than developing both a Client and Server interface to integrate with Condor alone. • The added advantage is that as their software exposes a BES interface other Grid users can access these resources from Standards based clients.

  17. Thanks Questions? david.wallom@oerc.ox.ac.uk Oxford e-Research Centre, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, UK

More Related