1 / 30

Interactivity and fast job allocation at the level of Resource Brokers

Interactivity and fast job allocation at the level of Resource Brokers. Miquel A. Senar Universitat Autònoma de Barcelona Spain. Abstract.

dandre
Download Presentation

Interactivity and fast job allocation at the level of Resource Brokers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interactivity and fast job allocation at the level of Resource Brokers Miquel A. Senar Universitat Autònoma de Barcelona Spain

  2. Abstract • Most Grid middleware technologies developed in recent years have been aimed at the execution of sequential batch jobs. However, some users require a certain kind of interactive access. • This talk describes our experience with CrossBroker to provide transparent and reliable support for interactive applications. • Our solution is based on two main notions: • split execution and interposition agents used to stream I/O. And we review how we have applied interposition agents transparently to sequential and MPI applications. • a simple multiprogramming mechanism that is used to start interactive applications as fast as possible. Budapest EGEE’07, 1-5 October 2007

  3. Outline • System Architecture and Interactivity Problems • CrossBroker: Functionality and Architecture • I/O streaming • Glidein and Multiprogramming • Open Issues and Conclusions Budapest EGEE’07, 1-5 October 2007

  4. SERVICES Middleware Middleware Middleware Grid Architecture Resource directory File Replica Management Monitorization Unified resource vision Authentication/Security Start/Control of jobs File transfer Communication Internet REMOTE SITE REMOTE SITE Budapest EGEE’07, 1-5 October 2007

  5. Job F1 F2 SERVICES O1 O2 Middleware Middleware Middleware Job F1 F2 Batch execution on Grids Internet REMOTE SITE REMOTE SITE Budapest EGEE’07, 1-5 October 2007

  6. Job F1 F2 SERVICES Middleware Middleware Middleware Job F1 F2 Interactive Job Execution • Fast start-up • Execution in high-occupancy situations Internet REMOTE SITE REMOTE SITE Budapest EGEE’07, 1-5 October 2007

  7. Grid Environment Constraints • No privileges • Minimal need for preinstalled components • No changes to the • LRMS or • applications • Outdated information • Dynamic changes • LRMS (PBS, LSF, • Condor): limited • external control • Non cooperative LRMS • Local user jobs SERVICES Information Index Internet middleware middleware LRMS LRMS Budapest EGEE’07, 1-5 October 2007

  8. CrossBroker Information Index Migrating Desktop Scheduling Agent Resource Searcher CrossBroker Replica Manager Application Launcher Condor-G DAGMan CE CE EGEE/Globus EGEE/Globus LRMS LRMS WN WN Budapest EGEE’07, 1-5 October 2007

  9. CrossBroker features • Jobs described in a text file using JDL (Job Description Language) • gLite interoperability • accepts jobs from gLite's UI • able to submit jobs to gLite resources (LCG-CE and gLite CE) • Focuss on jobs not fully supported by gLite: • parallel jobs (MPI) • Run in more than one resource /site , in a coordinated fashion. • Interactive jobs • The user interacts with the application during its execution Budapest EGEE’07, 1-5 October 2007

  10. Interactivity Requirements • Online Input-Output streaming: the ability to have application input and output online. • Fast startup: the possibility of starting the application immediately, also taking into account scenarios in which all computing resources might be running batch jobs. Budapest EGEE’07, 1-5 October 2007

  11. I/O Streaming Support • I/O stream managed by a streaming engine. Typical architecture, one console (user interface) and one agent (application) • Examples: • Condor Bypass: library interposition • glogin + gvid: shell-like • CrossBroker injects interactive agents that enable communication between user and job • Transparent to the user • Full integration with bypass and glogin & gvid • Support for interactivity in all kinds of jobs • sequential and all MPI flavors Budapest EGEE’07, 1-5 October 2007

  12. I/O streaming Injects the agent into the job CrossBroker Console-like component Started at the UI or at the MD Job Console Resource Job RPC I/O Job Agent stdinstdout stderr SO Traps input/output operations and sends them to the Job Shadow Supported agents: Glogin (shell-like)Condor ByPass (I/O library interception) Budapest EGEE’07, 1-5 October 2007

  13. Interactive Support for video streaming Interactive Job Submission Plugin Migrating Desktop Roaming Access Server Cross Broker Glogin submission support Java visualization plugin G-login SERVICES GVid Information Index Replica Manager Internet GSS secured Video Stream gLite gLite CE CE G-login GVid WN WN Budapest EGEE’07, 1-5 October 2007

  14. JDL: Interactive jobs • INTERACTIVE: true/false. Indicates that the job is interactive and the broker should treat it with higher priority • INTERACTIVEAGENT: streaming mechanism (currently, bypass and glogin) • INTERACTIVEAGENTARGUMENTS • These attributes specify the command (and its arguments) used to communicate with the user. Budapest EGEE’07, 1-5 October 2007

  15. JDL: Interactive jobs Type = "Job"; VirtualOrganisation = "imain"; JobType = "Parallel"; SubJobType = “openmpi"; NodeNumber = 11; Interactive = TRUE; InteractiveAgent = “glogin“; InteractiveAgentArguments = “-r –p 195.168.105.65:23433“; Executable = "test-app"; InputSandbox = {"test-app", "inputfile"}; OutputSanbox = {"std.out", "std.err"}; StdErr = "std.err“; StdOutput = "std.out"; Rank = other.GlueHostBenchmarkSI00 ; Requirements = other.GlueCEStateStatus == "Production"; Budapest EGEE’07, 1-5 October 2007

  16. Fast Start-Up for interactivity • Interactive jobs are sent to sites with available machines (problems with inaccurate information) • If there are not available machines, use time sharing • Possibility of starting the application immediately, also taking into account scenarios in which all computing resources might be running batch jobs. • Users should run interactive jobs with limitations. Otherwise, nobody will run batch jobs if interactive jobs run immediately at no cost for the user. • Resource usage has to be paid somehow and the cost has to be higher when users run interactive jobs. • “Payment” can be done in terms of user priority (more later). Budapest EGEE’07, 1-5 October 2007

  17. Time Sharing on Grid Resources: Glide-in mechanism • Main idea: • Wrap every batch job with an agent (glide-in) that will get control of the remote machine independently of its local resource manager. • Goals: • Agents enable simple multiprogramming between interactive and batch jobs. Interactive jobs may run even when no free resources are available. • Agents can also be used as a fast start-up mechanism. • Agent can control the amount of CPU that an interactive job gets according to QoS requirements expressed by the user in the JDL. Budapest EGEE’07, 1-5 October 2007

  18. Time Sharing on Grid Resources: Glide-in mechanism • For each batch job, the broker submits an agent (glide-in) to the Grid • The agent is created in a temporarily-acquired Grid resource. The resource is logically divided into two “Virtual Machines” • The agent reports back to the broker and starts the batch job in one VM • When needed, the broker sends (directly) an interactive job to the agent who runs it with a higher priority than the batch job (simple time sharing) at the second VM Budapest EGEE’07, 1-5 October 2007

  19. Time Sharing Grid Resource CrossBroker LRMS BATCH JOB Scheduling Agent Condor-G Budapest EGEE’07, 1-5 October 2007

  20. Time Sharing Grid Resource CrossBroker LRMS BATCH JOB Scheduling Agent Application Launcher Condor-G Budapest EGEE’07, 1-5 October 2007

  21. Time Sharing Grid Resource CrossBroker LRMS BATCH JOB Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G Budapest EGEE’07, 1-5 October 2007

  22. Time Sharing Grid Resource CrossBroker LRMS BATCH JOB Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G Budapest EGEE’07, 1-5 October 2007

  23. Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G BATCH JOB Batch Jobrunning Budapest EGEE’07, 1-5 October 2007

  24. Time Sharing Grid Resource CrossBroker INT. JOB LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G BATCH Budapest EGEE’07, 1-5 October 2007

  25. Time Sharing Grid Resource CrossBroker LRMS Scheduling Agent Agent Application Launcher VM1 VM2 Condor-G INT. JOB BATCH Startup-time reduction Only one layer involved Budapest EGEE’07, 1-5 October 2007

  26. Response Time CrossBroker CE + WN Budapest EGEE’07, 1-5 October 2007

  27. Open Issues • Still missing a generalized priority schema for grid users and jobs (business model). It’s more a social than a technical issue. • Things are more easy if infrastructure is simple (single Broker, uniform middleware, same LRMS, same policies,…) • More likely, users and institutions will share the same local infrastructure between several Grids (interactive grid will be one of them): multiple brokers, multiple LRMS, different policies,… • Some solutions (under investigation): • Fair share with different penalty factors • Batch jobs worsen the priority according to the resources used • Interactive jobs worsen the priority faster than batch ones (twice, three times,…?) • If batch and interactive jobs run in the multiprogramming schema, priority worsening is adjusted according to the amount of CPU consumed by each one • Interactive CE • Automatic injection of glidein wrappers for all jobs submitted from other sources different of CrossBroker. • Possibility to add advance reservations Budapest EGEE’07, 1-5 October 2007

  28. Interactivity and Grid Business Model RC B RAS LCG Broker UI LCG CE Int.eu.grid CE SE classical VOMS SE SRM RC A CrossBroker VO Applications Users Cluster Manager LCG CE WN WN WN Int.eu.grid CE SE SRM Budapest EGEE’07, 1-5 October 2007

  29. Conclusion Need for a business model that: • Commitment of resources (specific for a VO, shared between VOs,…) • Payment / Charging to users (VOs) • Happy Users: access to resources • Happy providers: resources are used productively. Budapest EGEE’07, 1-5 October 2007

  30. Thank you

More Related