Schedd on the side l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 14

Schedd On The Side PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on
  • Presentation posted in: General

Schedd On The Side. What is it?. Specialized scheduler operating on schedd’s jobs. Job 1 Job 2 Job 3 Job 4 Job 5 …. Schedd On The Side. Job 4*. job queue. Schedd. Random Seed. Random Seed. Random Seed. Random Seed. Random Seed. Random Seed. Random Seed. Random Seed.

Download Presentation

Schedd On The Side

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Schedd on the side l.jpg

Schedd On The Side


What is it l.jpg

What is it?

Specialized scheduler operating on schedd’s jobs.

Job 1

Job 2

Job 3

Job 4

Job 5

Schedd

On The

Side

Job 4*

job queue

Schedd


Condor farm story l.jpg

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Negotiator

Schedd

Startd

Resources

Condor Farm Story

  • Now that this is working, howcan I use my collaborator’sresources too?

condor_submit

job queue

Application


Option 1 merge farms l.jpg

Option #1: Merge Farms

  • Combine machines with collaborator into one Condor resource pool.

    • Everything works just like it did before.

    • Excellent option for small to medium clusters.

    • Requires bidirectional connectivity to all startds, or equivalent via GCB.

    • Requires some administrative coordination (e.g. upgrades, negotiator policy, security, etc.)


Option 2 flocking together l.jpg

  • full featured(std universe etc)

  • automatic matchmaking

  • easy to configure

  • requires bidirectionalconnectivity

  • both sites must runcondor

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Negotiator

Negotiator

Schedd

Remote

Startds

Random

Seed

Random

Seed

Random

Seed

Local

Startds

Option #2: Flocking Together


Option 3 grid universe l.jpg

Gatekeeper

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Negotiator

Schedd

X

Random

Seed

Random

Seed

Random

Seed

Startds

Option #3: Grid Universe

vanilla

site X

  • easier to live with private networks

  • may use non-Condor resources

  • restricted Condor feature set(e.g. no std universe over grid)

  • must pre-allocating jobsbetween vanilla and grid universe


Option 4 routing jobs l.jpg

Random

Seed

Z

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Schedd

On The

Side

Negotiator

Gatekeeper

Schedd

X

Random

Seed

Random

Seed

Random

Seed

Local

Startds

Y

Option #4: Routing Jobs

  • dynamic allocation of jobsbetween vanilla and grid universes.

  • not every job is appropriate fortransformation into a grid job.

vanilla

site X

site Y

site Z


What about flow control l.jpg

What About Flow Control?

  • May restrict routing to jobs which have been rejected by negotiator.

  • May limit maximum actively routed jobs on a per site basis.

  • May limit maximum idle routed jobs per site.

  • Periodic remove of idle routed jobs is possible, but no guarantee of optimal rescheduling.

  • Routing table may be reconfigured dynamically.

  • Multicast? Might be interesting to try.


What about i o l.jpg

What About I/O?

  • Jobs must be sandboxable (i.e. specifying input/output via transfer-files mechanism).

  • Routing of standard universe is not supported.

  • Additional restrictions may apply, depending on site network and disk.


What types of grids l.jpg

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Schedd

On The

Side

Negotiator

Schedd

Schedd X

Random

Seed

Random

Seed

Random

Seed

What Types of Grids?

  • Routing table may contain any combination of grid types supported by the grid universe.

  • Example: Condor-C

site X

  • for two Condor sites, schedd-to-scheddsubmission requires no additional software

  • however, still not as trivial to use as flocking


Routing behind the scenes l.jpg

Schedd

On The

Side

Schedd X3

Schedd

Routing Behind the Scenes

  • navigate internal firewalls

  • provide custom routesfor special users

  • improve scalability

  • However, keep in mindI/O requirements etc.

Gatekeeper

X2

X


Future step glidein factory l.jpg

Gatekeeper

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Schedd

On The

Side

Negotiator

Schedd

X

Random

Seed

Random

Seed

Random

Seed

Startds

Future Step: Glidein Factory

glidein jobs

site X

home

  • true late binding of jobs to resources

  • may run on top of non-Condor sites

  • supports full feature set of Condor(e.g. standard universe)

  • requires GCB on network boundary(initiated by schedd-on-the-side?)


Glideing in the works l.jpg

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Random

Seed

Schedd

On The

Side

Schedd

glidein factory

Glideing in the Works

site X

schedd-to-schedd

  • hierarchical strategy for scalabilityand reliability

  • better match for private networks

schedd-to-gatekeeper

  • may require some additional horsepowerfrom gatekeeper machine, perhaps adedicated element for “edge services”.


Thanks l.jpg

Thanks

Interested?Let us know.

We are currently

using job routing

for specific users

at UW.

Future development

will focus on more

use-cases.

Dan Bradley

[email protected]


  • Login