connecting lrms to grms
Download
Skip this Video
Download Presentation
Connecting LRMS to GRMS

Loading in 2 Seconds...

play fullscreen
1 / 12

Connecting LRMS to GRMS - PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on

Connecting LRMS to GRMS. Jeff Templon PDP Group, NIKHEF. HEPiX Batch Workshop 12-13 May 2005. Example Site Scenario. Computer cluster at HIKHEF: 50\% guaranteed for SC-Grid 50\% guaranteed for LHC experiments Allow either group to exceed 50\% if other group not active

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Connecting LRMS to GRMS' - keran


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
connecting lrms to grms

Connecting LRMS to GRMS

Jeff Templon

PDP Group, NIKHEF

HEPiX Batch Workshop

12-13 May 2005

example site scenario
Example Site Scenario
  • Computer cluster at HIKHEF:
    • 50% guaranteed for SC-Grid
    • 50% guaranteed for LHC experiments
    • Allow either group to exceed 50% if other group not active
    • Allow D0 experiment to scavenge any crumbs
    • Give ‘dteam’ (operations group) extremely high priority; but limit to 2 concurrent jobs
    • Limit running jobs from production groups to ~ 95% of capacity (always keep a few CPUs free for e.g. operations checks)
example user scenarios
Example User Scenarios
  • “polite user”
    • Uses grid job submission tools ‘bare’ & lets grid figure it out
  • “high-throughput user”
    • Ignores grid suggestions on sites; blast each site until jobs start piling up in ‘waiting’ state, then go to next site
  • “sneaky high-throughput user”
    • Like above but doesn’t even look at whether jobs pile up … jobs aren’t real jobs, they are ‘pilot’ jobs (supermarket approach)
  • “fast turnaround user”
    • Wants jobs to complete as soon as possible (special priority)
slide4
Connect Users to Sites with “Maximal Joint Happiness”
  • Users: work finished ASAP
  • Sites: always full and usage matches fair-share commitments
key question how long to run
Key Question: How Long to Run?
  • Users: want to submit to sites that will complete job as fast as possible
  • Sites: site may be “full” i.e. no free CPUs BUT:
    • HIKHEF 100% full for ATLAS means that
    • Any ‘SC-Grid’ jobs submitted will run as soon as a free CPU appears
    • If you can’t get this message to users, won’t get any SC-Grid jobs
  • Should be clear from this that answer to “how long” depends on who is asking!
different answers same question
Different answers, same question

dteam

ATLAS

Time to start (sec)

Real Time -> (sec)

Black lines are measured, blue triangles are statistical predictions

See Laurence’s Talk

how long to run
How Long to Run
  • Need reasonable normalized estimates from users
  • Need normalized CPU units
  • Need solution for heterogeneous CPU population behind most site’s grid entry points (HIKHEF has these)
  • Probably see Laurence’s talk here too!
  • Added value: good run-time estimates helps LRMS scheduling (eg MPI jobs & backfill)
sneaky ht vs polite users
Sneaky HT vs Polite Users
  • Polite almost always loses
  • Sneaky HT good for sites to 0th order – mix of waiting jobs allows good scheduling
  • However
    • Templon needs to run 10 jobs
    • Submits 10 jobs to each of 100 sites in grid
    • First ten to start grab the ‘real’ jobs
    • Other 990 look exactly like black hole jobs
    • Waste ~ 16 CPU hrs (2 min scheduling cycle * 500 passes)
polite users still lose unless we solve
Polite Users still Lose unless we solve:
  • One question, one answer … one size fits nobody
  • High overhead in WMS: avg 250 sec life cycle for 20 sec job!
  • Two hour job
  • Single user
  • Single RB
  • Best RB perf
  • Sched cycle is only delay at site

Grid Speedup

Number of Jobs Submitted

high priority users
High Priority Users
  • Sol’n 1: dedicated CPUs (standing reservations) (expensive!)
  • Soln’ 2: virtualization w/preemption (long way off?)
other issues
Other Issues
  • Transferring Info to LRMS
    • Run-time estimate
      • helps enormously in e.g. scheduling MPI jobs
      • Also may help in answering “the question”
    • Memory usage, disk space needs, etc etc
  • MPI & accounting – what about “the dip”?
  • Self-disabling sites (avoid hundreds of lost jobs and tens of lost person-hours)
  • “Circuit breakers”? (Miron Livny)
ad