1 / 13

SPRUCE S pecial PR iority and U rgent C omputing E nvironment Advisor Demo

SPRUCE S pecial PR iority and U rgent C omputing E nvironment Advisor Demo. http://spruce.teragrid.org/. Nick Trebon University of Chicago Argonne National Laboratory. Urgent Computing Resource Selection. Given an urgent computation and a deadline

karma
Download Presentation

SPRUCE S pecial PR iority and U rgent C omputing E nvironment Advisor Demo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SPRUCESpecial PRiority and Urgent ComputingEnvironmentAdvisor Demo http://spruce.teragrid.org/ Nick Trebon University of Chicago Argonne National Laboratory

  2. Urgent Computing Resource Selection • Given an urgent computation and a deadline how does one select the “best” resource? • “Best”: Resource that provides the configuration most likely to meet the deadline • Configuration: a specification of the runtime parameters for an urgent computation on a given resource • Runtime parameters: # cpus, input/output repository, priority, etc. • Cost function (priority): • Normal --> No additional cost • SPRUCE --> Token + intrusiveness to other users

  3. Probabilistic Bounds on Total Turnaround Time • Input Phase: (IQ,IB) • Resource Allocation Phase: (AQ,AB) • Execution Phase: (EQ,EB) • Output Phase: (OQ,OB) • If each phase is independent, then: • Overall bound = IB + AB + EB +OB • Overall quantile ≥ IQ * AQ * EQ *OQ

  4. IB+AB+EB+OB: File Staging Delay • Methodology is the same for input/output file staging • Utilize Network Weather Service to generate bandwidth predictions based upon probe • Wolski, et al.: Use MSE as sample variance of a normal distribution to generate upper confidence interval • Problems? • Predicting bandwidth for output file transfers • NWS uses TCP-based probes, transfers utilize GridFTP

  5. IB+AB+EB+OB: Resource Allocation Delay • Normal priority (no SPRUCE) • Utilize the existing Queue Bounds Estimation from Time Series (QBETS) • SPRUCE policies • Next-to-run • Modified QBETS - Monte Carlo simulation on previously observed job history • Pre-emption • Utilize Binomial Method Batch Predictor (BMPB) to predict bounds based upon preemption history • Other possibilities: elevated priority, checkpoint/restart

  6. Resource Allocation Concerns • Resource allocation bounds are predicted entirely on historical data and not current queue state • What if necessary nodes are immediately available? • What if it is clear that the bounds may be exceeded? • Example: Higher priority jobs already queued • Possible solution: Query Monitoring and Discovery Services framework (Globus) to return current queue state

  7. IB+RB+EB+OB: Execution Delay • Approach: Generate empirical bound for a given urgent application on each warm standby resource • Utilize BMBP methodology to generate probabilistic bounds • Methodology is general and non-parametric

  8. Calculating the Composite Probability • Goal: to determine a probabilistic upper bound for a given configuration • Approximate approach: query a small subset (e.g., 5) of quantiles for each individual phase and select the combination that results in highest composite quantile with bound < deadline

  9. Advisor Demo Screenshots

More Related