1 / 55

Local Resource Management System & State Estimation

Local Resource Management System & State Estimation. Local resource management systems Condor, Maui, LSF, PBS Prediction techniques example NWS improve resource selection. Condor - Introduction. Batch job system that allows usage of both dedicated and non-dedicated systems.

manning
Download Presentation

Local Resource Management System & State Estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Local Resource Management System & State Estimation • Local resource management systems • Condor, Maui, LSF, PBS • Prediction techniques • example NWS • improve resource selection

  2. Condor - Introduction • Batch job system that allows usage of both dedicated and non-dedicated systems. • Provides users with extra computing power • Introduces complexities • remove jobs before they are finished (preemption) • run on a wide array of machines (matchmaking)

  3. CondorPreemptive Resume Scheduling • Advantages • use resources that are only available occasionally by the use of checkpoints, preemption and allocation • no backfilling (take advantage of holes in the schedule to run more jobs, and hereby increase efficiency) • fair sharing of jobs and towards users • compute on demand (low vs high priority)

  4. Condor – Scheduling • Submit jobs to local computer queue • Interact with matchmaker to run job (1 cpu/job) • Run appropiate (ClassAd) job by claiming it

  5. Triumvirate • User agent – make sure job finishes, on failure resubmit, etc. • Owner agent – ensure owner's policy of how computer is used, responsible for running submitted jobs • Matchmaker – find matches between user and owner agent and implement system-wide policies

  6. Triumvirate (2)

  7. Condor – Matchmaking & Claiming • User submits job to queue, unique identification • User agent sends ClassAd (5 min) until there are jobs that are not running • Owner agent sends ClassAd (5 min) to describe the computer it is responsible for • Matchmaker accepts ClassAd's and attempts to find matches – negotiation • On match, user and owner agent independently of matchmaker work out the details (up-to-date inf.) • User agent sends job to owner agent, and it runs

  8. Condor – Matchmaking & Claiming (2) • On problems outside process redo matchmaking; on program error, record problem and inform user • When program starts, another process (shadow) is started on user agent that is responsible for Condor’s remote I/O capabilities • Running jobs continue even if matchmaker fails

  9. Condor - preemption • Preemption is necessary to respect interests of all parties • Key to success is checkpoint creation • when preempted from a machine • manual checkpoint creation • periodic checkpoint creation to safeguard against failures • Crashes/disruptions happen frequently in grids • Check pointing and reacting to preemptions is an essential part of Condor’s approach to reliability.

  10. Condor – user preemption • Manual preemption • Automation of above process (eg. running time) • Preemption on behalf of Condor • eg. check if job can run on a better machine • not supported in current version of Condor • needs consideration such as ‘thrashing’ (always look for better computer, not being able to do any jobs)

  11. Condor – owner / matchmaker preemption • Owner removes job running on his machine • automated by Condor (eg. check keyboard inactivity) • manually by running a command • Matchmaker can enforce administrator policies to increase efficiency • eg. run a better job on a machine already running one • Condor strongly prefers however not to preempt jobs if they can be run on an idle machine.

  12. Condor - conclusion • Condor can balance the desires of all stakeholders • Condor can take both advantage of sporadically available resources and react to problems such as failures • This flexibility and robustness is its key to success

  13. Maui Scheduler - Introduction • High performance scheduler for local clusters • Includes resource reservation, availability estimation and allocation management • External manager, extends and enhances the capabilities and performance of existing scheduler

  14. Maui – Allocation properties • Concept of reservation to maintain resource allocations • most important feature is future allocations • set aside a block of resources for various purposes such as cluster maintenance, guaranteed job start time • resource expression: resource quantity and type conditions which must be met to include • access control list (ACL): which consumers may utilize the reserved resources • timeframe: time period over which reservation actually blocks resources

  15. Maui – Allocation properties (2) • Revocation of allocation • support for revocable and irrevocable reservations • eg. strict time constrains on data availability or job completion • default is irrevocable; reservations maintained until timeframe has expired or explicitly removed • Guaranteed completion time of allocations • locked to exact time, guaranteed to complete before certain time or guaranteed to start after given time • scheduler regularly tries to optimize

  16. Maui – Allocation properties (3) • Guaranteed number of attempts to complete a job • don’t attempt to start job until all prerequisites are met • using defer mechanism maui can specify how many times to locate resources for a job before giving up, or putting on hold • Allocation run-to-completion • configure to disable all or subset of preemptions thus guaranteeing a job to complete without interference • Exclusive allocations • request dedicated resources to guarantee exclusive access

  17. Maui – Allocation properties (4) • Malleable Allocations • all aspects can be dynamically modified • if job consumes excessive resources, Maui can preempt or even cancel job depending on the resource utilization policy

  18. Maui - Access to available scheduling info • Access to the tentative scheduler • provide information to all possible availability times • scheduler can request single estimated start time for job • Exclusive control • Maui maintains exclusive control over the execution • Event notification • generalized event management interface; respond immediately to changes in the environment

  19. Maui – Requesting resources • Allocation offers • full contextual information regarding the request and if and how Maui can satisfy this request • Allocation cost or objective information • interface with allocation management systems that assist to assign costs to resource consumption • Advance reservation • allows full control to peers over the scheduling of jobs through time • Requirement for providing maximum allocation time in advance • credential-based walltime limits can be configured based on various criteria

  20. Maui – Requesting resources (2) • Deallocation policy • support for single-step resource allocation requests; create resource allocation valid until job completion • two-phase courtesy reservation; after courtesy is sent, needs to receive a reservation commit; otherwise remove job • Remote co-scheduling • stage remote jobs to a local cluster • Consideration of job dependencies • offer basic job dependency support to block certain job steps until specific prerequisites are met

  21. Maui – Manipulating the allocation execution • Preemption • suspend operations are supported as far as that capability is available in the underlying manager • Checkpointing • ‘checkpoint and terminate’ & ‘checkpoint and continue’ are supported • Migration • support for intra-domain job migration, but no support for QoS, load balancing, or other optimization • Restart • checkpoints used if available

  22. LSF - Introduction • As a low-level scheduler • Load Sharing Facility

  23. LSF – Available-information attributes • Access to the tentative scheduler • often impractical in real-world applications, no support • Exclusive control • LSF executes in user-space, so its control is not exclusive so can only provide necessary measures • Event notification • supplies an event-notification service for high-level schedulers

  24. LSF – Available-information attributes • Access to the tentative scheduler • often impractical in real-world applications, no support • Exclusive control • LSF executes in user-space, so its control is not exclusive so can only provide necessary measures • Event notification • supplies an event-notification service for high-level schedulers

  25. LSF – Requesting resources • Allocation offers • doesn’t expose potential resource allocations • Allocation cost or objective information • unsupported • Advance reservation • provides built-in and Maui-integrated capabilites • Requirement for providing maximum allocation time in advance • high regard

  26. LSF – Requesting resources (2) • Deallocation policy • automatic • Remote co-scheduling • support by a higher-order scheduling instances • Consideration of job dependencies • built-in support for job dependencies by logical expressions based on 15 dependency conditions

  27. LSF – Allocation properties • Revocation of allocation • not needed because of resource shortness, etc. • Guaranteed completion time of allocations

  28. LSF – Allocation properties (2) • Guaranteed number of attempts to complete a job • distinguish between attempts that are execution pre-condition and execution condition with complete flexibility • Allocation run-to-completion • with implicit assumptions that allocations don’t exceed resource limits for example • Exclusive allocations • can dispatch jobs to hosts where no other LSF job is running

  29. LSF – Allocation properties (3) • Malleable Allocations • built-in mechanisms allow allocations to decay consumption over time on a per-resource basis

  30. LSF – Manipulating the allocation execution • Preemption • support since 1995, preempted workloads retain resources • Checkpointing • assuming application supports it, LSF provides interface • Migration • provide mechanism to be done by high-level scheduler • Restart • provides interface

  31. LSF - Conclusion • Supports most attributes of a low-level scheduler that can be exploited by a high-level scheduler

  32. PBS – Introduction • Portable Batch System • Flexible workload management and batch job scheduling system • Covers the entire Grid computing space: security, information, compute and data • Middleware technology that sits between compute-intensive or data-intensive applictions and the network, hardware and OS • All jobs to single virtual pool which is scheduled and distributed on the grid

  33. PBS – Security • Fundamental capabilities are secure authentication and authentication • Internally it makes use of user-name based auth • Support for X.509 Grid standard identification • certificate lifetime (expire/renew) • Identity mapping between sites is handled by a mapping function

  34. PBS - Information • Information management with access to the state of the infrastructure • Collect real-time data on state with job executor daemon process (MOMs) • Easy integration with larger Grid information databases

  35. PBS - Compute • Advance reservation support • check for conflicts • eg. reserve resources for car-crash test including computer cycles, network, database, facility • Cycle harvesting • expand available computing resources by using idle workstations • Peer scheduling • enable a site or sites with different PBS installations to automatically run jobs from eachother • no job will be moved if it cannot run immediately

  36. PBS - Data • Most basic capability of data Grid: file staging • automatic handling of copying files onto execution nodes (stage-in) prior to running job • copying files off execution nodes (stage-out) after job completes • PBS will not run jobs until stage-in is fully done • Support for Globus Toolkit, scp, Gridftp, etc.

  37. PBS – Available-information attributes • Access basic information by typing qstat • Email notification

  38. PBS – Requesting resources • Single resource solution to a job request • Estimated completion time is configurable • absence of this information however hampers peformance (needed by backfilling for example) • Job dependencies • Co-scheduling by simply configuring the queues of the system

  39. PBS – Allocation properties • Revoke any allocation both while job is queued or is running • Also possible preemption by the scheduler; choice of suspension, checkpointing, requeuing, termination • Configurable job completion attempts • Configurable exclusive allocation, etc. • No support for malleable allocation (eg. allows addition or revocation of resources during runtime)

  40. PBS - Manipulating the allocation execution • Support for requeue, restart • On preemption checkpoint generation and migration

  41. Prediction techniques • Problem of scheduling and resource allocation are central to Grid performance • Applications must balance between performance and communication overhead parallelism produces • Grid resources differ widely in performance • A resource allocator must choose right combination of resources from pool while it's constantly changing

  42. Prediction techniques (2) • Categorization into static and dynamic performance characteristics based on speed of change • static: clock speed (CPU) for example • dynamic: CPU load, network throughput

  43. Grid resource performance prediction • For a grid scheduler two characteristics can be exploited to overcome the complexities introduced by the dynamics of Grid performance response • Observable Forecast Accuracy • predictions for future performance measurements can be evaluated by recording the accuracy once the measurements are actually gathered • Near-term Forecasting Epochs • scheduler can make decisions dynamically, just before execution begins. Since accuracy usually degrades into the future, make decision at last possible moment

  44. Prediction – an example (NWS) • Provide 3 fundamental functionalities • Monitoring, Forecasting, Reporting • NWS – Network Weather Service • grid monitoring and forecasting tool designed to support dynamic resource allocation and scheduling • sensor control subsystem • historical data for future performance prediction • multiple reporting interfaces • convenient methodology for replication and caching

  45. Prediction – an example (NWS) (2) • Performance monitoring and forecasting system must be able to execute on all platforms available to the user • written in C; highest portability with standard libs • Two types of monitors (CPU probe) • passive: read measurement gathered through some other means (eg. local OS) eg. UNIX load average • non-intrusive • inaccurate? • active: load own resource and observe performance response • know exact performance • intrusive

  46. Prediction – an example (NWS) (3) • Intrusiveness vs Scalability (Network probe) • probe the network by timing packet travel duration • for more hosts, probe collision will occur, resulting in loss of bandwidth • NWS uses a token-passing method to prevent such problems

  47. Prediction – an example (NWS) (4) • Forecasting • an inherent problem of prediction. • assumptions made on what resources will be when the job runs • in Grid settings, available resource performance can fluctuate dynamically • NWS uses statistical methods to attempt to mechanize and automate forecasting based on historical data

  48. Prediction - Conclusions • Effective resource allocation and scheduling are critical to performance • Immediate performance history data is used to make implicit prediction • To be truly effective the performance gathering system must be robust, portable and non-intrusive • Overhead introduced by perf.gath. system must be carefully controlled • Using fast, robust techniques it is possible to improve accuracy of performance predictions

  49. Improve resource selection with prediction • Run time predictions • statistical analysis that have already run • automatic code analysis or instrumentation • Explanation of two techniques, both using statistical data with information provided to scheduler upon run

  50. Categorization prediction technique • Derive run time predictions from historical information based on previous similar runs • many ways to look at similar applications; application name, user, arguments, submission time, etc. • use of genetic algorithm to identify good templates (eg user+time) for a given workload • use a mean prediction type • results are an average error of 39%

More Related