1 / 59

Job Scheduling: Market-based and Proportional Sharing

Job Scheduling: Market-based and Proportional Sharing. Richard Dutton CSci 780 – P2P and Grid Systems November 22, 2004. Importance. Computing and storage resources are being shared among many users Grids, utility computing, “cluster utilities”

Michelle
Download Presentation

Job Scheduling: Market-based and Proportional Sharing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Job Scheduling: Market-based and Proportional Sharing Richard Dutton CSci 780 – P2P and Grid Systems November 22, 2004

  2. Importance • Computing and storage resources are being shared among many users • Grids, utility computing, “cluster utilities” • Sharing can improve resource efficiency and flexibility and bring more computing power • Difficulty comes in controlling resource usage

  3. Outline • Market-based framework • Motivation/Problem • Characteristics of approach • Interposed Proportional Sharing • Goals • Methodology • Experimentation/Results • Conclusions • Discussion

  4. Market-based approach

  5. Motivation • Grids enable sharing of resources • Users have access to more resources • Resource usage must be arbitrated for competing demands • Both parties (resource provider and consumer) must benefit • Provider – effective global use of resources • Consumer – fairness, predictable behavior, control over relative priority of jobs

  6. Motivation(2) • Why market-based approach? • Large scale of resources and participants in Grid • Varying supply and demand • Market-based approach is good because • Provides decentralized resource management • Selfish consumers lead to global goal • User gets jobs done quickly, providers efficiently delegate resources • Laissez faire

  7. Ideas from batch systems • Batch systems incorporate: • User priority • Weighted proportional sharing • SLAs for setting bounds on available resources • Additionally, market-based approach will use relative urgency and cost of jobs

  8. Value-based system • Value-based (sometimes called user-centric) scheduling allows users to define a value (yield or utility) to each job • System is trying to maximize total value of jobs completed rather than simply meeting deadlines or reaching a certain throughput • Users bid for resources and pay with the value (value  currency) • System sells to highest “bidder” to maximize profits

  9. Risk vs. Reward • Focus here is scheduling in grid service sites • Since price is derived from completion time, scheduler must take into consideration length of a task with its value and opportunity cost • What this means: scheduler must balance the risk of deferring a task with the reward of scheduling the task

  10. Example: Market-Based Task Service • Tasks are batch computation jobs • Self-contained units of work • Execute anywhere • Consume known resources • Tasks give some value upon completion • Tasks associated with value function – gives value as function of completion time

  11. Example: Market-Based Task Service • Characteristics of a market-based task service • Negotiation between customers and providers • Value  price and quality of service  completion time • Form contracts for task execution • Not meeting terms of the contract implies a penalty • Consumers look for the best deal and each site attempts to maximize its profits

  12. Market Framework Bid (value, service demand) Accept (completion time, price) Accept (contract) Customer Bid (value, service demand) Reject Bid (value, service demand) Accept (completion time, price) Reject Task Service Sites

  13. Goals • Develop policy choices for the task service sites to maximize profits • Acceptance – admission control • Scheduling • Use value metric to balance risk and reward in bidding and scheduling • Not concerned with currency supply, pricing systems, incentive mechanisms, payment…

  14. Value functions • Negotiation between site and bidder establishes agreement on price and QoS • Value function maps service quality (completion time) to value • Want the formulation to be “simple, rich, and tractable” • Generalization of linear decay value functions from Millenium • Expresses how value of task degrades with time – decayi

  15. Maximum Value Runtime Value Penalty Time Value function

  16. Decisions • Which tasks to admit? • When to run an admitted task? • Wish to maximize profit • How much should a task be charged? • Based on value functions • Must find highest priced tasks and reject those which do not meet some minimum levels

  17. Experimental Methodology • Simulator to allow bidding and schedule according to a task service economy with linear value functions • Use synthetic traces that are representative of real batch workloads • Compare against FirstPrice from Millenium • Concerned with relative performance and sensitivity analysis of using value and decay

  18. Risk/Reward Heuristics • Discounting Future Gains • Leans toward shorter tasks – less likely to be preempted • Realizes gains more quickly with short tasks – risk-averse scheduler • Opportunity Cost – takes into account the slope of decay • Leans toward more urgent tasks • If all tasks must be completed, it is best to complete most urgent tasks first

  19. Discounting Future Gains • Based on Present Value from finance • PVi = yieldi / (1 + (discount_rate * RPTi)) • PVi can be thought of as investment value • Interest is earned at discount_rate for Remaining Processing Time (RPT) • High discount_rate causes the system to be more risk-averse • HeuristiccalledPV selects jobs in order of discounted gain PVi/RPTi

  20. Improvement for PV

  21. Opportunity Cost • Extendedheuristictoaccountforlossesfromopportunitycost • Loss in revenue from choosing some task i before task j • Opportunity cost to start i is given by aggregate loss of all other competing tasks • Bounded penalties require O(n2) time • Unbounded penalties computed in O(log n)

  22. Balancing Gains and Opportunity Cost • It is risky to defer gains from high-value task based solely on opportunity cost • Solution: FirstReward • rewardi = ((α)*PVi – (1-α)*costi)/RPTi • The αparameter controls how much system considers expected gains • α=1 and discount_rate =0 reduces FirstReward to PV

  23. Bounded Penalties • Shows it is more important to consider costs than gains  low alpha • Most effective around α=.3

  24. Unbounded Penalties • Shows it is ONLY important to consider costs, not gains • Magnitude of improvements much greater

  25. Negotiation • Client submits task bids • Site accepts/rejects bid • If site accepts, it negotiates to set a price and completion time

  26. Admission Control • Steps for proposed tasks • Integrate task into candidate schedule according to FirstReward • Determine yield for the task if accepted • Apply acceptance heuristic to determine acceptability • If accepting, issue a bid to the client • If client accepts the contract, place task into schedule to execute • Acceptance heuristic based on amount of additional delay the task can allow before its value falls below some yield threshold

  27. Summary of Market-based Services • Develops heuristics for scheduling and admission control in market-based grid task service • Value-based scheduling allows user to specify the value and urgency of the job • Maximizing user value in turn maximizes yield globally • Approach based on computational economy

  28. Interposed Proportional Sharing

  29. Overview • This paper deals with share-based scheduling algorithms for differentiated service in network services, in particular storage service utilities • Allows a server to be shared among many request flows with some probabilistic assurance of receiving some minimum share of resources

  30. Situation • Sharing of resources must be fair • SLA’s often define contractual obligations between client and service

  31. Goals of This Research • Performance isolation • A surge from one flow should not degrade the performance of another flow • Differentiated application service quality • Performance should be predictable and configurable • Non-intrusive resource control • Designed to work without changes to existing services, like commercial storage servers • Views server as a black box • Control server resources externally

  32. Idea in Words • As the name suggests, the idea is to interpose a request scheduler between the client and server • The scheduler will intercept requests to the server • Depending on the request and state of previous requests, it will delay, reorder, or simply dispatch the request

  33. Idea in Pictures

  34. Interposed Request Scheduling • Scheduler intercepts requests • Dispatches according to some policies seeking to fairly share resources among all flows • Parameter D limits maximum number of outstanding requests • Each flow has separate queue • Scheduler dispatches from each queue on FIFO basis

  35. Related Approaches • Façade proposes an interposed request scheduler that uses Earliest Deadline First(EDF) • Drawback: unfair – cannot provide performance isolation • Uses priority scheduling to achieve isolation • SLEDS – per-client network adapter • Uses leaky bucket filter to shape and throttle I/O flows • Not work-conserving

  36. Proportional Sharing • Proposes 3 proportional sharing algorithms • SFQ(D) – Start-time Fair Queuing • FSFQ(D) – Four-tag Start-time Fair Queuing • RW(D) – Request Windows which are general and configurable solutions that provide • Performance isolation • Fairness • Work-conservation

  37. Fair Queuing • Each flow is assigned a weight Φf • Resources allocated among active flows in proportion to weight • A flow is active if it has at least 1 outstanding request • Fair: Proven property bounding difference in work done for any pair of active flows (lag) • Work-conserving: surplus resources consumed by active flows without penalty

  38. Start-time Fair Queuing • Start-time Fair Queuing(SFQ) is the basis for the scheduling algorithms due to fairness • SFQ assigns a tag to each request upon arrival and dispatches the requests in ascending order of tags • Fairness stems from method of computing and assigning tags

  39. SFQ • Assigns a start tag and finish tag for each request • Start tag: • Finish tag: • Defines a system notion of virtual time v(t) that advances as active flows progress • For example, v(t) advances quickly with less competition in order to use surplus resources

  40. SFQ • Start tag of a flow’s most recent request acts as the flow’s virtual clock • Flow with small tag value is behind and will receive priority • Flow with large tag value are ahead and may be held back • However, newly active flows will have their tag values set by v(t) so that there is fair competition between all active flows • Drawback: traditional SFQ [specifically v(t) ] does not work well in the face of concurrency

  41. Interposed Proportional Sharing • Goal: use a variant of SFQ for interposed scheduling which handles up to D requests concurrently • Ideal goal: the interposed scheduler can dispatch enough jobs concurrently to completely use resources • Scheduler wants to always have D concurrent outstanding requests • This value D represents a tradeoff between server resource utilization and scheduler fairness • Example: large D allows server to always have jobs waiting, but also increases the wait time for incoming requests

  42. Min-SFQ • Adaptation of SFQ which defines v(t) as the minimum start-tag of any outstanding request • Issue: • v(t) advances according to slowest active flow f • Sudden burst from f will penalize aggressive flows which are using surplus from f’s idle resources • If v(t) lags behind, it degrades down to Virtual Clock algorithm • If v(t) gets too far ahead, becomes FIFO • Both known to be unfair

  43. SFQ(D) • Goal is to advance v(t) fairly • Solution1: derive v(t) from active flows, not lagging flows • v(t) is defined as the start tag of the queued request with the lowest start tag at the last dispatch • Still uses the initial rules for determining the tags

  44. SFQ(D) • Since the algorithm for dispatching is strictly SFQ, earlier properties still hold • Fairness • Bound on lag for different flows • Authors prove that SFQ(D) has fairness and lag bounds for requests completed • Client’s view of fairness

  45. SFQ(D) • Problems • v(t) advances monotonically on request dispatch events, but may not advance on every dispatch • Therefore, bursts of requests may get the same start tag regardless of being behind or ahead • It is most fair for the scheduler to be biased in these situations against flows that have been using surplus resources • Realization: MinSFQ doesn’t suffer from this

  46. FSFQ(D) • Refinement of SFQ(D) that favors slow flows over ones that are ahead • Four-tag Start-time Fair Queuing • Combines fairness policies of SFQ(D) and MinSFQ(D) • Adds two new “adjusted “tags derived from MinSFQ(D) • The new tags are used to break ties in favor of lagging flows

  47. Problems • SFQ(D) and FSFQ(D) require a central point of interposition to intercept and schedule all requests • Made for network switch or router • Introduces single point of failure and complexity • Scheduling overhead grows at best logarithmically – limits scalability • Authors propose simple decentralized approach called Request Windows (RW)

  48. Request Windows • Credit-based server access scheme • Interposed at the client • Each flow i is given a number of credits ni • Each request from i uses a portion of i’s credit allocation • For a given flow i, it’s portion of the total weight D is seen as

  49. Request Windows

  50. Request Windows • Pros • Under light load, a flow will encounter little congestion and complete quickly • Similar to self-clocking nature of TCP • Cons • RW is not fully work-conserving • Yields tight fairness bound, but may limit concurrency and ability to use surplus resources • As with SFQ(D), able to prove bound on lag between active flows

More Related