1 / 27

Model-Based Resource Provisioning for a Web Service Utility

Model-Based Resource Provisioning for a Web Service Utility. Ron Doyle*, Jeff Chase , Omer Asad, Wei Jin, Amin Vahdat Internet Systems and Storage Group Department of Computer Science Duke University. *. Internet Service Utilities. Shared server cluster Web hosting centers

kimn
Download Presentation

Model-Based Resource Provisioning for a Web Service Utility

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model-Based Resource Provisioning for a Web Service Utility Ron Doyle*, Jeff Chase, Omer Asad, Wei Jin, Amin Vahdat Internet Systems and Storage Group Department of Computer Science Duke University *

  2. Internet Service Utilities • Shared server cluster • Web hosting centers • Shared reserve capacity to handle surges and failures. • Service/load multiplexing • Dynamic provisioning • Service is contractual • Performance isolation • Differentiated service • SLAs

  3. Utility Resource Management • Goal: meet contractual service quality (SLA) targets under changing load; use resources efficiently. • Approach: assign each hosted service a dynamic “slice” of resources. • Combine “slivers” of shared servers, i.e., CPU time and memory. • Resource containers [Banga99], VMware ESX [Waldspurger02], PlanetLab • Assign shares of storage server I/O throughput. • Given the mechanisms for performance isolation and proportional sharing, how do we set the knobs?

  4. Adaptive Multi-Resource Provisioning • This work addresses resource allocation policy for multiple resources, with a focus on memory & storage. • 1. Provisioning: how much? [Muse SOSP01] • 2. Assignment: which servers and storage units? Actuator (directives) clients Utility OS executive or service manager Utility data center Monitor (observations)

  5. Model-Based Provisioning • Resources interact in complex ways to determine overall service performance. • Incorporate a model of application behavior. • Model predicts effects of candidate allotments. • Plan allotments that are predicted to yield desired behavior. • Monitor load and adapt as load intensity varies. Resource manager performance predictions candidate allotments Application models workload profiles (e.g., access locality) storage models

  6. Goals • Research question: how can a resource manager incorporate these models when they exist? • Manage multiple resources for diverse system goals. • Meet SLA targets for response time • Use surplus to optimize global average response time, yield, or value. • Adjust to constraints discovered during assignment. • Storage-aware caching [Forney03] • Demonstrate that even simple models are a powerful basis for dynamic resource management.

  7. Non-goals • We are NOT trying to: • build better models (you can plug in your favorite) • parameterize or adapt models online from system observations • manage network bandwidth • schedule resources within each slice • solve the assignment problem (bin-packing) • allocate resources across the wide area • make probabilistic performance guarantees • Assume stable average case behavior at each load level, and provision for average response time.

  8. System Context Load and performance measures configuration commands MBRP clients reconfigurable redirecting switch offered load λ per service server pool stateless interchangeable storage tier Muse [SOSP01]

  9. Enforcing Slices • Our prototype uses the Dash Web server [Asad02] to enforce resource control for slices at user level. • Based on Flash [Pai99] using DAFS network storage. • Asynchronous I/O from user space to user-level cache • Low overhead (zero-copy, etc.), and user-level control • Fully asynchronous, event-driven server • “SEDA meets Click.” • Independently size caches for co-hosted services. • Request Windows [Jin03]: control the number of outstanding I/Os on a per-service basis. • Dash is part of the utility’s trusted computing base.

  10. A Simple Web Service Model • Streams of requests with stable average case behavior per request class • Varying load intensity λ • Provision each stage, and M • Downstream demand grows and shrinks with M (inverse) • Bottlenecks limit demand downstream • Generalize to stages or tiers arrival rateλ CPU Object cache (M) λS Storage M yields hit rate H λS= λ(1 –H)

  11. 1 – M1 – α H = -------------- 1 – T1 – α Web Cache Model H • Footprint T objects • Average size S • Size is independent of popularity • Cache M objects • Given Zipf popularity  • LFU approximation • Integrate over the Zipf PDF Cache Size (M)

  12. Storage Arrival Rate (IOPS) • Each miss requires S I/O operations. • S determines intensity of bulk I/O in this service’s storage load. • Model predicts storage response time RSfor load λS given an IOPS share  per-service. • Account for prefetching and sequential locality indirectly. λS λs = λS(1 – H) Cache Size (M)

  13. An Example using Dash • IBM 2001 segment • Load λ grows during trace segment. • Dynamic cache resizing • Storage IOPS demand λS matches model prediction (squint) • A few transient shifts in request locality

  14. A Model-Based Allocator • MBRP is a package of three primitives that coordinate with an assignment planner. • Candidate • Plan an initial allotment vector with CPU share and [M, ] • LocalAdjust • Adjust a vector to adapt to a resource constraint or surplus, while staying on target for response time. • GroupAdjust • Modify a set of vectors to adapt to a fixed resource constraint or surplus exposed during assignment. • Use any surplus to meet system-wide goals.

  15. Candidate • There is a large space of possible allotment vectors to meet a given response time target. • Simplify the search space with a simple principle: Build a balanced system. • Set the CPU share and storage allotment  to hit a preconfigured target utilization level . • The  determines response time at storage and CPU. • Select the minimum M and H that can hit the SLA target for overall response time. • Refine  based on M and H and resulting λS. • Converges quickly.

  16. LocalAdjust Candidate LocalAdjust • LocalAdjust adapts to constraint in one resource by adding more of another. • Take as much as you can of the constrained resource, then rebalance to meet SLA target. • E.g., in this graph it grows memory to respond to an IOPS constraint. • Note: it’s not linear.

  17. GroupAdjust • Input: set of allotment vectors, with a group constraint or surplus. • E.g., planner mapped all vectors to a shared server, leaving surplus memory. • Adapt vectors to conform to constraint or use the surplus to meet a global goal. • E.g., for services with the same profiles (, S, T), prefer the service with the heaviest load.

  18. Example: Differentiated Service • Four identical services: • same load λ • same profiles (, S, T) • same storage units • Different SLA targets. • Provision memory to meet targets first, then optimize global response time. • (Give next unit of surplus memory to the most constrained service.)

  19. Some Other Results in the Paper • 1. GroupAdjust for services with different profiles and equivalent loads: prefer higher-locality services. • 2. Simple dynamic example to optimize for global response time in a storage-aware fashion. • 3. “Putting it all together” experiment: adjust to changes in locality, SLA targets, and available resources as well as changes in load. • 4. Handle overload by shifting a co-hosted service to another server (bin-packing assignment). • 5. Preliminary evaluation of storage model.

  20. Conclusion • Models are important for self-managing systems. • MBRP shows how to use models to adapt proactively. • Respond proactively to changing load signal, rather than reacting to off-target performance measures. • It’s easy to plug better models into the framework. • It seems clear that we can generalize this. • Broader class of systems (e.g., multi-tier) and system goals (e.g., availability). • But: models may be brittle or just plain wrong (HAL). • Self-managing systems will combine proactive and reactive mechanisms.

  21. http://issg.cs.duke.edu http://www.cs.duke.edu/~chase

  22. Utility Center with Distributed Servers and Storage Assignment Assignment Planning • Map services to servers and storage units • Allocator primitives work in concert with assignment planning • Bin-packing services, balancing affinity, migration costs, local constraints/ surplus

  23. Related Work • Proportional-share schedulers: mechanism to enforce provisioning policies. • Resource Containers[Banga99], Cluster Reserves[Aron00] • Response-time schedulers: meet SLA targets without explicit partitioning/provisioning. • Neptune[Shen02], Facade [Lumb03] • Adaptive Resource Management for Servers: reactive, feedback-based adjustment of server resources. • Web Server Performance Guarantees[Abdelzaher02], Predictable Web Server QoS[Aron-PhD], SEDA[Welsh01] • Memory/storage management: goal-directed allotment of resources to services. • Storage Aware Caching[Forney02], Value Sensitive Caching [Kelly99], Hippodrome[Anderson02]

  24. Multiple Shared Resources • Bottleneck Behavior • Non-bottleneck resource adjustments have little effect. • Global Constraints • Services compete for resources in zero-sum game • Local Constraints • Service assignment to nodes exposes local resource constraints. • Caching • Memory allotment affects storage load for single service, impacting available resources for other services

  25. Adaptive Resource Provisioning • Utility OS Services • Predictable average-case response time • Resource intensive • Workload Models predict • Resource Demand • Resource Interaction • Effect of allotment decisions • Framework is reactive to changes in workload characteristics for dynamic adaptation

  26. Outline • Overview • Resource control mechanisms • Web Service Models • Model-Based Allocator • Conclusions

More Related