Loading in 2 Seconds...
Loading in 2 Seconds...
Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research
Motivation • Interactive services specify stringent SLA on response time • Long response time causes user dissatisfaction and revenue loss • Important to bound response time (e.g. mean, 95-percentile) • Address two challenges • Adapt to dynamic and changing environment • Achieve high response quality Goal: Develop a self-managed scheduling system to meet response time target while achieving high quality.
Existing Techniques (1) • Static admission control approach • Define a fixed queue length limit; drop requests when queue is full. • Issues • Only works under a static system. • Determining an appropriate queue-length for every setting and load is challenging. • Small queue length => underutilize resources • Large queue length => long response time • Can not adapt to dynamic and changing environment.
Existing Techniques (2) • Classic feedback control approach: • Feedback control on queue length • Decrease queue length when response time is above target • Issue • Dropping requests results in degraded quality • Does not consider partial execution of requests
Partial Execution & Response Quality • Incomplete execution of requests may still return meaningful partial results • Many interactive services support partial execution • Web search, web server, video streaming, finance server • Quality profile • A function maps request execution time to response quality
Our Contributions • Propose a budget-based control model for interactive services with partial execution • Use feedback control to meet response time target • Apply optimization procedure to improve response quality • Exploit partial execution and request quality profile • Evaluation • Implementation at Bing search server • Simulation on finance server
Budget-based Control Model • Control Variable • Budget: amount of computation time for all pending requests • Control mechanism • Determine the budget based on response time feedback • Control budget to meet response time • Optimization procedure • Given a budget, assign processing time to requests • Exploit partial results of a query • Scheduling to improve quality
Control Mechanism • Basic idea • If response time is larger than target, smaller budget • If response time is smaller than target, larger budget • Criteria • Meet response time target accurately and quickly • Incur little runtime overhead.
Control Mechanism: Background • Integral control • Adjust budget based on the difference between the observed and target response time • Advantage: eliminate steady-state error • Limitation: response is slow (long settling time) • Adaptive control • Model estimator + Linear quadratic optimal controller • Advantage: quick adaptation, fast response • Limitation: computationally expensive, stead-state error
Control Mechanism: Hybrid Control • Combine the integral and adaptive control • Run adaptive control periodically in a coarse-grain time interval • Use integral control for execution of each request for fine-grain adjustment • Meet our goal • Quick and accurate adaptation • Little runtime overhead.
Optimization Procedure • Objective: maximize total response quality • Input: budget, pending requests • Output: assigned processing time to requests • Optimization procedure depends on applications
Bing index server • Core part of Bing search • For a user query, match and rank docs, return top results • Concave quality profile • First-half of request execution receives higher quality gain than the second half.
Optimization Procedure for Index Server • Run the portion of requests with higher gain • Prevent long requests from starving short ones • Combine two techniques • Reservation at light load: • Reserve time for later requests in the queue based on mean service demand • Equal sharing at heavy load: • Allocate resource equally among requests
Evaluation • Implemented and evaluated at Bing index server • Meet response time target and achieve high quality • Simulation study on finance server • Double system throughput at desired quality
Bing Index Server • Implementations • BudgetIS • Feedback control on budget • Hybrid control + optimization procedure • QueueIS • Feedback control on queue length • Evaluation • Production trace
Compare Queue v.s. Budget Approach Mean response time = 35ms • Budget approach • Meet response time accurately • Achieve high quality
Conclusion • Propose a budget-based control optimization model for interactive services with partial execution • Hybrid control mechanism to meet response time target • Optimization procedure to improve response quality • Evaluation • Implemented and evaluated at Bing index server • Meet response time target and achieve high quality • Simulation study on finance server • Double system throughput at desired quality