370 likes | 443 Views
Explore new analysis techniques for efficient server farm resource utilization and management in the era of cloud computing. Develop models to address key design decisions and optimize server performance.
E N D
Stochastic Models and Analysis for Resource Management in Server Farms Thesis Proposal VARUN GUPTA
SMART : Stochastic Models and Analysis for Resource managemenT in server farms Thesis Proposal VARUN GUPTA
SARCASM : Stochastic models and Analysis for ResourCemAnagement in Server farMs Thesis Proposal VARUN GUPTA
Server farm popularity is on the rise Supercomputers Data center pods + high compute capacity + incremental growth + fault-tolerance + efficient resource utilization + energy efficiency + high parallelism Cloud computing Array-of-Wimpy-Nodes Multi-core chips
A simple server farm “template” Design Choice 1: Which server to send a job to? Design Choice 2: Scheduling policy for backend servers? Backend servers Frontend dispatcher/router Design Choice 3: When to turn servers on/off for efficiency? Design Choice 4: How many servers to buy? Of what capacity?
A simple server farm “template” Design Choice 1: Dispatching policy Design Choice 2: Scheduling policy Backend servers Frontend dispatcher/router Design Choice 3: Dynamic capacity scaling Design Choice 4: Provisioning
Thesis Goal Stochastic modeling and analysis to answer the questions faced by server farm designers/managers Long history of stochastic modeling and analysis • Erlang (1909): Operator provisioning in telephone exchanges • Inventory/production management • Call center staffing Several gaps between traditional models and compute server farms • New constraints • New opportunities • New metrics Bridge these gaps by developing new models and analysis techniques relevant to requirements of today’s server farms
Application 1 : Web server farms/cluster computing Immediate dispatch
Application 1 : Web server farms/cluster computing PS PS PS Immediate dispatch Q: Good load balancing dispatchers? How many servers? Existing work limited to First-Come-First-Served servers or Exponential job size distribution GAP : Processor Sharing servers + high-variance job sizes 12
Model : Dispatching policies for M/G/K-PS PS K Homogeneous Servers PS Poisson arrivals ??? PS • Join-Shortest-Queue (JSQ) : most popular • Balances load • Greedy Q: Is JSQ optimal for general job size distribution? Bonomi [90] : Optimal for Exponential job size distribution when job sizes unknown Q: Analysis of JSQ for general job size distribution? 13
PS ??? PS Simulation Results RANDOM Mean Response Time Det Exp Bim-1 Weib-2 Bim-2 Weib-1 Increasing job-size variance (same mean)
PS ??? PS Simulation Results RANDOM Mean Response Time JSQ Det Exp Bim-1 Weib-2 Bim-2 Weib-1 Increasing job-size variance (same mean)
PS ??? PS Simulation Results RANDOM Mean Response Time Round -Robin JSQ Det Exp Bim-1 Weib-2 Bim-2 Weib-1 Increasing job-size variance (same mean)
PS ??? PS Simulation Results RANDOM Mean Response Time Round -Robin JSQ OPT-0 Det Exp Bim-1 Weib-2 Bim-2 Weib-1 Increasing job-size variance (same mean)
Model : Dispatching policies for M/G/K-PS PS PS K Homogeneous Servers Poisson arrivals JSQ PS Conjecture: JSQ is near-optimal (even among size-aware dispatching policies) Performance of JSQ is “nearly-insensitive” to the job size distribution 18
PS JSQ PS • Contribution 1: The Single-Queue-Approximation • Goal : Approx. for mean response time under Exponential job sizes • Compensate for the effect of other queues via state-dependent arrival rates • λ(n) easier to approximate (only need to worry about λ(1), λ(2)) • < 2% error in mean response time for up to 64 servers M/M/K-JSQ/PS Mn/M/1/PS ≈ λ(n) PS λ(n) = state-dependent arrival rate [Performance’07]
Contribution 2: Many-server heavy-traffic analysis (PROPOSED) • Goal 1: Quantify the “near-insensitive” behavior • Goal 2: Optimal dispatching policies for heterogeneous servers • Hard to prove anything in general, must resort to limiting regimes • The many-server heavy-traffic scaling • Shows the effect of job size variability • Intuition into behavior of JSQ PS PS PS PS λ = K - constant JSQ K → ∞
Application 2 : Energy-Performance trade-off in Data centers/Cloud computing Q: When to turn servers ON/OFF to adapt to demand? Existing work assumes zero setup delays, knowledge of future demand pattern GAP : setup penalties non-zero + unpredictable demand patterns
Model : Dynamic capacity scaling in M/M/∞ with setup delays ON ON Poisson arrivals SETUP First-In-First-Out buffer DELAYEDOFF is asymptotically optimal OFF • Contribution: New traffic-oblivious policy DELAYEDOFF • Servers turn off after idle for twait • If arrival sees all servers busy, turn a new server ON • Most-Recently-Busy (MRB) dispatching: send job to server which idled last • Theorem: Under DELAYEDOFF, as the load , the number of ON servers is concentrated around [Performance’10]
Simulation Results for DELAYEDOFF PROPOSED: Refine DELAYEDOFF, prove performance guarantees
Application 3 : Fully replicated databases First-In-First-Out buffer Q: How many servers, and what speed? No exact analysis, approximations good for low job size variance GAP : Job sizes have very high variance
Application 3 : Fully replicated databases Poisson arrivals First-In-First-Out buffer
Model : M/G/K/FCFS Poisson arrivals squared coeff. of variation of job size dist. typically C2 > 20 First-In-First-Out buffer The Holy Grail of queueing theory (model for many other applications) yet no exact analysis! Lee-Longton [1959] :
Contribution 1: Inapproximability results • Goal: No accurate approx. based only on first 2 moments • Pick a subclass of distributions • Analytically tractable • Large enough to fix 2 moments, but wiggle room to prove gap Lee-Longton Approximation {G | 2 moments} E[Delay] H2 Increasing 3rd moment → [QUESTA’10]
Contribution 2: Tight moment-based bounds • Goal: Better approximation using n moments? • [QUESTA ’10] : Conjectured extremal distributions • M/G/K/FCFS under light-traffic • Extremality should be invariant to load • Verify conjectures for n = 2,3 • Also for other queueing systems with no exact analysis {G | n moments} tight bounds | n moments ? ? , 4, 5, 6… proposed work E[Delay]
Application 5 : Managing VMs in the cloud 500MB 1GB 1GB 2GB 1.5GB 1GB
Application 5 : Managing VMs in the cloud 500M 1G 500M 250M Q: Which server to start VM on? What capacity servers to buy? Assumption of permanent items GAP : VMs depart + VM migration possible Contribution : Stochastic bin packing model with job departure/migration PROPOSED: develop packing/migration schemes for efficient packing
Application GAP Status Proposed Work • Optimal dispatch policies for No analysis Web server 70% Completed heterogeneous servers for PS server • Characterizing “near - farms Nov ‘10 - Jan ‘11 farms insensitivity” Speed Setup • Refine the proposed Energy 80% Completed penalties + DELAYEDOFF policy management in unpredictable • Performance guarantees for Feb - Mar ’11 Data centers # jobs at server demands traffic - oblivious capacity scaling VM VM VM PS Verifying conjectures on tight Fully replicated DBs High variance JSQ Completed moment - based bounds beyond in job sizes PS the scope of the thesis M/G/K ON Database Thrashing Completed servers SETUP OFF Develop and analyze heuristics VM migration 10% Completed for online stochastic bin packing Dispatch VM management and with item departures and Oct’ 10 - Jan ’11 departures migrations Expected Graduation: MAY 2011
References Other Work