Server Consolidation

Server Consolidation XiujiaoGao xiujiaog@buffalo.edu 12/02/2011

Overview • Introduction • Server consolidation problems and solutions • Static Server Allocation Problems (SSAP) and its extensions[1] • Shares and Utilities based Server Consolidation[2] • Server Consolidation with Dynamic Bandwidth Demand[3] • Conclusion

Introduction • Server Consolidation • The process of combining the workloads of several different servers or services on a set of target (physical) servers • The Gartner Group estimates that the utilization of servers in datacenter is less than20 percent. Server Consolidation

Introduction • Server Virtualization • Provide technical means to consolidate multiple servers leading to increased utilization of physical servers • Virtual machine appears to a “guest” operating system as hardware, but it is simulated in a contained software environment by the host system • Reduced time for deployment, easier system management— lower hardware and operating costs

Overview • Introduction • Server consolidation problems and solutions • Static Server Allocation Problems (SSAP) and its extensions[1] • Shares and Utilities based Server Consolidation[2] • Server Consolidation with Dynamic Bandwidth Demand[3] • Conclusion

SSAP and its Extensions [1] • Decision problems • Available data in data centers • Problem Formulation • Complexity and Algorithms • Experimental Setup • Simulation Results

Decision Problems • It applies to three widespread scenarios • Investment decision • Operational costs(i.e. energy, cooling and administrative cost) • Rack of identical blade servers (which subset of servers to use) • Minimize the sum of server costs in terms of purchasing, maintenance, administration or sum of them (cihas different meanings)

Available Data in Data Centers • Date centers reserve certain amounts of IT resources for each single service or server • CPU capacity — SAPS or HP computons • Memory —Gigabyte • Bandwidth —Megabits per second • Resource demand has seasonal patterns on a daily, weekly or monthly basis • large set of workload traces from their industry partner http://doi.ieeecomputersociety.org/10.1109/TSC.2010.25

An Example of Available Data

Available Data in Data Centers • Workloads traces can change in extended time periods • IT service managers monitor workload developments regularly • Reallocate servers if it is necessary • Models for initial and subsequent allocation problems

Problem Formulation • Static Server Allocation Problems (SSAP) • Static Server Allocation Problem with variable workload (SSAPv) • Extensions of the SSAP • Max-No. of services Constraints • Separation Constraints • Combination Constraints • Technical Constraints and Preassignment Constraints • Limit on the number of reallocations

SSAP • n services jϵJ that are to be served by m servers iϵI • Different types of resources k ϵK • Serverihas a certain capacity sikof resource k • cidescribes the potential cost of a server • Service j ordersujkunits of resource k • yi are binary decision variables indicating which servers are used • xijdescribes which service is allocated on which server

SSAP The SSAP represents a single service’s resource demand as constant over time (side constraints 2)

SSAPv • Consider variations in the workload • Time is divided into a set of intervals T indexed by t={1,….r} • ujkt describes how much capacity service j requires from resource type k in time interval t • ujktdepend on the load characteristics of the servers to be consolidated

Extensions of SSAP • Max No. of Services Constraints • Separation Constraints • Combination Constraints • Technical Constraints • Limits on the number of reallocations

Complexity and Algorithms • SSAP is strongly NP-hard • A straightforward proof by reducing SSAP to the multidimensional bin packing problem (MDBP) http://doi.ieeecomputersociety.org/10.1109/TSC.2010.25 • NP-hard does not necessarily mean that it is intractable for practical problem sizes • Which problem sizes can be solved exactly and how far one can get with heuristic solutions, both in terms of problem size and solution quality

Complexity and Algorithms • Polynomial-time approximation schemes (PTAS) with worst-case guarantees on the solution quality of MDBP have been published. • The first important result was produced in C. Chekuri and S. Khanna, “On Multi-Dimensional Packing Problems,” Proc. ACM-SIAM Symp. Discrete Algorithms, pp. 185-194, 1999 • For any fixed ε>0, delivers a approximate solution for constant d (d is dimension of MDBP) Two steps

Algorithms for MDBP • First step • Solves linear programming relaxation :make fractional assignments for at most dm vectors in d dimensions and m bins • Second step • The set of fractionally assigned vectors is assigned greedily—find the largest possible set

Algorithms for SSAP(v) • SSAP with only one source • Branch & Bound (SSAP B&B) • First Fit (FF) • First-Fit Decreasing (FFD) • SSAPv • Branch & Bound (SSAPv B&B) • LP-relaxation-based heuristic (SSAPv Heuristic) • Use the results of an LP-relaxation • Use an integer program to find an integral assignment (Compared to the PTAS)

Algorithms for SSAP(v) • For SSAP B&B, SSAPv B&B and SSAPv Heuristic, the number of servers used does have a significant impact on the computation time • Each additional server increases the number of binary decision variables by n+1 • Use specific iterative approach to keep the number of binary variables as low as possible Lower bound number of servers Same capacity s Fractional allocation of services

Algorithms for SSAP(v) • Start to solve the problem with m being the LB • If the problem is infeasible, m is incremented by 1 • Repeat until a feasible solution is found • The first feasible solution found in B&B search tree is obviously an optimal solution

Experimental Data • Experimental Data (3 consecutive months measured in intervals of 5minutes) • 160 traces for the resource usage of Web/Application/Database servers (W/A/D) • 259 traces describing the load of servers exclusively hosting ERP applications • Resources demands are in terms of CPU and memory • Strong diurnal seasonality with nearly all servers and some weekly seasonality • CPU is the bottleneck resource for these types of applications • CPU demand of ERP services is significantly higher than W/A/D

Data Preprocessing • Data Preprocessing: discrete characterization of daily patterns in the workload traces and solve the allocation problem as a discrete optimization problem • Two- step process to derive the parameters ujktfor our optimization models from the original workload traces ujktraworiginal workload traces ujktan estimator from the set of ujktraw

Data Preprocessing • First step • Derive an estimator for each interval A day as a period of observation p number of periods contained in the load data (p=92) ϒ’ intervals in a single period (ϒ’=288) Deriveujkt from the above distribution

Data Preprocessing Y-axis captures a sample of about 92values Risk attitude : 0.95-quantile of Ujktis an estimator for the resource requirement of service j where 95percent of requests can be satisfied

Data Preprocessing • Second step • Aggregate these intervals to reduce the number of parameters for the optimization

Experimental Design • Experimental Design • Model (SSAP and SSAPv) • Algorithms (B&B, Heuristic, FF,FFD) • Service type (W/A/D, ERP) • Number of services • Server capacity (CPU only) • Risk attitude • Number of time intervals considered in SSAPv • Sensitivity with respect to additional allocation constraints

Experimental Design • Experimental Design • lp_solve 5.5.9 :revised simplex and B&B • COIN-OR CBC branch-and-cut IP solver with the CLP LP server • Java 1.5.0 : FF and FFD • Time out is 20 mins (already up to 700 servers)

Simulation Results • Computation time Depending on Problem Size • Examine 24 time intervals • 95th percentile of 5-minute intervals • 5000 SAPS server capacity • For each of different numbers of services, 20 instances have been sampled(with replacement) • Different number of services—x-axis • Computation time—y-axis • Proportion of solvable instances within 2o mins—y-axis

Computation time Depending on Problem Size

Proportion of solvable W/A/D instances

Proportion of solvable ERP instances Solve much smaller instances compared with W/A/D services with 20mins

Solution Quality Depending on Problem Size Computed number of required servers exceeds the lower bound number of servers Refer to this excess ratio Q as solution quality The closer Q is to 1, the better the solution is

Solution Quality Depending on Problem Size

Solution Quality Depending on Problem Size • W/A/D • ERP D

Impact of Risk Attitude on Solution Quality • Previous simulation assumed the decision maker to select 95th percentile in data processing • Percent of the historical service demand would have been satisfied without delay at this capacity • Risk attitude • Actual overbooking of server resources (aggregate demands) • More conservative estimate (reduction in variance) • Analysis of capacity violations • 10 different consolidation problems of 250 W/A/D services • Quantiles :0.4.0.45….1 • Use SSAPv B&B

Impact of Risk Attitude on Solution Quality

Influence of the Interval Size SSAP SSAPv

Influence of Additional Allocation Constraints • Up bound on the number of services per server • The number of servers increases • Computation time increases • Combination and separation constraints • Little effect on the solution quality • Negative impact on computation time • Technical constraints • Little effect on the number of servers needed • Computation time decreases

Overview • Introduction • Server consolidation problems and solutions • Static Server Allocation Problems (SSAP) and its extensions[1] • Shares and Utilities based Server Consolidation[2] • Server Consolidation with Dynamic BandwidthDemand [3] • Conclusion

Shares and Utilities based Server consolidation [2] • Min, max and shares • Problem formulation • Algorithms • Basic Overprovision (BO) • Greedy Max (GM) • Greedy Min Max (GMM) • Expand Min Max (EMM) • Power Expand Min Max (PEMM) • Hypothetical Upper Bound Algorithm (HUB) • Experimental Evaluation

Min, Max and Shares • Not all the applications are created equal. • Different priority • High priority applications : e-commerce web server • Low priority applications: the intranet blogging server • Different resource affinities • Ex : web server may value additional CPU cycles much more than a storage backup • Under situation of high load, CPU resources are best to allocated to higher utility application-web server

Min, Max and Shares • Take advantage of the Min, Max and Shares parameters • Min: ensure VM receive at least that amount of resources when it is power on • Max: ensure low priority application does not use more resources and keep them available for high priority applications • Shares: provide advice to the virtualization scheduler distribute resources between contending VMs (shares ratio of 1:4)

Min, Max and Shares Impact Experiment 3 Vmware ESX servers 12 VMs (6 low priority and 6 high priority) • Low load: desire 35% of the total available CPU • High load: desire 100% of the total available CPU • Under high load conditions, MMS delivers 47% more utility than BASE

Problem Formulation • The set of VMs • Vi.mminimum resources needed (CPU only) • Vi.Mmaximum resources needed (CPU only) • Vi.uutility derived from the VM when it is allocated Vi.m • Vi.Uutility derived from the VM when it is allocated Vi.M • The set of physical servers • Cjthe CPU capacity of the server Sj • Pjpower cost for the server Sjif it is turned on

Problem Formulation The set of VMs allocated to server Sj

Problem Formulation • Maximize • Subject to Unique Multi-knapsack problem: Items can be elastic between min and max Try to find the best size

Algorithms-BO Power-aware: choosing lower power cost per unit resource First-fit Packing VMs at their maximum requirements Conservative use of 9/10 of servers’ capacities Fail to choose higher utility VMs

Server Consolidation