server consolidation n.
Skip this Video
Loading SlideShow in 5 Seconds..
Server Consolidation PowerPoint Presentation
Download Presentation
Server Consolidation

Loading in 2 Seconds...

play fullscreen
1 / 91

Server Consolidation - PowerPoint PPT Presentation

  • Uploaded on

Server Consolidation. Xiujiao Gao 12/02/2011. Overview. Introduction Server consolidation problems and solutions Static Server Allocation Problems (SSAP) and its extensions [1] Shares and Utilities based Server Consolidation [2]

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Server Consolidation' - xylia

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
server consolidation

Server Consolidation



  • Introduction
  • Server consolidation problems and solutions
    • Static Server Allocation Problems (SSAP) and its extensions[1]
    • Shares and Utilities based Server Consolidation[2]
    • Server Consolidation with Dynamic Bandwidth Demand[3]
  • Conclusion
  • Server Consolidation
    • The process of combining the workloads of several different servers or services on a set of target (physical) servers
  • The Gartner Group estimates that the utilization of servers in datacenter is less than20 percent.



  • Server Virtualization
    • Provide technical means to consolidate multiple servers leading to increased utilization of physical servers
    • Virtual machine appears to a “guest” operating system as hardware, but it is simulated in a contained software environment by the host system
    • Reduced time for deployment, easier system management— lower hardware and operating costs
  • Introduction
  • Server consolidation problems and solutions
    • Static Server Allocation Problems (SSAP) and its extensions[1]
    • Shares and Utilities based Server Consolidation[2]
    • Server Consolidation with Dynamic Bandwidth Demand[3]
  • Conclusion
ssap and its extensions 1
SSAP and its Extensions [1]
  • Decision problems
  • Available data in data centers
  • Problem Formulation
  • Complexity and Algorithms
  • Experimental Setup
  • Simulation Results
decision problems
Decision Problems
  • It applies to three widespread scenarios
    • Investment decision
    • Operational costs(i.e. energy, cooling and administrative cost)
    • Rack of identical blade servers (which subset of servers to use)
  • Minimize the sum of server costs in terms of purchasing, maintenance, administration or sum of them (cihas different meanings)
available data in data centers
Available Data in Data Centers
  • Date centers reserve certain amounts of IT resources for each single service or server
    • CPU capacity — SAPS or HP computons
    • Memory —Gigabyte
    • Bandwidth —Megabits per second
  • Resource demand has seasonal patterns on a daily, weekly or monthly basis
    • large set of workload traces from their industry partner

available data in data centers1
Available Data in Data Centers
  • Workloads traces can change in extended time periods
  • IT service managers monitor workload developments regularly
  • Reallocate servers if it is necessary
  • Models for initial and subsequent allocation problems
problem formulation
Problem Formulation
  • Static Server Allocation Problems (SSAP)
  • Static Server Allocation Problem with variable workload (SSAPv)
  • Extensions of the SSAP
    • Max-No. of services Constraints
    • Separation Constraints
    • Combination Constraints
    • Technical Constraints and Preassignment Constraints
    • Limit on the number of reallocations
  • n services jϵJ that are to be served by m servers iϵI
  • Different types of resources k ϵK
  • Serverihas a certain capacity sikof resource k
  • cidescribes the potential cost of a server
  • Service j ordersujkunits of resource k
  • yi are binary decision variables indicating which servers are used
  • xijdescribes which service is allocated on which server

The SSAP represents a single service’s resource demand

as constant over time (side constraints 2)

  • Consider variations in the workload
  • Time is divided into a set of intervals T indexed by t={1,….r}
  • ujkt describes how much capacity service j requires from resource type k in time interval t
  • ujktdepend on the load characteristics of the servers to be consolidated
extensions of ssap
Extensions of SSAP
  • Max No. of Services Constraints
  • Separation Constraints
  • Combination Constraints
  • Technical Constraints
  • Limits on the number of reallocations
complexity and algorithms
Complexity and Algorithms
  • SSAP is strongly NP-hard
  • A straightforward proof by reducing SSAP to the multidimensional bin packing problem (MDBP)
  • NP-hard does not necessarily mean that it is intractable for practical problem sizes
  • Which problem sizes can be solved exactly and how far one can get with heuristic solutions, both in terms of problem size and solution quality
complexity and algorithms1
Complexity and Algorithms
  • Polynomial-time approximation schemes (PTAS) with worst-case guarantees on the solution quality of MDBP have been published.
  • The first important result was produced in

C. Chekuri and S. Khanna, “On Multi-Dimensional Packing Problems,” Proc. ACM-SIAM Symp. Discrete Algorithms, pp. 185-194, 1999

  • For any fixed ε>0, delivers a

approximate solution for constant d (d is dimension of MDBP)

Two steps

algorithms for mdbp
Algorithms for MDBP
    • First step
      • Solves linear programming relaxation :make fractional assignments for at most dm vectors in d dimensions and m bins
  • Second step
    • The set of fractionally assigned vectors is assigned greedily—find the largest possible set
algorithms for ssap v
Algorithms for SSAP(v)
  • SSAP with only one source
    • Branch & Bound (SSAP B&B)
    • First Fit (FF)
    • First-Fit Decreasing (FFD)
  • SSAPv
    • Branch & Bound (SSAPv B&B)
    • LP-relaxation-based heuristic (SSAPv Heuristic)
      • Use the results of an LP-relaxation
      • Use an integer program to find an integral assignment (Compared to the PTAS)
algorithms for ssap v1
Algorithms for SSAP(v)
  • For SSAP B&B, SSAPv B&B and SSAPv Heuristic, the number of servers used does have a significant impact on the computation time
  • Each additional server increases the number of binary decision variables by n+1
  • Use specific iterative approach to keep the number of binary variables as low as possible

Lower bound number of servers

Same capacity s

Fractional allocation of services

algorithms for ssap v2
Algorithms for SSAP(v)
  • Start to solve the problem with m being the LB
  • If the problem is infeasible, m is incremented by 1
  • Repeat until a feasible solution is found
  • The first feasible solution found in B&B search tree is obviously an optimal solution
experimental data
Experimental Data
  • Experimental Data (3 consecutive months measured in intervals of 5minutes)
    • 160 traces for the resource usage of Web/Application/Database servers (W/A/D)
    • 259 traces describing the load of servers exclusively hosting ERP applications
    • Resources demands are in terms of CPU and memory
    • Strong diurnal seasonality with nearly all servers and some weekly seasonality
    • CPU is the bottleneck resource for these types of applications
    • CPU demand of ERP services is significantly higher than W/A/D
data preprocessing
Data Preprocessing
  • Data Preprocessing: discrete characterization of daily patterns in the workload traces and solve the allocation problem as a discrete optimization problem
  • Two- step process to derive the parameters ujktfor our optimization models from the original workload traces

ujktraworiginal workload traces

ujktan estimator from the set of ujktraw

data preprocessing1
Data Preprocessing
  • First step
    • Derive an estimator for each interval

A day as a period of observation

p number of periods contained in the load data (p=92)

ϒ’ intervals in a single period (ϒ’=288)

Deriveujkt from the above distribution

data preprocessing2
Data Preprocessing

Y-axis captures a sample of about 92values

Risk attitude : 0.95-quantile of Ujktis an estimator for

the resource requirement of service j where 95percent

of requests can be satisfied

data preprocessing3
Data Preprocessing
  • Second step
    • Aggregate these intervals to reduce the number of parameters for the optimization
experimental design
Experimental Design
  • Experimental Design
    • Model (SSAP and SSAPv)
    • Algorithms (B&B, Heuristic, FF,FFD)
    • Service type (W/A/D, ERP)
    • Number of services
    • Server capacity (CPU only)
    • Risk attitude
    • Number of time intervals considered in SSAPv
    • Sensitivity with respect to additional allocation constraints
experimental design1
Experimental Design
  • Experimental Design
    • lp_solve 5.5.9 :revised simplex and B&B
    • COIN-OR CBC branch-and-cut IP solver with the CLP LP server
    • Java 1.5.0 : FF and FFD
    • Time out is 20 mins (already up to 700 servers)
simulation results
Simulation Results
  • Computation time Depending on Problem Size
    • Examine 24 time intervals
    • 95th percentile of 5-minute intervals
    • 5000 SAPS server capacity
    • For each of different numbers of services, 20 instances have been sampled(with replacement)
    • Different number of services—x-axis
    • Computation time—y-axis
    • Proportion of solvable instances within 2o mins—y-axis
proportion of solvable erp instances
Proportion of solvable ERP instances

Solve much smaller instances compared with W/A/D services with 20mins

solution quality depending on problem size
Solution Quality Depending on Problem Size

Computed number of required servers exceeds the lower bound number of servers

Refer to this excess ratio Q as solution quality

The closer Q is to 1, the better the solution is

impact of risk attitude on solution quality
Impact of Risk Attitude on Solution Quality
  • Previous simulation assumed the decision maker to select 95th percentile in data processing
  • Percent of the historical service demand would have been satisfied without delay at this capacity
  • Risk attitude
    • Actual overbooking of server resources (aggregate demands)
    • More conservative estimate (reduction in variance)
  • Analysis of capacity violations
    • 10 different consolidation problems of 250 W/A/D services
    • Quantiles :….1
    • Use SSAPv B&B
influence of additional allocation constraints
Influence of Additional Allocation Constraints
  • Up bound on the number of services per server
    • The number of servers increases
    • Computation time increases
  • Combination and separation constraints
    • Little effect on the solution quality
    • Negative impact on computation time
  • Technical constraints
    • Little effect on the number of servers needed
    • Computation time decreases
  • Introduction
  • Server consolidation problems and solutions
    • Static Server Allocation Problems (SSAP) and its extensions[1]
    • Shares and Utilities based Server Consolidation[2]
    • Server Consolidation with Dynamic BandwidthDemand [3]
  • Conclusion
shares and utilities based server consolidation 2
Shares and Utilities based Server consolidation [2]
  • Min, max and shares
  • Problem formulation
  • Algorithms
    • Basic Overprovision (BO)
    • Greedy Max (GM)
    • Greedy Min Max (GMM)
    • Expand Min Max (EMM)
    • Power Expand Min Max (PEMM)
    • Hypothetical Upper Bound Algorithm (HUB)
  • Experimental Evaluation
min max and shares
Min, Max and Shares
  • Not all the applications are created equal.
  • Different priority
    • High priority applications : e-commerce web server
    • Low priority applications: the intranet blogging server
  • Different resource affinities
    • Ex : web server may value additional CPU cycles much more than a storage backup
  • Under situation of high load, CPU resources are best to allocated to higher utility application-web server
min max and shares1
Min, Max and Shares
  • Take advantage of the Min, Max and Shares parameters
    • Min: ensure VM receive at least that amount of resources when it is power on
    • Max: ensure low priority application does not use more resources and keep them available for high priority applications
    • Shares: provide advice to the virtualization scheduler distribute resources between contending VMs (shares ratio of 1:4)
min max and shares impact experiment
Min, Max and Shares Impact Experiment

3 Vmware ESX servers

12 VMs (6 low priority and 6 high priority)

  • Low load: desire 35% of the total available CPU
  • High load: desire 100% of the total available CPU
  • Under high load conditions, MMS delivers 47% more utility than BASE
problem formulation1
Problem Formulation
  • The set of VMs
    • Vi.mminimum resources needed (CPU only)
    • Vi.Mmaximum resources needed (CPU only)
    • Vi.uutility derived from the VM when it is allocated Vi.m
    • Vi.Uutility derived from the VM when it is allocated Vi.M
  • The set of physical servers
    • Cjthe CPU capacity of the server Sj
    • Pjpower cost for the server Sjif it is turned on
problem formulation2
Problem Formulation

The set of VMs allocated to server Sj

problem formulation3
Problem Formulation
  • Maximize
  • Subject to

Unique Multi-knapsack problem:

Items can be elastic between min and max

Try to find the best size

algorithms bo

Power-aware: choosing lower power cost per unit resource


Packing VMs at their maximum requirements

Conservative use of 9/10 of servers’ capacities

Fail to choose higher utility VMs

algorithm gm gmm
Algorithm-GM & GMM
  • GM: Sorts the VMs by their profitablity
    • Provide higher utility for every unit resource than BO
    • Assign maximum requested allocation
    • Some VMs are more profitable at another smaller size
  • GMM: Sorts the VMs by and
    • When it choose one corresponding to , then


algorithm gmm
  • GMM
    • Pick right combination of VMs especially when some VMs are more profitable at their min
    • Fail to get additional resources even when the incremental value they provide beyond min is much higher than other VMs
    • Fail to get resources when some nodes still have left room after first-fitting
algorithms emm
  • No first-fit fashion
  • Compute an estimated utility for each node if the new VM were added to that node
  • Choose the node gives best utility
  • How to compute the utility of a node?
  • Given a feasible set Q for Sj, use algorithm 2 to estimate the utility of node Sj
compute node utility
Compute Node Utility

Expanding VMs that give the most incremental utility per unit capacity until either the node’s capacity is reached or no more expansion is possible

algorithms emm1
  • Limitation of EMM
  • It tends to use all the servers that it has access to (considering when empty server is available)
  • It is disadvantage in terms of power costs
  • How to detect whether to start a new node or not?
algorithms pemm
  • One important change compared with EMM
  • Use the node utility gain minus the proportional power cost incurred on the new node as the comparative measure
  • If Sj is an already open node
  • If Sj is a new node
algorithms hub
  • Provide a upper bound
    • Relax multi-bin packing constraint by “lining up the servers end-to-end”
    • Allow single VMs to be placed over multiple servers
    • Charge only for each VM’s fractional power consumption over its respective machines
  • Allow achieving 100% capacity usage on each machine thus providing an upper bound
experimental evaluation
Experimental Evaluation
  • Large synthetic data center experiments
    • Simulate thousands of VMs and hundreds of servers
    • Utility and Min-Max inputs: normal distribution
    • Power cost is a percentage of total utility received
  • Real testbed experiments
    • Three Vmware ESX 3.5 servers
    • 12 VMs run RedHat Enterprise Linux (RHEL)
    • Low priority (1) and high priority (4) (6/6)
    • Workloads in each VM were generated using HPCC

HPC Challenge Benchmark,

utility under different number of servers
Utility under Different Number of Servers

MM: Packing VMs at other sizes (compared with BO using maximum size)

150-200 servers: Shrinking resources given to high-profitable VMs in favor of fitting more low-profitable VMs

200-300 servers: more room to expand high –utility VMs

300-350 servers: steady state :place all the VMS; PEMM with higher utility

number of vms excluded under different number of servers
Number of VMs Excluded under Different Number of Servers

All algorithms leave many VMs out up to 150 servers

PEMM and EMM are able to fit all the VMs earliest

utility under different number of vms at min per server
Utility under Different Number of VMs (at min) per Server

Lower numbers: more servers are needed and hence power costs rise leading to lower utility

High numbers: fewer servers are needed and utility goes up

PEMM yields the highest utility placements

utility under different standard deviation of server capacity
Utility under Different Standard Deviation of Server Capacity

As the standard deviation increases, certain servers will have high capacity, lower power to capacity rate and will be filled first, since the utility goes up

computation time under different vms
Computation Time under Different VMs

EMM has a greater execution time as it attempts to use many more servers than PEMM

PEMM can easily scale to larger systems

utility under different percentage of low priority load
Utility under Different Percentage of Low Priority Load

GMM places all VMs in a single server allocating only the min amount of resources resulting in poor utility

PEMM use less servers than EMM to generate more utility

utility under different percentage of power cost
Utility under Different Percentage of Power Cost

As the power cost increases, it becomes increasingly expensive to use extra servers, causing drops in utility

PEMM does the best by adjusting to those increasing cost

  • Introduction
  • Server consolidation problems and solutions
    • Static Server Allocation Problems (SSAP) and its extensions[1]
    • Shares and Utilities based Server Consolidation[2]
    • Server Consolidation with Dynamic Bandwidth Demand[3]
  • Conclusion
server consolidation with dynamic bandwidth demand 3
Server Consolidation with Dynamic Bandwidth Demand [3]
  • Dynamic bandwidth demand: normal distribution
  • Modulate it into Stochastic Bin Packing (SBP)
  • Propose an online algorithm to solve SBP by which the number of servers required is within of the optimal for any
  • Simulation Results
dynamic bandwidth demand
Dynamic Bandwidth Demand
  • Normal Distribution

The mean is positive

The standard deviation is small enough compared with the mean

The probability of bandwidth demand being negative is very small

stochastic bin packing
Stochastic Bin Packing

The total size of items in a bin follows normal distribution



stochastic bin packing2
Stochastic Bin Packing
  • Equivalent Size
  • Classical Bin Packing:
    • The number of bins used is the sum of the item sizes plus the sum of the residual capacity of each bin
    • Pack each bin in a compact way so as to reduce the residual capacity of each bin
stochastic bin packing3
Stochastic Bin Packing
  • SBP
    • Reducing the residual capacity does not necessarily reduce the number of bins used since the equivalent size can change
  • Solution
    • Reduce the residual capacity
    • Reduce the equivalent size at the same time while packing items
stochastic bin packing4
Stochastic Bin Packing
  • One way: 1-4(full),5-13(full),14-35
  • Second way:1-2 & 5-8 &14 (full), 3-4 & 9-10 & 15 (full), the rest must be in two bins
  • Both of them pact as compact as possible
  • The total equivalent size in the second method is larger than the first one
online algorithm
Online Algorithm
  • Find methods of dividing groups
    • The items with both means and standard deviations in the same interval belong to the same group
  • Run Group Packing algorithm
online algorithm1
Online Algorithm
  • Two scenarios
    • Finite possibilities of means
    • Generic scenarios
  • Worst case performance ratio
    • B(L) is the number of bins used by the packing algorithm
      • First scenario
      • Second scenario
simulation setup
Simulation Setup
  • Use traffic dataset from global operational data centers.
  • 9K VMs and servers are equipped with 1Gbps Ethernet card

The bandwidth requirement for VM i

simulation results1
Simulation Results
  • The number of servers used : our algorithm 421 HARMONIC 609

C. C. Lee and D. T. Lee, "A simple on-line bin-packing algorithm," J. ACM, vol. 32, no. 3, pp. 562-572, 1985

  • Use as the bandwidth for requirement VM i
  • The number of servers become 402 for HARMONIC
  • But the violation probability exceeds 0.01
simulation setup1
Simulation Setup
  • Number of items: 2000-20,000
  • for FF and FFD
  • Introduction
  • Server consolidation problems and solutions
    • Static Server Allocation Problems (SSAP) and its extensions [1]
    • Shares and Utilities based Server Consolidation [2]
    • Server Consolidation with Dynamic Bandwidth Demand [3]
  • Conclusion
  • They(except SSAP) do not use deterministic values to characterize the demands over time anymore
    • Some modulate the problem into ILP while considering real world constraints, then solve the ILP with heuristic algorithm [1]
    • Some try to take advantages of Min, Max and Shares features inherent in Server consolidation [2]
    • Some use SBP to consolidate dynamic servers [3]
  • 1 Speitkamp, B.; Bichler, M., “A Mathematical Programming Approach for Server Consolidation Problems in Virtualized Data Centers,” IEEE Transactions on Services Computing, vol.3, no.4, pp.266-278, Oct.-Dec. 2010
  • 2 Cardosa, M.; Korupolu, M.R.; Singh, A., “Shares and utilities based power consolidation in virtualized server environments,” IM ’09. IFIP/IEEE International Symposium on Integrated Network Management, 2009, pp.327-334, 1-5 June 2009
  • 3 Meng Wang; XiaoqiaoMeng; Li Zhang, “Consolidating virtual machines with dynamic bandwidth demand in data centers,” 2011 Proceedings IEEE INFOCOM , pp.71-75, 10-15 April 2011
  • 4 Ying Song; Yanwei Zhang; Yuzhong Sun; Weisong Shi, “Utility analysis for Internet-oriented server consolidation in VM-based data centers,” 2009. CLUSTER ’09. IEEE International Conference on Cluster Computing and Workshops, pp.1-10, Aug. 31 2009-Sept. 4 2009
  • 5 C.Chekuri and S. Khanna, “On Multi-Dimensional PackingProblems,” Proc.ACM-SIAM Symp. Discrete Algorithms, pp. 185-194, 1999.
questions and answers
Questions and Answers
  • Q1: How to understand the combination constraints in the extensions of SSAP?[1]
  • Answer: If is zero, that means the service e is not allocated on server i, from the formulation

we can get that must be zero, in other words, the other services are not allocated on the server i, too. If is one, it means service e is allocated on server i, then all the other services have to be allocated on the same server i.

questions and answers1
Questions and Answers
  • Q2: Explain the constraints of limit on the number of reallocations. [1]
  • Professor Qiao explained this in an much easier way, just reformulate the constraints into , here r is the limit number of reallocations, and n is the size of the set in the current time.
questions and answers2
Questions and Answers
  • Q3: How does GMM work if the is higher, but the corresponding max size cannot be satisfied?[2]
  • If the max size cannot be satisfied, it will not allocate this service at max, and then keep traversing the list, leaving the corresponding in the list.
questions and answers3
Questions and Answers
  • Q4: How to get the utility inputs in Min, Max and Shares paper? [2]
  • They use normal distribution to generate the utility. First, they generate the utility at min, and then on top of this, they get the utility at max using certain variance. But for the utility of the size between min and max, the paper does not explain very clearly.
questions and answers4
Questions and Answers
  • Q5: In which way they get the HUB method? [2]
  • HUB is ideal. It allows VMs use different servers in different time interval and also VMs can be fractional allocated. In real technology world, we cannot really do this. If we assume that migration over time does not consume time and VMs can be fractionally allocated, we can get this upper bound.
questions and answers5
Questions and Answers
  • Q6 : Since “the dynamic bandwidth demand” paper is too abstract and theoretical, just introduce the main ideas. [3]
  • The core idea is that we should pact items as compact as possible, and at the same time try to minimize the equivalent size of the items. First they divided the items into different groups according to the mean and variance interval, and then used on line algorithm to pact the items in each group. They improved that the worst case performance ratio is .