Virtualizing mission critical apps 1pm est 3 29 2011 ilya mirman philip thomas
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

Virtualizing Mission-Critical Apps 1PM EST, 3/29/2011 Ilya Mirman Philip Thomas PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on
  • Presentation posted in: General

Virtualizing Mission-Critical Apps 1PM EST, 3/29/2011 Ilya Mirman Philip Thomas. Agenda. The Rise of “The Virtualization Chasm” 3 Fundamental inefficiencies Best practices Live demonstration. Background. Before Virtualization. Excess capacity to keep utilization under 80%.

Download Presentation

Virtualizing Mission-Critical Apps 1PM EST, 3/29/2011 Ilya Mirman Philip Thomas

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Virtualizing mission critical apps 1pm est 3 29 2011 ilya mirman philip thomas

Virtualizing Mission-Critical Apps1PM EST, 3/29/2011Ilya MirmanPhilip Thomas


Agenda

Agenda

  • The Rise of “The Virtualization Chasm”

  • 3 Fundamental inefficiencies

  • Best practices

  • Live demonstration


Background

Background


Before virtualization

Before Virtualization

Excesscapacityto keep utilizationunder 80%

Peak CPU Workload

  • Traditional IT guarantees apps’performance by

    • Dedicating physicalmachines (PM) to apps

    • Provisioning sufficient capacityto service peak loads

  • Consider an app requiring16 cores, 8GB memory and 10kIOPS (IO Per Sec) IO bandwidthto service its peaks

16

CPUcapacity16 cores

14

12

IO capacity: 10k IOPS

10

Capacity

8

6

Memory capacity: 8 GB

4

Mem

2

CPU

IO

PM


Over provisioning waste

Over-Provisioning Waste

Capacity over-provisioned

for peak demands

16

14

12

10

Wasted

capacity

Capacity

8

6

4

Average

utilization: 10%

2

PM

Workloads are ‘bursty’: Average/peak is often under 10%

Dedicating hardware wastes the slack capacity between average & peak


Virtualization is set to resolve this waste

Virtualization is Set to Resolve This Waste

Consolidate workloads into shared PMs

This increases average utilization additively

But it also increases interference among VMs

E.g., Peak traffic of VM1 can interferewith CPU availability for other VMs

PMs

Peak Workloads of VMs

Consolidate into shared PMs

8

6

4

2

VM1

VM2

VM3

VM4

VM5

VM6

VM7

VM8

VM9

VM10


Vms compete for resources

VMs Compete for Resources

Best-effort resource allocations (vs. dedicated)

VMs get their allocations, if capacity is available

VMs experience interference when capacity is insufficient

Interference can create congestion, bottlenecks and delays

Performance-insensitive apps can tolerate interference

Permit simple, risk-free virtualization

But mission-critical apps are highly vulnerable to interference!


The rise of the virtualization chasm

The Rise of “The Virtualization Chasm”

Performance-

Insensitive Apps

Production Apps

“The Virtualization-Chasm”

ROI

Virtualization 1.0

Virtualization 2.0

20%

40%

80%

100%

Percentage Apps Virtualized

  • Virtualization 1.0: Virtualize performance-insensitive apps

    • E.g., Print servers, non-critical web apps (The low-hanging fruits)

    • 20%-30% of enterprise apps

  • Virtualization 2.0: Virtualize production apps

    • The remaining 70%-80% important/critical production apps


Virtualizing mission critical apps

Virtualizing Mission-Critical Apps


The key challenge ensuring that production apps get their resources

The Key Challenge: Ensuring That ProductionApps Get Their Resources

Interference results from statistical over-commitment

Apps’ demands can exceed capacity momentarily

Interference may be controlled by two mechanisms

Resource allocation: protect apps against over-commitment

Workload placement: move workloads to minimize interference

Let’s take a look at recommendations from the hypervisor vendors…


Vmware best practices managing productions apps performance

VMWare Best Practices: Managing Productions Apps Performance

Avoid Over-Commitment:

Assure PeakUtilization:

“For performance-critical Exchange virtual machines (i.e., production systems),try toensure the total number of vCPUs assigned to all the virtual machines is equalto or less than the total number of cores onthe hostmachine.”

“It is recommended that standalone servers…be designed tonot exceed 70% utilization duringpeak period.”

Best Practice Guide to Exchange Server Virtualization:

http://www.vmware.com/files/pdf/Exchange_2010_on_VMware_-_Best_Practices_Guide.pdf


Vmware best practices managing productions apps performance1

VMWare Best Practices: Managing Productions Apps Performance

VMWare Production Apps Strategy Rests on 2 Rules:

VMs running production apps should ensure that:

“Resource allocationsare sufficient to serve peak demands.”

R-I guarantees that an app may get its peak demands served, if capacityis available.

R-I

“Aggregate allocationsdo not exceed thePM capacity.”

R-II guarantees that the capacity allocation will be available.

R-II

i.e., if VM1 and VM2 each need 4 vCPUs, we need a PM with ≥8 CPUs!


Wait really then why virtualize

Wait….Really? Then why virtualize?

  • Though there’s no sharing of resources, still enjoy the other benefits of virtualization (app isolation, VM set-up, back-up, etc.)

“Resource allocationsare sufficient to serve peak demands.”

R-I guarantees that an app may get its peak demands served, if capacityis available.

R-I

“Aggregate allocationsdo not exceed thePM capacity.”

R-II guarantees that the capacity allocation will be available.

R-II


Virtualization can result in 3 fundamental inefficiencies

Virtualization Can Result in3 Fundamental Inefficiencies

1.

2.

3.

Over-provisioning inefficiency

Workload packing inefficiency

Non-adaptive control inefficiency

These fundamental inefficiencies are considered next…


Over provisioning inefficiency

Over-provisioning Inefficiency


How to avoid over provisioning waste

How to Avoid Over-Provisioning Waste?

  • To Avoid Waste: Increase average workload withoutincreasing reservations

    • Add performance-insensitive apps with high average workload

    • E.g., consolidate spam-filter apps, email archival apps alongside mission-critical apps

  • Need additional best practice rule: Smart consolidation

Best Practice #1:

Maintain a consolidation-balance between performance-sensitive and insensitive workloads


Workload packing inefficiency

Workload-Packing Inefficiency


A greatly simplified example

A Greatly Simplified Example

PM2

PM3

Virtualized Workloads

CPU capacity: 16 cores

8

Memory capacity: 8 GB

6

4

IO capacity: 10k IOPS

2

16

VM6

VM5

VM4

VM3

VM2

VM1

14

12

Manual Ad-Hoc Workload Assignment

10

8

6

4

2

PM1


What if we get new vms

What If We Get New VMs?

16

14

12

10

8

16

6

14

4

12

2

10

8

PM1

6

4

2

PM1

PM2

PM2

PM4

PM5

PM3

PM3

8

6

4

2

VM7

VM8

VM9

VM10

Ad Hoc Assignment

  • Can we do better?

  • Optimized assignment uses 40% less resources (3 PM vs. 5)


What can we learn from this example

What Can We Learn from This Example?

Changes may require (re-)assignment of workloads

Even a trivialized example can be very complex

Complexity and waste can grow dramatically

When the number of VMs increases

When physical machines vary

When there are constraints (e.g., storage access, security policies)

When the rate of changes is high

Ad hoc processes can lead to costly inefficiencies

Planning and workload placement must consider all workload types (not just CPU)


Overcoming the packing inefficiency

Overcoming the Packing Inefficiency

Use improved workload placement algorithms

Look holistically at all workloads and resources

Exploit the flexibility of performance-insensitive workloads

Exploit the dynamics of workloads peaks & troughs

Best Practice #2:

Use improved workload placement algorithms


Non adaptive control inefficiency

Non-adaptive Control Inefficiency


Mission critical app example

Mission-Critical App Example

  • Virtualized MS Exchange app

  • High IOPS during the night (2AM-5AM)

    • Peak: 10 k-IOPS

    • <1 k-IOPS during the rest of the time

10

1

k-IOPS Rate

Time

17

20

23

04

06

07

09

22

01

12

15

18

19

21

24

02

05

11

13

14

16

03

08

10


What if workloads grow

What If Workloads Grow?

16

14

12

10

8

6

4

16

14

2

12

PM1

10

8

6

4

2

PM2

PM4

PM2

PM3

PM3

PM1

What if VM1 needs more memory & storage?

8

6

4

2

VM1

VM2

VM3

VM4

VM5

VM6

  • Can we do better?

  • Optimized assignment uses 25% less resources


Adaptive vs non adaptive workload control

Adaptive vs. Non-Adaptive Workload Control

  • Workloads demands (and interference) change over time

    • E.g., Exchange server is active through the night

    • Why keep its reservation during the day?

  • Static workload mgmt is limited in handling emergent problems

    • Apps profiles reflect long-term statistics; fluctuations can cause interferences

  • Adaptive workload control offers superior mgmt

    • Exploit workload dynamics to reduce waste of static policies

    • Eliminate emergent interferences

Best Practice #3:

Provide adaptive control to optimize resource use & avoid interference

Best Practice #4:

Use of forward looking

workload projection


Adaptive control too complex for manual management

Adaptive Control: Too Complex for Manual Management

Manual management requires administrators to:

Master voluminous details of hypervisor andapplications internals

Manage interference and waste problems manually

Manage resource allocations and move applicationsas workloads change

Maintain tight-coordination between virtualization& app administrators

This complexity is a central barrier for Virtualization 2.0 !!!


Virtualizing production apps improved best practices

Virtualizing Production Apps:Improved Best Practices


Conclusions

Conclusions

Workload placement can be very inefficient

Over-provisioning waste; workload-packing waste; non-adaptive inefficiencies

Virtualization is much too complex for manual administration

Must be augmented by workload management:

Eliminate the over-provisioning waste through balanced consolidation

Minimize the workload-packing waste by exploiting workload features

Support adaptive control to optimize resource use & avoid interference

Virtualization 2.0 Strategy:

Replace manual mgmt with automated optimized workload management


Live demonstration

Live Demonstration


Virtualizing mission critical apps 1pm est 3 29 2011 ilya mirman philip thomas

Thank you!

www.vmturbo.com


  • Login