Shared Resource Monitoring and Throughput Optimization in Cloud-Computing Datacenters

Shared Resource Monitoring and Throughput Optimization in Cloud-Computing Datacenters By- Jaideep Moses, Ravi Iyer , Ramesh Illikkaland SadagopanSrinivasan

Abstract • Datacenters employ server consolidation to maximize the efficiency of platform resource usage. • Impacts on their performance. • Focus: Use of shared resource monitoring to • Understand the resource usage. • Collect resource usage and performance. • Migrate VMs that are resource-constrained. • Result : To improve overall datacenter throughput and improve Quality of Service (QoS)

Focus • Monitor and address shared cache contention. • Propose a new optimization metric that captures the priority of the VM and the overall weighted throughput of the datacenter • Conduct detailed experiments emulating data center scenarios including on-line transaction processing workloads. • Results: Monitoring shared resource contention is highly beneficial to better manage throughput and QoS in a cloud-computing datacenter environment.

Keyword • Benchmark- TPCC, SPECjAppServer, SPECjbb, PARSEC • Virtualization • LLC – Last Level Cache • Shared-cache • CMP – Chip Multiprocessing • Cache Contention • Virtual Platform Architecture • MPI – Misses Per Instruction • IPC – Instruction Per Cycle

Outline • Introduction • Background and Motivation • Proposed Approach • Simulation • Related Work • Summary and Conclusions.

Introduction • Evolved data center with large number of heterogeneous applications running within virtual machines on each Platform. • Vsphere • Service Level Agreements. • Key Aspects: • Shared Resource Monitoring • VM Migration • QoS and Datacenter Throughput

Contribution • A simple methodology of using cache occupancy for shared cache environment. • New optimization metric that captures QoS as part of the throughput measure of the datacenter. • Detailed Experiments emulating data center scenario resulting in improvement in QoS and throughput. • Work is unique as it addresses application/VM scheduling in the context of SLAs. • Management of the shared cache occupancy. • Focus on shared cache contention which has first-order impact on performance. • LLC monitoring.

Typical Datacenter Platform and VM Usage

Background and Motivation • Cloud-computing virtualized datacenters of the future will have machines that are based on CMP architecture with multiple cores sharing the same LLC. • Measured the performance of Intel’s latest Core2 Duo platform when running all 26applications (in Windows XP) from the SPEC CPU2000 benchmark suite individually and in pair-wise mode .

Impact of Cache/Memory Contention

Cache sensitivity of Server Workloads

TPCC performance while co-running with other workloads on same shared LLC

Proposed MIMe Approach • Key components : • Mechanism used to monitor VM resource usage and identifying VMs that suffer due to resource contention. • Techniques used to identify candidate VMs for migration based on priorities and their behavior to achieve improved weighted throughput and determinism across priorities. • A metric that quantifies the goodness/efficiency of the datacenter as weighted throughput measure.

MIMe Key components to improve the efficiency of datacenter weighted throughput

Monitoring resource usage • VPA architecture • VPAID

IPC sensitivity for TPCC

Identifying VM candidates for migration • Two key factors : • VMs priority as agreed upon by an SLA • Behavior. E.g. cache sensitivity . • Example scenarios wherein an application like TPCC can exhibit a huge variation in performance depending on co-scheduled application.

The basic algorithm to identify a candidate VM for migration

Goal • No VM of interest that has a higher priority but runs less efficiently than a VM of lower priority after the migrations. • The whole process is cyclic which ensures that workload phases change or changes in SLAs with customers can be addressed with ease

Metric to quantify the efficiency of a datacenter • Measure - Total System IPC. • Benchmarking propose a Vconsolidate concept using weights associated to workload performance. Weighted normalized performance metric • Our New metric that would incorporate the QoS value as part of the throughput measure : QoS-Weighted throughput performance metric

RESULTS AND ANALYSIS • Simulation based methodology that uses CMPSched$im - Parallel multi-core performance simulator. • Utilizes the Pin binary instrumentation system to evaluate the performance of single-threaded, multi-threaded, and multi-programmed workloads on a single/multi-core processor . • Dynamically feeds instructions and memory references to the simulator. • Modified to be used as a trace-driven simulator. • Server workload traces for TPCC, Specjbb, SPECjAppServer, indexing workload and parsec . • Result: In the absence of any type of enforcement mechanisms being available in the hardware to control the cache occupancy, we have to rely only on monitoring information to make scheduling decisions.

TPCC IPC and Occupancy with QoS values

After Migration TPCC IPC and Occupancy with QoS values

Effect of minimizing contention for HP applications

Mean IPC after VM migration for reducing cache contention for HP applications

Experiment Result • Logically clustering identical machines together, then applying the migration policy. • The overall scrore increases 8%for TPCC workloads. • With SjappServer workloads the increase is 4.5%

RELATED WORK • All other study have focused on a single machine not virtualized environments. • Recently, a few studies like - Cherkasova and Enright Jergerand have focused on sharing in caches and for better scheduling policies. • We show how identical machines can be logically clustered, and based on VPA monitoring how higher priority applications that we care about are always guaranteed to get more platform resources (cache)than lower priority applications. • We also propose a new metric that incorporates QoS as the throughput measure

CONCLUSION • Problem of contention in the shared cache is a critical problem in virtualized cloud computing data centers. • High priority applications can suffer if scheduling at a data center level is not done with cache contention in mind. • How it can be solved without waiting for enforcement mechanisms to be available in the shared LLC . • A verysimple solution based on a VPA architecture

Future Work • Incorporating memory bandwidth as part of the VPA architecture. • Scheduling optimizations. • Profiling of VMs to take scheduling decisions. • Monitoring and Enforcement for cache, memory bandwidth and also power can be very efficiently used

THANK YOU !!

Shared Resource Monitoring and Throughput Optimization in Cloud-Computing Datacenters