Cellular disco resource management using virtual clusters on shared memory multiprocessors
Download
1 / 21

Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors. Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum Computer Systems Laboratory, Stanford University * Xift, Inc., Palo Alto, CA www-flash.stanford.edu. Motivation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors' - traci


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Cellular disco resource management using virtual clusters on shared memory multiprocessors

Cellular Disco:Resource management using virtual clusters on shared-memory multiprocessors

Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum

Computer Systems Laboratory, Stanford University

* Xift, Inc., Palo Alto, CA

www-flash.stanford.edu


Motivation
Motivation

  • Why buy a large shared-memory machine?

    • Performance, flexibility, manageability, show-off

  • These machines are not being used at their full potential

    • Operating system scalability bottlenecks

    • No fault containment support

    • Lack of scalable resource management

  • Operating systems are too large to adapt


Previous approaches
Previous approaches

  • Operating system: Hive, SGI IRIX 6.4, 6.5

    • Knowledge of application resource needs

    • Huge implementation cost (a few million lines)

  • Hardware: static and dynamic partitioning

    • Cluster-like (fault containment)

    • Inefficient, granularity, OS changes, large apps

  • Virtual machine monitor: Disco

    • Low implementation cost (13K lines of code)

    • Cost of virtualization


Questions
Questions

  • Can virtualization overhead be kept low?

    • Usually within 10%

  • Can fault containment overhead be kept low?

    • In the noise

  • Can a virtual machine monitor manage resources as well as an operating system?

    • Yes


Overview of virtual machines

Virtual Machine

App

OS

Overview of virtual machines

Virtual Machine

  • IBM 1960s

  • Trap privilegedinstructions

  • Physical to machineaddress mapping

  • No/minor OS modifications

App

OS

Virtual Machine Monitor

Hardware


Avoiding os scalability bottlenecks

VM

VM

Virtual Machine

Application

App

App

App

OS

OS

Operating System

Avoiding OS scalability bottlenecks

Cellular Disco

CPU

CPU

CPU

CPU

CPU

CPU

CPU

. . .

Interconnect

32-processor SGI Origin 2000


Experimental setup

IRIX 6.4

32P Origin 2000

Experimental setup

IRIX 6.2

  • Workloads

    • Informix TPC-D (Decision support database)

    • Kernel build (parallel compilation of IRIX5.3)

    • Raytrace (from Stanford Splash suite)

    • SpecWEB (Apache web server)

Cellular Disco

vs.

32P Origin 2000


Mp virtualization overheads
MP virtualization overheads

  • Worst case uniprocessor overhead only 9%

+20%

+10%

+4%

+1%


Fault containment

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

Interconnect

Fault containment

VM

VM

VM

  • Requires hardware support as designed in FLASH multiprocessor

Cellular Disco


Fault containment overhead @ 0
Fault containment overhead @ 0%

+1%

+1%

+1%

-2%

  • 1000 fault injection experiments (SimOS): 100% success


Resource management challenges
Resource management challenges

  • Conflicting constraints

    • Fault containment

    • Resource load balancing

  • Scalability

  • Decentralized control

  • Migrate VMs without OS support


Cpu load balancing

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

Interconnect

CPU load balancing

VM

VM

VM

VM

VM

VM

Cellular Disco


Idle balancer local view
Idle balancer (local view)

  • Check neighboring run queues (intra-cell only)

  • VCPU migration cost: 37µs to 1.5ms

    • Cache and node memory affinity: > 8 ms

  • Backoff

  • Fast, local

CPU 0

CPU 1

CPU 2

CPU 3

A0

A1

VCPUs

B0

B1

B1


Periodic balancer global view

1

0

2

1

CPU 0

CPU 1

CPU 2

CPU 3

fault containment boundary

Periodic balancer (global view)

4

  • Check for disparity in load tree

  • Cost

    • Affinity loss

    • Fault dependencies

1

3

A0

A1

B0

B1

B1


Cpu management results
CPU management results

+9%

  • IRIX overhead (13%) is higher

+0.3%


Memory load balancing

RAM

RAM

RAM

RAM

RAM

RAM

RAM

RAM

Interconnect

Memory load balancing

VM

VM

VM

VM

Cellular Disco


Memory load balancing policy
Memory load balancing policy

  • Borrow memory before running out

  • Allocation preferences for each VM

  • Borrow based on:

    • Combined allocation preferences of VMs

    • Memory availability on other cells

    • Memory usage

  • Loan when enough memory available


Memory management results

4

4

4

4

4

4

Memory management results

Only +1% overhead

  • Ideally: same time if perfect memory balancing

DB

DB

Cellular Disco

Cellular Disco

32 CPUs, 3.5GB

4

4

Interconnect

Interconnect


Comparison to related work
Comparison to related work

  • Operating system (IRIX6.4)

  • Hardware partitioning

    • Simulated by disabling inter-cell resource balancing

16 process

Raytrace

TPC-D

Cellular Disco

8 CPUs

8 CPUs

8 CPUs

8 CPUs

Interconnect


Results of comparison
Results of comparison

  • CPU utilization: 31% (HW) vs. 58% (VC)


Conclusions
Conclusions

  • Virtual machine approach adds flexibility to system at a low development cost

  • Virtual clusters address the needs of large shared-memory multiprocessors

    • Avoid operating system scalability bottlenecks

    • Support fault containment

    • Provide scalable resource management

    • Small overheads and low implementation cost


ad