1 / 21

Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors. Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum Computer Systems Laboratory, Stanford University * Xift, Inc., Palo Alto, CA www-flash.stanford.edu. Motivation.

traci
Download Presentation

Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cellular Disco:Resource management using virtual clusters on shared-memory multiprocessors Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum Computer Systems Laboratory, Stanford University * Xift, Inc., Palo Alto, CA www-flash.stanford.edu

  2. Motivation • Why buy a large shared-memory machine? • Performance, flexibility, manageability, show-off • These machines are not being used at their full potential • Operating system scalability bottlenecks • No fault containment support • Lack of scalable resource management • Operating systems are too large to adapt

  3. Previous approaches • Operating system: Hive, SGI IRIX 6.4, 6.5 • Knowledge of application resource needs • Huge implementation cost (a few million lines) • Hardware: static and dynamic partitioning • Cluster-like (fault containment) • Inefficient, granularity, OS changes, large apps • Virtual machine monitor: Disco • Low implementation cost (13K lines of code) • Cost of virtualization

  4. Questions • Can virtualization overhead be kept low? • Usually within 10% • Can fault containment overhead be kept low? • In the noise • Can a virtual machine monitor manage resources as well as an operating system? • Yes

  5. Virtual Machine App OS Overview of virtual machines Virtual Machine • IBM 1960s • Trap privilegedinstructions • Physical to machineaddress mapping • No/minor OS modifications App OS Virtual Machine Monitor Hardware

  6. VM VM Virtual Machine Application App App App OS OS Operating System Avoiding OS scalability bottlenecks Cellular Disco CPU CPU CPU CPU CPU CPU CPU . . . Interconnect 32-processor SGI Origin 2000

  7. IRIX 6.4 32P Origin 2000 Experimental setup IRIX 6.2 • Workloads • Informix TPC-D (Decision support database) • Kernel build (parallel compilation of IRIX5.3) • Raytrace (from Stanford Splash suite) • SpecWEB (Apache web server) Cellular Disco vs. 32P Origin 2000

  8. MP virtualization overheads • Worst case uniprocessor overhead only 9% +20% +10% +4% +1%

  9. CPU CPU CPU CPU CPU CPU CPU CPU Interconnect Fault containment VM VM VM • Requires hardware support as designed in FLASH multiprocessor Cellular Disco

  10. Fault containment overhead @ 0% +1% +1% +1% -2% • 1000 fault injection experiments (SimOS): 100% success

  11. Resource management challenges • Conflicting constraints • Fault containment • Resource load balancing • Scalability • Decentralized control • Migrate VMs without OS support

  12. CPU CPU CPU CPU CPU CPU CPU CPU Interconnect CPU load balancing VM VM VM VM VM VM Cellular Disco

  13. Idle balancer (local view) • Check neighboring run queues (intra-cell only) • VCPU migration cost: 37µs to 1.5ms • Cache and node memory affinity: > 8 ms • Backoff • Fast, local CPU 0 CPU 1 CPU 2 CPU 3 A0 A1 VCPUs B0 B1 B1

  14. 1 0 2 1 CPU 0 CPU 1 CPU 2 CPU 3 fault containment boundary Periodic balancer (global view) 4 • Check for disparity in load tree • Cost • Affinity loss • Fault dependencies 1 3 A0 A1 B0 B1 B1

  15. CPU management results +9% • IRIX overhead (13%) is higher +0.3%

  16. RAM RAM RAM RAM RAM RAM RAM RAM Interconnect Memory load balancing VM VM VM VM Cellular Disco

  17. Memory load balancing policy • Borrow memory before running out • Allocation preferences for each VM • Borrow based on: • Combined allocation preferences of VMs • Memory availability on other cells • Memory usage • Loan when enough memory available

  18. 4 4 4 4 4 4 Memory management results Only +1% overhead • Ideally: same time if perfect memory balancing DB DB Cellular Disco Cellular Disco 32 CPUs, 3.5GB 4 4 Interconnect Interconnect

  19. Comparison to related work • Operating system (IRIX6.4) • Hardware partitioning • Simulated by disabling inter-cell resource balancing 16 process Raytrace TPC-D Cellular Disco 8 CPUs 8 CPUs 8 CPUs 8 CPUs Interconnect

  20. Results of comparison • CPU utilization: 31% (HW) vs. 58% (VC)

  21. Conclusions • Virtual machine approach adds flexibility to system at a low development cost • Virtual clusters address the needs of large shared-memory multiprocessors • Avoid operating system scalability bottlenecks • Support fault containment • Provide scalable resource management • Small overheads and low implementation cost

More Related