1 / 25

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

Bugnion et al. Presented by: Ahmed Wafa. Disco: Running Commodity Operating Systems on Scalable Multiprocessors. Overview. Shared Memory Multiprocessor Machines System Software for Shared Memory Multi-processor Disco as a virtual machine monitor Performance Conclusions.

ursa
Download Presentation

Disco: Running Commodity Operating Systems on Scalable Multiprocessors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bugnion et al. Presented by: Ahmed Wafa Disco: Running Commodity Operating Systems on Scalable Multiprocessors

  2. Overview • Shared Memory Multiprocessor Machines • System Software for Shared Memory Multi-processor • Disco as a virtual machine monitor • Performance • Conclusions

  3. Shared Memory Multiprocessors • UMA (Uniform Memory Access) • NUMA (Non Uniform Memory Access) • ccNUMA (Cache-Coherent NUMA)

  4. The Stanford FLASH Multiprocessor Fig1. The FLASH system architecture

  5. System Software for Shared Memory Multi-processors • Invest a large development effort for designing/developing custom OS • Statically Partition the machine and run commodity systems that communicate together using distributed protocols • Run multiple commodity systems on top of a virtual machine monitor that provide dynamic resource sharing

  6. Disco: A Virtual Machine Monitor Fig.2 Disco's Architecture

  7. Challenges facing Virtual Machines • Overheads. • Resource Management. • Communication and Sharing.

  8. Disco • Interface. • Implementation. • Running Commodity Operating systems.

  9. Disco's Interface • Processors • Virtual CPUs that abstracts the MIPS R10000 • Physical Memory • Contiguous space starting at address 0 • Near Uniform memory access time • I/O Devices • Virtualize access to devices. • Virtual subnet to allow communication

  10. Disco's Implementation • Virtual CPUs • Virtual Physical Memory • NUMA Memory Management • Virtual I/O Devices • Copy-on-write Disks • Virtual Network Interfaces

  11. Implementation: Virtual CPUs • Direct Execution when possible • MIPS modes • Kernel • Supervisor • User

  12. Implementation: Virtual Physical Memory • Fast and Correct Mapping from the VM virtual address space to the real machine address space • Handling of TLB misses and the pmap data structures • TLB miss cost and Disco's Second-Level TLB

  13. Implementation: NUMA Memory Management • Satisfying Cache misses from local memory • Page Migration and Replication • FLASH cache miss counter and Hot Pages • Disco memmap and TLB shootdowns Fig.3 Page Replication

  14. Implementation: I/O • Virtual I/O Devices • Intercepting device access and DMA requests • Special device drivers, single trap per request • Copy-on-Write Disks • Used for shared disks • Speed up disk reads by sharing memory Fig.4 Copy-on-Write Page Sharing

  15. Virtual Network Interface • Virtual subnet to allow VM communication • No MTU limit for packets • Aligned messages that span complete pages will be re-mapped rather than copied • Sharing vs Replicating for maximizing data locality Fig.5 NFS sharing, replacing bcopy to remap data

  16. Running Commodity Operating Systems • IRIX 5.3 a SVR4 based UNIX from SG. • Changes for IRIX • MIPS and KSEG0 • Device Drivers • HAL • Network mbuf sharing

  17. Experimental Results • Setup • Execution Overheads • Memory Overheads • Scalability • Dynamic Page Migration and Replication

  18. Experimental Setup • The FLASH machine was not available at the time of development • Hardware emulation using SimOS

  19. Execution Overheads • 3-16% slowdown • TLB misses • System calls Fig.2 Virtualization Overhead Fig.6 Service Breakdown for pmake workload

  20. Memory Overheads Fig.7 Service Breakdown for pmake workload

  21. Scalability • IRIX and memory management • Using 8 VM(s) reduces delay by 40% Fig.8 Workload Scalability Under Disco

  22. Dynamic Page Migration and Replication • The NUMA problem • IRIX(non NUMA aware) vs DISCO Fig.9 Performance Benefits of Page Migration

  23. Conclusions • Developing system for software for Shared Memory Multiprocessor machines can cost much • Disco can present such a machine as a cluster of connected machines through virtualization • Disco will allow commodity operating system to be used on such machines

  24. Conclusions • Virtualization overhead can be as low as 3%-16% • Optimization techniques can enhance scalability and hide NUMAness from commodity OS

  25. References • “The Stanford FLASH multiprocessor”, kuskin et al. • MIPS R4400 manual http://groups.csail.mit.edu/cag/raw/documents/R4400_Uman_book_Ed2.pd

More Related