1 / 25

Disco : Running commodity operating system on scalable multiprocessor Edouard et al.

Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran. Outline. Goal Problems & Solution Virtual Machine Monitors(VMM) Disco architecture Disco implementation Experimental results Conclusion. Goal.

sutton
Download Presentation

Disco : Running commodity operating system on scalable multiprocessor Edouard et al.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disco : Running commodity operating system on scalable multiprocessorEdouard et al. Presented by Vidhya Sivasankaran

  2. Outline • Goal • Problems & Solution • Virtual Machine Monitors(VMM) • Disco architecture • Disco implementation • Experimental results • Conclusion

  3. Goal • Extend modern OS to run efficiently on shared memory multiprocessor without many changes to the OS • Virtual Machine Monitors (VMM) – VM runs multiple copies of OS on a scalable multi processor

  4. Problems • The system software for scalable multi-processor machines lag behind the hardware • Extensive custom modifications to OS needed to support the scalable machines • Modification is implementation intensive and has reliability issues

  5. Solution • Insert a VMM between OS and hardware to resolve the problem • VMM - Implemented Using Disco

  6. Virtual Machine Monitors • VMM is a software layer. • VMM combined with commodity OS forms a flexible system software solution that supports large application base • Virtualizes all resources of the machine exporting conventional hardware interface to OS • VM communicate through distribution protocols

  7. Disco Architecture.

  8. Advantages • Scalability • Fault containment • Avoid NUMAness • Flexibility

  9. Challenges facing virtual machines • Overhead • Resource management • Communication and sharing

  10. Disco Implementation • Multithreaded shared memory program • Attention given to Numa,cache aware data structure • Code segment of disco is copied to memories of flash processors to satisfy cache misses-data locality • Communication using shared memory

  11. Virtual CPU(v.cpu) • Disco emulates the execution of virtual cpu by using direct execution on real cpu. • Scheduling - Disco set the register of real cpu to v.cpu and jump to current PC • State - For each v.cpu disco keeps the data structure to maintain the state of v.cpu • Virtual CPU of Disco provide abstraction of a MIPS R10000 processor

  12. Virtual CPU (contd..) • Additional data structure-privileged register and TLB content • On MIPS,disco runs in kernel mode • Control transfer to VM, Disco puts processor in “Supervisor mode” – to run OS of VM • Supervisor mode does not give access to privileged instructions

  13. Virtual Physical memory • Disco performs physical-machine address mapping using TLB • When OS tries to insert Virtual-physical address mapping in TLB,Disco emulates this and gets the machine address for that subsequent physical address. • Disco has pmap data structure that contain one entry for each page of VM.

  14. Continued.. • In MIPS processor,kernel mode memory reference bypass TLB and directly access memory and I/O.Need to relink OS code and data to mapped address space. • MIPS processor tag each TLB entry with address space identifier(ASID) • Workload execution on top of disco suffer from increased no.of TLB misses. • Second level software TLB – to lessen performance impact

  15. NUMA Memory Management • Cache misses should be satisfied from local memory to avoid latency. • To accomplish this disco implement page replication and migration. • Page Migration • Heavily accessed page by one node will be migrated to other node. • Disco transparently changes physical-machine address mapping • Invalidate the TLB entry mapping the old machine page then copy the data to local page.

  16. Continued.. • Page Replication • Downgrade the TLB entry of the machine page to read-only and then copy the page to local node and update its TLB entry. • Disco maintain the data structure memmap contains entry for each real machine memory page.

  17. Page replication Disco uses physical to machine mapping to replicate the pages.Virtual page from both cpu of same virtual machine map the same physical page of their virtual machine.Disco transparently maps each virtual page to machine page replica that is located local to the node.

  18. Virtual I/O devices • To virtualize I/O devices Disco intercept all device access from virtual machine and gradually pass them to physical devices • In a system like Disco, the sequence would look something like:  • VM executes instruction to access I/O • Trap generated by CPU (based on memory or privilege protection) transfers control to VMM. • VMM emulates I/O instruction, saving information about where this came from

  19. Copy on write disk • Disk reads can be serviced by monitor,if request size is multiple of machine page size,then monitor has to remap the machine pages into VM physical memory. • Pages are read only and an attempt to modify will generate copy on write fault handled by monitor. Read only pages are are brought in from disk can be transparently shared between virtual machines.This creates global buffer shared across virtual machine and helps to reduce memory foot prints.

  20. Virtual N/W interface 1)monitors n/w device remap data page from source machine address to destination machine address. 2)monitor remap the data page from drivers mbuf to client buffer cache.

  21. Execution Overhead • Experimented on a uni-processor, once running IRIX directly on the h/w and once using disco running IRIX in a single virtual machine • Overhead of virtualization ranges from 3% - 16%.

  22. Memory overhead • Ran single workload of eight different instances of pmake with six different system configurations • Effective sharing of kernel text and buffer cache limits the memory overheads of multiple VM’s

  23. Scalability • Ran pmake workload under six configurations • IRIX Suffers from high synchronization overheads • Using a single VM has a high overhead. When increased to 8 VM’s execution time reduced to 60%

  24. NUMA • Performance of UMA machine determines the lower bound for the execution time of NUMA machine • Achieves significant performance improvement by enhancing the memory locality.

  25. Conclusion • Develop system software for scalable shared memory multiprocessor without massive development efforts • Experiments results show that overhead of virtualization is modest • Provides solution for NUMA management

More Related