1 / 61

Presented by: Sagnik Bhattacharya

Cellular Disco. Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum. Presented by: Sagnik Bhattacharya. Overview. Problems of current shared memory multiprocessors and our requirements Cellular Disco as a solution architecture prototype hardware-fault containment

adolfo
Download Presentation

Presented by: Sagnik Bhattacharya

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cellular Disco Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum Presented by: Sagnik Bhattacharya

  2. Overview • Problems of current shared memory multiprocessors and our requirements • Cellular Disco as a solution • architecture • prototype • hardware-fault containment • CPU management • Memory management • statistics • Cellular Disco and ubiquitous environments • Conclusion

  3. Problem • Extending modern Operating systems to run efficiently on shared memory multiprocessors. • Software development has not kept pace with hardware development. • Common operating systems fail beyond 12 processors.

  4. What we need…. • the system should be reliable • it should be scalable • it should be fault-tolerant • it should not take too much of development time or effort.

  5. Traditional approaches • Hardware partitioning - lacks resource sharing, makes physical clusters. • Software-centric approaches : (significant development time and cost) • modify existing OS • develop new OS

  6. Control unit Proc Proc Proc Proc A scenario…. Smart Space (No rebooting necessary)

  7. Solution : Cellular Disco • Extension of previous work - Disco • Uses the concept of Virtual machine monitors • Partitions the multiprocessor system into virtual clusters.

  8. Virtual Machine Monitor OS (Win NT) OS (IRIX 6.2) VM1 VM2 Virtual Machine Virtual Machine µP1 µP2 µP3 µP1 µP3 µP5 µP8 VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor Hardware

  9. VM1 µP1 µP2 µP3 OS (Win NT) I/O request OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

  10. VM1 µP1 µP2 µP3 OS (Win NT) OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 Trap I/O request & perform I/O VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

  11. VM1 µP1 µP2 µP3 OS (Win NT) OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 Perform I/O and send interrupt VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

  12. VM1 µP1 µP2 µP3 OS (Win NT) OS (IRIX 6.2) VM2 µP1 µP3 µP5 µP8 VM1 - µP’s 1,2,3 VM2 - µP’s 1,3,5,8 Virtual Machine Monitor

  13. Issues it addresses • Address scalability • NUMA awareness • Hardware fault-containment • Resource management

  14. Basic Cellular Disco Architecture

  15. Prototype • Runs on a 32-processor SGI-Origin 2000 • Supports shared memory systems based on MIPS R1000 architecture. • The prototype runs piggybacked on IRIX 6.4 • The host OS is made dormant and is only used to invoke some device drivers.

  16. Hardware Virtualization • Physical Resources - visible to a virtual machine • Machine Resources - actual resources; allocated by Cellular Disco • CD operates in the kernel mode of the MIPS processor • CD intercepts all system calls.

  17. Resource Management • CPU management - Each processor maintains its own run queue • Memory Management - Memory borrowing mechanism • Each OS instance is only given as many resources as it can handle. Large applications are split and communications between the parts is established by using the shared-memory regions.

  18. CPU Management • VCPU migration : - Intra node (37 µsec) - Inter node (520 µsec) - Inter Cell (1520 µsec)

  19. Cellular Disco Interconnect VCPU migration Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

  20. Cellular Disco Interconnect Intra Node Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

  21. Cellular Disco Interconnect Inter Node Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

  22. Cellular Disco Interconnect Inter Cell Cell Cell Cell VCPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Node Node Node Node Node Node

  23. CPU Management(contd.) • CPU balancing : Idle Balancer Periodic balancer Load Balancing Scenario

  24. Idle balancer CPU0 CPU1 CPU2 CPU3 (Idle) VC A0 VC A1 Asks VC B0 VC B1 Does this have enough cache affinity to CPU2?

  25. Idle balancer CPU0 CPU1 CPU2 CPU3 (Idle) VC A0 VC A1 Asks VC B0 VC B1 Does this have enough cache affinity to CPU2? NO!!

  26. Idle balancer CPU0 CPU1 CPU2 CPU3 VC B1 VC A0 VC A1 VC B0 VC B1

  27. Periodic Balancer • Does depth-first traversal of the load tree 4 1 3 Traversal 1 0 2 1

  28. Periodic Balancer • Checks difference of 2 siblings, ignores if<2 4 1 3 Traversal 1 0 2 1 Diff=1 Diff=1

  29. Periodic Balancer • If diff>=2 does load balancing if benefit>cost 4 1 3 Traversal Diff=2 1 0 2 1

  30. Gang Scheduling • For all the CPU’s we select the VCPU that is to run on the physical CPU. • The VCPU selected is the highest priority be gang-runnable VCPU • all non-idle VCPU’s of that VM are either • running or, • waiting on run queues of processors running lower-priority VM’s.

  31. Example VM1 VC’s - 1,3,8(idle) Wait Queue µP1 : VC1 VC7 VC5 VM2 VC’s - 2,4,6(idle),7 Priority µP2 : VC2 VC1 VC9 µP3 : VC5 VC3 VC4 VM3 VC’s - 5,9 Currently Executing VCPU

  32. Example VM1 VC’s - 1,3,8 (idle) µP1 : VC1 VC7 VC5 VM2 VC’s - 2,4,6(idle),7 Priority µP2 : VC2 VC1 VC9 µP3 : VC5 VC3 VC4 VM3 VC’s - 5,9 Gang Runnable

  33. Example VM1 VC’s - 1,3,8(idle) New Wait Queue µP1 : VC5 VC7 VC1 VM2 VC’s - 2,4,6(idle),7 Priority µP2 : VC9 VC1 VC2 µP3 : VC5 VC3 VC4 VM3 VC’s - 5,9 New Executing VCPU

  34. Memory Management • Each cell maintains its own freelist, and allocates memory to other cells in it allocation preference list on request(RPC). • Speed - 758 µsec for 4 MB. • A threshold is set for min. amount of local free memory • As far as possible Paging is avoided.

  35. Memory Borrowing • freelist - list of free pages in the cell • allocation preference list - list of cells from which borrowing memory is more beneficial than paging.

  36. Memory Borrowing Freelist sizes 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

  37. Memory Borrowing Freelist sizes 32 MB Lending threshold asks 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

  38. Memory Borrowing Freelist sizes 32 MB Lending threshold refused 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

  39. Memory Borrowing Freelist sizes 32 MB Lending threshold cannot ask 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

  40. Memory Borrowing Freelist sizes asks 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

  41. Memory Borrowing Freelist sizes Gives 4 MB 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

  42. Memory Borrowing Freelist sizes 32 MB Lending threshold 16 MB Borrowing threshold Cell 1 Cell 2 Cell 3 Cell 4 Cell 5

  43. Memory Management (contd.) • Paging : Algo - Second Chance FIFO • Page sharing information by some control data structure • Cellular Disco traps all read and write requests made by the Operating Systems

  44. Second-chance FIFO • A reference bit is added to each page in FIFO scheme • Every time the page is accessed the bit is set to 1 • If the page is selected by FIFO, and the reference bit is 1, then it is set to 0 and another page is looked for. • A page is the target page if it is selected b FIFO and the reference bit is 0

  45. Example Page Table Page Fault 1 Oldest Page FIFO 0 Second Oldest Page RB

  46. Example Page Table Page Fault 0 Oldest Page Second-chance FIFO 0 Second Oldest Page RB

  47. Example Page Table 0 Oldest Page RB

  48. Hardware fault-containment • Failure rate increases with increase in processors. • Internally structured as a set of semi-independent cells. • Failure in one cell does not impact VM’s running in other cells (localization of faults) • Assumption - CD is a trusted software layer

  49. Cellular Structure Fault in one cell does not affect others

  50. Hardware fault-containment (contd.) • Communication modes - Fast inter-processor RPC - Message • Side benefit - Software fault containment, i.e., individual OS crashes do not impact the system.

More Related