1 / 12

Region scheduling

Region scheduling. A cache-aware scheduler for CMP environment. Abstract. Last level cache become more performance-critical HW approach Intel's smart cache NUCA (non-uniform cache architecture) SW approach Need manage cache and schedule better

seth-horne
Download Presentation

Region scheduling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Region scheduling A cache-aware scheduler for CMP environment

  2. Abstract • Last level cache become more performance-critical • HW approach • Intel's smart cache • NUCA (non-uniform cache architecture) • SW approach • Need manage cache and schedule better • Hypervisor’s ability to control memory access (Region)

  3. Abstract Current working set A B C D • Working set • How big? What consists of it? How much sharing? • ‘Region’ is used to capture these working set Task A Over utilized! Less utilized! Bad  Task B Task C A C B D Task D Fully utilized Fully utilized Good 

  4. Region P1 R1(2) P2 P3 R2(2) P4 • Physical memory is partitioned into regions • Regions are allocated to each cache • Core can access only the allowed regions • Page table enforcement • Private/Shared region • We focus on private region P5 R3(2) P6 P7 R4(2) P8 P9 P10 R5(5) P11 P12 P13 P14 R6(4) P15 P16 P17 R7(2) P18 P19 ...... Private Region (Size) Shared Region (Size) Physical page

  5. Region • Regions are implemented in Xen • Transparent to the guest • Guest’s memory accesses are controlled • Page table enforcement • Caches are effectively managed by Xen • Regions can dynamically changes by the guest’s behavior • Application’s phase change

  6. Region • Page tables are managed to provide ‘page touch’ • Page touch is generated when VCPU illegally access non-allocated region • Page touch invokes microscheduling

  7. Region • Regions are allocated to cache • VCPU may need run on other core to access it • microscheduling

  8. Example1 (Single VCPU) • Region ID 0x1884f is allocated to Cache0 • Can be accessed only from PCPU0,2 • Region ID 0x12b44 is allocated to Cache1 • Can be accessed only from PCPU1,3 • When VCPU want to access certain region, it may need run on other cores Microscheduling VCPU : Region ID

  9. Example2 (multiple VCPUs) Time • Cache0 (PCPU0,2) has 0x13d83, 0xd7d4, 0xf3a6, so on… • Cache1 (PCPU1,3) has 0xb638, 0xcd4b, so on…

  10. Example2 (multiple VCPUs) VCPU0 0xf3a6 0xb638 Cache0 (PCPU0&2) 0x13d83 0xf3a6 0x3d92e 0x1909b 0xd7d4 0x18d38 • Each regions are allocated to caches • E.g. VCPU0 is microscheduled to PCPU1 or 3 to access the region 0xb638 VCPU1 0xfcad 0xcd4b VCPU2 0x13d83 0x1225e Cache1 (PCPU1&3) 0x1225e 0xb638 0xfcad 0xcd4b VCPU3 0xb638 0xf3a6 0x3d92e 0x1909b 0xd7d4

  11. Initial result • Simple experiments shows good result • Run multiple copies of SPEC2006 libquantum • Over 40% CPU time reduction • But need more experiments... • Data is on the way

  12. Conclusion • SW approach to manage caches • Transparent to the guests • Memory access are controlled by the hypervisor

More Related