1 / 14

Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac

Processes Scheduling on Heterogeneous Multi-core Architecture with Hardware Support. Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac.cn. Contents. Introduction Hardware support for LLC-miss latency LA-ACMP scheduling algorithm

xanthe
Download Presentation

Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Processes Scheduling on Heterogeneous Multi-core Architecture with Hardware Support Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac.cn

  2. Contents Introduction Hardware support for LLC-miss latency LA-ACMP scheduling algorithm Evaluation and analysis

  3. Introduction Heter-CMP: Heterogeneous Chip Multi-Processor Composed with some big cores and some small cores Big cores: large area, high power, high performance Adapted to CPU-bound programs, serial programs, …… Small cores: Small area, low power, low performance Adapted to memory-bound programs, parallel programs, …… Advantage Make good use of chip resources Reduce power and performance waste Challenge Identify applications’ behaviors when executing Schedule proper programs to proper cores

  4. Hardware Support (1) Identify programs’ behaviors Last level cache (LLC) miss latency LLC miss  Memory access Memory accesses induce high latency Affect programs’ efficiency when executed Can not make full use of cores’ performance Schedule rules Programs with high LLC miss latency should be scheduled to small cores Programs with low LLC miss latency should be scheduled to big cores

  5. Hardware Support (2) Identify programs’ behaviors Last level Cache (LLC) miss latency Mechanism LLC miss delay is the period between miss request and miss response UN-Overlapped, Overlapped Record LLC miss latency for each core, with hardware support

  6. Hardware Support (3) Implemented based on Godson-3A Record LLC miss request and response for each core, with hardware support

  7. Hardware Support (4)

  8. LA-ACMP Schedule Algorithm(1) LA-ACMP:Latency-Aware Asymmetry CMP Identify heterogeneity of cores Based on Linux kernel 2.6.18 Calculate BogoMIPS value of each core, evaluate each core’s performance Workload assignment balance Using Scaled Load method L=N/P: each core’s scaled load N: number of workloads being in queue P: processor’s performance If Lmax – Lmin <= 1, workload assignment balance

  9. LA-ACMP Schedule Algorithm(2) LLC-delay buffer Append each run-queue with a LLC-delay buffer save each task’s LLC miss latency

  10. LA-ACMP Schedule Algorithm(3) Update LLC-delay buffer When running, clear thread’s LLC-delay value When exhausting time slice, save thread’s LLC-delay value When migrate thread from queue-A to queue-B, also migrate LLC-delay value

  11. LA-ACMP Schedule Algorithm(4) LA-ACMP algorithm Executed when judging balance Don’t destroy balance

  12. Evaluate and analysis(1) Platform Godson-3A-heter Four cores: one works with 1GHz, three work with 500MHz Using asynchronous FIFO for synchronization Benchmark SPEC CPU2000

  13. Evaluate and analysis(2) Applications’ executing speedup Compared to original OS LLC miss rate: with 15.4% performance improvement LLC miss delay: with 19.8% performance improvement Application groups with higher heterogeneity get higher performance improvement The third group, with highest improvement The second group, with lowest improvement

  14. Thanks!

More Related