Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac

Processes Scheduling on Heterogeneous Multi-core Architecture with Hardware Support Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac.cn

Contents Introduction Hardware support for LLC-miss latency LA-ACMP scheduling algorithm Evaluation and analysis

Introduction Heter-CMP: Heterogeneous Chip Multi-Processor Composed with some big cores and some small cores Big cores: large area, high power, high performance Adapted to CPU-bound programs, serial programs, …… Small cores: Small area, low power, low performance Adapted to memory-bound programs, parallel programs, …… Advantage Make good use of chip resources Reduce power and performance waste Challenge Identify applications’ behaviors when executing Schedule proper programs to proper cores

Hardware Support (1) Identify programs’ behaviors Last level cache (LLC) miss latency LLC miss  Memory access Memory accesses induce high latency Affect programs’ efficiency when executed Can not make full use of cores’ performance Schedule rules Programs with high LLC miss latency should be scheduled to small cores Programs with low LLC miss latency should be scheduled to big cores

Hardware Support (2) Identify programs’ behaviors Last level Cache (LLC) miss latency Mechanism LLC miss delay is the period between miss request and miss response UN-Overlapped, Overlapped Record LLC miss latency for each core, with hardware support

Hardware Support (3) Implemented based on Godson-3A Record LLC miss request and response for each core, with hardware support

Hardware Support (4)

LA-ACMP Schedule Algorithm(1) LA-ACMP：Latency-Aware Asymmetry CMP Identify heterogeneity of cores Based on Linux kernel 2.6.18 Calculate BogoMIPS value of each core, evaluate each core’s performance Workload assignment balance Using Scaled Load method L=N/P: each core’s scaled load N: number of workloads being in queue P: processor’s performance If Lmax – Lmin <= 1, workload assignment balance

LA-ACMP Schedule Algorithm(2) LLC-delay buffer Append each run-queue with a LLC-delay buffer save each task’s LLC miss latency

LA-ACMP Schedule Algorithm(3) Update LLC-delay buffer When running, clear thread’s LLC-delay value When exhausting time slice, save thread’s LLC-delay value When migrate thread from queue-A to queue-B, also migrate LLC-delay value

LA-ACMP Schedule Algorithm(4) LA-ACMP algorithm Executed when judging balance Don’t destroy balance

Evaluate and analysis(1) Platform Godson-3A-heter Four cores: one works with 1GHz, three work with 500MHz Using asynchronous FIFO for synchronization Benchmark SPEC CPU2000

Evaluate and analysis(2) Applications’ executing speedup Compared to original OS LLC miss rate: with 15.4% performance improvement LLC miss delay: with 19.8% performance improvement Application groups with higher heterogeneity get higher performance improvement The third group, with highest improvement The second group, with lowest improvement

Thanks！

Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac

Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac

Presentation Transcript

INSTITUTE OF COMPUTING TECHNOLOGY

INSTITUTE OF COMPUTING TECHNOLOGY

INSTITUTE OF COMPUTING TECHNOLOGY

Shanghai Observatory, Chinese Academy of Sciences

Hong Ding Institute of Physics, Chinese Academy of Sciences

The Institute of High Energy of Physics, Chinese Academy of Sciences

Library of Chinese Academy of Sciences

Computing Environment in Chinese Academy of Sciences

Institute of High Energy Physics Chinese Academy of Sciences

Kunming Institute of Botany Chinese Academy of Sciences

Institute of Soil Science Chinese Academy of Sciences Nanjing, China

Institute of Microbiology Chinese Academy of Sciences

XtreemOS and ICT Zhiwei Xu Institute of Computing Technology, Chinese Academy of Sciences, China

Azerbaijan National Academy of Sciences Institute of Information Technology

Institute of Physics Chinese Academy of Sciences Beijing, China

Chinese Academy of Meteorological Sciences

Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences haoshouqing@ict.ac

LASG, Institute of Atmospheric Physics, Chinese Academy of Sciences