310 likes | 430 Views
This paper explores the integration of machine learning techniques to improve compiler heuristics in optimization tasks. Given the complexity of modern architectures and the NP-hard nature of many compiler problems, traditional heuristic approaches often fall short due to their reliance on trial-and-error. The authors propose using machine learning to search the priority function space automatically, thereby enhancing the quality and efficiency of solutions. Through case studies like hyperblock formation and register allocation, the effectiveness of these methodologies is demonstrated.
E N D
Meta Optimization Improving Compiler Heuristics with Machine Learning Mark Stephenson, Una-May O’Reilly, Martin Martin, and Saman Amarasinghe MIT Computer Architecture Group http://www.cag.lcs.mit.edu/metaopt
Motivation • Compiler writers are faced with many challenges: • Many compiler problems are NP-hard • Modern architectures are inextricably complex • Simple models can’t capture architecture intricacies • Micro-architectures change quickly http://www.cag.lcs.mit.edu/metaopt
Motivation • Heuristics alleviate complexity woes • Find good approximate solutions for a large class of applications • Find solutions quickly • Unfortunately… • They require a lot of trial-and-error tweaking to achieve suitable performance http://www.cag.lcs.mit.edu/metaopt
Priority Functions • A heuristic’s Achilles heel • A single priority or cost function often dictates the efficacy of a heuristic • Priority functions rank the options available to a compiler heuristic • Graph coloring register allocation (selecting nodes to spill) • List scheduling (identifying instructions in worklist to schedule first) • Hyperblock formation (selecting paths to include) http://www.cag.lcs.mit.edu/metaopt
Machine Learning • We propose using machine learning techniques to automatically search the priority function space • Search space is feasible • Make use of spare computer cycles http://www.cag.lcs.mit.edu/metaopt
Case Study I: Hyperblock Formation • Find predicatable regions of control flow • Enumerate paths of control in region • Exponential, but in practice it’s okay • Prioritize paths based on several characteristics • The priority function we want to optimize • Add paths to hyperblock in priority order http://www.cag.lcs.mit.edu/metaopt
Favor frequently Executed paths Favor short paths Penalize paths with hazards Favor parallel paths Case Study I: IMPACT’s Function http://www.cag.lcs.mit.edu/metaopt
Hyperblock Formation • What are the important characteristic of a hyperblock formation priority function? • IMPACT uses four characteristics • Extract all the characteristics you can think of and have a machine learning algorithm find the priority function http://www.cag.lcs.mit.edu/metaopt
Hyperblock Formation http://www.cag.lcs.mit.edu/metaopt
* / - predictability num_ops 2.3 4.1 Genetic Programming • GP’s representation is a directly executable expression • Basically a lisp expression (or an AST) • In our case, GP variables are interesting characteristics of the program http://www.cag.lcs.mit.edu/metaopt
Genetic Programming • Searching algorithm analogous to natural selection • Maintain a population of expressions • Selection • The fittest expressions in the population are more likely to reproduce • Sexual reproduction • Crossing over subexpressions of two expressions • Mutation http://www.cag.lcs.mit.edu/metaopt
Genetic Programming Create initial population (initial solutions) • Most expressions in initial population are randomly generated • It also seeded with the compiler writer’s best guesses Evaluation Generation of variants (mutation and crossover) Selection Generations < Limit? END http://www.cag.lcs.mit.edu/metaopt
Baseline expression in the one that’s distributed with Trimaran Genetic Programming Create initial population (initial solutions) • Each expression is evaluated by compiling and running benchmark(s) • Fitness is the relative speedup over the baseline on benchmark(s) Evaluation Generation of variants (mutation and crossover) Selection Generations < Limit? END http://www.cag.lcs.mit.edu/metaopt
Genetic Programming Create initial population (initial solutions) • Just as with Natural Selection, the fittest individuals are more likely to survive and reproduce. Evaluation Generation of variants (mutation and crossover) Selection Generations < Limit? END http://www.cag.lcs.mit.edu/metaopt
Genetic Programming Create initial population (initial solutions) Evaluation Generation of variants (mutation and crossover) Selection Generations < Limit? END http://www.cag.lcs.mit.edu/metaopt
Genetic Programming Create initial population (initial solutions) • Use crossover and mutation to generate new expressions Evaluation Generation of variants (mutation and crossover) Selection Generations < Limit? END http://www.cag.lcs.mit.edu/metaopt
Hyperblock ResultsCompiler Specialization 3.5 Train data set Alternate data set 3 (add (sub (cmul (gt (cmul $b0 0.8982 $d17)…$d7)) (cmul $b0 0.6183 $d28))) 2.5 (add (div $d20 $d5) (tern $b2 $d0 $d9)) 2 Speedup 1.5 1.54 1.23 1 0.5 0 toast Average huff_enc mpeg2dec huff_dec rawdaudio rawcaudio g721encode g721decode 129.compress http://www.cag.lcs.mit.edu/metaopt
Hyperblock ResultsA General Purpose Priority Function http://www.cag.lcs.mit.edu/metaopt
Cross ValidationTesting General Purpose Applicability http://www.cag.lcs.mit.edu/metaopt
Case Study II: Register AllocationA General Purpose Priority Function http://www.cag.lcs.mit.edu/metaopt
Register Allocation ResultsCross Validation http://www.cag.lcs.mit.edu/metaopt
Conclusion • Machine learning techniques can identify effective priority functions • ‘Proof of concept’ by evolving two well known priority functions • Human cycles v. computer cycles http://www.cag.lcs.mit.edu/metaopt
GP Hyperblock SolutionsGeneral Purpose (add (sub (mulexec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref) (mul 0.6727 num_paths) (mul 1.1609 (add (sub (mul (divnum_opsdependence_height) 10.8240) exec_ratio) (sub (mul (cmulhas_unsafe_jsrpredict_product_mean 0.9838) (sub 1.1039 num_ops_max)) (sub (muldependence_height_mean num_branches_max) num_paths))))))) Intron that doesn’t affect solution http://www.cag.lcs.mit.edu/metaopt
GP Hyperblock SolutionsGeneral Purpose (add (sub (mulexec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref) (mul 0.6727 num_paths) (mul 1.1609 (add (sub (mul (divnum_opsdependence_height) 10.8240) exec_ratio) (sub (mul (cmulhas_unsafe_jsrpredict_product_mean 0.9838) (sub 1.1039 num_ops_max)) (sub (muldependence_height_mean num_branches_max) num_paths))))))) Favor paths that don’t have pointer dereferences http://www.cag.lcs.mit.edu/metaopt
Favor highly parallel (fat) paths GP Hyperblock SolutionsGeneral Purpose (add (sub (mulexec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref) (mul 0.6727 num_paths) (mul 1.1609 (add (sub (mul (divnum_opsdependence_height) 10.8240) exec_ratio) (sub (mul (cmulhas_unsafe_jsrpredict_product_mean 0.9838) (sub 1.1039 num_ops_max)) (sub (muldependence_height_mean num_branches_max) num_paths))))))) http://www.cag.lcs.mit.edu/metaopt
GP Hyperblock SolutionsGeneral Purpose (add (sub (mulexec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref) (mul 0.6727 num_paths) (mul 1.1609 (add (sub (mul (divnum_opsdependence_height) 10.8240) exec_ratio) (sub (mul (cmulhas_unsafe_jsrpredict_product_mean 0.9838) (sub 1.1039 num_ops_max)) (sub (muldependence_height_mean num_branches_max) num_paths))))))) If a path calls a subroutine that may have side effects, penalize it http://www.cag.lcs.mit.edu/metaopt
Case Study I: IMPACT’s Algorithm A 4k 24k B C 10 4k 22k 2k E D 2k 10 25 F 28k G 28k http://www.cag.lcs.mit.edu/metaopt
Case Study I: IMPACT’s Algorithm A 4k 24k B C 10 4k 22k 2k E D 2k 10 25 F 28k G 28k http://www.cag.lcs.mit.edu/metaopt