1 / 28

GCC ICI (Interactive Compilation Interface)

GCC ICI (Interactive Compilation Interface). Grigori Fursin. ALCHEMY Group INRIA Futurs France. January, 2007. Funded by HiPEAC network. Outline. Introduction and Motivation Iterative Interactive Compiler Framework Interactive Compilation Interface (ICI) Tools and Experiments

nyx
Download Presentation

GCC ICI (Interactive Compilation Interface)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GCC ICI (Interactive Compilation Interface) Grigori Fursin ALCHEMY Group INRIA Futurs France January, 2007 Funded by HiPEAC network

  2. Outline • Introduction and Motivation • Iterative Interactive Compiler Framework • Interactive Compilation Interface (ICI) • Tools and Experiments • Conclusions and Future Work

  3. Motivation • Current compilers fail to deliver best performance on modern processors due to • rapidly evolving hardware • simplistic hardware models • fixed black-box optimization heuristics • inability to fine-tune applications • lack of run-time information • Different research compilers or transformation tools • rewritten from scratch to “clean” internals and understand behavior (time consuming) • have many unnecessary duplications of other compiler internals • are often incompatible with each other and non-portable • usually support limited number of languages • still often have ambiguous and non-portable optimization heuristics

  4. Goals • Instead of developing new compiler or transformations tools, modify current popular (non-research) rigid compilers into simpler transparent open transformation toolsets with externally tunable optimization heuristics through a standardized Interactive Compilation Interface (ICI) • Control only decision process at global and local levels and avoid revealing all intermediate compiler representation to allow further transparent compiler evolution • Narrow down optimization space by suggesting only legal transformations • Enable iterative recompilation algorithm to apply sequences of transformations • Treat current optimization heuristic as a black-box and progressively adapt it to a given program and given architecture • Allow life-long, whole-program optimization research with optimization knowledge reuse

  5. Current Compilers Source-to-source transformers Application Decision for Perform transformation 1 transf 1 Compiler optimization heuristic Sub-heuristic 2 Sub-heuristic 1 Sub-heuristic j Sub-heuristic i Sub-heuristic k Decision for Perform transformation i transf i Binary-to-binary transformers Binary

  6. Iterative Interactive Compiler Framework Application Iterative Interactive Compiler Rigid compiler optimization heuristic “black box” Decision for transformation 1 Perform transf. 1 ICI1 Decision for transformation 2 Perform transf. 2 ICI2 Decision for transformation i Perform transf. i ICIi External compiler drivers Binary Program Optimization Database

  7. Interactive Compilation Interface Application … Iterative Interactive Compiler Analysis, decision and parameters for decision for optimization Apply transformation … Executable

  8. Interactive Compilation Interface Application … Iterative Interactive Compiler External output transformation file Analysis, decision and parameters for decision for optimization Saved decisions and parameters for transformations Apply transformation Write mode … Executable

  9. Interactive Compilation Interface Application External input transformation file or Socket Communication External output transformation file or Socket Communication … Iterative Interactive Compiler Analysis, decision and parameters for decision for optimization Saved decisions and parameters for transformations Modified decisions and parameters for transformations Apply transformation Write mode Read mode … Read/Write mode Executable

  10. Interactive Compilation Interface • Invoking ICI • Through command line: • Write mode: • gcc -fici-generate-ftree-loop-linear -funroll-loops *.c • Read/Write mode: • gcc -fici-generate -fici-use-ftree-loop-linear -funroll-loops *.c • Through environment variables • (to enable transparent continuous optimizations): • Write mode: • exportGCC_ICI_GEN = 1 • make • Read/Write mode: • exportGCC_ICI_GEN = 1 • exportGCC_ICI_USE = 1 • export GCC_ICI_OPTS = -ftree-loop-linear -funroll-loops • make

  11. Current Implementation External output transformation xml file: <?xml version="1.0"?> <compiler_ici> <file_name="swim.f"> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>4</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>3</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> … </file_name> </compiler_ici>

  12. Current Implementation • Supported optimizations: • global: • program phase reordering • local: • loop interchange • loop peeling • loop unrolling • more optimizations soon … External output transformation xml file: <?xml version="1.0"?> <compiler_ici> <file_name="swim.f"> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>4</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>3</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> … </file_name> </compiler_ici>

  13. Current Implementation • Supported optimizations: • global: • program phase reordering • local: • loop interchange • loop peeling • loop unrolling • more optimizations soon … External output transformation xml file: <?xml version="1.0"?> <compiler_ici> <file_name="swim.f"> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>4</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> <transformation name="unroll_and_peel"> <function>calc1</function> <loop_number>3</loop_number> <depth>1</depth> <decision>4</decision> <factor>7</factor> </transformation> … </file_name> </compiler_ici> • Based on PathScale ICI (2004-2006) • inlining • array padding (global/local) • loop fusion/fission • loop interchange • loop blocking • loop unrolling • register tiling • prefetching

  14. Iterative Recompilation Algorithm Iterative Recompilation Algorithm to apply sequences of transformations: clear transformation_file_out.xml set PATHSCALE_ICI_W to 1 compile program (write transformation_file_out.xml) set PATHSCALE_ICI_R to 1 _label_recompile: copy transformation_file_out.xml to transformation_file_in.xml modify transformation_file_in.xml if needed compile program (read transformation_file_in.xml, write transformation_file_out.xml) if transformation_file_in.xml not the same as transformation_file_out.xml go to _label_recompile

  15. GCC Instrumentation (Phase Reordering) gcc/passes.c #include “fici.h” void execute_pass_list (…) { … /* GCC ICI */ if (flag_ici_use) { int i; for(i = 0; i < fici_pass_count(type); i++) execute_one_pass(pass_list[fici_reorder_pass_number(type, i)]); } else execute_pass_list(pass, new_type); … do { if (flag_ici_generate) fici_reorder_add_pass(type, pass->name, pass->index); if (execute_one_pass (pass) && pass->sub) execute_pass_list (pass->sub, type); pass = pass->next; } while (pass);

  16. GCC Instrumentation (Transformations) gcc/loop-unroll.c #include “fici.h” static void decide_unrolling_and_peeling (struct loops *loops, int flags) { … decide_unroll_constant_iterations (loop, flags); if (loop->lpt_decision.decision == LPT_NONE) decide_unroll_runtime_iterations (loop, flags); if (loop->lpt_decision.decision == LPT_NONE) decide_unroll_stupid (loop, flags); if (loop->lpt_decision.decision == LPT_NONE) decide_peel_simple (loop, flags); /* GCC ICI */ if (flag_ici_use) fici_unroll_in(get_name(current_function_decl), loop->num, loop->depth, &(loop->lpt_decision.decision),&(loop->lpt_decision.times)); if (flag_ici_generate) fici_unroll_out(get_name(current_function_decl), loop->num, loop->depth, &(loop->lpt_decision.decision), &(loop->lpt_decision.times)); loop = next; }

  17. GCC Instrumentation (Features) gcc/tree-loop-linear.c #include “fici.h” linear_transform_loops (struct loops *loops) { … if (flag_api_generate) { dump_file_tmp=dump_file; dump_flags_tmp=dump_flags; dump_file=fici_features_group_start_out(FICI_FGR_LOOP_DEPS); dump_flags=TDF_DETAILS | TDF_STATS; fapi2_features_start_dump_tmp(); } … } Reuse GCC dump information and progressively clean it

  18. Preliminary Results

  19. Using Framework Porting from PathScale Continuous Optimization Framework (2003-cur) to GCC or developing: • Continuous iterative optimization driver with run-time adaptation at function, loop-levelor instruction level using low-overhead phase detection technique • Driver to continuously collect all possible optimization parameters • Driver to automatically and continuously rebuild compiler optimization heuristic, and adapt to a specific architecture using statistical methods and collective optimization knowledge reuse among different programs and architectures • Prototype framework to replace a model-based compiler heuristic with automatically learned one by connecting ICI with WEKA - an open-source machine learning software package

  20. Iterative Continuous Optimizations application source-to-source transformations current compilers binary execution binary-to-binary transformations

  21. Iterative Continuous Adaptive Optimizations application source-to-source transformations Iterative Interactive Compiler Program Transformation Database binary execution Iterative Optimizations/ Machine Learning binary-to-binary transformations

  22. ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time

  23. ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time transformations new application GCC ICI features

  24. ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time transformations new application GCC ICI features GCC ICI

  25. ML to Remove Compiler Heuristic transformations GCC ICI application1 features execution time … Building Model with WEKA transformations GCC ICI applicationN features execution time transformations new application GCC ICI features best execution time GCC ICI

  26. Conclusions • We demonstrate a simple, practical and non-intrusive way to turn current • rigid compilers into powerful interactive transformation toolset with an Interactive • Compilation Interface that allows to bias compiler optimization decisions externally • We avoid the pitfalls of rigidifying the compiler internals while granting • access to rich-enough features to take performance-critical decisions • We considerably reduce optimization search space by analyzing and applying only legal transformations • We develop tools for continuous collective life-long optimizations and knowledge reuse across different programs and architectures • We use framework in EU projects to automatically adapt and optimize programs for performance, code size, power consumption, multiple ISA, etc

  27. Future work • Porting ICI to GCC in collaboration with IBM, NXP (Philips), STMicro, ARC, multiple universities within HiPEAC network of excellence and within EU-funded projects MilePost, SARC and GGCC • Adding more transformations and enabling phase-reordering at function level in GCC • Unifying optimization naming conventions to enable portability and knowledge reuse to build optimization heuristics automatically • Implementing run-time adaptation technique to select different program versions at run-time depending on program behavior • Finishing framework for practical continuous life-long whole-program optimizations with statistical or machine learning techniques • Porting ICI to JIT compilers (Jikes, .NET) to unify run-time optimizations • Would like to participate?http://sourceforge.net/projects/gcc-ici

  28. Questions? Software development web-site for GCC ICI: http://sourceforge.net/projects/gcc-ici Thanks to Sebastian Pop, Cupertino Miranda and Hamid Daoud for help with gcc modifications Collaborations and Support: IBM, NXP (Philips), STMicro, ARC, CAPS, Universities within HiPEAC This work is funded by HiPEAC http://www.hipeac.net Contact e-mail: grigori.fursin@inria.fr More information: http://fursin.net/research_desc.html

More Related