1 / 18

WCET-aware Register Allocation based on Integer-Linear Programming

WCET-aware Register Allocation based on Integer-Linear Programming. Heiko Falk, Norman Schmitz, Florian Schmoll TU Dortmund Computer Science 12 Design Automation for Embedded Systems. Outline. Introduction State of the Art in Compiler Design Register Allocation

kyrie
Download Presentation

WCET-aware Register Allocation based on Integer-Linear Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WCET-aware Register Allocationbased onInteger-Linear Programming Heiko Falk, Norman Schmitz, Florian Schmoll TU Dortmund Computer Science 12 Design Automation for Embedded Systems

  2. Outline • Introduction • State of the Art in Compiler Design • Register Allocation • Traditional ILP-based Register Allocation • ILP Model • Limitations • WCET-aware Register Allocation using ILP • Model of the WCET • Model of Pipeline-Related Spill Costs • Results • Summary & Future Work

  3. Current State of the Art in Compiler Design Objective Function of Compiler Optimizations • Usually reduction of Average-Case Execution Times (ACET): Accelerate a “typical” execution of a program using “typical” input data • No statements about WCETs possible Optimization Strategy • Naive: Current compilers lack precise ACET timing model • Application of an optimization if “promising” • Effect of optimizations on a program’s ACET fully unknown to the compiler itself. • ACET-optimizations not useful for WCET minimization

  4. Register Allocation Goals • Considered the most important compiler optimization • Registers are fastest and most efficient memories • Register Allocation should make optimal use of registers Tasks • Assembly code before register allocation: virtual registers (VREGs) • Map all (potentially many) VREGs to (usually few) physical registers (PHREGs) of a processor • Insert memory loads and stores (spill code) whenever VREGs don’t fit into the register file

  5. Well-Known Register Allocators Graph Coloring • De-facto standard approach nowadays • Heuristics decide about allocation and spill code generation • Fast approach of moderate complexity • Spill heuristic might lead to poor code quality [P. Briggs, Register Allocation via Graph Coloring, 1992] [D. W. Goodwin, K. D. Wilken, Optimal and Near-optimal Global Register Allocation Using 0-1 Integer Programming, 1996] Register Allocation via Integer- Linear Programming (ILP) • Formal mathematical model of allocation and spilling • Achieves minimal spill code overhead, i.e. minimizes total number of spill instructions • Relatively high complexity, but optimal quality

  6. Traditional ILP-based Register Allocation Allocation decisions Variables , and map VREGs to PHREGs Spilling decisions Constraints Guarantee correctness of allocation and spilling decisions, e.g. • ensure that each VREG is assigned to at least one PHREG, • that at most one VREG can be assigned to a single PHREG, • ...

  7. Traditional ILP-based Register Allocation Objective Function • Minimizes spill code-related overhead • Under the assumption: • Each spill instruction contributes by same constant amount to objective function • Example: minimization of spill-related code size

  8. WCET Minimization via ILP-based Allocation? Limitation of the traditional approach • Assumption: • Each spill instruction contributes by same constant amount to objective function • Assumption only holds for trivial objectives like e.g. code size Challenges • How to model and minimize Worst-Case Execution Time (WCET) as non-trivial objective? • How to deal with complex processor pipelines executing spill instructions in parallel with other code?

  9. Challenge 1: ILP Model of the WCET The Worst-Case Execution Path (WCEP) • WCET of a program = Length of the program’s longest execution path (WCEP) • WCET Minimization: Optimization of only those parts of a program lying on the WCEP • Code optimization apart the WCEP will not reduce WCET • Only those spill-related decision variables must contribute to the ILP’s objective function that actually lie on the WCEP. • But: Spilling decisions affect WCET of basic blocks and thus the WCEP within a program. • How to model the WCEP via ILP depending on spill-related decision variables?

  10. Spill Code-dependent Costs • Costs of basic block : • models WCET of depending on the WCET of potentially inserted spill code • WCET without any spill code, plus WCET of all spill code inside

  11. Intraprocedural Control Flow • Modeling of a function’s control flow: Acyclic sub-graphs: (Reducible) Loops: • Treat body of inner-most loop like acyclic sub-graph • Fold loop • Costs of : • Continue with next innermost loop A A B B C Loop L B, C, D C D D E E = WCET of longest path starting at A

  12. Objective Function • WCET of entire function: • Each function has dedicated entry block • Variable models WCET of longest path within starting at • Variable models WCET of entire function

  13. add d0,d1,d2; # d0 = d1 + d2 ld d0,[a0]; # d0 = mem[a0] Challenge 2: Pipeline-Related Spill Costs Example: The Infineon TriCore Pipelines • Integer I-Pipeline: Executes usual integer ALU instructions • Load/Store LS-Pipeline: Executes memory loads/stores and address arithmetic • Ideal case: One I- and one LS-instruction executed in parallel within same clock cycle • However... (Some even more subtle cases of the TriCore pipelines omitted here…) I-instruction LS-instruction WAW hazard (write after write) Stalled by 1 cycle

  14. add d0,d1,d2; # i: d0 = d1 + d2 ld d0,[a0]; # s: d0 = mem[a0] ILP Example for Costs of Spill Instruction s st [a1],d1; # i: mem[a1] = d1 ld d0,[a0]; # s: d0 = mem[a0] Case 1 • If is LS-instruction: • . costs 1 cycle if is actually generated: Case 2 • If is spill-loadand is I-instruction: • . costs 1 cycle if is actually generatedand WAW hazard between and exists via PHREG :

  15. Results – Worst-Case Execution Times [H. Falk, WCET-aware Register Allocation based on Graph Coloring, DAC 2009] 98% x2 80% 19% • Compiler: WCC at optimizationlevel -O3 (42 optimizations) • Target Processor: TriCore TC1796 • 100%: WCETEST using Graph Coloring

  16. Results – Average-Case Execution Times • Compiler: WCC at optimizationlevel -O3 (42 optimizations) • Target Processor: TriCore TC1796 • 100%: ACET using Graph Coloring

  17. Results – CPU Runtimes ILP-based Allocator • Runtimes range from 1 CPU second to 54:08 CPU minutes • Including WCET analysis and ILP solver • Average runtime for 55 benchmarks: 3:33 CPU minutes WCET-aware Graph Coloring • Average runtime for 55 benchmarks: 4:13 CPU minutes • Reason: Performs a costly WCET analysis after register allocation for each individual basic block

  18. Summary & Future Work Summary • Current state of the art: Compilers are unaware of timing, naive optimization strategies • Standard register allocators unaware of worst-case properties • May thus lead to spill code generation along WCEP • WCET-aware ILP-based register allocation: Sophisticated models of WCET and pipeline-related spill costs • Average WCET reductions over 55 benchmarks: 20.2% • Outperforms WCET-aware graph coloring by factor 2 Future Work • Reduce runtimes of ILP-based register allocator • Improve code quality further by integrating rematerialization

More Related