1 / 10

A Uniform Optimization Technique for Offset Assignment Problems

A Uniform Optimization Technique for Offset Assignment Problems. Rainer Leupers, Fabian David University of Dortmund, Germany Dept. of Computer Science 12. Overview. Offset assignment problem Related work Genetic algorithm approach Exploitation of modify registers

Download Presentation

A Uniform Optimization Technique for Offset Assignment Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Uniform Optimization Technique for Offset Assignment Problems Rainer Leupers, Fabian David University of Dortmund, Germany Dept. of Computer Science 12

  2. Overview • Offset assignment problem • Related work • Genetic algorithm approach • Exploitation of modify registers • Results & conclusions

  3. Offset assignment problem Context: Code generation for DSPs Given: DSP address generation unit (AGU) with # address registers (ARs): k # modify registers (MRs): m auto-increment range (AIR): r Auto-increment capabilities: AR[i] += d, d <= r AR[i] += MR[j] Other address computations cause extra code ! Problem: Assign program variables to memory addresses and ARs, such that the use of auto-increment address computations is maximized !

  4. 0 1 2 3 0 1 2 3 a c b a c d d b Offset assignment: example variables: { a, b, c, d } access sequence: (b, d, a, c, d, a, c, b, a, d, a, c, d) Layout 1 Layout 2 AR = 3 AR - - AR - - AR - - AR += 2 AR - - AR - - AR += 3 AR -= 2 AR ++ AR - - AR - - AR += 2 AR = 1 AR += 2 AR -= 3 AR += 2 AR ++ AR -= 3 AR += 2 AR - - AR - - AR += 3 AR -= 3 AR += 2 AR ++ Simple Offset Assignment: k = 1 m = 0 r = 1 cost: 9 cost: 5

  5. Related work Offset assignment for different AGU models: #ARs #MRs AIR [Bartley92] 1 - 1 [Liao95] k - 1 [Leupers96] k m 1 [Wess97] 1 - 2 [Sudarsanam97] k - r this work k m r Further work on address optimization for fixed layout

  6. Genetic algorithm approach (1) Chromosomal representation: n variables, k address registers each individual is a permutation of { 1, ..., n + k - 1} example: n = 6, k = 2 switch to next AR offset mapping 0 1 2 3 4 5 2 5 3 1 7 6 4 AR[1] AR[2] Fitness function: F(I) = # transitions (v,w) in access sequence, such that v, w assigned to different ARs, or |off(v) - off(w) | <= r

  7. Genetic algorithm approach (2) Mutation: exchange two gene values x y y x Crossover: standard order crossover operation Optimization procedure: form initial population for N generations do: select parent individuals generate offspring mutate offspring emit best individual

  8. Exploitation of modify registers (1) [Leupers96]: Modification of Belady‘s optimal page replacement algorithm can be used for optimal exploitation of m MRs for a fixed offset assignment (only postpass optimization) PRA(I) = # address computations that can be saved by exploiting MRs for a given offset assignment modified fitness function: F´(I) = F(I) + PRA(I) => exploitation of MRs included into GA !

  9. Exploitation of modify registers (2) AR = 2 AR - - MR = 3 AR += MR AR ++ AR -= 2 AR - - AR ++ AR - - AR += MR AR -= MR MR = 2 AR += MR AR - - AR += 0 AR += 0 AR -= MR AR ++ AR - - AR ++ AR += 0 AR += 3 AR = 4 AR ++ MR = 2 AR -= MR AR -= MR AR ++ AR += MR AR -= MR AR += MR MR = 3 AR -= MR AR += MR AR - - AR - - AR += 0 AR += 0 AR += MR AR - - AR ++ AR - - AR += 0 AR -= MR heuristic OA + PRA genetic algorithm heuristic OA AR = 2 AR - - AR += 3 AR ++ AR -= 2 AR - - AR += 3 AR -= 3 AR += 2 AR - - AR += 0 AR += 0 AR -= 2 AR ++ AR - - AR += 0 AR += 3 v4 v0 v2 v1 v3 0 1 2 3 4 0 1 2 3 4 v4 v0 v2 v1 v3 v3 v2 v1 v0 v4 0 1 2 3 4

  10. Results & conclusions • Statistical evaluation: • 32 % improvement over OA heuristic with postpass • MR optimization • 32 % improvement over Wess‘ simulated annealing • technique • Runtime: typically 10 CPU seconds (Pentium II) • Main contributions: • First uniform offset assignment technique, • arbitrary k, m, r values • Significant improvements in code quality over previous • techniques, largely due to better exploitation of MRs

More Related