1 / 43

Preliminary Transformations

Chapter 4 of Allen and Kennedy. Preliminary Transformations. Overview. Why do we need this? Requirements of dependence testing Stride 1 Normalized loop Linear subscripts Subscripts composed of functions of loop induction variables Higher dependence test accuracy

brad
Download Presentation

Preliminary Transformations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4 of Allen and Kennedy Preliminary Transformations

  2. Overview • Why do we need this? • Requirements of dependence testing • Stride 1 • Normalized loop • Linear subscripts • Subscripts composed of functions of loop induction variables • Higher dependence test accuracy • Easier implementation of dependence tests

  3. Programmers optimized code Confusing to smart compilers An Example INC = 2 KI = 0 DO I = 1, 100 DO J = 1, 100 KI = KI + INC U(KI) = U(KI) + W(J) ENDDO S(I) = U(KI) ENDDO

  4. Applying induction-variable substitution Replace references to AIV with functions of loop index An Example INC = 2 KI = 0 DO I = 1, 100 DO J = 1, 100 ! Deleted: KI = KI + INC U(KI + J*INC) = U(KI + J*INC) + W(J) ENDDO KI = KI + 100 * INC S(I) = U(KI) ENDDO

  5. Second application of IVS Remove all references to KI An Example INC = 2 KI = 0 DO I = 1, 100 DO J = 1, 100 U(KI + (I-1)*100*INC + J*INC) = U(KI + (I-1)*100*INC + J*INC) + W(J) ENDDO ! Deleted:KI = KI + 100 * INC S(I) = U(KI + I * (100*INC)) ENDDO KI = KI + 100 * 100 * INC

  6. Applying Constant Propagation Substitute the constants An Example INC = 2 ! Deleted: KI = 0 DO I = 1, 100 DO J = 1, 100 U(I*200 + J*2 - 200) = U(I*200 + J*2 -200) + W(J) ENDDO S(I) = U(I*200) ENDDO KI = 20000

  7. Applying Dead Code Elimination Removes all unused code An Example DO I = 1, 100 DO J = 1, 100 U(I*200 + J*2 - 200) = U(I*200 + J*2 - 200) + W(J) ENDDO S(I) = U(I*200) ENDDO

  8. Transformations need knowledge Loop Stride Loop-invariant quantities Constant-values assignment Usage of variables Information Requirements

  9. Loop Normalization • Lower Bound 1 with Stride 1 • To make dependence testing as simple as possible • Serves as information gathering phase

  10. Algorithm Procedure normalizeLoop(L0); i = a unique compiler-generated LIV S1: replace the loop header for L0 DO I = L, U, S with the adjusted loop header DO i = 1, (U – L + S) / S; S2: replace each reference to I within the loop by i * S – S + L; S3: insert a finalization assignment I = i * S – S + L; immediately after the end of the loop; end normalizeLoop; Loop Normalization

  11. Loop Normalization • Caveat • Un-normalized: DO I = 1, M DO J = I, N A(J, I) = A(J, I - 1) + 5 ENDDO ENDDO Has a direction vector of (<,=) • Normalized: DO I = 1, M DO J = 1, N – I + 1 A(J + I – 1, I) = A(J + I – 1, I – 1) + 5 ENDDO ENDDO Has a direction vector of (<,>)

  12. Loop Normalization • Caveat • Consider interchanging loops • (<,=) becomes (=,>) OK • (<,>) becomes (>,<) Problem • Handled by another transformation • What if the step size is symbolic? • Prohibits dependence testing • Workaround: use step size 1 • Less precise, but allow dependence testing

  13. Definition-use Graph • Traditionally called Definition-use Chains • Provides the map of variables usage • Heavily used by the transformations

  14. Definition-use graph is a graph that contains an edge from each definition point in the program to every possible use of the variable at run time uses(b): the set of all variables used within the block b that have no prior definitions within the block defsout(b): the set of all definitions within block b that are not killed within the block killed(b): the set of all definitions that define variables killed by other definitions within block b Definition-use Graph

  15. Computing reaches for one block b may immediately change all other reaches including b itself since reaches(b) is an input into other reaches equations Archiving correct solutions requires simultaneously solving all individual equations There is a workaround this Definition-use Graph

  16. Definition-use Graph

  17. Definition-use Graph

  18. Dead Code Elimination • Removes all dead code • What is Dead Code ? • Code whose results are never used in any ‘Useful statements’ • What are Useful statements ? • Are they simply output statements ? • Output statements, input statements, control flow statements, and their required statements • Makes code cleaner

  19. Dead Code Elimination

  20. Constant Propagation • Replace all variables that have constant values at runtime with those constant values

  21. Constant Propagation

  22. Constant Propagation

  23. Static Single-Assignment • Reduces the number of definition-use edges • Improves performance of algorithms

  24. Example Static Single-Assignment

  25. Example Forward Expression Substitution DO I = 1, 100 K = I + 2 A(K) = A(K) + 5 ENDDO DO I = 1, 100 A(I+2) = A(I+2) + 5 ENDDO

  26. Forward Expression Substitution • Need definition-use edges and control flow analysis • Need to guarantee that the definition is always executed on a loop iteration before the statement into which it is substituted • Data structure to find out if a statement S is in loop L • Test whether level-K loop containing S is equal to L

  27. Forward Expression Substitution

  28. Forward Expression Substitution

  29. Forward Expression Substitution

  30. Forward Expression Substitution

  31. Definition: an auxiliary induction variable in a DO loop headed by DO I = LB, UB, S is any variable that can be correctly expressed as cexpr * I + iexprL at every location L where it is used in the loop, where cexpr and iexprL are expressions that do not vary in the loop, although different locations in the loop may require substitution of different values of iexprL Induction Variable Substitution

  32. Induction Variable Substitution • Example: DO I = 1, N A(I) = B(K) + 1 K = K + 4 … D(K) = D(K) + A(I) ENDDO

  33. Induction Variable Recognition

  34. Induction Variable Substitution • Induction Variable Recognition

  35. Induction Variable Substitution

  36. Induction Variable Substitution

  37. Induction Variable Substitution • More complex example DO I = 1, N, 2 K = K + 1 A(K) = A(K) + 1 K = K + 1 A(K) = A(K) + 1 ENDDO • Alternative strategy is to recognize region invariance DO I = 1, N, 2 A(K+1) = A(K+1) + 1 K = K+1 + 1 A(K) = A(K) + 1 ENDDO

  38. Induction Variable Substitution • Driver

  39. IVSub without loop normalization DO I = L, U, S K = K + N … = A(K) ENDDO DO I = L, U, S … = A(K + (I – L + S) / S * N) ENDDO K = K + (U – L + S) / S * N

  40. IVSub without loop normalization • Problem: • Inefficient code • Nonlinear subscript

  41. IVSub with Loop Normalization I = 1 DO i = 1, (U-L+S)/S, 1 K = K + N … = A (K) I = I + 1 ENDDO

  42. IVSub with Loop Normalization I = 1 DO i = 1, (U – L + S) / S, 1 … = A (K + i * N) ENDDO K = K + (U – L + S) / S * N I = I + (U – L + S) / S

  43. Summary • Transformations to put more subscripts into standard form • Loop Normalization • Constant Propagation • Induction Variable Substitution • Do loop normalization before induction-variable substitution • Leave optimizations to compilers

More Related