1 / 78

CMPUT680 - Winter 2006

CMPUT680 - Winter 2006. Topic G: Static Single-Assignment Form José Nelson Amaral http://www.cs.ualberta.ca/~amaral/courses/680. Reading Material. Chapter 19 of the “Tiger book” (with a grain of salt!!). Bilardi, G., Pingali, K., “The Static Single Assignment Form and its

karsen
Download Presentation

CMPUT680 - Winter 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMPUT680 - Winter 2006 Topic G: Static Single-Assignment Form José Nelson Amaral http://www.cs.ualberta.ca/~amaral/courses/680 CMPUT 680 - Compiler Design and Optimization

  2. Reading Material Chapter 19 of the “Tiger book” (with a grain of salt!!). Bilardi, G., Pingali, K., “The Static Single Assignment Form and its Computation,” unpublished? (citeseer). Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., Zadeck, F. K., “An Efficient Method of Computing Static Single Assignment Form,” ACM Symposium on Principles of Programming Languages (PoPL), pp. 25-35, Austin, TX, Jan., 1989. Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., “Efficiently Computing Static Single Assignment Form and the Control Dependence Graph,” ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 13, No. 4, October, 1991, pp. 451-490. Sreedhar, V. C., Gao, G. R., “A Linear Time Algorithm for Placing -Nodes,” ACM Symposium on Principles of Programming Languages (PoPL), pp. 62-73, 1995. CMPUT 680 - Compiler Design and Optimization

  3. Static Single-Assignment Form Each variable has only one definition in the program text. This single staticdefinition can be in a loop and may be executed many times. Thus even in a program expressed in SSA, a variable can be dynamically defined many times. CMPUT 680 - Compiler Design and Optimization

  4. Advantages of SSA Simpler dataflow analysis No need to use use-def/def-use chains, which requires NM space for N uses and M definitions SSA form relates in a useful way with dominance structures. SSA simplifies algorithms that construct interference graphs. CMPUT 680 - Compiler Design and Optimization

  5. SSA Form in Control-Flow Path Merges Is this code in SSA form? B1 b  M[x] a  0 No, two definitions of a appear in the code (in B1 and B3) B2 if b<4 How can we transform this code into a code in SSA form? B3 a  b We can create two versions of a, one for B1 and another for B3. B4 c  a + b CMPUT 680 - Compiler Design and Optimization

  6. We define a fictional function that “knows” which control path was taken to reach the basic block B4: SSA Form in Control-Flow Path Merges But which version should we use in B4 now? B1 b  M[x] a1  0 B2 if b<4 B3 a2  b B4 c a?+ b CMPUT 680 - Compiler Design and Optimization

  7. We define a fictional function that “knows” which control path was taken to reach the basic block B4: SSA Form in Control-Flow Path Merges But which version should we use in B4 now? B1 b  M[x] a1  0 B2 if b<4 B3 a2  b B4 a3  (a2,a1) c  a3 + b CMPUT 680 - Compiler Design and Optimization

  8. A Loop Example a  0 a1  0 b0  undef c0  undef b  a+1 c  c+b a  b*2 if a < N a3  (a1,a2) b1  (b0,b2) c2  (c0,c1) b2  a3+1 c1  c2+b2 a2  b2*2 if a < N return (b0,b2) is not necessary because b1 is never used. But the phase that generates  functions does not know it. Unnecessary  functions are later eliminated by dead code elimination. return CMPUT 680 - Compiler Design and Optimization

  9. The  Function How can we implement a  function that “knows” which control path was taken? Answer 1: We don’t!! The  function is used only to connect use to definitions during optimization, but is never implemented. Answer 2: If we must execute the  function, we can implement it by inserting MOVE instructions in all control paths. CMPUT 680 - Compiler Design and Optimization

  10. Criteria for Inserting  Functions We could insert one  function for each variable at every joinpoint (a point in the CFG with more than one predecessor). But that would be wasteful. What criteria should we use to insert a  function for a variable a at nodez of the CFG? Intuitively, we should add a function  if there are two definitions of a that can reach the point z through distinct paths. CMPUT 680 - Compiler Design and Optimization

  11. Path Convergence Criterion (Cytron-Ferrante/89) Insert a  function for a variable a at a node z if allthe following conditions are true: 1. There is a block x that defines a 2. There is a block y x that defines a 3. There is a non-empty paths xz and yz 4. Paths xz and yz don’t have any nodes in common other than z 5. The node z does not appear within bothxz and yz prior to the end, but it may appear in one or the other. Note: The start node contains an implicit definition of every variable. CMPUT 680 - Compiler Design and Optimization

  12. -Candidates are Join Nodes Notice that according to the path convergence criterion, the node z that will receive the  function must be a join node. z is the first node that joins the paths Pxz and Pyz. CMPUT 680 - Compiler Design and Optimization

  13. This algorithm is extremely costly, because it requires the examination of every triple of nodes x, y, z and every path from x to z and from y to z. Can we do better? Iterated Path-Convergence Criterion The  function itself is a definition of a. Therefore the path-convergence criterion is a set of equations that must be satisfied. whilethere are nodes x, y, z satisfying conditions 1-5 andz does not contain a  function for a do inserta (a0, a1, …, an) at node z CMPUT 680 - Compiler Design and Optimization

  14. The SSA Conversion Problem For each variable x defined in a CFG G=(V,E), given the set of nodes S  V that contain a definition for x, find the minimal set, J(S) of nodes that requires a (xi,xj) function. By definition, the START node defines all the variables, therefore  S  V, START  S. If we need to compute  nodes for several variables, it may be efficient to precompute data structures based on the CFG. CMPUT 680 - Compiler Design and Optimization

  15. Processing Time for SSA Conversion The performance of an SSA conversion algorithm should be measured by the processing timeTp, the preprocessing spaceSp, and the query timeTq. (Shapiro and Saint 1970): outline an algorithm (Reif and Tarjan 1981): extend the Lengauer-Tarjan dominator algorithm to compute -nodes. (Cytron et al. 1991): show that SSA conversion can use the idea of dominance frontiers, resulting on an O(|V|2) algorithm. (Sreedhar and Gao, 1995): An O(|E|) algorithm, but in private commun. with Pingali in 1996 admits that it is in practice 5 times slower than Cytron et al. CMPUT 680 - Compiler Design and Optimization

  16. Processing Time for SSA Conversion Bilardi, Pingali, 1999:present a generalized framework and a parameterized Augmented Dominator Tree (ADT) algorithm that allows for a space-time tradeoff. They show that Cytron et al. and Gao-Shreedhar are special cases of the ADT algorithm. • Bilardi and Pingali describe three strategies to compute • -placement: • Two-Phase Algorithms • Lock-Step Algorithms • Lazy Algorithms CMPUT 680 - Compiler Design and Optimization

  17. CFG DF Computation DF Graph S Reachability J(S) Two-Phase Algorithms First build the entire Dominance Frontier Graph, then find the nodes reachable from S Simple DF Graph may be quite large CMPUT 680 - Compiler Design and Optimization

  18. Lock-Step Algorithms Performs the reachability computation incrementally while the DF relation is computed. CFG • Avoid storing the DF Graph. • Perform computations at all • nodes of the graph, even though • most are irrelevant • Inneficient when computing the • -nodes for many variables. DF Computation Reachability S J(S) CMPUT 680 - Compiler Design and Optimization

  19. Lazy Algorithms Lazily compute only the portion fo the DF Graph that is needed. Carefully select a portion of the DF Graph to compute eagerly (before it is needed). CFG DF Computation A Two-Phase Algorithm is an extreme case of a lazy algorithm. DF Graph SubGraph Reachability S J(S) CMPUT 680 - Compiler Design and Optimization

  20. Computing a Dominator Tree (n: # of nodes; m: # of edges) (Lowry and Medlock, 1969):Introduce the problem and give an O(n4) algorithm. (Lengauer and Tarjan, 1979):Give a complicated O(m(m.n)) algorithm [(m.n) is the inverse Ackermann’s function]. (Harel, 1985):Give a linear time algorithm. (Alstrup, Harel and Thorup, 1997):Give a simpler version of Harel’s algorithm. CMPUT 680 - Compiler Design and Optimization

  21. Dominance Property of the SSA Form In SSA form definitions dominate uses, i.e.: 1. If x is used in a  function in block n, then the definition of x dominates every predecessor of n. 2. If x is used in a non- statement in block n, then the definition of x dominates n. CMPUT 680 - Compiler Design and Optimization

  22. The Dominance Frontier A node xdominates a node w if every path from the start node to w must go through x. A node xstrictly dominates a node w if x dominates w and x w. The dominance frontier of a node x is the set of all nodes w such that x dominates a predecessor of w, but x does not strictly dominates w. CMPUT 680 - Compiler Design and Optimization

  23. 2 9 5 3 10 6 11 7 12 8 4 Example 1 13 What is the dominance frontier of node 5? CMPUT 680 - Compiler Design and Optimization

  24. 2 9 5 3 10 6 11 7 12 8 4 Example 1 13 First we must find all nodes that node 5 dominates. CMPUT 680 - Compiler Design and Optimization

  25. 2 9 5 3 10 6 11 7 12 8 4 Example 1 13 A node w is in the dominance frontier of node 5 if 5 dominates a predecessor of w, but 5 does not strictly dominatesw itself. What is the dominance frontier of 5? CMPUT 680 - Compiler Design and Optimization

  26. 10 6 11 7 Example 1 2 5 9 3 8 12 4 13 A node w is in the dominance frontier of node 5 if 5 dominates a predecessor of w, but 5 does not strictly dominatesw itself. What is the dominance frontier of 5? CMPUT 680 - Compiler Design and Optimization

  27. 10 6 11 7 Example 1 2 5 9 3 8 12 4 DF(5) = {4, 5, 12, 13} 13 A node w is in the dominance frontier of node 5 if 5 dominates a predecessor of w, but 5 does not strictly dominatesw itself. What is the dominance frontier of 5? CMPUT 680 - Compiler Design and Optimization

  28. Dominance Frontier Criterion Dominance Frontier Criterion: If a node x contains a definition of variable a, then any node z in the dominance frontier of x needs a  function for a. Can you think of an intuitive explanation for why a node in the dominance frontier of another node must be a join node? CMPUT 680 - Compiler Design and Optimization

  29. 10 6 11 7 Example 1 If a node (12) is in the dominance frontier of another node (5), than there must be at least two paths converging to (12). 2 5 9 3 8 12 4 These paths must be non-intersecting, and one of them (5,7,12) must contain a node strictly dominated by (5). 13 CMPUT 680 - Compiler Design and Optimization

  30. Dominator Tree To compute the dominance frontiers, we first compute the dominator tree of the CFG. There is an edge from node x to node y in the dominator tree if node x immediately dominates node y. I.e., xdominatesyx, and xdoes not dominate any other dominator of y. Dominator trees can be computed using the Lengauer-Tarjan algorithm(1979). See sec. 19.2 of Appel. CMPUT 680 - Compiler Design and Optimization

  31. 6 10 7 11 Example: Dominator Tree 1 2 5 9 Dominator Tree 3 1 8 12 4 2 4 5 12 9 13 3 10 11 13 Control Flow Graph 6 7 8 CMPUT 680 - Compiler Design and Optimization

  32. Local Dominance Frontier Cytron-Ferrante define the local dominance frontier of a node n as: DFlocal[n] = successors of n in the CFG that are not strictly dominated by n CMPUT 680 - Compiler Design and Optimization

  33. 10 6 7 11 Example: Local Dominance Frontier In the example, what are the local dominance frontier of nodes 5, 6 and 7? 1 2 5 9 DFlocal[5] =  DFlocal[6] = {4,8} DFlocal[7] = {8,12} 3 8 12 4 13 Control Flow Graph CMPUT 680 - Compiler Design and Optimization

  34. Dominance Frontier Inherited From Its Children The dominance frontier of a node n is formed by its local dominance frontier plus nodes that are passed up by the children of n in the dominator tree. The contribution of a node c to its parents dominance frontier is defined as [Cytron-Ferrante, 1991]: DFup[c] = nodes in the dominance frontier of c that are not strictly dominated by the immediate dominator of c CMPUT 680 - Compiler Design and Optimization

  35. 10 6 7 11 Example: Local Dominance Frontier 1 In the example, what are the contributions of nodes 6, 7, and 8 to its parent dominance frontier? 2 5 9 3 8 12 First we compute the DF and the immediate dominator of each node: DF[6] = {4,8}, idom(6)= 5 DF[7] = {8,12}, idom(7)= 5 DF[8] = {5,13}, idom(8)= 5 4 13 Control Flow Graph CMPUT 680 - Compiler Design and Optimization

  36. 10 6 11 7 Example: Local Dominance Frontier First we compute the DF and the immediate dominator of each node: DF[6] = {4,8}, idom(6)= 5 DF[7] = {8,12}, idom(7)= 5 DF[8] = {5,13}, idom(8)= 5 1 2 5 9 3 8 12 4 Now we check for the DFup condition: DFup[6] = {4} DFup[7] = {12} DFup[8] = {5,13} 13 Control Flow Graph CMPUT 680 - Compiler Design and Optimization

  37. A note on implementation We want to represent these sets efficiently: DF[6] = {4,8} DF[7] = {8,12} DF[8] = {5,13} If we use bitvectors to represent these sets: DF[6] = 0000 0001 0001 0000 DF[7] = 0001 0001 0000 0000 DF[8] = 0010 0000 0010 0000 CMPUT 680 - Compiler Design and Optimization

  38. Strictly Dominated Sets We can also represent the strictly dominated sets as vectors: SD[1] = 0011 1111 1111 1100 SD[2] = 0000 0000 0000 1000 SD[5] = 0000 0001 1100 0000 SD[9] = 0000 1100 0000 0000 Dominator Tree 1 2 4 5 12 9 13 3 10 11 6 7 8 CMPUT 680 - Compiler Design and Optimization

  39. A note on implementation If we use bitvectors to represent these sets: DF[6] = 0000 0001 0001 0000 DF[7] = 0001 0001 0000 0000 DF[8] = 0010 0000 0010 0000 SD[5] = 0000 0001 1100 0000 DFup[c] = DF[6] ^ ~SD[5] DFup[c] = nodes in the dominance frontier of c that are not strictly dominated by the immediate dominator of c CMPUT 680 - Compiler Design and Optimization

  40. Dominance Frontier Inherited From Its Children The dominance frontier of a node n is formed by its local dominance frontier plus nodes that are passed up by the children of n in the dominator tree. Thus the dominance frontier of a node n is defined as [Cytron-Ferrante, 1991]: CMPUT 680 - Compiler Design and Optimization

  41. 10 6 7 11 Example: Local Dominance Frontier 1 What is DF[5]? Remember that: DFlocal[5] =  DFup[6] = {4} DFup[7] = {12} DFup[8] = {5,13} DTchildren[5] = {6,7,8} 2 5 9 3 8 12 4 13 Control Flow Graph CMPUT 680 - Compiler Design and Optimization

  42. 10 6 7 11 Example: Local Dominance Frontier 1 What is DF[5]? Remember that: DFlocal[5] =  DFup[6] = {4} DFup[7] = {12} DFup[8] = {5,13} DTchildren[5] = {6,7,8} 2 5 9 3 8 12 4 13 Control Flow Graph Thus, DF[5] = {4, 5, 12, 13} CMPUT 680 - Compiler Design and Optimization

  43. Join Sets In order to insert -nodes for a variable x that is defined in a set of nodes S={n1, n2, …, nk} we need to compute the iterated set of join nodes of S. Given a set of nodes S of a control flow graph G, the set of join nodes of S, J(S), is defined as follows: J(S) ={z G|  two paths Pxz and Pyz in G that have z as its first common node, x  S and y  S} CMPUT 680 - Compiler Design and Optimization

  44. Iterated Join Sets Because a -node is itself a definition of a variable, once we insert -nodes in the join set of S, we need to find out the join set of S  J(S). Thus, Cytron-Ferrante define the iterated join set of a set of nodes S, J+(S), as the limit of the sequence: CMPUT 680 - Compiler Design and Optimization

  45. Now we can define the iterated dominance frontier, DF+(S), of a set of nodes S as the limit of the sequence: Iterated Dominance Frontier We can extend the concept of dominance frontier to define the dominance frontier of a set of nodes as: Exercise: Find an example in which the IDF of a set S is different from the DF of the set! CMPUT 680 - Compiler Design and Optimization

  46. An important result proved by Cytron-Ferrante is that: Location of -Nodes Given a variable x that is defined in a set of nodes S={n1, n2, …, nk} the set of nodes that must receive -nodes for x is J+(S). Thus we are mostly interested in computing the iterated dominance frontier of a set of nodes. CMPUT 680 - Compiler Design and Optimization

  47. Algorithms to Compute Dominance Frontier The algorithm to insert -nodes, due to Cytron and Ferrante (1991), computes the dominance frontier of each node in the set S before computing the iterated dominance frontier of the set. In the worst case, the combination of the dominance frontier of the sets can be quadratic in the number of nodes in the CFG. Thus, Cytron-Ferrante’s algorithm has a complexity O(N2). In 1994, Shreedar and Gao proposed a simple, linear algorithm for the insertion of -nodes. CMPUT 680 - Compiler Design and Optimization

  48. 6 10 7 11 Sreedhar and Gao’s DJ Graph 1 2 5 9 Dominator Tree 3 1 8 12 4 2 4 5 12 9 13 3 10 11 13 Control Flow Graph 6 7 8 CMPUT 680 - Compiler Design and Optimization

  49. 10 6 7 11 Sreedhar and Gao’s DJ Graph 1 D nodes 2 5 9 Dominator Tree 3 1 8 12 4 2 4 5 12 9 13 3 10 11 13 Control Flow Graph 6 7 8 CMPUT 680 - Compiler Design and Optimization

  50. 10 6 11 7 Sreedhar and Gao’s DJ Graph D nodes 1 J nodes 2 5 9 Dominator Tree 3 1 8 12 4 2 4 5 12 9 13 3 10 11 13 Control Flow Graph 6 7 8 CMPUT 680 - Compiler Design and Optimization

More Related