Data Flow Analysis 3 15-411 Compiler Design

Data Flow Analysis 315-411 Compiler Design Nov. 8, 2005

Key Reference on Global Optimization • Gary A. Kildall, A Unified Approach to Global Program Optimization, ACM Symposium on Principles of Programming Languages, 1973, pages 194-206. • From the abstract: • “A technique is presented for global analysis of object code generated for expressions. The global expression optimization presented includes constant propagation, common sub-expression elimination, elimination of redundant register load operations and live expression analysis. A general purpose program flow analysis algorithm is developed which depends on an optimizing function. The algorithm is defined formally using a directed graph model of program flow structure and is shown to be correct. …”

Kildall’s Contribution • A number of techniques had been developed for compile-time optimization to •  locate redundant computations, •  perform constant computations, •  reduce the number of store-load sequences, etc. • Some provided analysis of only straight-line sequences of instructions; others tried to take program branching into account. • Kildall gave a single unified flow analysis algorithm which extended all the straight-line techniques to include branching. • He stated the algorithm formally and proved it correct in his POPL paper.

Constant Propagation – Example program • begin • integer i, a, b, c, d, e; • a := 1; c:=0; … • for i :=1 step 1 until 10 do • begin b:= 2; … • d := a + b; … • e := b + c; … • c := 4; … • end • end

Directed Graph Representation Nodes represent sequences of instructions with no branches. Edges represent control flow between nodes.

Constant Propagation • Convenient to associate a pool of propagated constants with each node in the graph. • Pool is a set of ordered pairs which indicate variables that have constant values when node is encountered. • The pool at node B denoted by PB consists of a single element (a,1) since the assignment a:= 1 must occur before B.

Constant Propagation (cont.) • Fundamental problem of constant propagation is to determine the pool of constants for each node in an arbitrary program graph. • By inspection of the program graph for the example, the pool of constants at each node is • PA =  PB = {(a, 1)} PC = {(a, 1)} PD = {(a, 1), (b, 2)} • PE = {(a, 1), (b, 2), (d, 3)} PF = {(a, 1), (b, 2), (d, 3)}

Constant Propagation (cont.) • PN may be determined for each node N in the graph as follows: • Consider each path (A, p1,p2, …, pn,N). Apply constant propagation along path to obtain set of constants at node N. •  Intersection for each path to N is the set of constants which can be assumed for optimization. • (It is unknown what path will be taken at execution time, so intersection is conservative choice)

Global Analysis Algorithm--Informal • Start with an entry node in the program graph, along with a given entry pool corresponding to this entry node. • Process the entry node and produce optimization information for all immediate successors of the entry node. • Intersect incoming optimizing pools with already established pools at the successor nodes. • (First time node is encountered, assume incoming pool is first approximation and continue processing.) • foreach successor, if amount of optimizing information is reduced by this intersection, then process successor like initial entry node.

Global Analysis Algorithm (cont) • It is useful to define an optimizing functionf which maps an input pool together with a particular nodeto a new output pool. • Given a set of propagated constants, it is possible to examine the operation of a particular node and determine the set of constants that can be assumed after the node is executed. • In the case of constant propagation, let V be a set of variables, C be a set of constants, and N be the set of nodes in the graph. • The set U = V £ C represents ordered pairs which may appear in any constant pool. • In fact, all constant pools are elements of the power set U, denoted P(U). • Thus, f: N £P(U) !P(U), where (v, c) 2 f(N, P) if and only if • (cont.)

Global Analysis Algorithm (cont.) • 1. (v, c) 2 P and the operation at node N does not assign a new value to the variable v. • 2. The operation at N assigns an expression to the variable v, and the expression evaluates to the constant c.

Constant Propagation (cont.) • Successively longer paths from A to D can be evaluated, resulting in PD,3 , PD,4 , …, PD,nfor arbitrarily large n. • The pool of constants that can be assumed no matter what flow of control occurs is the set of constants common to all PD,i , i.e. • Åi PD,i • This procedure is not effective since the number of such paths may have no finite bound, and the procedure would not halt.

Optimization Function for Example • The optimizing function can be applied to node A with an empty constant pool resulting in • f(A, ; ) = {(a,1)}. • The function can be applied to B with {(a, 1)} as the constant pool yielding • f(B, {(a, 1)}) = {(a, 1), (c, 0)}.

Extending f to Paths in the Graph • Given a path from entry node A to an arbitrary node N, optimizing pool for path is determined by composing the function f. • For example, f(C, f(B, f(A, ;))) = {(a, 1), (c, 0), (b, 2)} is the constant pool for D for this path.

Constant Propagation (cont.) • The pool of propagated constants at node D can be determined as follows: • A path from entry node A to the node D is (A, B, C, D). For this path the first approximation to the pool for Dis • PD,1 = {(a, 1), (b, 2), (c, 0)}. • A longer path from A to D is (A, B, C, D, E, F, C, D) which results in the pool • PD,2 = {(a, 1), (b, 2), (c, 4), (d, 3), (e, 2)}.

Computing the Pool of Optimizing Information. • The pool of optimizing information which can be assumed at node N in the graph, independent of the path taken at execution time, is • PN = Å {x | x 2 FN}. • Here FN = { f(pn, f(pn-1, …, f(p1, P))…)| (p1, p2, …, pn, N) is a path from an entry node p1 with corresponding entry pool P to node N}.

Directed Graphs and Paths • A finite directed graphG = <N,E> is an arbitrary finite set of nodesN and edges E½N£N. • A path from node A to node B in G is a sequence (p1, p2, …, pk ) such that p1 = A and pk = B where (pi, pi+1) 2E for 16 i < k. • The length of the path is k – 1.

Program Graphs • A program graph is a finite directed graphG with a non-empty set of entry nodesI½N. • Given N 2N we assume there exists a path (p1, p2, …, pn) such that p12I and pn = N. • (i.e., there is a path to every node in the graph from an entry node.)

Successors and Predecessors of a Node • The set of immediate successors of a node N is given by • I(N) = { N’ 2N | 9 (N,N’) 2E}. • The set of immediate predecessors of N is given by • I-1(N) = {N’ 2N| 9 (N’, N) 2E}.

Meet-Semilatticies • Let the finite set L be the set of all possible optimizing pools for a given application. • Let Æ be a meet operation with the properties: • Æ : L£L!L • x Æ y = y Æ x • x Æ (y Æ z) = (x Æ y) Æ z • where x, y z 2L. The set L and the Æoperation define a finite meet-semilattice.

Ordering on Meet-Semilattices • The Æ operation defines a partial ordering on L by • x 6 y if and only if x Æ y = x. • Similarly, • x < y if and only if x 6y and x  y.

Generalized Meet Operation • If X ½L, the generalized meet operation Æ X is defined as the pairwise application of Æ to the elements of X. • Lis assumed to have a “zero element” 0 such that 0 6 x for all x 2L. • An augmented set L’ is constructed from L by adding a “unit element” 1 such that 1 is not in L and 1 Æ x = x for all x in L. • The set L’ = L[ {1}. It follows that x <1 for all x in L.

Optimizing Function • An “optimizing function” f is defined • f: N£L!L . • It must have the homomorphism property: • F(N, x Æ y) = f(N, x) Æ f(N, y) for all N 2N and x, y 2L. • Note that f(N, x) < 1 for all N 2N and x 2L.

Global Analysis Algorithm • Global analysis starts with an entry pool set EP½I£L, where (e, x) 2EP if e 2I is an entry node with optimizing pool x 2L. • A1 [initialize] L := EP. • A2 [terminate ?] If L = ; then halt. • A3 [select node] Let L’ 2 L, L’ = (N, Pi) for some N 2N and Pi2L. • Then L := L – {L’}. • A4 [Traverse] Let PN be the current approximate pool for node N • (Initially PN = 1). If PN6 Pi the go to step A2. • A5 [set pool] PN := PNÆ Pi, L:= L [ {(N’, f(N, PN)) | N’ 2 I(N)}. • A6 [Loop] Go to step A2.

Data Flow Analysis 3 15-411 Compiler Design

Data Flow Analysis 3 15-411 Compiler Design

Presentation Transcript

Data Flow Analysis 2 15-411 Compiler Design

Compiler Design

Data Flow Analysis 1 15-411 Compiler Design

Data Flow Analysis

Data-Flow Oriented Design

Data flow analysis

___________________________________________ COMPILER DESIGN

Data Flow Analysis

Compiler Design

Data-Flow Analysis

Compiler design

Data Flow Analysis 3 15-411 Compiler Design

Data Flow Analysis 4 15-411 Compiler Design

Compiler design

Data flow analysis