1 / 18

Languages and Compiler Design II Basic Blocks

Languages and Compiler Design II Basic Blocks. Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 5/18/2010. Agenda. Definition Sample: Basic Block Identifying Basic Blocks (BB) Control Flow Graph (CFG) Sample: Quicksort Quicksort CFG

emory
Download Presentation

Languages and Compiler Design II Basic Blocks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Languages and Compiler Design IIBasic Blocks Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 5/18/2010 CS322

  2. Agenda • Definition • Sample: Basic Block • Identifying Basic Blocks (BB) • Control Flow Graph (CFG) • Sample: Quicksort • Quicksort CFG • Loops • CFG Synthesis CS322

  3. Definition: Basic Block A Basic Block (BB) is a sequence of 1 or more consecutive instructions, starting with a unique entry (header, aka leader) and ending with an exit instruction that transfers to another BB or ends the program (e.g. a Halt instruction) Possible to have single-instruction BBs. Leaders are explicitly created by being the destination of branch- and call destinations. Leaders are implicitly created by the previous instruction branching away (via jump or call), or by fall-though e.g. in the case of conditional branches CS322

  4. Sample: Basic Block A basic block is a sequence of 1 or more consecutive operations whose first is the sole entry and whose last is the sole exit point. • Only the first statement can be a label or target of a jump. But being the first operation of a BB via fall-though is also possible • Only the last statement can be a jump statement. But non-control-flow operations can also be exits points, for example, when the next one happens to be target of a branch. (1)-(4) form Basic Block (0) is Basic Block Multiple Basic Blocks (0) L1: (0) L3: goto foo (0) i := m-1 • i := m-1 (1) j := n • j := n (2) L4: (3) t1 := 4*n (3) t2 := 4 * i • v := a[t1] (4) goto bar (5) L2: ... (5) t3 := t3 -j CS322

  5. Identifying Basic Blocks 1.) Identify “leaders”, i.e. the first statements of basic blocks. Leaders are: • The first statement of the program; e.g. first instruction of main() function • The target of a call, conditional, or unconditional branch • Operation following a control-transfer instruction; this operation is an implicit target by fall-through; note that successor of unconditional branch is candidate for unreachable code 2.) For each leader: its basic block consists of the leader itself plus all 0 or more operations up to and excluding the next leader or up to the halt instruction Example: Leaders: Basic Blocks: L0:(1) a := 0 (1) (1) L1:(2) b := b+1 (2) (2) (3) (4) (5) (3) c := c+b (4) a := b*2 (5) if a<N goto L1 (6) return c (6) (6) CS322

  6. Control Flow Graph (CFG) A program’s Control Flow Graph is a directed graph, whose nodes are Basic Blocks, and whose vertices are program-defined flows of control from Basic Blocks to others Example (1) a := 0 a := 0 BB1 L1: (2) b := b+1 b := b+1 BB2 (3) c := c+b c := c+b (4) a := b*2 a := b*2 (5) if a<N goto L1 id a<N goto L1 (6) return c return c BB3 CS322

  7. Sample: Quicksort // assume an external input-output array: int a[] void quicksort( int m, int n ) { int i, j, v, x; // temps if ( n <= m ) return; i = m-1; j = n; v = a[n]; while(1) { do i=i+1; while( a[i] < v ); do j=j-1; while( a[j] > v ); if ( i >= j ) break; x = a[i]; a[i] = a[j]; a[j] = x; } //end while x = a[i]; a[i] =a [n]; a[n] = x; quicksort( m, j ); quicksort( i+1, n ); } //end quicksort CS322

  8. Quicksort IR Code (16) t7 := 4*i (17) t8 := 4*j (18) t9 := a[t8] (19) a[t7] := t9 (20) t10 := 4*j (21) a[t10] := x (22) goto L0 L3: (23) t11 := 4*i (24) x := a[t11] (25) t12 := 4*i (26) t13 := 4*j //Jingke (27) t14 := a[t13] (28) a[t12] := t14 (29) t15 := 4*j //Jingke (30) a[t15] := x (31) 2 calls ... (1) i := m-1 (2) j := n (3) t1 := 4*n (4) v := a[t1] L0: L1: (5) i := i+1 (6) t2 := 4*i (7) t3 := a[t2] (8) if t3<v goto L1 L2: (9) j := j-1 (10) t4 := 4*j (11) t5 := a[t4] (12) if t5>v goto L2 (13) if i>=j goto L3 (14) t6 := 4*i (15) x := a[t6] CS322

  9. Quicksort CFG BB1 i := m-1 j := n t1 := 4*n v := a[t1] Control Flow Graph BB1: (1)--(4) BB2: (5)--(8) BB3: (9)--(12) BB4: (13) BB5: (14)--(22) BB6: (23)--(30) BB2 i := i+1 t2 := 4*i t3 := a[t2] if t3<v goto BB2 BB3 j := j-1 t4 := 4*j t5 := a[t4] if t5 > v goto BB3 BB4 if i >= j goto BB6 BB5 BB6 t6 := 4 * i x := a[t6] t7 := 4*i t8 := 4*j t9 := a[t8] a[t7]:= t9 t10 := 4*j a[t10]:= x goto BB2 t11 := 4*i x := a[t11] t12 := 4*i t13 := 4*j t14 := a[t13] a[t12]:= t14 t15 := 4*j a[t15]:= x CS322

  10. Loops • Since cfg is a graph, it may contain loops, AKA strongly-connected-components (SCC) • Generally, a loop is a directed graph, whose nodes can reach all other nodes along some path • This includes “unstructured” loops, with multiple entry and multiple exit points • A structured loop (proper loop) has just 1 entry point, and (generally) a single point of exit • Loops created by mapping high-level source programs to IR or assembly code are proper, unless disturbed by Goto (and Break) statements • Goto can create any loop; break creates additional exits CS322

  11. Loops, Cont’d Unstructured 2 proper loops, one unstructured loop Loop: 2, 3, 4, 5 Loop1: 2, 3; Loop2: 2, 4; Loop3: 2, 3, 4 1 1 2 2 How many loops? 3 3 4 4 5 5 6 CS322

  12. Natural Loops • Given a “back edge” t -> h, the natural loop of t -> h is the subgraph consisting of the set of nodes containing h and all the nodes that (1) are dominated by h and (2) from which t can be reached without passing through h, and the edge set connecting all the nodes in this node set • Node h is the loop header, which is the unique entry node to the loop • Dominance Relation: A node d dominates node i, if every execution path from CFG entry to i includes d, i.e. one can’t execute i without executing d first • Recursive Dominance Definition: a dom b: Meaning node a dominates node b if and only if • a = b, or • a is the unique immediate predecessor of b, or • a dominates all the immediate predecessors of b CS322

  13. Back Edges We call an edge t -> h back edge, if h dominates t Finding Back Edges: • Find a spanning tree of the CFG, e.g. using a depth-first search algorithm • Edges that are not included in the spanning tree are candidates for back edges, check each against the dominance relation CS322

  14. BB Analysis See separate .doc presentation CS322

  15. Well-Structured CFG CFG is well-structured (AKA reducible) iff all its loops are natural loops characterized by their back edges. Important Properties: • In a well-structured control-flow graph there are no jumps into the middles of loops. I.e. each loop is entered only through its header • A cfg derived from programs using structured flow-of-control statements exclusively such as if-then-else, while-do, continue, and break statements are always well-structured Many dataflow analysis algorithms work only on well-structured CFGs. Example: Simplest irreducible flow graph: 1 2 3 CS322

  16. CFG Synthesis Definition: A Control Flow Graph (cfg) of some program p , named cfg(p), is a static abstraction of p, in which each node represents a Basic Block (BB). Edges connecting the nodes in cfg represent the control flow from any one basic block to its successors. A cfg only represents the static control flow, hence it is not necessary to store, which of 2 successors in an If Expression (the Then Clause and the Else Clause) is connected by the true condition. Only that there are 2 successors matters. CS322

  17. CFG Synthesis The cfg Algorithm cfg_build(pc): • Aside from its parameter pc, input to the cfg Algorithm cfg_build() is a list of instructions I broken into Basic Blocks. One of these BBs holds the select entry instruction at address: pc • The cfg Algorithm creates a new cfg node for each BB • For each successor s of a Basic Block BB(n) cfg_build() installs a directed edge from BB(n) to s. During this process each BB is labeled as reached. At completion, all BBs are inspected; those not reached are filtered out as unreachable BBs, hence each of its instructions is unreachable code. Output of cfg_build() is a pointer to the cfg node associated with address pc CS322

  18. CFG Synthesis See separate .doc presentation CS322

More Related