1 / 104

Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT 6.035 Conversion to Low Level Intermediate Representation Unstructured Flow of Control and Instruction Flattening. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.

eagan
Download Presentation

Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MIT 6.035Conversion to Low Level Intermediate RepresentationUnstructured Flow of Control and Instruction Flattening Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

  2. GoalRemain Largely Machine IndependentButMove Closer to Standard MachineModel (flat address space, branches)

  3. Control Flow Graph (CFG) • Starting point: AST plus symbol tables • Target: CFG • CFG Nodes are Instruction Nodes • stl, sta, stf, cbr, ret nodes are instruction nodes • Instruction nodes refer to expression nodes • ldl, lda, ldp, len, +, <, ... are expression nodes • CFG Edges Represent Flow of Control • Forks At Conditional Jump Instructions • Merges When Flow of Control Can Reach A Point Multiple Ways • Entry and Exit Nodes

  4. entry while (i < v.length) v[i] = v[i]+x; cbr < Instruction and Expression Edges ldl i len ldf v sta exit Control Flow Edges + ldf v ldl i lda ldp x ldf v ldl i Pattern for while loop

  5. entry if (x < y) { a = 0; } else { a = 1; } cbr < ldl x ldl y stl a 0 stl a 1 exit Pattern for if then else

  6. Short-Circuit Conditionals • In program, conditionals have a condition written as a boolean expression ((i < v.len) && (v[i] != 0)) || i > k) • Semantics say should execute only as much as required to determine condition • Evaluate (v[i] != 0) only if (i < v.len) is true • Evaluate i > k only if ((i < v.len) && (v[i] != 0)) is false • Use control-flow graph to represent this short-circuit evaluation

  7. Short-Circuit Conditionals while (i < v.length && v[i] != 0) { i = i+1; } entry cbr < ldl i len cbr ldf v != stl i 0 lda exit + ldf v ldl i ldl i 1

  8. More Short-Circuit Conditionals if (a < b || c != 0) { i = i+1; } entry cbr < cbr ldl a ldl b != stl i 0 ldl c + exit ldl i 1

  9. Routines for Destructuring Program Representation destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form shortcircuit(c, t, f) generates short-circuit form of conditional represented by c if c is true, control flows to t node if c is false, control flows to f node returns b - b is begin node for condition evaluation new kind of node - nop node

  10. Destructuring Seq Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form seq x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3:next(ex) = by; 4:return (bx, ey); bx seq ex by x y ey

  11. Destructuring Seq Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form seq x y 1: (bx,ex) = destruct(x); bx seq ex x y

  12. Destructuring Seq Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form seq x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); bx seq ex by x y ey

  13. Destructuring Seq Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form seq x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3:next(ex) = by; bx seq ex by x y ey

  14. Destructuring Seq Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form seq x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3:next(ex) = by; 4:return(bx, ey); bx seq ex by x y ey

  15. bx ex if bc e c y by ey x Destructuring If Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form if c x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3: e = new nop; 4:next(ex) = e; 5:next(ey) = e; 6: bc = shortcircuit(c, bx, by); 7:return (bc, e);

  16. Destructuring If Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form if c x y 1: (bx,ex) = destruct(x); bx ex if c y x

  17. Destructuring If Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form if c x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); bx ex if c y by ey x

  18. Destructuring If Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form if c x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3: e = new nop; bx ex if e c y by ey x

  19. Destructuring If Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form if c x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3: e = new nop; 4:next(ex) = e; 5:next(ey) = e; bx ex if e c y by ey x

  20. Destructuring If Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form if c x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3: e = new nop; 4:next(ex) = e; 5:next(ey) = e; 6: bc = shortcircuit(c, bx, by); bx ex if bc e c y by ey x

  21. Destructuring If Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form if c x y 1: (bx,ex) = destruct(x); 2: (by,ey) = destruct(y); 3: e = new nop; 4:next(ex) = e; 5:next(ey) = e; 6: bc = shortcircuit(c, bx, by); 7:return(bc, e); bx ex if bc e c y by ey x

  22. Destructuring While Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form while c x 1: e = new nop; 2: (bx,ex) = destruct(x); 3: bc = shortcircuit(c, bx, e); 4:next(ex) = bc; 5:return (bc, e); bc while c x bx e ex

  23. Destructuring While Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form while c x 1: e = new nop; while c x e

  24. Destructuring While Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form while c x 1: e = new nop; 2: (bx,ex) = destruct(x); while c x bx e ex

  25. Destructuring While Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form while c x 1: e = new nop; 2: (bx,ex) = destruct(x); 3: bc = shortcircuit(c, bx, e); bc while c x bx e ex

  26. Destructuring While Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form while c x 1: e = new nop; 2: (bx,ex) = destruct(x); 3: bc = shortcircuit(c, bx, e); 4:next(ex) = bc; bc while c x bx e ex

  27. Destructuring While Nodes destruct(n) generates lowered form of structured code represented by n returns (b,e) - b is begin node, e is end node in destructed form if n is of the form while c x 1: e = new nop; 2: (bx,ex) = destruct(x); 3: bc = shortcircuit(c, bx, e); 4:next(ex) = bc; 5:return(bc, e); bc while c x bx e ex

  28. Shortcircuiting And Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 && c2 1: b2 = shortcircuit(c2, t, f); 2: b1 = shortcircuit(c1, b2, f); 3:return (b1); b1 c1 && c2 b2 f t

  29. Shortcircuiting And Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 && c2 1: b2 = shortcircuit(c2, t, f); c1 && c2 b2 f t

  30. Shortcircuiting And Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 && c2 1: b2 = shortcircuit(c2, t, f); 2: b1 = shortcircuit(c1, b2, f); b1 c1 && c2 b2 f t

  31. Shortcircuiting And Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 && c2 1: b2 = shortcircuit(c2, t, f); 2: b1 = shortcircuit(c1, b2, f); 3:return(b1); b1 c1 && c2 b2 f t

  32. Shortcircuiting Or Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 || c2 1: b2 = shortcircuit(c2, t, f); 2: b1 = shortcircuit(c1, t, b2); 3:return (b1); b1 c1 || c2 b2 t f

  33. Shortcircuiting Or Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 || c2 1: b2 = shortcircuit(c2, t, f); c1 || c2 b2 t f

  34. Shortcircuiting Or Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 || c2 1: b2 = shortcircuit(c2, t, f); 2: b1 = shortcircuit(c1, t, b2); b1 c1 || c2 b2 t f

  35. Shortcircuiting Or Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form c1 || c2 1: b2 = shortcircuit(c2, t, f); 2: b1 = shortcircuit(c1, t, b2); 3:return(b1); b1 c1 || c2 b2 t f

  36. Shortcircuiting Not Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form ! c1 1: b = shortcircuit(c1, f, t); return(b); b ! c1 f t

  37. Computed Conditions shortcircuit(c, t, f) generates shortcircuit form of conditional represented by c returns b - b is begin node of shortcircuit form if c is of the form e1 < e2 1: b = new cbr(e1 < e2, t, f); 2:return (b); cbr e1 < e2 < t f e1 e2

  38. Nops In Destructured Representation while (i < v.length && v[i] != 0) { i = i+1; } entry cbr < ldl i len cbr ldf v nop != stl i 0 lda exit + ldf v ldl i ldl i 1

  39. Eliminating Nops Via Peephole Optimization ... ... nop

  40. Flattening Expression Trees • Start with expression tree (- (+ (ldl i) (ldl j)) 1) • Produce flat sequence of three-address instructions • ldl t1, i • ldl t2, j • add t3, t1, t2 • sub t4, t3, 1 • Facilitates translation to machine code • Facilitates optimizations and transformations • Key concept: compiler-generated temps - + 1 ldl i ldl j

  41. Handling Temps • Each procedure has its own set of temps • Make a temp table for the procedure • Store information about temps in temp table

  42. Three-Address Instructions • stl temp, local • stp parm, temp • stf temp, field • sta temp, tempArray, tempIndex • ldl temp, local • ldp temp, parameter • ldf temp, field • lda temp, tempArray, tempIndex • len temp, tempArray • add dst, src1, src2 • sub dst, src1, src2 • sll dst, src1, src2 • slt dst, src1, src2 All of these have a reference to the next instruction to execute dst, src1, src2 all temps (or constants)

  43. Conditional Branch Instructions • Two conditional branch instructions • breqz temp, (trueIns, falseIns) • brneqz temp, (trueIns, falseIns) • Branches have two instruction references • Next instruction if branch taken • Next instruction if branch not taken

  44. while (i < v.length) v[i] = v[i]+x; entry ldl t8, i ldf t9, v len t10, t9 slt t11, t8, t10 brneqz t11 entry cbr < len ldl i ldf v ldf t1, v ldl t2, i ldf t3, v ldl t4, i lda t5, t3, t4 ldp t6, x add t7, t5, t6 sta t7, t1, t2 exit sta exit + ldf v ldl i lda ldp x ldf v ldl i

  45. How to Flatten Expression Trees • Simple depth-first traversal • Generates sequence of instruction nodes • One instruction (and one temp) for each value in tree • Leaves contain load instructions – generate an instruction to load value into temp • Internal nodes combine values from subtrees • Generate compute instruction for each internal node • Use temps from subtrees as operands of instruction • New temp holds new value • Link generated instructions for each expression tree together as they are generated

  46. while (i < v.length) v[i] = v[i]+x; ldl t8, I ldf t9, v len t10, t9 slt t11, t8, t10 entry cbr < len ldl i ldf v ldf t1, v ldl t2, i ldf t3, v ldl t4, i lda t5, t3, t4 ldp t6, x add t7, t5, t sta exit + ldf v ldl i lda ldp x ldf v ldl i

  47. How to Flatten Instructions • Leverage expression tree flattening • Store instructions • Flatten expression trees for operands of store • Generate a store instruction that uses temps from flattened expression subtrees • Branch instructions • Flatten condition expression • Generate a conditional branch instruction that uses temp from flattened condition expression • Link generated instructions for each instruction together as they are generated

  48. while (i < v.length) v[i] = v[i]+x; ldl t8, I ldf t9, v len t10, t9 slt t11, t8, t10 entry cbr < len ldl i ldf v ldf t1, v ldl t2, I ldf t3, v ldl t4, i lda t5, t3, t4 ldp t6, x add t7, t5, t sta exit + ldf v ldl i lda ldp x ldf v ldl i sta t7, t1, t2

  49. Reconnecting Control Flow Graph Edges • Instruction correspondence map M • M(n) = (n1,n2) • n is a node in high-level IR • n1 is the first node in instruction sequence resulting from the flattening of n • n2 is the last node in instruction sequence resulting from flattening of n • M is used to reestablish control-flow links after flattening • Typical implementation of M would be a hash table

  50. while (i < v.length) v[i] = v[i]+x; entry ldl t8, i ldf t9, v len t10, t9 slt t11, t8, t10 brneqz t11 entry first cbr last < len ldl i ldf v ldf t1, v ldl t2, i ldf t3, v ldl t4, i lda t5, t3, t4 ldp t6, x add t7, t5, t6 sta t7, t1, t2 exit sta exit + ldf v ldl i first lda ldp x last ldf v ldl i

More Related